Flink CDC里flink1.17写doris的代码怎么做?

Flink CDC中,将数据从Flink 1.17写入Doris,需要遵循以下步骤:

Flink CDC里flink1.17写doris的代码怎么做?
(图片来源网络,侵删)

1、添加依赖

在项目的pom.xml文件中添加Flink CDCDoris的依赖:

<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flinkconnectordoris_2.11</artifactId>
    <version>1.13.2</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flinkconnectormysqlcdc</artifactId>
    <version>2.1.0</version>
</dependency>

2、创建Flink CDC Source

创建一个Flink CDC Source,用于从MySQL数据库中读取数据变更事件:

import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import com.ververica.cdc.connectors.mysql.MySqlSource;
import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema;
public class FlinkCDCSourceExample {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        SourceFunction<String> sourceFunction = MySqlSource.<String>builder()
                .hostname("localhost")
                .port(3306)
                .databaseList("mydb") // 监听的数据库名
                .tableList("mydb.mytable") // 监听的表名
                .username("root")
                .password("password")
                .deserializer(new StringDebeziumDeserializationSchema()) // 反序列化方式
                .build();
        env.addSource(sourceFunction).print();
        env.execute("Flink CDC Example");
    }
}

3、创建Doris Sink

创建一个Doris Sink,用于将数据写入Doris数据库:

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.doris.DorisSink;
import org.apache.flink.streaming.connectors.doris.DorisStreamLoadOptions;
import org.apache.flink.types.Row;
public class DorisSinkExample {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 假设从Flink CDC Source获取的数据流为dataStream
        DataStream<Row> dataStream = ...;
        DorisSink<Row> dorisSink = DorisSink.builder()
                .setDorisTable("mydb.mytable") // Doris表名
                .setUsername("root")
                .setPassword("password")
                .setFenodes("localhost:8030") // Doris FE节点地址
                .setLoadProps(DorisStreamLoadOptions.DEFAULT_LOAD_PROPS) // 加载属性
                .build();
        dataStream.addSink(dorisSink);
        env.execute("Doris Sink Example");
    }
}

4、整合Flink CDC Source和Doris Sink

将Flink CDC Source和Doris Sink整合到一起,实现从MySQL数据库到Doris数据库的数据同步:

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.doris.DorisSink;
import org.apache.flink.streaming.connectors.doris.DorisStreamLoadOptions;
import org.apache.flink.types.Row;
import com.ververica.cdc.connectors.mysql.MySqlSource;
import com.ververica.cdc.debezium.StringDebeziumDeserializationSchema;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.typeutils.RowTypeInfo;
public class FlinkCDCToDorisExample {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        SourceFunction<String> sourceFunction = MySqlSource.<String>builder()
                .hostname("localhost")
                .port(3306)
                .databaseList("mydb") // 监听的数据库名
                .tableList("mydb.mytable") // 监听的表名
                .username("root")
                .password("password")
                .deserializer(new StringDebeziumDeserializationSchema()) // 反序列化方式
                .build();
        DataStream<String> dataStream = env.addSource(sourceFunction);
        // 将数据流转换为Row类型,以便写入Doris
        DataStream<Row> rowDataStream = dataStream.map(json > {
            JsonObject jsonObject = new JsonParser().parse(json).getAsJsonObject();
            String before = jsonObject.get("before").getAsString();
            String after = jsonObject.get("after").getAsString();
            return Row.of(before, after);
        }).returns(new RowTypeInfo(Types.STRING, Types.STRING));
        DorisSink<Row> dorisSink = DorisSink.builder()
                .setDorisTable("mydb.mytable") // Doris表名
                .setUsername("root")
                .setPassword("password")
                .setFenodes("localhost:8030") // Doris FE节点地址
                .setLoadProps(DorisStreamLoadOptions.DEFAULT_LOAD_PROPS) // 加载属性
                .build();
        rowDataStream.addSink(dorisSink);
        env.execute("Flink CDC to Doris Example");
    }
}

这样,就完成了使用Flink CDC将数据从MySQL数据库同步到Doris数据库的过程。

原创文章,作者:未希,如若转载,请注明出处:https://www.kdun.com/ask/561373.html

(0)
未希新媒体运营
上一篇 2024-05-03 12:46
下一篇 2024-05-03 12:48

相关推荐

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注

云产品限时秒杀。精选云产品高防服务器,20M大带宽限量抢购  >>点击进入