Kafka Connect

https://kafka.apache.org/documentation/#connect_overview

Kafka Connect?

Kafka와 다른 시스템들간 데이터를 처리할 수 있는 도구

어떻게 제공할까?

A common framework for Kafka connectors
Distributed and standalone modes
REST interface
Automatic offset management
Distributed and scalable by default
Streaming/batch integration

어떻게 쓰일까?

왜 사용해야할까?

Kafka producer와 consumer는 간단하지만 그리 실용적(?)이지 못함
- 관련 라이브러리를 이용해서 직접 개발해야 함
보다 쉽게 Kafka와 다른 시스템들간 ETL을 처리할 수 있음
- 개발자는 고유 데이터 처리에 집중할 수 있음
다음 요구사항을 제공
- Data conversion (serialization)
- Parallelism, Scaling
- Load balancing
- Fault tolerance, Failover
- General management
- (+) Move data to/from sink/source
- (+) Support relevanr delivery segmentics (Exactly once)

Connect 종류

Source connector : Kafka로 import
Sink connector : Kafka에서 export

주요 컴포넌트(컨셉)

Connector : 데이터 처리 관리. 어디에 어떻게 저장할지
Task : 실제 데이터를 처리
Worker : Connector와 Task를 실행
Converter : 데이터 변환

실행 모드

Standalone mode

단일 프로세스. 테스트/개발용

$ bin/connect-standalone worker.properties connector1.properties [connector2.properties connector3.properties ...]

Distributed mode

다중 프로세스. 한 개 이상 머신에서 실행
fault tolerant, scalable

$ bin/connect-distributed worker.properties

설정

모드에 따라

Standalone mode : file, cmd로 설정
Distributed mode : REST API (JSON payload, 모니터링도 가능)

(1) Connector

어떤 커넥터를 사용할지, 얼만큼의 Task를 사용할지 설정함

name : Unique name for the connector.
connector.class : The Java class for the connector
tasks.max : The maximum number of tasks that should be created for this connector.
topics : List of input topics (sink connects only)

예) connector1.properties

name=local-file-sink
connector.class=FileStreamSinkConnector
tasks.max=1
file=test.sink.txt
topics=connect-test

(2) Worker

Connector와 Task를 어떻게 실행할지 설정함

bootstrap.servers : kafka cluster
key.converter : Convert class for key connect data
value.converter
offset.storage.file.filename : [Standalone mode]
offset.storage.topic : [Distributed mode]
...

(3) Converter

key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false

(4) Overriding Producer & Consumer Settings

producer.retries=1
consumer.max.partition.fetch.bytes=10485760

Kafka connectors

https://www.confluent.io/product/connectors/

8.KAFKA CONNECT

Kafka Connect

Kafka Connect?

어떻게 제공할까?

어떻게 쓰일까?

왜 사용해야할까?

Connect 종류

주요 컴포넌트(컨셉)

실행 모드

Standalone mode

Distributed mode

설정

모드에 따라

(1) Connector

(2) Worker

(3) Converter

(4) Overriding Producer & Consumer Settings

Kafka connectors

results matching ""

No results matching ""