Pushing data to Kafka

Six Stars

Pushing data to Kafka

Hi,

 

I'm currently trying to push database data to Kafka in a near real-time manner.

talend_cdc_kafka.JPG

Basically, what this job does is it checks a SQL Server table (i.e. Talend CDC table) for any new rows. If yes, the data is converted into a JSON, then serialized into byte[] type.

I've added a tReplicate and tLogRow to verify that the data output is correct.

 

All other components are successfully executed except tKafkaOutput, which returns the following error:

org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
[ERROR]: testing_kafka.talendcdc_consumecdctable_0_1.TalendCDC_ConsumeCDCTable - tKafkaOutput_1 - Failed to update metadata after 60000 ms.
[ERROR]: testing_kafka.talendcdc_consumecdctable_0_1.TalendCDC_ConsumeCDCTable - tKafkaOutput_1 - Failed to update metadata after 60000 ms.

Also, I noticed that there are only several options for Kafka versions in Talend 7.0:

  • 0.8.2.0
  • 0.9.0.1
  • 0.10.0.1

But the version configured on our Hadoop cluster is version 2.0. Is this a possible cause of the problem?

 

 

Regards,

Veronica

Thirteen Stars

Re: Pushing data to Kafka

Hi Veronica,

 

from one side - Kafka is critical to a version number, but with 0.10.0.1 it must work

 

first of all check - could you connect from Talend host to Kafka? because your error looking like connection timeout

-----------
Six Stars

Re: Pushing data to Kafka

Hi vapukov,

 

Thanks for responding.

I am able to connect to Kafka through Talend as I was able to create a Kafka topic with Talend. But loading data to the topic doesn't seem to work from Talend.

Thirteen Stars

Re: Pushing data to Kafka

Hi,

I use Confluent 5.1 which is Kafka 2.0
And Kafka 1.?? From Apache

In both cases - Talend and create topics and send data to it

Try with any other Kafka client, but from Talend machine and with same settings - is it work or not

 

add. what kafka settings you use in tKafkaOutput?

-----------
Six Stars

Re: Pushing data to Kafka

Hi,

 

I'm using Apache Kafka version 2.0 under the HDF distribution.

 

I'm using only broker list and topic name configurations in tKafkaOutput.

 

Do you have any other configurations set in your tKafkaOutput?

Six Stars

Re: Pushing data to Kafka

If it helps, I managed to setup a local zookeeper and kafka to test out kafka 1.10 to match the kafka version supported by Talend 7.1. The data loads into the topic just fine this way.

 

So I'm guessing it really is a compatibility issue after all.

Thirteen Stars

Re: Pushing data to Kafka

I use this - https://docs.confluent.io/current/release-notes.html

5.1.0 is a major release of Confluent Platform that provides you with Apache Kafka 2.1.0, the latest stable version of Kafka.

The technical details of this release are summarized below.

and in my case, it works.

 

it could depend on many moments, I don't have installed Kafka from HDP, so I don't know - what definitely wrong in your case

but you could test your job with soft above - same install it locally. because testing different version of local instance - it not a proper test.

 

try to publish from command line from your machine to HDP Kafka, is it work or not?

try to connect to HDP Kafka from tools like - www.kafkatools.com and check do you have any issues or not?

 

-----------
Six Stars

Re: Pushing data to Kafka

Hi,

 

I have tried sending a message through Kafka CLI, and this works. The consumer prints out the payload just fine.

 

I initially thought that it might be a compatibility issue, but it seems to work without problems now... So it's definitely not a compatibility issue. Though I might have to check why it wasn't working before.

 

Thanks for the help!

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.