Four Stars

How to stream JSON with Kafka?

Hi,

I'm trying to stream the JSON message in Kafka and save it as the .json file. However the output file is empty.

If somebody can help, I will be grateful.

 

I'm using TOS for Big Data 6.4

Kafka ver.: 0.9.0.1

Kafka ver. set in TOS: 0.9.0.1

 

I'm also looking for examples of solutions where Kafka is used to catch the message in JSON format and the output is in .json file.
I want to evaluate whether TOS is a good tool for reading JSON using Kafka.

 

Kafka producer and consumer input/output (example):

kafka_producer.JPG

 

My simple jobs.

1.

kafka_json.JPG

 

2.
kafka_json_2.JPG

  • Big Data
4 REPLIES
Seven Stars

Re: How to stream JSON with Kafka?

Hi,

 

if Output file empty, it mean - You wrong parse JSON 

You can attache screenshots for tExtractJSON and tWriteJSON, it would be better

-----------
Four Stars

Re: How to stream JSON with Kafka?

Hi,

 

I've checked two options of the same job.

 

1. with Kafka input. No data in .jsonfile (file is empty) or no data i .txt file as an output (tFileOutputRaw component)

2. with tFixedFlowInput component. All the data and structure is in the .json file. I've checked also export to .xls file and it works as well.

 

In both cases I'm not using tWriteJSON component. Should I use it?

 

1. Kafka input job

kafka_json.JPG

 

tExtractJSON config

tExtractJSON.JPG

 

tFileOutputJSON

tFileOutputJSON.JPG

Output schema, columns 1:1

output_schema.JPG

 

2. Job where I used tFixedFlowInput component

kafka_json_2.JPG

 

tFixedFlowInput

 

tFixedFlowInput.JPG

JSON here:

{"data":[{"Service_Description":"Pets Allowed","Service_Code":"PET"},{"Service_Description":"Swimming Pool","Service_Code":"SWI"},{"Service_Description":"Tennis Court","Service_Code":"TEN"},{"Service_Description":"Dry Cleaning","Service_Code":"DRY"},{"Service_Description":"Internet Access","Service_Code":"INT"},{"Service_Description":"WIFI Internet Access","Service_Code":"WIF"},{"Service_Description":"Fitness Room","Service_Code":"FIT"},{"Service_Description":"Concierge","Service_Code":"CON"}]}

output .json file

json_file.JPG

 

output .xlsx file

Excel_export.JPG

 

Please NOTE that tLogRow_2 displays the same output in both cases/jobs.

Seven Stars

Re: How to stream JSON with Kafka?

I will check later with your sample
generally I use JSONPath for parse JSON

-----------
Seven Stars

Re: How to stream JSON with Kafka?

I make few test, with Your schema.

 

My Jobs use little different and this is was not affected for me, but look like all depend from configuration of KafkaInput component

 

Kafka as any MQ oriented for non stop work, and depending how Your component setup, it open and close output file different.

If You use auto-disconnect by timeout or by number of received messages - all fine:

Screen Shot 2017-06-05 at 1.44.33 AM.png

 

if You manual stop Job - file not closed properly

 

as alternative possible use tFlowToIterate and route output in different JSON files, or append in same delimited, something like:

 

Screen Shot 2017-06-05 at 1.49.56 AM.pngScreen Shot 2017-06-05 at 1.50.06 AM.pngScreen Shot 2017-06-05 at 1.50.20 AM.png

 

in this case - each file contain single message from Kafka, all of them closed independent and could be processed after

-----------