Read Column Names and Extract JSON Data

Seven Stars

Read Column Names and Extract JSON Data

I have an JSON object:

 

{
	"RESULT_1": {
		"columns": ["col1", "col2", "col3", "col4"],
		"data": [
			["0", null, "12345", "other"],
			["1", "a", "54321", "MA"],
			["0", null, "76543", "RI"]
		]
	}
}

I am able to read the "data" part without issues using the tExtractJSONFields component.

 

capture_only_data.JPGCAPTURE_ONLY_DATA

However, I want to make sure that I read the column names from the "columns section" and make sure that the correct data is read and pushed in. For eg.

col1 should contain values 0,1,0

col2 should contain null, a, null

col3 should contain 12345, 54321, 76543

 

How can I achieve this? Any help is appreciated.

Sixteen Stars

Re: Read Column Names and Extract JSON Data

I *think* I understand what you want and here is an example of how it can be achieved.....

Screen Shot 2018-03-21 at 21.39.09.png

 

The tFileInputJSON_1 retrieves the column names and is configured like this....

Screen Shot 2018-03-21 at 21.39.19.png

The tFileInputJSON_2 retrieves the data columns and is configured like this....

Screen Shot 2018-03-21 at 21.39.32.png

 

You don't need to use tFileInputJSON components for this, I just did as it made it quicker for me to try out. The logic will still work with the tExtractJSONField components.

 

Hope this helps.

Seven Stars

Re: Read Column Names and Extract JSON Data

@rhall_2_0: Thank you. This makes sense and unite the rows/column together. But, how can I use the columns to make sure that the data is inserted correctly? I mean, tomorrow, if col3 comes before col1, (same for data also), then, I want to ensure that Talend is able to read the column names and save it correctly in a DB table.

Thanks.
Sixteen Stars

Re: Read Column Names and Extract JSON Data

This problem sounds almost precisely like a problem I wrote a tutorial for a while ago. Take a look at this and see if it helps....

https://www.rilhia.com/tutorials/dynamic-column-order

Seven Stars

Re: Read Column Names and Extract JSON Data

@rhall_2_0: Thanks for sharing your article. My challenge is that twice already, a couple of columns were added, and now, as you know I am referring to each element using the array placeholder.. "[0]". So, if a column is added/ removed, I need to reconfigure the whole tExtractJsonFields structure.
Sixteen Stars

Re: Read Column Names and Extract JSON Data

Ah, I see. This is going to be tricky figuring out how you want to work with this data, but I can show you how to get the header data one header value at a time....

Screen Shot 2018-03-23 at 19.17.59.png

 

This will return your column headers one row at a time. You can get a count here and each header name.

 

To get the data you can do the following to get a "row". What this does is return the array structure one row at a time. So this essentially...

["0", null, "12345", "other"]
["1", "a", "54321", "MA"]
["0", null, "76543", "RI"]

Screen Shot 2018-03-23 at 19.18.20.png

 

You can use a bit of simple Java String manipulation to extract this data into columns.

Seven Stars

Re: Read Column Names and Extract JSON Data

@rhall_2_0: Thanks. I think this may work...

Am i correct in assuming that this will be done in tMssqlRow, slightly slower than consuming data directly via tMssqlOutput component?

Thanks.
Sixteen Stars

Re: Read Column Names and Extract JSON Data

If your data is needed inside a database then yes, you could use a tMSSqlRow to dynamically create your update statements. However, you could also use a dynamic schema if you have the Enterprise Edition. However, this will mean that you will need your database to have every column you may possibly receive in your data. Alternatively (if you want to get really dynamic) you could create dynamic DDL statements and add new columns when they appear in your JSON. In this case you would need to use the tMSSqlRow component as your DB schema would not be known.