Four Stars

Parsing a nested json (json within a json) using tExtractJSONFields

Hi all,


I've done a bit of searching and I haven't found a post yet that addresses this scenario. I'm trying to use the tExtractJSONFields component to parse JSON that looks something like this:


"CardId": "123456789",
"NestedJSON": "{"transactionNetTotal":107.88,"shippingAndHandling":0.0,"transactionTotalTax":0.0,"transactionNumber":0,"currencyCode":"USD"}"


So basically, I have a JSON within a JSON.  I've tried using tExtractJSONFields with both JsonPath and XPath, but the only way I can get it to work is if I escape out the double quotes in the nested JSON.  Unfortunately, it doesn't look like the service upstream of my job will be able to generate the JSON this way, so I'm stuck trying to parse the nested JSON without the double quotes escaped.


Does anybody have any experience with this and know of a way (using stock Talend components) that I can parse the JSON above?  My team tries to avoid using external java libraries and raw code as much as we can, but I'm starting to think that might be my only option.  



  • Data Integration
Community Manager

Re: Parsing a nested json (json within a json) using tExtractJSONFields

I have made a testing, tried to create a json metadata, but it fails to parse the json data due to the wrong format. To resolve it, remove the double quotes that encloses the nested json before tExtractJSONFields, the json data will be:
"CardId": "123456789",
"NestedJSON": {"transactionNetTotal":107.88,"shippingAndHandling":0.0,"transactionTotalTax":0.0,"transactionNumber":0,"currencyCode":"USD"}
...your json data-->tJavaRow--tExtractJsonField-->

on tJavarow:
//Code generated according to input schema and output schema
output_row.json_data= ((input_row.json_data).replace("\"{","{")).replace("}\"","}");


Talend | Data Agility for Modern Business