Issue with TextractJSONField - Read by Xpath

Highlighted
Six Stars

Issue with TextractJSONField - Read by Xpath

I have a requirement of pulling data from RestAPI and putting it into database. I created below mentioned job

 

tRestClient-->TextractJSONFields-->tLogRow-->tOracleOutput.

 

 Sample JSON

{
"chats":[
{

"id":"2001",
"comment": null,
"triggered_response": true,

"visitor":"Guest1",
"history": [
{
"name": "Guest1",
"msg_id": "1534494221340",
"msg": "Test message from Guest",
},
{
"name": "Support",
"msg_id": "1534494221340",
"msg": "Test message from support"
}
]
}
"count": 179049,
"next_url": "www.nexturl..."
}

 

The loop is on the history where I have to get all the chat messages for each visitor. For this I have used the

tExtractJSONField

Read By : Xpath 

Loop XPathQuery : "/chats/history[*]"

Mappin

Column                   xPathQuery

 

ChatID                   "../id"

Chat_Text              "msg"

Chat_Text_By       "name"

 

 

Everything works fine except sometimes the chat will have attachments and there are extra elements to 'history' and JSON looks like

 

{
"chats":[
{

"id":"2001"
"comment": null,
"triggered_response": true,
"history": [
{
"name": "Guest1",
"msg_id": "1534494221340",
"msg": "Test message from Guest",
},
{
"name": "Support",
"msg_id": "1534494221340",
"msg": "Test message from support"
}

"attachment": {
"mime_type$string": "image/jpeg",
"name$string": "image.jpg",
"url$string": "image_url"
}

]
}
"count": 179049,
"next_url": "www.nexturl..."
}

 

whenever this appears in output the job gives error

Error on line 28 of document  : Element type "mime_type" must be followed by either attribute specifications, ">" or "/>". Nested exception: Element type "mime_type" must be followed by either attribute specifications, ">" or "/>".

 

When I use Read By : JsonPath and supply JSON query there is no error but the parent elements to looping elements do not appear in result

tExtractJSONField Properties

Read By : JsonPath

Loop XPathQuery : "$.chats[*].history[*]"

Mappin

Column                   Json Query

 

ChatID                  "$.chats[*].id"

Chat_Text              "msg"

Chat_Text_By       "name"

 

So there are two issues I am looking for help

 

1) tExtractJSONFields with XPath option gives error when JASON output contains "$" in element name

2) tExtractJSONFields with JsonPath option does not give parent element data when loop jasonpath query is used

Moderator

Re: Issue with TextractJSONField - Read by Xpath

Hello,

Have you tried to use tfileinputjson component with XPath option and JsonPath to read your sample json to see if this two issues repro?

Best regards

Sabrina

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

APIs for Dummies

View this on-demand webinar about APIs....

Watch Now

6 Ways to Start Utilizing Machine Learning with Amazon We Services and Talend

Look at6 ways to start utilizing Machine Learning with Amazon We Services and Talend

Blog