Issues with tExtractJSONFields Component xPath v jsonPath

Four Stars

Issues with tExtractJSONFields Component xPath v jsonPath

Good Morning All,
I'm having some issues processing data from an API.
I make a call to a URL for example:
and i get back a json object.
I make the call using a tRESTClient component, and parse the json using a tExtractJSONFields component.  If i stop here i do not get an error.
If i try to connect the tExtractJSONFields component to anything else (tMap or tLogRow for example) i receive the error:
Invalid white space character (0x1) in text to output (in xml 1.1, could output as a character entity)
I've tried all the debug run options and traces and log catchers, but this is the only message i ever see.
In case it is important, the tExtractJSONFields component is configured using XPath.  Also worth noting that when i call a different timeframe the job works perfectly to i know its a data issue.
The problem being that as this is an external company providing the data i cannot 'fix' the data, so rather i need to implement something within Talend to overcome the issue.
I have googled extensively and i canot find anything Talend specific for this.  The best i have seen is to use a regex function to remove the foreign characters, but this doesn't really seem to be a solution - rather a workaround.
I am using TOS 5.6.
Thanks in advance,
Job Screenshots:
Four Stars

Re: Issues with tExtractJSONFields Component xPath v jsonPath

Thought it worth mentioning:
I have tested processing the same json streams in other tools and loading them into the database without any issues.
Therefore i do not believe this to be a general data quality issue but rather a Talend Handling Issue.
My research suggests that the tExtractJSONFields component is parsing the data from the json and compiling it into an xml output using jaxen-1.1.1.jar.
I believe that this version of jaxen uses an older xml protocol and i should be using a newer version.
i have located and downloaded jaxen-1.1.4.jar and added this into Talend - but the Modules pane does not appear to have updated.
not quite sure where to go from here as this is clearly Talend related but its not apparent that there is a way to circumvent this.
I dont want to have to move away from Talend after investing significant time in getting it to this point in our organization.
Four Stars

Re: Issues with tExtractJSONFields Component xPath v jsonPath

Could really do with help on this please...
Four Stars

Re: Issues with tExtractJSONFields Component xPath v jsonPath

so if i change the tExtractJSONFields component from xPath to jsonPath the error goes away...
This changes my question...
so a sample json is:
  tickets: ,
  Count: 1000,
  end_time: 1434370040
Using xPath i was able to set the Loop Path as tickets but use the ../count query to refer to a higher level.
Using jsonPath if i set the Loop query to $.tickets.* then i cannot access the count value according to as the parent operator is not applicable to jsonPath.
I have tried using the json query $.count but this does not return a value for me.
And of course if i change the Loop path to $.* i only get the first ticket in the array and not all of them...
The big issue i have with this is using other tools i can get this to work without any issues - i don't understand why this is so difficult in Talend...
Four Stars

Re: Issues with tExtractJSONFields Component xPath v jsonPath

still struggling with this and need assistance
Four Stars

Re: Issues with tExtractJSONFields Component xPath v jsonPath

ive even tried upgrading to v6.2, but this upgrade has broken everything (see other ticket).
Guess i have no choice but to move to a different tool - very disappointing

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.



Introduction to Talend Open Studio for Data Integration.


Downloads and Trials

Test drive Talend's enterprise products.