Five Stars

XML API to CSV - Content is not allowed in prolog.

Hello

 

I need some community support with using the tRest---->xmlfield---->FileoutDelimited.

 

  1.  I am trying to extract information from https://tatts.com/pagedata/racing/2017/7/7/RaceDay.xml   (this is a free public API)
  2. I have configured the flow as best i believe accurate based on the research on these forums.
  3. When i first select the URL in the tRest it only ever shows me Body or ERROR_CODE. I suspect this was normal so i then used tExtractXMLField to extract the data from the body
  4. In the tExtractXML field if i click sync Columns it only syncs the two BODY and ERROR_CODE. So instead of this i clicked on "EDIT SCHEMA" and created the columns i wanted so it would appear in the Mapping
  5. I then set the loop Xpath query to "/RaceDay/RaceDay" as i only wanted to test if i could get the very first level.
  6. I then mapped each column to the exact path.
  7. When i run the job it i get the error "Error on line 1 of document  : Content is not allowed in prolog. Nested exception: Content is not allowed in prolog." in the CSV file it creates the heading for each column but no data.

Any help on this would be much appreciated. Screens are below. 

Flow.JPGtRest.JPGxmlfield.JPG

error.JPG

 

 

  • Data Integration
  • SDI
2 ACCEPTED SOLUTIONS

Accepted Solutions
Community Manager

Re: XML API to CSV - Content is not allowed in prolog.

Hello
I got the same error with tExtractXMLFields, however, it works if I write the response to a file with tHttpRequest and then use a tFileInputXML to read the file. see

1.png2.png

 

Regards

Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Community Manager

Re: XML API to CSV - Content is not allowed in prolog.

Hello
If i understand your request well, you want to loop a date range and call the API for each date. To achieve it, you need to use a tLoop to do a loop for the API calling, eg:
tLoop--iterate--tJava--oncomponentok--tHttpRequest--oncomponentok--other components.
on tLoop, check 'For' loop type, set From as 0, To as 3, and Step as 1.
on tJava, build a dynamic URL for used on tHttpRequest:

int i=((Integer)globalMap.get("tLoop_1_CURRENT_VALUE"));
java.util.Date currentDate=TalendDate.getCurrentDate();
String stringDate=(TalendDate.formatDate("yyyy/MM/dd",TalendDate.addDate(currentDate, i, "dd"))).replaceAll("/0","/");

//System.out.println(stringDate);
context.URL="https://tatts.com/pagedata/racing/"+stringDate+"/RaceDay.xml";

on tHttpRequest, set the URL with context variable context.URL

Regards
Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
4 REPLIES
Community Manager

Re: XML API to CSV - Content is not allowed in prolog.

Hello
I got the same error with tExtractXMLFields, however, it works if I write the response to a file with tHttpRequest and then use a tFileInputXML to read the file. see

1.png2.png

 

Regards

Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Five Stars

Re: XML API to CSV - Content is not allowed in prolog.

Hi Shong,

 

that worked well thank you very much. Im not explorer how to create a variable and rotate through the date in the URL. For example if i set the job to run through every date from today + 3 days from now. do you have any advice on where to start with this?

 

kind regards

 

David

Community Manager

Re: XML API to CSV - Content is not allowed in prolog.

Hello
If i understand your request well, you want to loop a date range and call the API for each date. To achieve it, you need to use a tLoop to do a loop for the API calling, eg:
tLoop--iterate--tJava--oncomponentok--tHttpRequest--oncomponentok--other components.
on tLoop, check 'For' loop type, set From as 0, To as 3, and Step as 1.
on tJava, build a dynamic URL for used on tHttpRequest:

int i=((Integer)globalMap.get("tLoop_1_CURRENT_VALUE"));
java.util.Date currentDate=TalendDate.getCurrentDate();
String stringDate=(TalendDate.formatDate("yyyy/MM/dd",TalendDate.addDate(currentDate, i, "dd"))).replaceAll("/0","/");

//System.out.println(stringDate);
context.URL="https://tatts.com/pagedata/racing/"+stringDate+"/RaceDay.xml";

on tHttpRequest, set the URL with context variable context.URL

Regards
Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Five Stars

Re: XML API to CSV - Content is not allowed in prolog.

Thanks again Shong. Proposed solution worked. Originaly i got working with just placing 

"https://tatts.com/pagedata/racing/"+TalendDate.getDate("yyyy/M/d")+"/NRFields.xml"

in the URL but it would only get one date.

 

i will start another post with a different heading as i am now trying to get the URL to change based of a dynamic list which comes from another HTTP XML request.

 

thanks again, much appreciated.

 

kind regards


David