Four Stars

tExtractJSONFields - An invalid XML character (Unicode: 0xf) was found in the element content of the document

Hi,

 

I'm facing a problem with non-printable characters in tExtractJSONFields component. I have an input json with the structure:

{"vid":1, "properties": {"value": "asd\u000f6ój"}}

Inside of the value I have non-printable character like 

\u000f

When I'm trying to process it I'm receiving an error like:

Przechwytywanie.PNG

Starting job test at 10:49 15/02/2018.

[statistics] connecting to socket on port 3836
[statistics] connected
Error on line 1 of document  : An invalid XML character (Unicode: 0xf) was found in the element content of the document. Nested exception: An invalid XML character (Unicode: 0xf) was found in the element content of the document.
[statistics] disconnected
Job test ended at 10:49 15/02/2018. [exit code=0]

 Can you please help me how can I process/decode it?

1 REPLY
Eight Stars

Re: tExtractJSONFields - An invalid XML character (Unicode: 0xf) was found in the element content of the document

Hello,

 

It is escaped UTF-8. Use Java component to convert it into national characters.

String text = "S\u00e3o"
text = StringEscapeUtils.unescapeJava(text);
System.out.println("text " + text);

Hope it helps.

 

Regards
Lojdr