Handling special characters

Four Stars

Handling special characters

Hi Guys,

 

I need to transform special characters like "á" . Whenever i read these characters and create a output file it shows "" . Please help how these characters can be handled and it should be loaded into the output. I am stuck badly with this .

 

Thanks,

Srinath


Accepted Solutions
Four Stars

Re: Handling special characters

been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate


All Replies
Fifteen Stars TRF
Fifteen Stars

Re: Handling special characters

Hi,

Try using regex:

row1.theStringYouWishToTransform.replaceAll("[^\\w]", "")

Which means, replace any non-word characters (any character outside from [a-zA-Z_0-9]).

 

If it doesn't matches with your requirements, you can specifiy the characters to replaced bt yourself:

 

row1.theStringYouWishToTransform.replaceAll("[àâäéèêëîïôöùûü]", "")

Which means, replace these characters (àâäéèêëîïôöùûü) by nothing.

You just have to complete the list of characters you want to remove.

 

 


TRF
Four Stars

Re: Handling special characters

Thanks for the reply.

But my requirement is not to replace special character with empty string. It is to load the special characters to output file/table with same size as input file and data should not get trimmed.

In my case, it is populating special characters as empty string but I want to know how Talend handles special characters. 

 

Thanks

Seven Stars

Re: Handling special characters

Hi @srkalakonda,

 

I had encountered a similiar issue. 

 

make sure your source and target files are with the same encoding.

 

If you use UTF-8 character encoding this should not occur. 

 

Cheers!

Gatha

Tags (2)
Five Stars

Re: Handling special characters

Hi,

 

I had a similar issue. 

 

I was sending a message to a soap endpoint and Köln was converted as K?ln. This issue didn't occur from the studio but only when I scheduled the standalone job as a scheduled task.

 

This is how I fixed it: http://talendhowto.com/2017/09/02/add-encoding-batch-file/

 

Six Stars

Re: Handling special characters

Usin utf-8 as encodin gpage will solve the problem and in case of latin characters to be existed u have to override the JVM parameters with utf parameter also.

Four Stars

Re: Handling special characters

Hello,

 

Could you please give details on how did you resolve it. I am still getting mark if I get any ' mark in my excel source

 

thanks

Ravi

Four Stars

Re: Handling special characters

been able to solve the problem by edit the *.bat file in a notepad and adding -Dfile.encoding=utf8 after the java word,it works. thanks a lot mate

15TH OCTOBER, COUNTY HALL, LONDON

Join us at the Community Lounge.

Register Now

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now