how to remove special characters from string while loading csv file

Four Stars

how to remove special characters from string while loading csv file

I have requirement to create csv file from hive table while creating csv file, which is comma separated.

 

while inserting rows talend is inserting special characters in string.

 

for example 

source : Sliver 2 – Sell

 

target : 

Sliver 2 – Sell

 

please help how to remove this special character.

 

jira.PNG

Employee

Re: how to remove special characters from string while loading csv file

Hi @prasad_nayani ,

 

Could you please change the Encoding to UTF-8 in tfileoutputdelimited component in advanced settings and let us know if that helps.csv.png

 

Regards,

Pratheek Manjunath

 

Highlighted
Four Stars

Re: how to remove special characters from string while loading csv file

I applied encoding as UTF-8 but its actually inserting more special chars instead of removing it.

 

Sliver 2 – Sell

 

I want to see data like my source

 

Sliver 2 – Sell

 

Tags (1)
Employee

Re: how to remove special characters from string while loading csv file

Hi @prasad_nayani 

 

I tried to read the String you had given Sliver 2 – Sell from a csv file and I am writing the data back to another csv file.

 

Our assumption is that you are using UTF-8 encoding while reading and writing the files (need to updated in the advanced section of Talend components). Even if you are using Hive, you will have to check the underlying Hadoop files. 

 

Now, lets assume that the input data is in correct format. If you are trying to print the data in Talend, it will show like below.

image.png

 

The reason is that Talend is using Courier Font for log printing. But if you write the data to a file, you can see that it is having the data as shown below.

image.png

The above data is output from notepad after running the job below.

image.png

 

If you copy the data and put it to a MS word, you can see the data in original format (like the font in this post)

image.png

So I believe as long as you are maintaining the UTF-8 encoding which is also called Unicode encoding, you should be fine. Only in very rare occasions, you may need UTF-16 encoding but all those encoding can be added by selecting Custom language encoding in the Talend components.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Four Stars

Re: how to remove special characters from string while loading csv file

hi @nikhilthampi,

 

thanks for you effort but, 

Actually I want to remove char from the output because my source doesn't has special char in it  as below. 

 

source

Sliver 2 â“ Sell

 

but target .csv file is inserting that special char.

 

Sliver 2 – Sell

 

 

 

 

Employee

Re: how to remove special characters from string while loading csv file

@prasad_nayani 

 

Ok. Your earlier posts had Euro symbol in both source and target. So I added it in the source data.

 

I copied your new data Sliver 2 â“ Sell into the job and printed the output. I did not get any extra Euro symbol and the output is as shown below. It could be due to UTF-8 settings in your environment.

image.png

 

Could you please double check all your job settings and see all the underlying Hadoop files once again?

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 2

Part 2 of a series on Context Variables

Blog

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog