One Star

Pig problem - Dealing with two delimiters. Please help.

I'm new to to Pig. I have a problem -
My input textfile.txt ->
aaaa,bbbb,cccc
aaaa,'xxxx,yyyy',zzzz
qqqq,wwww,eeee
Output should have three columns only. Anything in single quotes should be treated as single column.  'xxxx,yyyy' to be treated as one field data only.
If I use below -
A = load textfile.txt PigStorage(,) as (col1:chararray, col2:chararray ,col3:chararray)
I get -
(aaaa,bbbb,cccc)
(aaaa,'xxxx,yyyy')
(qqqq,wwww,eeee)
Please help me with this.
3 REPLIES
Community Manager

Re: Pig problem - Dealing with two delimiters. Please help.

Hi 
It seems Pig don't support escape char, can you please report a issue in our bugtracker and discuss with our developers.
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business
Six Stars

Re: Pig problem - Dealing with two delimiters. Please help.

The tPigLoad component would need to be updated to use a loader that supports this, such as the CSVLoader in the piggybank. Otherwise you will need to preprocess the file to have clear delimiters that do not appear in the data.
http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVLoader.html
One Star

Re: Pig problem - Dealing with two delimiters. Please help.

Is CSV loader added to Talend ? May I know the current status?