Five Stars

Issues with .csv files created by the tFileOutputDelimited component.

Hi,
Hopefully someone can shed some light on this for me.
I have a job that is reading a large table and then writing the data to .csv files in increments of 25000 rows. What I have noticed is that all .csv files created after the first .csv file have the data loaded all in one row versus the first .csv file that has the data loaded in 25000 rows (as I want it).
Is there a setting that needs to get set on the tFileOutputDelimited component that will allow for the rows in all subsequent .csv files to get loaded as they are in the first (and 'good') .csv file? I am thinking it may be due to what is being used for the 'Escape char' value on the 'Advance settings' tab but am not sure. Unfortunately the documentation I found for the tFileOutputDelimited component's 'Advance settings' tab is lacking in regards to the CSV options.
Attached are a couple of screen shots showing the basic and advanced settings in place for the tFileOutputDelimited component.
Thank you in advance.
Tom
3 REPLIES
Moderator

Re: Issues with .csv files created by the tFileOutputDelimited component.

Hi,
What's the rate of subsequent .csv files? If you unchecked the CSV options, is the rate increasing?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Five Stars

Re: Issues with .csv files created by the tFileOutputDelimited component.

Hi,
Not sure I understand your question. When you ask rate, are you referring to how fast the data is being loaded? If so, speed (or rate) is not the issue. The issue is that the data getting loaded into the split .csv files is not as I want\
it.
For example, I set the value of the 'Split output in several files' option to on and the value of 'Rows in each output file' to 25,000 on the 'Advanced settings' tab on the tFileOutputDelimited component. With this in mind, a new .csv file will get created every 25,000 rows. So far, so good for this is occurring.
Now the problem is that all of the .csv files that get created via the split (this would be the second file and all subsequent files) have the rows' data on the same line as the header. See the below example.
File #1...
header row
row 1
row 2
...
row 25000
File #2...
header rowrow1row2...row25000
File #3...
header rowrow1row2...row25000
and so on.
Looking at the above you see that the rows are not being placed on separate lines which in turn is causing issues
down stream in another process.
Hopefully the above provides a good understanding of the issue being encountered.
Thank you and please let me know if you need additional information from me on this.
Regards,
Tom
Five Stars

Re: Issues with .csv files created by the tFileOutputDelimited component.

Solved the issue. It looks like the tFileOutputDelimited component's CRLF("\r\n") option (on the Basic setting) tab is not working as expected when data is split among multiple .csv files. After some research, I tried the LF("\n") option and that addressed the problem.
I am not sure if what was encountered is a bug or not, but I did notice in the generated java code that there seems to be some confusion (for lack of a better term) in the way the rowSeparator is managed.