In the middle of my text file that is the data source there is a newline character at some random points. It's produced by someone else's system and I don't have control over the format coming to me.
Normally, all my lines should start with a date format like "dd/MM/YYYY HH:mm". So Basically what I need is a way to concatenate my lines that doesn't start with a date format to the previous one.
Example of input :
09/05/2017 16:52:51 JOB _Ref_IO810 09/05/2017 18:39:10 JOB
What I need on the output :
09/05/2017 16:52:51 JOB_Ref_IO810 09/05/2017 18:39:10 JOB_Ref_IO811
Any Idea on how to do this?
Enclose the source file fields between "" like this
"09/05/2017 16:52:51" "JOB _Ref_IO810" "09/05/2017 18:39:10" "JOB _Ref _IO811"
If not possible, refer yourself to this post for a more advanced solution based on tMap variables.
The 2nd approach is the way if you really want to remove the undesirable line feed.
Introduction to Talend Open Studio for Data Integration.
Practical steps to developing your data integration strategy.
Create systems and workflow to manage clean data ingestion and data transformation.