A Talend 6.4.1 Job consists of reading data from a database (with a tOracleInput component) and writing data into an HDFS file (with a tHDFSOutput component). The tOracleInput component is using schema Dynamic type for getting data from generic SQL queries. tHDFSOutput does not support the Dynamic type as shown below:
So, the tMap component is used to convert the Dynamic type column to an Object type column:
However, the problem is that when writing data to an HDFS file with the tHDFSOutput component, the field separator string is not taken into account (in the screenshot above, the tHDFSOutput field separator is ';'). The field separator is always '-' when writing rows into HDFS files:
" 7369 - SMITH - CLERK - 7902 - 17/12/1980 - 800 - null - 20 7499 - ALLEN - SALESMAN - 7698 - 20/02/1981 - 1600 - 300 - 30 7521 - WARD - SALESMAN - 7698 - 22/02/1981 - 1250 - 500 - 30 "
So, the question is how to write data into HDFS with a field separator other than '-'.
|Problem root cause||
The default field separator for the Dynamic type is '-'. By design, this field separator cannot be customized in Talend components, in particular the tHDFSOutput component.
|Solution or Workaround||
The workaround consists of:
To illustrate this solution, suppose that the desired field separator is '|'.
The screenshot below shows the tMap component mapping that can be done:
The chosen field separator is '|' and it is passed to the tostring method as a parameter: row1.newColumn.toString("|") in the tMap component mapping.
As a result, the String rows taken from tMap output will be written into an HDFS file with '|' as a field separator by the tHDFSOutput component:
" 7369|SMITH|CLERK|7902|17/12/1980|800|null|20 7499|ALLEN|SALESMAN|7698|20/02/1981|1600|300|30 7521|WARD|SALESMAN|7698|22/02/1981|1250|500|30 "
|JIRA ticket number|