One Star

How does output stream work?

Hi, I would like to have a job that uses the output from a tFileOutputDelimited as the input for a tFileInputDelimited
I'd like to do this without actually writing to a physical file
Is there a default command to use for the output stream to get the data from this component?
For example, my tFileInputDelimited component uses a tFileFetch as its source and the File name/Stream field looks like this:
((java.io.InputStream)globalMap.get("tFileFetch_1_INPUT_STREAM"))
Can I use something similar, e.g.
((java.io.InputStream)globalMap.get("tFileOutputDelimited_1_INPUT_STREAM"))
?
I tried doing just this but I get the error message: outputStream cannot be resolved to a variable
Thanks Smiley Happy
11 REPLIES
One Star

Re: How does output stream work?

Streaming is faster but still creates a physical file.
A work-around to using files on disk is using the tBufferInput and tBufferOutput. You can read up about them here
https://help.talend.com/search/all?query=tBufferInput&content-lang=en
If you don't find these components in your palette in your Studio, you'd need to add them by going to File --> Edit Project properties --> Designer --> Palette Settings, and adding to your palette.
One Star

Re: How does output stream work?

Hi and thanks
Do you know if I can use the stream of a tBufferInput into a tFileInputDelimited?
I have to define my schema mid-job with some special delimiting and line break characters (it's just a single column string before that)
Seems I need to use a tFileInputDelimited to specify what the characters are from the incoming file
One Star

Re: How does output stream work?

Sounds like you need the tExtractDelimitedFields after your tBufferInput to split the single column into multiple columns...
See usage in this help file: https://help.talend.com/search/all?query=tExtractDelimitedFields&content-lang=en
One Star

Re: How does output stream work?

Hi, I had looked at tExtractDelimitedFields, but I am also using a special characters to denote line breaks so I didn't know how to proceed
One Star

Re: How does output stream work?

See attached screenshots for what the job could look like....
One Star

Re: How does output stream work?

Presuming you start by reading a file (like I did in my screenshots above), you'd specify what special characters denote line breaks. In the attached screenshot, it's "\n". Change it to yours...
One Star

Re: How does output stream work?

Thanks for sharing the screen caps
The issue is that I basically have a delimited file within a delimited file Smiley Happy
My first pass I get the normal delimited fields; second pass I get the embedded file that was within one of the columns
What it seems to amount to is that I need to define my line breaks mid-job, which is why I asked if I could possibly stream the tBufferInput to a tFileInputDelimited file
One Star

Re: How does output stream work?

Sample layout of your file?
One Star

Re: How does output stream work?

Here it is:
ga:eventLabel,ga:totalEvents
typeA|111111|x;typeA|111112|x;typeB|111113|x;typeB|111114|x,20
typeA|111115|x;typeA|111116|x;typeB|111117|x;typeB|111118|x,32

The first column has "|" as delimiters and ";" as line breaks
In step one of my job I drop the second column and do a string replace to replace the "x" with the value of column 2, "ga:totalEvents"
So then I have one column with everything in it, e.g.
typeA|111111|x;typeA|111112|x;typeB|111113|x;typeB|111114|20
typeA|111115|x;typeA|111116|x;typeB|111117|x;typeB|111118|32

This works fine if I then output to a file with a one column schema and then input the same file with a 3 column schema, but of course I'm trying to avoid writing to physical files Smiley Happy
One Star

Re: How does output stream work?

What would your sample output look like (please write out 3 lines, so it's clear). Sounds like an iterate on the input flow , a tJava and some code to store in a hashmap could be used to skip file outputs. How many records do you anticipate having?
One Star

Re: How does output stream work?

I know its an old post, still replying just in case anyone is looking for similar kind of solution.
You could use tNormalize component. Here you can specify on which column you need to generate multiple rows, on which character you need the line break. Here for the given example data, it requires a tFileInputDelimitted --> tNormalize --> tExtractDelimitedFields --> tLogRow.
First the tFileInputDelimited component reads the file with default/normal linebreak and ","(comma) as the field delimiter. I presume that the first line is the header row which needs to be given as 1 in header row input box.
Second the tNormalize component act on the first column which is having delimited values including different line break that is ";" (semi colon)
Third the tExtractDelimitedFields component will take in the output of tNormalize and further splits the single column to multiple columns based on the field delimiter "|" (pipe).
Last the tLogRow will output the rows with modified schema as per the tExtracDelimitedFields component