Hello all, I have 3 component in my job tFilelist1 ------Iterate------> tFileInputDelimeted_1 ----------------> row main ----------> tFileOutputDelimeted_1 File List iterate on directory and send each file to input delimeted and it transfer all records to output. In tFileInputDelimeted the value of "File name/Stream" parameter is putted as : ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH")) I seen the code of all components and understand that tFilelist iterate over directory and put each file in global map in its begin section, tfileInputDelimeted opens the file in its begin section. Means the order of execution is begin of tFilelist then begin of tFileInputDelimeted. But I read in component creation document as begin starts from last component i.e begin2 -> begin1 then main1 -> main2 then end1->end2. Please tell me why it conflicts or am i doing any misconception.
Means you want to say the 1st component connected by iterate connection to other then 1st iteration of 1st component starts (i.e lops 1st iteration in begin section) then the 2nd component executes all its lifecycle(begin to end), then again 1st starts with its 2 nd iteration then 2nd component again executes its lifecycle(begin to end). And it means that execution order is different for row main connection and iterate connection. right? Plz reply and if want to ellaborate or give more info.Plz its welcome. Thnx.
You are getting into too much detail and it is not necessary. When you write a component you do need to consider the timing of Begin, Main and End code, but this scenario is very simple. When the tFileList runs it's begin code initiates everything. Then its Main code is cycled for every file. The Iterate link is fired every time the Main code is fired. The next components are treated like they are in a separate subjob. So the tFileInputDelimited is instantiated (Begin) and each row is retrieved from the file (Main). The tFileOutputDelimited's Main code is called for every row. It's End code is called once all of the rows have been transferred. After this has occurred the next row is iterated from the tFileList and the whole process happens again until there are no more files and the tFileList's End code is called.
Thanks I got your point. But why I am going into deep is I am creating my own component. And didn't find any documentation for the coding (rather than document posted on forum , which is not in depth). In my another post you replied me to R&D on available components. So for my purpose I am looking in deep into these existing component and there working. Let me tell you what component I created and what I want to achieve next. I m doing R&d, if you have also any solution then its welcome. I have created my own component say X. X reads JSON file but not whole.It reads first n records and create batch where n is batch size(entered by user) . Now it reads each record 1 by 1 from batch and transfer it. Then it continue reading file and creating 2nd batch and same procedure.... Repeat the process till the last record and then closes the file. Job eg. X(Created Component) ----- row main --- > tmap ----output connection ---> tmysqlo/p But as tmap transfer all records so on output connection it shows the final size(i.e total no of records transfer), But what I want is that it should show the number of records transferred from batch (Number of records in last batch can be less than actual batch size depend on the records available in file). for this purpose Initially I am thinking about the suitable Job design(Before going for further coding).(above job structure is not compulsion, but the purpose should be same i.e 1.Batchwise transfer 2.Connection should show number of records transfer of the batch ), So I want good job structure so as I will start to proceed with my component accordingly. please let me know if any difficulty to understand the language.
I see. I didn't realise that you were wanting to create a component. I thought you were wanting to create the job structure you described and wondered why you needed to know so much detail about how the components work. A good way to think about the code sections are ..... Begin = Instantiate variables used for the lifetime of the component (connections, lists, containers, etc) Main = The code used to deal with Row level detail and input and output of records End = The shut down phase. Tie up loose ends and maybe post process some data (aggregates, etc) I agree about the lack of documentation. It is a pain. I have written one component out of a need to pass datarows to a child job and being fed up with writing loads of Java to do it in a job. My experience there was not a happy one. But I did learn a fair bit about how it all works.
Thanks for it. But I not yet got how the job structure should be for my purpose . Then i want to proceed with the designing. To which component I should compare my component and create?Please read my previous post again.
I don't believe there is a component which batches up records to be output. The closest thing I can think of would be an aggregate component. You might find this easier to achieve with 2 components; 1 to batch the data up and store in a List of some type and another linked to it by an Iterator to release the records from the batch. Maybe store the List (containing the batches) in a Context variable. Take a look at my component for storing records using context variables (https://exchange.talend.com/#marketplaceproductoverview:gallery=marketplace%252F1&pi=marketplace%252...)
Yeah I was also thinked in the same way. That my first component will create batches to which 2nd compoent should be attach by iterate connection and 2nd batch should trnslate records 1 by 1. Custom Component X ---> iterate -----> Custom component Y -----> row main ------> tmap or Continue with any components or job X will create first batch send it to Y, Y will read the batch and translate individual records and so on row main connection it will show the number of records translate of each batch. X will store batch in global map variable which Y will read. Or 2nd solution I think is just create X --->Iterate ---> tmap ----> i.e modify tmap i.e at tmap take a bulk(batch) and read record 1 by 1 (simulate as i/p comes 1 by 1) and transfer it. But think it as difficult one. (Actually I dont know very musch about in build components) What you think, is I am going in the right way.