Hi, I have been working on Talend for a while now. I wanted to clear a few things which I couldn't find clearly. I want to know what all are possible to do with TOS and when will i be needing Talend enterprise version. 1. Job distribution: Is it possible to run Talend Jobs on different machines or processors? 2. Data pipelining: Can the different steps of an ETL process be run on different machines or processors? 3. Partitioning: Is it possible to partition based on, for example, product codes, to determine on which machine or processor the data has to be processed? 4. Support for analytical functions: During the loading process, is it possible to invoke different kinds of analytical functions like forecasting, basket analysis, regression? 5. Key lookups in memory: Can you load a table completely into internal memory and search the table? (without having to make joins) 6. Key lookups reusable across processes: Are these tables reusable across different loading processes in such a way the key lookup table is loaded once into memory?
I am sure, Talend would be providing all the above. Just looking for some documents and explanation regarding these. Thanks!
1. ) yes it is possible. With the Enterprise edition you would install various job servers on your target servers and deploy the job to them and in the open source edition you would export your job as standalone program and move it to the server you need. 2.) if you understand a step as a job (which does only one part of the whole transformation): the answer is yes. If you thing the whole data transformation takes place in one job: the answer is no. I would generally recommend to separate the complete transformation into small jobs which can be separately tested. 3.) Talend currently provides partioning within the job. You can of course design the transformation in a way you have multiple jobs and run these jobs on different servers (as explained in topic 2). The separation of the data to call different job is possible but not out of the job available. 4.) No, Talend does not provide analytical functions because this is not the focus of Talend as ETL tool. Those task should be done in a statistic tool like R or SAS. But of course you can implement it with a lot of Java libraries, Talend is actually a high abstracted Java development environment. 5.) yes this is the way the common mapping components like a tMap works 6.) you can keep the keys into a list component like tHashOutput/tHashInput and this way you can reuse the keys for any task of lookups. I would recommend you speak to Talend sales men and end them you requirements. Talend usually provides proof of concept projects for new customers. I have done a couple of them and this is the ultimate way to proof the capabilities of Talend.