Coming from SAS background, I am finding this basic task of deduping a dataset quite a chore with Talend. I am certain that it's to do with the lack of my experience with the tool.
I need to remove duplicates from a target dataset, after inserting the data. I can't seem to run multiple sql statements using tSortRow (MySQL) and I get the error - "You have an error in your SQL syntax". The set of queries work fine on the MySQL Workbench.
With SAS Data Management Studio, I could add a SAS code node and could run this sort procedure with noduplicates modifier. I was trying to do something similar by running the set of SQL queries, through tMySQLRow component.
How do you guys do that? All I need is the talend way to deduplicate the data in the datasource.
to run multiple SQL statements in one tMysqlRow, you have to set an additional jdbc parameter (allowMultiQueries) in advanced settings:
Could you please try tUniqrow component for your use case?
Please refer the help document of this component?
I was trying to do something similar by running the set of SQL queries, through tMySQLRow component.
The tMysqlRow component is not a component that provides output. It can execute a query (or multiple queries) for each input row, but it does not give you data like the tMysqlInput component.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema