Group data by a column and then sum multiple columns for each group
I've got a CSV with the following headers Date|Name|Type|Status|Gross|Fee|Net|Shipping|Balance What I'm trying to do is group the data by the Type column (Payment, Refund, Declined) and then sum the Gross, Fee, Net, & Shipping columns for each group. Currently my work flow is as follows tFileInputDelimited->tSortRow(sort by column Type)->tMap(removing unnecessary columns)->tFileOutputDelimited I see that the tAggregateSortedRow function provides the ability to group by and sort but it looks like it's only for databases. Is there a way to preform this operation on a CSV file? Cheers, AJ
Re: Group data by a column and then sum multiple columns for each group
i have a mysql table with the columns date|customer_name|purchase_amount| i want to group the data based on the customer name and date. and i want to load this data to another table of the same database with the grouped result and it should be update on the basis of current date. so,please help me i am new in this technology... Regards, Chandan kumar
/** something like **/ select ...... , sum(..) from ..... group by .....
As you have to load result in table in the same DB , you also can use ELT that have a graphic Talend way to design your job but do not extract data to manage it but let the DB engine do the request ans insert result into yout table.