One Star

Aggregate data and make a sum

Hello everyone,
Sorry for my english, i'm french !
I have a source file delimited as same as this exemple :
Date;Serveur;Drive;Name;Space;FreeSpace
13/11/2008;SERVER1;C:;SYSTEM;8578932736;2527985664
13/11/2008;SERVER1;D:;SWAP;2138540032;917469184
13/11/2008;SERVER1;E:;DATA;2138537984;1242279936
13/11/2008;SERVER1;G:;Save AD;2138540032;876894208
13/11/2008;SERVER2;C:;BACKUP_NT;2097413632;1934547968
13/11/2008;SERVER2;D:;Disque NT;8595417088;4937967104
13/11/2008;SERVER2;E:;Applicatif;6440357888;4147671040
13/11/2008;SERVER2;F:;Data;10733957120;5570387968
13/11/2008;SERVER2;T:;Applicatif;6440357888;4147671040
.
.
.
.
06/02/2009;SERVER1;C:;System_SAN;8587157504;3538583552
06/02/2009;SERVER1;D:;Swap Local;28021305344;14857166848
06/02/2009;SERVER1;E:;Application;8587157504;5822963712
06/02/2009;SERVER1;F:;Data;257689763840;60264726528
06/02/2009;SERVER1;Y:;System_Local;8389242880;2561630208
06/02/2009;SERVER2;C:;BACKUP_NT;2097413632;1934547968
06/02/2009;SERVER2;D:;Disque NT;8595417088;4896002048
06/02/2009;SERVER2;E:;Applicatif;6440357888;3076538368
06/02/2009;SERVER2;F:;Data;10733957120;5566181376
06/02/2009;SERVER2;T:;Applicatif;6440357888;3076538368

I want to aggregate the space and FreeSpace data and load this datas in an output table same as :
Date;Serveur;SumSpace,SumFreeSpace
13/11/2008;SERVER1;14994550784;5564628992
13/11/2008;SERVER2;32210089984;18803697152
.
.
.
.
06/02/2009;SERVER1;14994550784;6894628992
06/02/2009;SERVER2;32210089984;15603697152

Have you got any idea about this ?
How I can make this in a Job Design ?
I'm trying to make that with the tAggregateRow component but it doesn't work Smiley Sad
Thank's a lot,
François.
8 REPLIES
One Star

Re: Aggregate data and make a sum

For the moment I obtain this result :
.------------+-------------+----------+----------------.
| tLogRow_1 |
|=-----------+-------------+----------+---------------=|
|DATE_COLLECT|SERVER |SUM_SPACE|SUM_FREE_SPACE|
|=-----------+-------------+----------+---------------=|
|06/02/2009 |SERVER1 |0.0 |0.0 |
|06/02/2009 |SERVER2 |0.0 |0.0 |
|06/02/2009 |SERVER3 |0.0 |0.0 |
|06/02/2009 |SERVER4 |0.0 |0.0 |
|06/02/2009 |SERVER5 |0.0 |0.0 |
|06/02/2009 |SERVER6 |0.0 |0.0 |
|06/02/2009 |SERVER7 |0.0 |0.0 |
|06/02/2009 |SERVER8 |0.0 |0.0 |

With this settings of tAggregateRow component :
- Group BY :
----> DATE_COLLECT
----> SERVER
- Operations :
----> Out : SUM_SPACE => SUM => In : SPACE
----> Out : SUM_FREE_SPACE => SUM => In : FREESPACE
I don't understand why the third and fourth collumn are not calculated.
François.
One Star

Re: Aggregate data and make a sum

Hum,
Another test....
I have loaded all the datas (includes in my csv file) line per line into an SQL Database (with the same table schema), and I have made this following query :
select date_collect, server, sum(space), sum(free_space)
from statging_table
group by (date_collect, server);

And the result was good....!!!!
So I think the tAggregateRow component cannot make a sum with two (or more) String datas...but when I'm trying to convert String to Integer (especially this data : 8578932736) Talend return to me this error :
Starting job WIN_Disk at 16:41 23/02/2009.
Exception in component tConvertType_1
java.lang.NumberFormatException: For input string: "8578932736"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.valueOf(Unknown Source)
at routines.system.TypeConvert.String2Integer(TypeConvert.java:4186)
at collect_it.win_disk_0_1.WIN_Disk.tFileInputDelimited_1Process(WIN_Disk.java:1086)
at collect_it.win_disk_0_1.WIN_Disk.runJobInTOS(WIN_Disk.java:1415)
at collect_it.win_disk_0_1.WIN_Disk.main(WIN_Disk.java:1328)
Job WIN_Disk ended at 16:41 23/02/2009.

So...I decided to throw my computer out the window... Smiley Sad
François.
One Star

Re: Aggregate data and make a sum

Please keep your computer, or tell me where you work, I'll go under your window ! Smiley Tongue
Ok, I tried with this file source
Date;Serveur;Drive;Name;Space;FreeSpace
13/11/2008;SERVER1;C:;SYSTEM;8578932736;2527985664
13/11/2008;SERVER1;D:;SWAP;2138540032;917469184
13/11/2008;SERVER1;E:;DATA;2138537984;1242279936
13/11/2008;SERVER1;G:;Save AD;2138540032;876894208
13/11/2008;SERVER2;C:;BACKUP_NT;2097413632;1934547968
13/11/2008;SERVER2;D:;Disque NT;8595417088;4937967104
13/11/2008;SERVER2;E:;Applicatif;6440357888;4147671040
13/11/2008;SERVER2;F:;Data;10733957120;5570387968
13/11/2008;SERVER2;T:;Applicatif;6440357888;4147671040
06/02/2009;SERVER1;C:;System_SAN;8587157504;3538583552
06/02/2009;SERVER1;D:;Swap Local;28021305344;14857166848
06/02/2009;SERVER1;E:;Application;8587157504;5822963712
06/02/2009;SERVER1;F:;Data;257689763840;60264726528
06/02/2009;SERVER1;Y:;System_Local;8389242880;2561630208
06/02/2009;SERVER2;C:;BACKUP_NT;2097413632;1934547968
06/02/2009;SERVER2;D:;Disque NT;8595417088;4896002048
06/02/2009;SERVER2;E:;Applicatif;6440357888;3076538368
06/02/2009;SERVER2;F:;Data;10733957120;5566181376
06/02/2009;SERVER2;T:;Applicatif;6440357888;3076538368

...and I get this:
.----------+-------+------------+-----------.
| tLogRow_1 |
|=---------+-------+------------+----------=|
|Date |Serveur|Space |FreeSpace |
|=---------+-------+------------+----------=|
|06/02/2009|SERVER2|34307503616 |18549808128|
|06/02/2009|SERVER1|311274627072|87045070848|
|13/11/2008|SERVER2|34307503616 |20738245120|
|13/11/2008|SERVER1|14994550784 |5564628992 |
'----------+-------+------------+-----------'

Is something like this you expected ?
Arnaud
One Star

Re: Aggregate data and make a sum

Thank's Arnaud for your reply Smiley Happy
Yes, that's exactly what I want it!
One Star

Re: Aggregate data and make a sum

Ok. Then, here is the parameter I applied, as screenshot...
If you need further details, please ask (english or french accepted Smiley Wink)
Arnaud
One Star

Re: Aggregate data and make a sum

Hey,
Merci beaucoup de ton aide.
J'ai donc fait exactement comme sur tes captures d'écrans, et bingo, ça fonctionne !
C'était donc bien le passage en "Long" au lieu de "Integer" qui a fait tout fonctionner Smiley Very Happy
Reste à régler un problème....de temps en temps j'ai des lignes "vides" de type :
13/11/2008;SERVER1;C:;SYSTEM;8578932736;2527985664
13/11/2008;SERVER1;D:;;;
13/11/2008;SERVER1;E:;;;
13/11/2008;SERVER1;G:;;;
13/11/2008;SERVER2;C:;BACKUP_NT;2097413632;1934547968
13/11/2008;SERVER2;D:;Disque NT;8595417088;4937967104
13/11/2008;SERVER2;E:;Applicatif;6440357888;4147671040
13/11/2008;SERVER2;F:;Data;10733957120;5570387968
13/11/2008;SERVER2;T:;Applicatif;6440357888;4147671040

Car j'ai des serveurs en cluster...et là du coup ça plante...donc il faut que j'utilise un petit tFilterRow Smiley Very Happy
Je te tiens au courant de l'avancement, et si j'y arrive pas, j'espère bénéficier de ton expertise Smiley Very Happy
@+
François.
One Star

Re: Aggregate data and make a sum

Finalement c'était un peu plus compliqué que cela car :
- impossible de tester si une variable de type "Long" est vide (enfin je n'ai pas réussi)
Du coup voici mon job finale (avec insertion dans une BDD oracle) :
- je récupère les lignes du fichier (tout en String)
- je fais mes tests et remplacement des valeurs vides
- je transforme les variables "space" et "freespace" en Long
- je lance l'agrégation et la somme des données
- je retransforme les variables "Long" en "String" et incrémente un ID unique
- enfin je balance le tout dans ma table Oracle
C'est assez long, mais ça fonctionne Smiley Very Happy
Merci beaucoup Arnaud pour ton aide précieuse!
@+
François.
One Star

Re: Aggregate data and make a sum

Une autre méthode (à voir si c'est plus rapide pour toi), c'est de faire un tMap juste avant l'aggrégation pour remplacer toutes les valeurs null par un 0.
Pour ceci, il y a la forme conditionnelle suivante: (row1.Space==null)?0:row1.Space (cf capture d'écran)
Arnaud