How to Load a CSV file based upon the metadata and different structure in the File.

Four Stars

How to Load a CSV file based upon the metadata and different structure in the File.

I have 138000 files I need to load with the following structure. The first column I have added as an explanation of what are the contents of reach line.

Line 1 Drives values to be loaded from Lines of Atom 1 to Atom 19 (Child Table)

Line 2: Its data needs to go into Parent Table

Frequencies need to into Parent Table

Last 2 lines need to go into  Parent and Child Tables.

 

Any idea how to do this?

 

 

 

Explanation                   
# of Atoms19                  
Molecular_energiesgdb 13.15803571.24363291.10602161.131277.92-0.212360.023960.236321176.69950.155411-422.59307-422.58379-422.58285-422.6270334.695  
Atom 1C-1.8396130.529295573.18206792-0.414977              
Atom 2C-2.1465361-0.16446751.842511390.232941              
Atom 3C-3.58660020.132658721.38668197-0.415379              
Atom 4O-1.9452669-1.57364171.95806872-0.461014              
Atom 5C-1.16446160.268958510.791004570.388553              
Atom 6C-0.2471162-0.39756230.04289842-0.278918              
Atom 7C0.385209510.58702844-0.7883678-0.178412              
Atom 8C-0.19714011.77247527-0.4801599-0.025883              
Atom 9O-1.14642861.597741070.48275977-0.201531              
Atom 10H-2.56554970.22276973.944423310.100196              
Atom 11H-0.83769960.255238873.521592370.128535              
Atom 12H-1.89109761.616636723.080944810.130435              
Atom 13H-4.3011711-0.17126032.160777280.100233              
Atom 14H-3.72488961.200332281.196584410.130552              
Atom 15H-3.8074834-0.41876660.469412710.128541              
Atom 16H-2.5619847-1.89787642.622093660.278056              
Atom 17H-0.0515173-1.45582610.083738290.120535              
Atom 18H1.166691590.4258948-1.51446230.1088              
Atom 19H-0.06722312.78381413-0.82781740.128737              
Frequencies 47.3286140.8247190.1321220.2055235.6849277.8749328.4914342.2134347.5775466.2143484.9254606.1277613.3396707.4789739.7643814.5836843.6529854.6878901.6639
SMILESCC(C)(O)C1=CC=CO1CC(C)(O)c1ccco1                
InChIInChI=1S/C7H10O2/c1-7(2,8)6-4-3-5-9-6/h3-5,8H,1-2H3InChI=1S/C7H10O2/c1-7(2,8)6-4-3-5-9-6/h3-5,8H,1-2H3             
Thirteen Stars

Re: How to Load a CSV file based upon the metadata and different structure in the File.

as variant, split job for 3 step

 

1) read file (no header) - read only 1 line, store number to variable

2) read file, skip 1 line, read number of lines stored in variable on step 1 (LIMIT)

3) read file, skip 1+ variable lines (header == 1+ variable), limit = 2

 

 

-----------
Four Stars

Re: How to Load a CSV file based upon the metadata and different structure in the File.

Do I have to write code for it or can it be done by gui?

Highlighted
Thirteen Stars

Re: How to Load a CSV file based upon the metadata and different structure in the File.

if all files have the same structure  (number of columns)

all possible to do with standard components:

tFileInputDelimited

tFlowToIterate

tMap

etc

 

if not - it more complicated

-----------
Four Stars

Re: How to Load a CSV file based upon the metadata and different structure in the File.

All files are the same. Can you give me a fwe more details on these steps please? It will help me shorten my learning curve.

 

Thanks!

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.