Five Stars

tFileInputDelimited simple dynamic schema (number of columns)?

I have a single job that iterates through a bunch of files based on dynamically loaded contexts.

It then gets whatever's in these files and uploads it to a sheet.

My challenge currently is with the fact these files will have a different number of columns - how can I add via a context variable "NumColumns" a dynamic schema option so each of the files has the correct number of columns uploaded?

 

I am thinking of setting the tFileInputDelimited schema to a max number of columns and then using tMap to dynamically only pull the number of columns available for the current file.

I am very novice with Talend and Java so not sure how and if I can do this transofrmation in tMap - any ideas?

  • Data Integration
8 REPLIES
Five Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

The last reply in this thread could be a workaround solution:
https://www.talendforge.org/forum/viewtopic.php?id=37949

However, I am not sure how to "mark" columns as null and then use tJavaRow to filter them - any pointers?
Ten Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

This tutorial may work for you. It might make sense to extend the ideas here and write a routine to handle the processing of the logic, but that is up to you. However this shows a relatively easy way of achieving the majority of your goal

https://www.rilhia.com/tutorials/dynamic-column-order

Rilhia Solutions
Five Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

Hey thanks for the quick reply.
I have seen similar column mapping solutions and it's definitely something I am saving for later use.
Unless I am missing something however, it is of no use to me as I don't have the same number of columns.
My problem is that some of the files I work with have for example column A-M whilst others will have additional columns A-W.
When I set in the schema of the tFileInputDelimited all the columns up to "W", I get N-W with null values which upset my sheet (I have some "self dragging" array formulas which break as they see the null values).
Therefore I need a way to discard the columns with null values from being passed.
Ten Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

When you say "sheet" are you talking Excel or just another CSV type file? There are ways around this, but you may end up having to use a bit of Java. If you are just outputting to a CSV (or similar) file, this example may help. It was actually written to deal with a scenario where every row may need to be added to a different file (depending on a key in the file). However, you can ignore the multiple file part and look at the section where the file content is built. If you build the structure manually you could add checks for null fields and remove them.

 

https://www.rilhia.com/tutorials/load-data-dynamic-number-files

 

Of course, the tutorial above will only work with flat file creation. If you are wanting to do something with Excel it will still be possible, but may require a bit more thought.

 

 

Rilhia Solutions
Five Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

I am taking .csv and uploading it to a Google Sheet, using tGoogleSheetOutput.
Checked your example but can't follow your logic - can you kindly explain your thinking a bit more?
Cheers
Ten Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

OK, my thinking was regarding avoiding a fixed schema. While Talend is an awesome tool, sometimes the functionality to make it "drag and drop" and "easy" are somewhat limiting. Loading to a CSV file you can take total control of the columns being loaded using Java. You simply need to create a String of columns formatted in a suitable format for CSV. When you take control of that yourself, you can handle dynamically removing null columns. The part of the tutorial I was aiming at you was the building of the CSV row. I didn't actually include any logic to dynamically remove columns, but that is easy. The bit I was hoping you would latch onto was the creation of the rows and how easy it is to write to a flat file with a tiny bit of Java.

 

That aside, it does not really suit your case here. The tGoogleSheetOutput I have not used, but assume you are required to use a fixed schema. If you really do not want to add null columns at the end (or cannot for fear of breaking your file) you can use the Google API for manually creating your data row in a similar way to above....although you will have to code this yourself. Take a look here (https://developers.google.com/sheets/api/) for an idea of what you will need. You can call any web API available using Talend components and all you need to think about after that is formatting JSON or XML and maybe dealing with OAuth authentication. These are big steps to start with, but really beneficial in the long run and open a lot of doors.

Rilhia Solutions
Five Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

Thanks for taking the time to explain more, that sheds some light.

I think the API route is an overkill for my use as all I am trying to avoid doing is breaking some helper columns in my Google Sheet - a simpler solution would be to simply move them to the left so when I'm dropping the CSV with null data to the right it doesn't affect my data.
Just on the API as it might be a useful exercise in the future; what Talent components do you use to connect?
Ten Stars

Re: tFileInputDelimited simple dynamic schema (number of columns)?

I use the tRestClient or tRest component (there are subtle differences between the two). 

If you can shift your columns to the left and use the component you mentioned earlier, that will certainly be easier, but may leave you open to your formula issue with nulls. However I am sure there is another workaround for that, maybe within the spreadsheet.

 

Don't be afraid to use third party APIs with Talend. That is one of the best things about it from my perspective. It allows you to do things in a simplified manner, while still allowing you to do some technical stuff. Good luck

Rilhia Solutions