Zeropad filenames of splitting large file into multiple small files?

Nine Stars

Zeropad filenames of splitting large file into multiple small files?

I have a 108GB file I am splitting into 600 files.

 

Is there any way to rename or zero pad a certain number of spaces for the generated filenames when you split a delimited output file into multiple smaller files?

 

For example the generated filenames are:

myFile1.csv

myFile2.csv

myFile3.csv

myFile4.csv

myFile5.csv

myFile6.csv

 

Desired output format:

myFile001.csv

myFile002.csv

myFile003.csv

myFile004.csv

myFile005.csv

myFile006.csv


Accepted Solutions
Employee

Re: Zeropad filenames of splitting large file into multiple small files?

Hi,

 

    Below is a quick way of renaming the string with zero padding and same function can be used for single or multiple zero padding.

image.pngScenario 1 - myFile1.csv is input file name

 

image.pngScenario 2 - myFile10 is input file name

 image.pngtmap

 

 

 

row1.input.substring(0,row1.input.lastIndexOf(".") ).replaceAll("[0-9]", "")+
("000" + row1.input.replaceAll("[^0-9]", "")).substring(row1.input.replaceAll("[^0-9]", "").length()) + row1.input.substring(row1.input.lastIndexOf(".") ) 

If you notice I am concatenating three strings together by +. The first string will have the details of file name without numbers. The second file will have only numbers from the file and third part file have .csv

 

Please note that its a quick way of parsing where the assumption is that you will not have numbers in your file names. If numbers are already there in original file name itself, you will have to make necessary amendments in the current code yourself. But I have given you the idea to move forward :-)

 

If the reply has helped you, could you please mark the topic as resolved? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)

All Replies
Nine Stars

Re: Zeropad filenames of splitting large file into multiple small files?

Is there a better way than this two step process?

 

Talend_rename.png

Employee

Re: Zeropad filenames of splitting large file into multiple small files?

Hi,

 

    Below is a quick way of renaming the string with zero padding and same function can be used for single or multiple zero padding.

image.pngScenario 1 - myFile1.csv is input file name

 

image.pngScenario 2 - myFile10 is input file name

 image.pngtmap

 

 

 

row1.input.substring(0,row1.input.lastIndexOf(".") ).replaceAll("[0-9]", "")+
("000" + row1.input.replaceAll("[^0-9]", "")).substring(row1.input.replaceAll("[^0-9]", "").length()) + row1.input.substring(row1.input.lastIndexOf(".") ) 

If you notice I am concatenating three strings together by +. The first string will have the details of file name without numbers. The second file will have only numbers from the file and third part file have .csv

 

Please note that its a quick way of parsing where the assumption is that you will not have numbers in your file names. If numbers are already there in original file name itself, you will have to make necessary amendments in the current code yourself. But I have given you the idea to move forward :-)

 

If the reply has helped you, could you please mark the topic as resolved? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Nine Stars

Re: Zeropad filenames of splitting large file into multiple small files?

 

Thanks!

For future reference, here is my recap of your solution.

 

//REMOVE ALL NUMBERS IN FILENAME:
row1.myInput.substring(0,row1.myInput.lastIndexOf(".") ).replaceAll("[0-9]", "")
//ADD ZEROS FOR PADDING:
+("000"
//COMBINE THE FILE NUMBERS WITH THE ZEROS:
+ row1.myInput.replaceAll("[^0-9]", "")).substring(row1.myInput.replaceAll("[^0-9]", "").length())
//ADD THE FILE EXTENSION:
+ row1.myInput.substring(row1.myInput.lastIndexOf(".") );

Cloud Free Trial

Try Talend Cloud free for 30 days.

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.