Zeropad filenames of splitting large file into multiple small files?

Highlighted
Nine Stars

Zeropad filenames of splitting large file into multiple small files?

I have a 108GB file I am splitting into 600 files.

 

Is there any way to rename or zero pad a certain number of spaces for the generated filenames when you split a delimited output file into multiple smaller files?

 

For example the generated filenames are:

myFile1.csv

myFile2.csv

myFile3.csv

myFile4.csv

myFile5.csv

myFile6.csv

 

Desired output format:

myFile001.csv

myFile002.csv

myFile003.csv

myFile004.csv

myFile005.csv

myFile006.csv


Accepted Solutions
Employee

Re: Zeropad filenames of splitting large file into multiple small files?

Hi,

 

    Below is a quick way of renaming the string with zero padding and same function can be used for single or multiple zero padding.

image.pngScenario 1 - myFile1.csv is input file name

 

image.pngScenario 2 - myFile10 is input file name

 image.pngtmap

 

 

 

row1.input.substring(0,row1.input.lastIndexOf(".") ).replaceAll("[0-9]", "")+
("000" + row1.input.replaceAll("[^0-9]", "")).substring(row1.input.replaceAll("[^0-9]", "").length()) + row1.input.substring(row1.input.lastIndexOf(".") ) 

If you notice I am concatenating three strings together by +. The first string will have the details of file name without numbers. The second file will have only numbers from the file and third part file have .csv

 

Please note that its a quick way of parsing where the assumption is that you will not have numbers in your file names. If numbers are already there in original file name itself, you will have to make necessary amendments in the current code yourself. But I have given you the idea to move forward :-)

 

If the reply has helped you, could you please mark the topic as resolved? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

View solution in original post


All Replies
Nine Stars

Re: Zeropad filenames of splitting large file into multiple small files?

Is there a better way than this two step process?

 

Talend_rename.png

Employee

Re: Zeropad filenames of splitting large file into multiple small files?

Hi,

 

    Below is a quick way of renaming the string with zero padding and same function can be used for single or multiple zero padding.

image.pngScenario 1 - myFile1.csv is input file name

 

image.pngScenario 2 - myFile10 is input file name

 image.pngtmap

 

 

 

row1.input.substring(0,row1.input.lastIndexOf(".") ).replaceAll("[0-9]", "")+
("000" + row1.input.replaceAll("[^0-9]", "")).substring(row1.input.replaceAll("[^0-9]", "").length()) + row1.input.substring(row1.input.lastIndexOf(".") ) 

If you notice I am concatenating three strings together by +. The first string will have the details of file name without numbers. The second file will have only numbers from the file and third part file have .csv

 

Please note that its a quick way of parsing where the assumption is that you will not have numbers in your file names. If numbers are already there in original file name itself, you will have to make necessary amendments in the current code yourself. But I have given you the idea to move forward :-)

 

If the reply has helped you, could you please mark the topic as resolved? Kudos are also welcome :-)

 

Warm Regards,

 

Nikhil Thampi

View solution in original post

Nine Stars

Re: Zeropad filenames of splitting large file into multiple small files?

 

Thanks!

For future reference, here is my recap of your solution.

 

//REMOVE ALL NUMBERS IN FILENAME:
row1.myInput.substring(0,row1.myInput.lastIndexOf(".") ).replaceAll("[0-9]", "")
//ADD ZEROS FOR PADDING:
+("000"
//COMBINE THE FILE NUMBERS WITH THE ZEROS:
+ row1.myInput.replaceAll("[^0-9]", "")).substring(row1.myInput.replaceAll("[^0-9]", "").length())
//ADD THE FILE EXTENSION:
+ row1.myInput.substring(row1.myInput.lastIndexOf(".") );

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog