I need to read txt.gz file and but I dont want to extract it because I need only first row (header) and count of lines in the file.
Any help will be appreciated.
but how you are thinking to count the number of rows without unarchiving the file? even if it will be in memory, it is. still will be unarchiving
if you are on Liunx machine , check if you have zgrep/gunzip command.
if yes , you could use tSystem and Zgrep/gunzip .
if you are on Liunx machine , check if you have zipgrep command.
if yes , you could use tSystem and Zipgrep .
for grep (pattern search) - yes
but for row counts?
in any case -
zipgrep is a shell script and requires egrep(1) and unzip(1L) to function
it's not a "magic button" it uncompress the file and do the proper job :-)
Thanks for responding. Actually I was thinking if we have some existing component in Talend that can do the similar stuff like in Unix.
zcat file.txt.gz | wc -l
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Part 2 of a series on Context Variables
Learn how to do cool things with Context Variables
Find out how to migrate from one database to another using the Dynamic schema