how to archive a file which is 90 days older than the run date passed during job run

Six Stars

how to archive a file which is 90 days older than the run date passed during job run

Hi,

I will pass run date (format: yyyymmdd) during job run and i have to archive the files from a folder which are 90 days old from the run date which i have passed.

 

The files in the folder are of the format

 

abcdefgh.Dyyyymmdd. 

How can this be achieved

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

I suppose the date passed to job reside in a context variable named "inDate" with data type String.

First you need to compute the value for "context.inDate - 90 days".

You can to that within a tSetGlobalVar component to define a global variable named "referenceDate" like this:

global.png

Then you need to get files in the source folder using tFileList.

Connect the tFileList component to an other tSetGlobalVar using the "Iterate" row.

Into the tSetGlobalVar component you will get the date part from the filename for each file you:

global.png

Connect tFileCopy to this tSetGlobalVar using an "If" trigger with the following expression:

((String)globalMap.get("referenceDate")).compareTo((String)globalMap.get("fileDate")) <= 0

Finally, configure the tFileCopy component as need to archive the current file:

copy.png

Current will be moved to the folder designed by "archiveFolder" context variable.


TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

pnt.pngjob design

Hi,

 

Should the job be designed like given above or there should be 2  different tfilelist components ?

when i ran the above job i got the following error for the tsetglobalvar_2:-

Exception in thread "main" java.lang.Error: Unresolved compilation problem:
The method replaceALL(String, String) is undefined for the type String.

Not sure whether i have designed it wrong.

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Change replaceALL by replaceAll

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run


I have changed it like that and ran the job but now it throws the below error-
tSetGlobalVar_2 null
[statistics] disconnected
Exception in component tSetGlobalVar_2
java.lang.NullPointerException

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Share the component settings

TRF
Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Here is what your job should look like:

job.png

tSetGlobalVar_1 to define the "reference date" based on the inDate context variable with the following expression:

TalendDate.addDate(context.inDate, "yyyyMMdd", -90, "dd")

tFileList_2 to get file list (no comment)

tSetGlobalVar_3 to compute the "file date" base on the current filename with the following expression:

((String)globalMap.get("tFileList_2_CURRENT_FILE")).replaceAll("^.*\\.D", "")

If trigger is defined with the following expression:

((String)globalMap.get("referenceDate")).compareTo((String)globalMap.get("fileDate")) <= 0

Note: we can use String.compareTo() because of the date format (yyyyMMdd).

 

tFileCopy to move the current file into the archived folder:

copy.png


TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

attaching the job component by component.

Now the job runs successfully but file is not getting copied ( there are files that are more than 90 days old in the source location) to the archive path.

 

Not getting what has gone wrong. 

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Do you some files successfully archived?
Can you share filename for files not archived?

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

No files got archived even when there were files matching the criteria(greater than 90 days).

Attaching the files from the path along with.(for testing purposes i have created touch files which is why all are showing up as 0 KB).

 

Thanks.

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

No problem with empty files.
Probably something wrong elsewhere in your job (context variable, global variable and so on). Double check everything or share the whole job design with components settings.

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

Sharing the whole job design along with individual component setting as well as the context variables.

 

Thanks

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

The problem is probably due to filenames.
In your initial post you were speaking about filenames like "ABCDEFGH.DyyyyMMdd" but in fact it seems there is an extension (based on your tFileList filemask which is "ABCDEFH.D*.gz").
So, change the tSetGlobalVar_2 expression with the following to remove the file extension:

(((String)globalMap.get("tFileList_2_CURRENT_FILE")).replaceAll("^.*\\.D", "")).replaceAll("\\.gz$", "")

Start with this correction but a more generic solution could be:

(((String)globalMap.get("tFileList_2_CURRENT_FILE")).replaceAll("^.*\\.D", "")).replaceAll("\\." + (String)globalMap.get("tFileList_2_CURRENT_FILEEXTENSION) + "$" , "")

In this case, if you change the extension or filemask into tFileList_1 you don't have to change anything into tFileList_2.

 


TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

Thanks for your reply and sorry for the confusion created from my end.

 

Actually i wanted to archive the files wherein the extension is gz (which was the reason i had included  "ABCDEFH.D*.gz" in the filemask). for eg if i need to archive a file with name ABCDEFGH.DyyyyMMdd.gz.

 

 

Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,
The job has only 1 tfilelist compo.
The flow is like this:-
tsetglobalvar_1->onsubjobok->tfilelist->iterate->tsetglobalvar_2->if->tfilecopy.

Please let me know if this correct. If another tfilelist has to be included please let me know as i couldnt understand it.

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

2 things you could have found by yourself:

- for the "If" expression replace "<= 0" by "> 0"

- if you decide to use the generic solution for tSetGlobalVar_2, the correct syntax is:

(((String)globalMap.get("tFileList_2_CURRENT_FILE")).replaceAll("^.*\\.D", "")).replaceAll("\\." + (String)globalMap.get("tFileList_2_CURRENT_FILEEXTENSION") + "$" , "")

It works (tested).

 


TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

Sorry ..i am new to this. i dont understand what the new tsetglobalvar_2 will do.

 

why is there a "$" when i have to remove the extension of .gz from my incoming file

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

tSetGlobalVar_2 is in charge to extract the date from the current filename. It uses 2 regex:
1- to remove everything from the beginning to ".D" (included)
2- to remove the extension ($ to say at the end of the filename in case of the same string would be present in the filename)
So, if you have something like "ABCDEF.D20180511.gz" the result will be "20180511".

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

Thanks for helping me understand.

 

in tfilelist_2 i had given context.sourcefilepath at directory and in filemask i had given context.filename.

 

is this where it goes wrong ?

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

If value of context.sourcefilepath is still /data/Archive/ and files are in this this folder, it's ok for this one.
if value of context.filename is still ABCDEFGH.D, change it for ABCDEFGH.D.*.gz or use context.filename + ".*.gz" for the tFileList file mask.
Also think about tell me what goes wrong (as I gave you a working solution).

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

I did a test run of all components (by passing input to fixedflowinput and running the refdate value in a tmap and writing to tLogRow) except the tfilelist to check where i have gone wrong.

 

But for tfilelist i dont know how to test it out individually.

 

Sorry for this and thanks for your time

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

You need to create files by yourself for input and connect a tJava to tFileList to print global variables associated to this component.
Could be time to mark your case as solved...

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Hi,

 

As mentioned earlier files are already there in the folder which are more than 90 days old than the run date passed as a context variable.I connected tfilelist to tjava to see the output from the former and found that when i have context.filename+"*.gz" in my filemask in tfilelist component i got the output as /data/Archive/ (sourcepath)
20190712 (as the date that is 90 days less than context run date).

But my tsetglobalvar_2 still throws null pointer exception.  Smiley Sad.

I couldnt figure it out.

Also this is giving me only 1 particular file date whereas my requirement is to archive all the files which are greater than 90 days .

 

Any help

 

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

Export your job (export element) with dependencies and send it to me

TRF
Six Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

attaching the exported job

Fifteen Stars TRF
Fifteen Stars

Re: how to archive a file which is 90 days older than the run date passed during job run

something is missing...

TRF

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog

How Media Organizations Achieved Success with Data Integration

Learn how media organizations have achieved success with Data Integration

Read

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch