Auditing and Preprocessing of File

Eight Stars

Auditing and Preprocessing of File

Hi All,

 

Hope everyone is doing fine Smiley Happy

I got a requirement Friday which is mainly on Some preprocessing of files and Auditing. It is like below:

1. Suppose a File is coming ABC.txt which has 2 columns like below

ID|NAME

1|ABHIJIT

2|ABHIJIT1

Now suppose There is a extra column came in a week like ID|NAME|ADDRESS which should be notified via Mail. This ADDRESS column may come at the End or at the Middle.How to achieve this?

2. I am processing through tfilelist component files. Now suppose a File came with ABC.zip how to get this notification that a file came with invalid extension(Valid is .csv or .txt). I have kept masking as ABC.*. So irrespective of .csv or .txt it will pickup.

3. IF Any special character is coming how to get notification.If it is insertable No worries.

If any help much appreciated

@rhall_2_0@TRF@vboppudi@manodwhb@vapukov@alevy


Accepted Solutions
Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,

 

For the 1:-

if you enabled Check each row structure against schema  in Advanced setting of tFileinputDelimited and if you capture rejectes from tFileinputDelimited,you will get into reject file.

 

Untitled.png

 

Edited

 

Untitled.png

Manohar B
Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,for 2:-

you can check the filename is with extention .csv or .txt with below way.

 

Untitled.pngUntitled.png

Manohar B
Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,you shold desing in the below way.Untitled.png

Manohar B

All Replies
Seven Stars

Re: Auditing and Preprocessing of File

@abhi90, there is a component validateschema, based on you can validate the data. If columns are part of file, then you can check manually the first line,and cross verify against the schema. e.g. you can store the schema(columns comma separate) in db, and get that column list and cross verify against input column list. As you find some error, with the help of email component you can send the email. 

 

Once you get the file from filelist, get the file name and validate against required extn.

 

Hope this solve your problem and if so, marked as complete.

Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,check below link for tSchemaComplianceCheck.

 

https://help.talend.com/reader/g8zdjVE7fWNUh3u4ztO6Dw/dmSJyJcYe2p5cHIsy22nnw

Manohar B
Eight Stars

Re: Auditing and Preprocessing of File

Hi @mailforsaggy,

Thanks for your feedback. Can you please let me know how to get that column list and cross verify against input column list. Also Can you tell me how to validate the Delimiter against my Base. Suppose My Delimiter should come as "|" but suddenly it came as "," as well as If there is 6 columns incoming and Delimiter should be 5 then if One Delimiter extra came how to check that also.

Hi @manodwhb,

I have checked tschemacompliancecheck but that is Data Accuracy like Nullability,Length. This things. If anyone have done this any help much appreciated.

@rhall_2_0@vapukov@cterenzi@shong

Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,

 

For the 1:-

if you enabled Check each row structure against schema  in Advanced setting of tFileinputDelimited and if you capture rejectes from tFileinputDelimited,you will get into reject file.

 

Untitled.png

 

Edited

 

Untitled.png

Manohar B
Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,for 2:-

you can check the filename is with extention .csv or .txt with below way.

 

Untitled.pngUntitled.png

Manohar B
Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,you shold desing in the below way.Untitled.png

Manohar B
Eight Stars

Re: Auditing and Preprocessing of File

Hi @manodwhb,

Thanks for your solution. I am doing in your way. As of Now I have accepted the solution

Thirteen Stars

Re: Auditing and Preprocessing of File

@abhi90,please provide Kudos also Smiley Happy

Manohar B