[resolved] tFileList filemask regex

One Star

[resolved] tFileList filemask regex

Hi All,
In my job I am trying to delete from a directory all files that are not *.xls (*.xlsx). To do it I'm using tFileList -> tFileDelete.
The tFileList has 'exclude filemask' set to "*.xls*" and 'Use Glob Expressions as Filemask' box checked. This config works ok when I run the job from TOS on my machine (Windows) but when I run on unix a script generated from the job it fails with following error:
'java.lang.NoClassDefFoundError: org/apache/oro/text/GlobCompiler'
I tried to change it and use Perl5 Regex Expressions ('Use Glob Expressions as Filemask' box unchecked) with ".*\.xls.*" exclude filemask but it fails with another error:
'Unresolved compilation problem: Invalid escape sequence (valid ones are \b \t \n \f \r \" \' \\ )'
Any advise?
Thank you in advance

Accepted Solutions
One Star

Re: [resolved] tFileList filemask regex

Finally resolved!
The real problem here was the first error message (missing GlobCompiler class). The "*.xls*" filmask is ok.
The job worked ok on Windows as there I have the TOS installed which contains all the standard Talend libraries but it failed on unix because there is no TOS at all, I just execute there the java scripts generated by Talend so the libraries are missing.
My solution was:
1. The missing GlobCompiler class required for regular expressions is included in jakarta-oro-2.0.8.jar library, so I've added it to the user libraries in Talend preferences (Preferences -> Java -> Build Path -> User Libraries). When it is added to the preferences the library is added to 'lib' folder created by Talend when the java scripts are generated.
2. The GlobCompiler needs to be imported to a job using tLibraryLoad component.
Now everything works ok.

All Replies
Community Manager

Re: [resolved] tFileList filemask regex

Hi
Which version of TOS are you using? It is a compilation error, try ".*\\.xls.*"
Best regards
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business

Re: [resolved] tFileList filemask regex

also keep in mind that Regular expressions are a bit trickier in Java due to the double interpolation of your regular expression.
i.e. to match a single backslash with a Java regex, the expression must "double-escape" because the regex string is interpolated twice:
"\\\\" ---read to regex object --> "\\" -- regex string parsed --> "\"
The error you received is most likely due to an unescaped character class such as "\w+" instead of "\\w+"
One Star

Re: [resolved] tFileList filemask regex

Finally resolved!
The real problem here was the first error message (missing GlobCompiler class). The "*.xls*" filmask is ok.
The job worked ok on Windows as there I have the TOS installed which contains all the standard Talend libraries but it failed on unix because there is no TOS at all, I just execute there the java scripts generated by Talend so the libraries are missing.
My solution was:
1. The missing GlobCompiler class required for regular expressions is included in jakarta-oro-2.0.8.jar library, so I've added it to the user libraries in Talend preferences (Preferences -> Java -> Build Path -> User Libraries). When it is added to the preferences the library is added to 'lib' folder created by Talend when the java scripts are generated.
2. The GlobCompiler needs to be imported to a job using tLibraryLoad component.
Now everything works ok.