One Star

Small NLS_LANG issue

Could somebody explain, why do we have the following lines of code in generated perl script for Talend/Oracle connection? I don't want to set any NLS during connection, I want to use the default one. And even if I would like one - why do we only have 2 types of NLS lines here? We don't use UTF8 or ISO8859 in our environment at all. I have to manually comment it out every time I run the script on my linux box, or have to override the NLS settings to something else, like 'NO_NLS' in open studio. Can we have an option in the environment to skip NLS by default and not worry about changing any NLS settings? If I'll need to change NLS - I'll just change my environment variable "NLS_LANG" before running my Talend scripts
if (XXXXXXX =~ m/^utf-?8$/i) {
$ENV{NLS_LANG} = 'AMERICAN_AMERICA.AL32UTF8';
}
elsif (XXXXXXXXX =~ m/^iso-?8859-?1$/i) {
$ENV{NLS_LANG} = 'AMERICAN_AMERICA.WE8ISO8859P1';
}
Tags (1)
6 REPLIES
Employee

Re: Small NLS_LANG issue

Could somebody explain, why do we have the following lines of code in generated perl script for Talend/Oracle connection?

I think you timson have understood the purpose of these Perl instructions, but for other readers : it's about telling the database server what is the encoding of the characters sent by the client.
I don't want to set any NLS during connection, I want to use the default one. I'll just change my environment variable "NLS_LANG" before running my Talend scripts

All right, I understand your vision :-) You may also understand that "manually" setting the NLS_LANG environment variable is not as user friendly as setting the encoding at the Talend Open Studio job connection level.
What would you think of :
if (not defined $ENV{NLS_LANG}) {
# set the NLS_LANG environment variable depending on the encoding
# the user has chosen in the GUI
}

In addition to this it would be useful to make obvious the fact that even if the user has chosen an encoding, it may not be used if the NLS_LANG is already set, so a checkbox "override any existing NLS_LANG environment variable" in the "Advanced settings" tab, uncheked by default.
Your opinion on this?
One Star

Re: Small NLS_LANG issue

This is a great idea! I already like it Smiley Happy
But... I have a couple of concerns...
First. What if I want to skip the NLS_LANG definition? For example, the script might work with NLS_LANG set to some value on some development environment, but fail in the production just because NLS_LANG is incorrect. In 90% of my applications I just want to leave it undefined and don't bother setting it up.
So this is how I envision the ideal scenario:
1) If NLS_LANG is defined in the Environment - don't touch it
2) By default it should be undefined in open studio as well as in generated script and should remain undefined UNLESS I explicitly want to change it in Open Studio.
3) So if it's undefined in the environment and in the script - don't do anything
4) If it's defined in Open Studio AND in the environment - I would prefer the script variable to take precedence. Well... I am not 100% sure about that, we can discuss it.
Another big concern about this NLS_LANG code:
let's say - I want to change the value of the NLS_LANG variable to: FRENCH_FRANCE.UTF8 and manually set it as a custom value
The following code
if (XXXXXXX =~ m/^utf-?8$/i) {
$ENV{NLS_LANG} = 'AMERICAN_AMERICA.AL32UTF8';
}
elsif (XXXXXXXXX =~ m/^iso-?8859-?1$/i) {
$ENV{NLS_LANG} = 'AMERICAN_AMERICA.WE8ISO8859P1';
}
will match it with the 1st regex (because we have UTF8 in our line) and actually convert the value to 'AMERICAN_AMERICA.AL32UTF8'! that's why I asked you what is the purpose of the code - why can't we just leave the value as it is defined by the user in Open Studio.
Let me know if I am thinking in the right direction, or maybe I am missing something here? I really appreciate your answers!
Here is the document that helped me to go through the NLS_LANG issues:
http://www.oracle.com/technology/tech/globalization/htdocs/nls_lang%20faq.htm#_Toc110410546
Employee

Re: Small NLS_LANG issue

let's say - I want to change the value of the NLS_LANG variable to: FRENCH_FRANCE.UTF8 and manually set it as a custom value

In that case, you use a tSetEnv with 'NLS_LANG' as key and 'FRENCH_FRANCE.UTF8' as value. Then you disable in the tOracle* the "override any existing NLS_LANG environment variable" option, so that you're sure that anywhat the Encoding is set to, it won't update the NLS_LANG environment variable.
What's "complex" with Oracle is that NLS_LANG does not only set the encoding, but also the language and the territory. Thanks to your link on Oracle documentation, I've understood that I could set the NLS_LAND to ".AL32UTF8", language and territory will be set based on NLS_LANGUAGE and NLS_TERRITORY. Maybe I should skip the language and territory information.
You're also right that what a user set as custom encoding should be used "as is" in the $ENV{NLS_LANG}.
Summary of proposed changes:
- don't set language/territory in $ENV{NLS_LANG}
- option "override any existing NLS_LANG environment variable", unchecked by default
- when the encoding is "CUSTOM", then it's use "as is" to set $ENV{NLS_LANG}
One Star

Re: Small NLS_LANG issue

let's say - I want to change the value of the NLS_LANG variable to: FRENCH_FRANCE.UTF8 and manually set it as a custom value

In that case, you use a tSetEnv with 'NLS_LANG' as key and 'FRENCH_FRANCE.UTF8' as value. Then you disable in the tOracle* the "override any existing NLS_LANG environment variable" option, so that you're sure that anywhat the Encoding is set to, it won't update the NLS_LANG environment variable.
What's "complex" with Oracle is that NLS_LANG does not only set the encoding, but also the language and the territory. Thanks to your link on Oracle documentation, I've understood that I could set the NLS_LAND to ".AL32UTF8", language and territory will be set based on NLS_LANGUAGE and NLS_TERRITORY. Maybe I should skip the language and territory information.
You're also right that what a user set as custom encoding should be used "as is" in the $ENV{NLS_LANG}.
Summary of proposed changes:
- don't set language/territory in $ENV{NLS_LANG}
- option "override any existing NLS_LANG environment variable", unchecked by default
- when the encoding is "CUSTOM", then it's use "as is" to set $ENV{NLS_LANG}
Thank you! great summary - you nailed it Smiley Happy
Employee

Re: Small NLS_LANG issue

I have reported 4927
One Star

Re: Small NLS_LANG issue

Could somebody explain, why do we have the following lines of code in generated perl script for Talend/Oracle connection?

I think you timson have understood the purpose of these Perl instructions, but for other readers : it's about telling the database server what is the encoding of the characters sent by the client.
I don't want to set any NLS during connection, I want to use the default one. I'll just change my environment variable "NLS_LANG" before running my Talend scripts

All right, I understand your vision :-) You may also understand that "manually" setting the NLS_LANG environment variable is not as user friendly as setting the encoding at the Talend Open Studio job connection level.
What would you think of :
if (not defined $ENV{NLS_LANG}) {
# set the NLS_LANG environment variable depending on the encoding
# the user has chosen in the GUI
}

In addition to this it would be useful to make obvious the fact that even if the user has chosen an encoding, it may not be used if the NLS_LANG is already set, so a checkbox "override any existing NLS_LANG environment variable" in the "Advanced settings" tab, uncheked by default.
Your opinion on this?
thank you!