Five Stars ami
Five Stars

I am facing the following issues while loading UTF8 files.

hI guys,
Unable to pass Character Set encoding to TPT: In order to load UTF8 data to Teradata table, we need an attribute/option in tTeradataTptUtility or tTeradataTPTExec job that results in “USING CHARACTER SET UTF8” in the generated script. There is an option on the  “Advanced SettingsàDefine Character set” which is used by Talend to execute tbuild with –e flag. However, it is of no use to us it specify the character set encoding of script file i.e. if script file is generated in UTF8 or ASCII then one can specify its encoding. While, we need the Teradata session character set encoding in order to load Arabic data to Teradata table with Unicode columns.
3 REPLIES
Moderator

Re: I am facing the following issues while loading UTF8 files.

Hi,
Have you tried to add the "Dfile.encoding=utf-8" to the JVM parameters to see if it works?
Best regards
Sabrina
--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Five Stars ami
Five Stars

Re: I am facing the following issues while loading UTF8 files.

Hi,
Have you tried to add the "Dfile.encoding=utf-8" to the JVM parameters to see if it works?
Best regards
Sabrina

Hello,
i TRIED but issue persists
Five Stars ami
Five Stars

Re: I am facing the following issues while loading UTF8 files.

following is the tpt script generated 
DEFINE JOB Job_tTeradataTPTUtility_1_tTPTInput
  (
  DEFINE OPERATOR Operator_tTeradataTPTUtility_1_tTPTInput
  TYPE LOAD
  SCHEMA *
  ATTRIBUTES
  (
  VARCHAR UserName, 
  VARCHAR UserPassword, 
  VARCHAR LogTable, 
  VARCHAR TargetTable, 
  INTEGER BufferSize, 
  VARCHAR DataEncryption, 
  INTEGER ErrorLimit, 
  INTEGER MaxSessions, 
  INTEGER MinSessions, 
  INTEGER TenacityHours, 
  INTEGER TenacitySleep, 
  VARCHAR AccountId, 
  VARCHAR DateForm, 
  VARCHAR ErrorTable1, 
  VARCHAR ErrorTable2, 
  VARCHAR LogSQL, 
  VARCHAR LogonMech, 
  VARCHAR LogonMechData, 
  VARCHAR NotifyExit, 
  VARCHAR NotifyExitIsDLL, 
  VARCHAR NotifyLevel, 
  VARCHAR NotifyMethod, 
  VARCHAR NotifyString, 
  VARCHAR PauseAcq, 
  VARCHAR PrivateLogName,
  VARCHAR QueryBandSessInfo,
  VARCHAR TdpId, 
  VARCHAR ARRAY TraceLevel, 
  VARCHAR WildcardInsert, 
  VARCHAR WorkingDatabase
  );
  
  DEFINE SCHEMA Schema_tTeradataTPTUtility_1_tTPTInput
  (
               sCity VARCHAR(50),
               CityArabic VARCHAR(50),
               sZone VARCHAR(50),
               Region VARCHAR(50),
               Province VARCHAR(50),
               CourierId VARCHAR(50),
               LAST_UPDATE_DATE VARCHAR(10),
               BATCH_ID VARCHAR(10),
               FILE_ID VARCHAR(10)
  );
  
  DEFINE OPERATOR Connector_tTeradataTPTUtility_1_tTPTInput
  TYPE DATACONNECTOR PRODUCER
  SCHEMA Schema_tTeradataTPTUtility_1_tTPTInput
  ATTRIBUTES
  (
  VARCHAR FileName, 
  VARCHAR Format, 
  VARCHAR OpenMode, 
  INTEGER BlockSize, 
  INTEGER BufferSize, 
  INTEGER RetentionPeriod, 
  INTEGER RowsPerInstance, 
  INTEGER SecondarySpace, 
  INTEGER UnitCount, 
  INTEGER VigilElapsedTime, 
  INTEGER VigilWaitTime, 
  INTEGER VolumeCount, 
  VARCHAR AccessModuleName, 
  VARCHAR AccessModuleInitStr, 
  VARCHAR DirectoryPath, 
  VARCHAR ExpirationDate, 
  VARCHAR IndicatorMode, 
  VARCHAR PrimarySpace, 
  VARCHAR PrivateLogName, 
  VARCHAR RecordFormat, 
  VARCHAR RecordLength, 
  VARCHAR SpaceUnit, 
  VARCHAR TextDelimiter, 
  VARCHAR VigilNoticeFileName, 
  VARCHAR VigilStartTime, 
  VARCHAR VigilStopTime, 
  VARCHAR VolSerNumber, 
  VARCHAR UnitType
  );
  
    APPLY
        (
            'INSERT INTO DP_SAPPHIRE_STG.LMD_ZONE (sCity,CityArabic,sZone,Region,Province,CourierId,LAST_UPDATE_DATE,BATCH_ID,FILE_ID) VALUES (:sCity,:CityArabic,:sZone,:Region,Smiley Tonguerovince,:CourierId,:LAST_UPDATE_DATE,:BATCH_ID,:FILE_ID);'
        )
    TO OPERATOR
    (
        Operator_tTeradataTPTUtility_1_tTPTInput
  
        ATTRIBUTES
        (
            UserName = 'up_manjaneyulu', 
            UserPassword = 'STCedw123',
            TdpId = '172.20.37.74'
            ,TargetTable = 'DP_SAPPHIRE_STG.LMD_ZONE'
            ,LogTable = 'DP_SAPPHIRE_STG.LOGTABLE_LMDZONE'
            ,ERRORTABLE1 = 'DP_SAPPHIRE_STG.ERROR_LMDZONE'
            ,ERRORLIMIT = 1000
        )
    )
    SELECT * FROM OPERATOR
    (
        Connector_tTeradataTPTUtility_1_tTPTInput
  
        ATTRIBUTES
        (
            FileName = 'E:\Talend-Studio-20150908_1633-V6.0.1\Talend-Studio-20150908_1633-V6.0.1\workspace\out.csv', 
            Format = 'DELIMITED', 
            OpenMode = 'Read', 
            DirectoryPath = '', 
            IndicatorMode = 'N', 
            TextDelimiter = '^~$'
        )
    );
  );

is there any way to alter this script?
we would like to add 1st line of the script as 
USING CHARACTER SET UTF8

if this is added then we will be ab;le to see unicode chars
this script is automatically generated by talend ...
what needs to be done ?  is it a bug?