I read in several posts that people are having problems querying UTF-8 (non-English) data from Access DBs. But none of solutions posted help, or the requests have just gone unanswered. My problem is specifically querying Korean characters from an Access 2007 DB. I'm currently using Talend Enterprise Data Integration 5.0.2 but have also tested in TOS 5.3.1 and 5.4.0 with the same results. I have tried java 1.6.0_21 and _38. What DOES work: (In MS Access):Export Table from Access as an Excel (xlsx) file. Import data into Talend using tFileInputExcel(Encoding UTF8)---->tFileOutputDelimited(Encoding UTF8). Output file (txt) shows Korean characters as expected. File Properties show UTF8 Encoding Data Viewer (in 5.0.2) shows ?????? for Korean characters. What does NOT work: Import data into Talend using tAccessInput---->tFileOutputDelimited(Encoding UTF8) Output file (txt) shows ?????? for Korean characters. File Properties show ANSI Encoding. Data Viewer (in 5.0.2) shows ?????? for Korean characters. What does NOT work: Import data into Talend using tAccessInput(JDBC Parameter characterEncoding=UTF-8)---->tFileOutputDelimited(Encoding UTF8) Output file (txt) shows ?????? for Korean characters. File Properties show ANSI Encoding. Data Viewer (in 5.0.2) shows ?????? for Korean characters. I have also attempted pulling the data from the Access table using the tDBInput component with the exact same results as the tAccessInput component. How do I query an Access 2007 DB directly from Talend and retain the Korean characters?
Hi, Sorry for delay, we have confirmed it with our component team. It is a bug, indeed. Could you open a jira issue on Talend Bug Tracker, our developer will work on it. Please paste the issue link on forum so that other community users are able to see it. Thanks for your contribution on talend. Best regards Sabrina
-- Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
I experience same problem from tAccessInput to any UTF8 file (tried tFileOutputMSDelimited and tMysqlOutput) source is a *.accdb file. Cant figure out exaclty what encoding the access file is, but just default file encoding when creating an access file. All files like é etc. get converted to ?
For French characters, Please add "charSet=iso8859-9" or "charSet=iso-8859-9" into "Additional JDBC Parameters" on advanced setting of tAccessInput. For Korean characters, please refer to: https://jira.talendforge.org/browse/TDI-28224
Hi Talend Team - I have managed to get my tAccessInput to generate a .csv file using tFileOutputDelimited, which now has the accented characters correctly displayed by adding the ISO character set 8859-9. However, on importing it into Salesforce via tSalesforceBulkExec_1, the resulting records in Salesforce have accented characters replaced by ?. Do you have any advice on what I need to do to resolve this?