[resolved] decode PDF

Highlighted
Four Stars

[resolved] decode PDF

I have a list of .txt files that contain encoded PDFs (base64). I am trying to decode and save them back to .pdf files. I am starting with one .txt file to test. 
tFileList ---> tFileInputDelimited -----> tJavaRow ------> tFileOutputDelimited
In tFileInputDelimited, I set row separator to something like "\nnnnnnnnnnnnnnn\nnnnnnnn" so the whole file is treated as one row
In tJavaRow, 
  byte[] buf = new sun.misc.BASE64Decoder().decodeBuffer(input_row.pdf_in);
  output_row.pdf_out = new String(buf);
but the output file test.pdf is not readable (Adobe Reader: damaged and could not be repaired). 
What am I doing wrong?

Accepted Solutions
Highlighted
Seventeen Stars

Re: [resolved] decode PDF

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.

View solution in original post


All Replies
Highlighted
Seventeen Stars

Re: [resolved] decode PDF

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.

View solution in original post

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Best Practices for Using Context Variables with Talend – Part 1

Learn how to do cool things with Context Variables

Blog

Migrate Data from one Database to another with one Job using the Dynamic Schema

Find out how to migrate from one database to another using the Dynamic schema

Blog

Best Practices for Using Context Variables with Talend – Part 4

Pick up some tips and tricks with Context Variables

Blog