I have complex JSON files that I have to parse and insert into table and I have been using the talend data mapper for this.
below is a screenshot of a portion of my json
and the result of my parsing is
any null values in the json is being neglected and the next non null value is showed for the null keys.
I would like to bring in the null values and have the values under corresponding keys. also attaching the complete json file.
can someone help please?
thanks in advance
what is the output format of the DataMapper?
Could you please provide the output file as well?
Be careful when opening non Excel file within Excel: you might think file is incorrect while it's just the Excel transformation which makes things wrong!
thanks for your response
Im attaching the jobs with all its dependencies and the source file as well.
My job actually has 2 steps where in the 1st step I use a tFileInputFullRow to remove the remove the pretty format and replace output a csv file and in the next step consume the csv with a tFileInputFullRow component and a tHmap with the input structure from the actual JSON file and mapping it to output.
What I think I'm missing is the logic to populate null values for the optional fields(or the fields that are not available).
I wanna be able to get the keys as header and all the 9 different rows(since there are 9 loops) which has to be loaded to MqSql.
also attaching the output delimited file I have now. the issue with this is any optional field is being skipped instead of the particular key being null
this is a bit urgent requirement. if you could provide me any input at all, it would really be helpful.
thanks in advance
thanks for your response.
Attaching the talend job I've built, source json(renamed it to txt file) and the output csv I have now.
I think the optional fields have to be handled differently which is not being done here I think. My job has 2 parts, in the first I have a tFileInputFullRow where I read the json ,remove the pretty format and output the json as csv. And in the 2nd part, the csv is read with tFileInputFullRow and passed to tHMap where the input structure is created by importing the JSON file(not sure if this is right) and output as csv.
I wanna be able to have null values under corresponding keys for the optional fields that aren't there but they are currently being skipped and the key gets the next non null value.
this is a bit urgent. if you could provide any input/solution it would really be helpful.
Thanks in advance
I just had a quick look on your job and files and there are multiple parts that I find strange.
First of all, your input file does not look like a valid json.
Then, I think you should have : tFileInputRaw ==> tHMap ==> tFileOutputWhatever
In your mapping, either you try to output Flat data (and then use a tFileOutputRaw) or you output Map data and use a tFileOutputDelimited for example.
The type of output depends from the Mapping and the output Structure.
Try with that simple job and only a single json row in your input file (do not use your tMap to flatten the json: it's useless).
Hi @lennelei ,
thanks for the suggestion.
when I used an online json validator, it did say my JSON is invalid. But this is the exact file I receive from my client.
so do you suggest removing the first set of components and just starting with a tFileInputRaw with tHMap and the outputdelimited like this?
This also gives me the same output. Is there something I have to change?
If you use Flat as output format, I don't think you should have a tFileOutputDelimited but a tFileOutputRaw.
However I'm not sure it will change anything to your issue.
You may try to add a "Map" output Structure representation (in the Mapping perspective, under the output Structure, you can right click on Representations and then "New Map"). Unless I'm mistaken, a map representation allows you to have all your columns in the tHMap output row (instead of only one column).
I found this easier to use with CSV type data.
Once you've add your Map representation, you open your Map (the mapping) and change the output representation from Flat to Map (at the top of the mapping window, you'll find Output (Flat) : click on Flat and select Map).
After that, you can save everything and return to your job in order to update the output row of the tHMap and add all the columns.
But honestly, I'm really not sure it will help...
I have imported your project and I have found that you have an invalid json because you use NumberInt function.
However, your problem is to flatten the input json as a csv file?
when I directly used a simple job
I already found your data
I think it is better that you create a new structure that contains your desired csv structure and used as a reference to the outputdelimitedfile
Hi @rriahi ,
thanks for your time.
have you used data mapper to do this? Can you please provide the job you created?
In the csv output screenshot, the data is still off, like for comment the value is Project X in your output but is it actually null in the source
yes I used Talend Data Mapper.
what is your studiio version?
I think your problem is the output structure you need to define a flat structure related to your CSV schema then you do the corresponding mapping.
I have do that using Talend Data Mapper
and you can obtain this result (I just used Json to show you the result and I had only creat a part from the mapping)
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Move from On-Premises to the Cloud by following the advice of experts
Create systems and workflow to manage clean data ingestion and data transformation.
Introduction to Talend Open Studio for Data Integration.