Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

One Star

Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Dear friends, 
I have a problem when trying to import data from flat file into MongoDB using tMongoDBOutput, tExtractJSONFields and tWriteJSONFields, describing in the scenario as below: 
Ref: internet, you can search by writing some words in data set of my example, sorry I cannot attach an URL. 
Here is a simple example of how to read a CSV file and convert the data into JSON format.
CSV data (original data):
WDCi,Lean,KL
WDCi,Kai Herng,KL
WDCi,Walter,Sydney
WDCi,Deborah,Sydney
WDCi,Terry,US
FPT,Minh-Hieu,Hanoi
FPT,Anthony,Paris
FPT,Luis,Paris
JSON data (result wanted) in Mongo:

{
   "_id" : ObjectId("554b7fdcb42309e07933f70f"),
   "name" : "WDCi",
   "locations" :
           },
           {
               "location" : "Sydney",
               "staffs" :
           },
           {
               "location" : "US",
               "staffs" : {
                   "staff" : "Terry"
               }
           }
       ]
   }
{
   "_id" : ObjectId("554b7fdcb42309e07933f710"),
   "name" : "FPT",
   "locations" :
           }
       ]
   }


Firstly, I will use a tFileInputDelimited component to read the CSV file: 
http://blog.wdcigroup.net/wp-content/uploads/2012/07/fileschema.pngFigure 1.1
Then, I will drop the tWriteJSONField component from the palette and link the output row of the tFileInputDelimited component to it. After that, I define the schema for tWriteJSONField component. In this scenario, I create one column named “company” and set it as the output column of the component. The output row of tWriteJSONField component is linked to a tExtractJSONFields component so that I can use to insert into MongoDB later.

The result when I see in MongoDB is: 

{
   "_id" : ObjectId("554b7fdcb42309e07933f70f"),
   "name" : "WDCi",
   "locations" : {
       "locations" :
           },
           {
               "location" : "Sydney",
               "staffs" :
           },
           {
               "location" : "US",
               "staffs" : {
                   "staff" : "Terry"
               }
           }
       ]
   }
}
/* 2 */
{
   "_id" : ObjectId("554b7fdcb42309e07933f710"),
   "name" : "FPT",
   "locations" : {
       "locations" :
           }
       ]
   }
}



The problem is: the array of location embedded document is included inside an "locations" element. How could I do to have the result as wanted?
Thanks very much.  
p/s: I tried to uncheck the "Get nodes" box of location in tExtractJSONFields, but the result in MongoDB showed: 
{
    "_id" : ObjectId("554b7ba4b42394748949a3e3"),
    "name" : "WDCi",
    "locations" : ""
}
{
    "_id" : ObjectId("554b7ba4b42394748949a3e4"),
    "name" : "FPT",
    "locations" : ""
}

Community Manager

Re: Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Hi 
You are able to generate directly desired result with tMongodbOutput, about the configuration of XML tree on tMongodbOutput, please see the following screenshots.
BRS
Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Dear Friends,
How to prevent the creation of empty value key in tmongodbOutput component.

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now