Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

One Star

Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Dear friends, 
I have a problem when trying to import data from flat file into MongoDB using tMongoDBOutput, tExtractJSONFields and tWriteJSONFields, describing in the scenario as below: 
Ref: internet, you can search by writing some words in data set of my example, sorry I cannot attach an URL. 
Here is a simple example of how to read a CSV file and convert the data into JSON format.
CSV data (original data):
WDCi,Lean,KL
WDCi,Kai Herng,KL
WDCi,Walter,Sydney
WDCi,Deborah,Sydney
WDCi,Terry,US
FPT,Minh-Hieu,Hanoi
FPT,Anthony,Paris
FPT,Luis,Paris
JSON data (result wanted) in Mongo:

{
   "_id" : ObjectId("554b7fdcb42309e07933f70f"),
   "name" : "WDCi",
   "locations" :
           },
           {
               "location" : "Sydney",
               "staffs" :
           },
           {
               "location" : "US",
               "staffs" : {
                   "staff" : "Terry"
               }
           }
       ]
   }
{
   "_id" : ObjectId("554b7fdcb42309e07933f710"),
   "name" : "FPT",
   "locations" :
           }
       ]
   }


Firstly, I will use a tFileInputDelimited component to read the CSV file: 
http://blog.wdcigroup.net/wp-content/uploads/2012/07/fileschema.pngFigure 1.1
Then, I will drop the tWriteJSONField component from the palette and link the output row of the tFileInputDelimited component to it. After that, I define the schema for tWriteJSONField component. In this scenario, I create one column named “company” and set it as the output column of the component. The output row of tWriteJSONField component is linked to a tExtractJSONFields component so that I can use to insert into MongoDB later.

The result when I see in MongoDB is: 

{
   "_id" : ObjectId("554b7fdcb42309e07933f70f"),
   "name" : "WDCi",
   "locations" : {
       "locations" :
           },
           {
               "location" : "Sydney",
               "staffs" :
           },
           {
               "location" : "US",
               "staffs" : {
                   "staff" : "Terry"
               }
           }
       ]
   }
}
/* 2 */
{
   "_id" : ObjectId("554b7fdcb42309e07933f710"),
   "name" : "FPT",
   "locations" : {
       "locations" :
           }
       ]
   }
}



The problem is: the array of location embedded document is included inside an "locations" element. How could I do to have the result as wanted?
Thanks very much.  
p/s: I tried to uncheck the "Get nodes" box of location in tExtractJSONFields, but the result in MongoDB showed: 
{
    "_id" : ObjectId("554b7ba4b42394748949a3e3"),
    "name" : "WDCi",
    "locations" : ""
}
{
    "_id" : ObjectId("554b7ba4b42394748949a3e4"),
    "name" : "FPT",
    "locations" : ""
}

Community Manager

Re: Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Hi 
You are able to generate directly desired result with tMongodbOutput, about the configuration of XML tree on tMongodbOutput, please see the following screenshots.
BRS
Shong

----------------------------------------------------------
Talend | Data Agility for Modern Business
Four Stars

Re: Import data into MongoDB - Bug with tMongoDBOutput, tExtractJSONFields

Dear Friends,
How to prevent the creation of empty value key in tmongodbOutput component.

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads