Cannot extract fields from Couchbase query in Open Studio 7.1

Seven Stars

Cannot extract fields from Couchbase query in Open Studio 7.1

Hi,

I have a job that calls Couchbase. If I use a tLogRow, I can see what is returned. There are the fields:

couchbasefields.PNG

The column content contains a JSON document that is what I need to extract. The JSON document doesn't always contains the same fields, there are 2 types of JSON documents. I don't know how to handle this situation or how to extract the fields. I tried a bit everything, but no success.

 

I would need to get the structure in a job, with the tCouchbaseInput component, then extract the json Document content, and depending of the field "class", treat it to decompose it in 2 different types of records, or not.

 

Any help is welcome.

 


Accepted Solutions
Employee

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi,

 

    Could you please try the following method?

 

a) Once the data is extracted, convert the JSON from Document to String using tConvertType

b) Pass the data to a tMap or tJavarow where you need to write pure java function to parse the "class"field and the data after this field using substring and regex functions.

c) Verify the data of "class" and based on the value redirect the output to two output flows.

d) In the first output flow, have design the JSON extraction structure (using a tExtractJSONFields) in first format

e) In the second output flow, have design the JSON extraction structure (using a tExtractJSONFields) in second format

f) Once the JSON data is extracted, pass the data to output flow.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

View solution in original post


All Replies
Employee

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi,

 

    Could you please try the following method?

 

a) Once the data is extracted, convert the JSON from Document to String using tConvertType

b) Pass the data to a tMap or tJavarow where you need to write pure java function to parse the "class"field and the data after this field using substring and regex functions.

c) Verify the data of "class" and based on the value redirect the output to two output flows.

d) In the first output flow, have design the JSON extraction structure (using a tExtractJSONFields) in first format

e) In the second output flow, have design the JSON extraction structure (using a tExtractJSONFields) in second format

f) Once the JSON data is extracted, pass the data to output flow.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

View solution in original post

Seven Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi nikhilthampi,

I am currently encountering the following issue:

 

The tCouchbaseInput component has the following mapping (it takes it from the couchbase bucket by default):

 

tcouchbaseinput.PNG

 

When I connect the tCouchbaseInput to a tLogRow, I can see all the data, no problem.

 

Then  I connect the tCouchbaseInput to a tExtractJsonField:

extractjsonfields.PNG

 

 

 It doesn't work. This is the error I receive:

 

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/DEV/TOS_BD-20181026_1147-V7.1.1/configuration/.m2/repository/org/talend/libraries/slf4j-log4j12-1.7.2/6.0.0/slf4j-log4j12-1.7.2-6.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/DEV/TOS_BD-20181026_1147-V7.1.1/configuration/.m2/repository/org/talend/libraries/slf4j-log4j12-1.7.5/6.0.0/slf4j-log4j12-1.7.5-6.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in component tCouchbaseInput_1 (Couchbase)
java.lang.ClassCastException: [B cannot be cast to java.lang.String
at local_project.couchbase_0_1.Couchbase.tCouchbaseInput_1Process(Couchbase.java:1219)
at local_project.couchbase_0_1.Couchbase.runJobInTOS(Couchbase.java:1939)
at local_project.couchbase_0_1.Couchbase.main(Couchbase.java:1788)

 

What can I do?

 

thanks in advance for your help and support

Seven Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

I have finally used a tFileOutputDelimited, even when in this article, it is advised to extract the fields into a tExtractJsonField. Now that I passed that first step, I encounter another issue. I have 21000 documents in couchbase, but I can only fetch around 11000 (never the same number).  The job stays waiting for Couchbase and I have to end up killing it. As I cannot use a cursor, I really don't know what to do. 

 

Any help is welcome...

Employee

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi,

 

   Could you please share a screen shot of you overall job flow for better understanding of your issue.

 

Warm Regards,
Nikhil Thampi

Please appreciate our Talend community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved :-)

Seven Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi,

 

I deleted the job, I created a new job, I inserted a new tCouchbaseInput, and then the field conatining the JSON document had a type of byte[]. Then I understood the problem. I linked the tCouchbaseInput to a tMap, and made an expression converting the byte[] into String, and setting a condition with the INDEX function. This way, I could finally extract the JSON fields from Couchbase, but there is a little issue. 

My JSON document has a loop in the customer address, because a customer can have several addresses: I have set the json loop to the address level.


tExtractJSONfields.PNG

 

and I get as many rows as addresses the customer has BUT the fields class, validFrom, createdByTransactionId...are null in the output. This means that I cannot reach the information in the parent node, even when the mapping is done for them. I have tried to add ../, and it give me errors, then i tried to add /../ and I don't get anything, and now, I ran out of ideas...

 

Any help is welcome. Here is my job:

JobExtractCouchDB.PNG

 

And the function to filter the records:

 

tmapfunction.PNG

 

Here are the results:

tlogresults.PNG

 

Any help is welcome.

Seven Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Ok,

 

I changed the JSONPath to  XPath in the tExtractJSONFIelds and now is working perfectly.

 

Thank you very much for your help!

 

change2XPATHPNG.PNG

Four Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Can you please share the how to write into couchbase? I already have json input. but while writing bad charector (\)


@spr654 wrote:

Ok,

 

I changed the JSONPath to  XPath in the tExtractJSONFIelds and now is working perfectly.

 

Thank you very much for your help!

 

change2XPATHPNG.PNG



in into bucket.

 

 

Seven Stars

Re: Cannot extract fields from Couchbase query in Open Studio 7.1

Hi vjalagam,

 

Unfortunately, I have never upload to Coachbase, i have only downloaded from Couchbase.

2019 GARTNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

Put Massive Amounts of Data to Work

Learn how to make your data more available, reduce costs and cut your build time

Watch Now

How OTTO Utilizes Big Data to Deliver Personalized Experiences

Read about OTTO's experiences with Big Data and Personalized Experiences

Blog

Talend Integration with Databricks

Take a look at this video about Talend Integration with Databricks

Watch Now