Custom component - setting schema in runtime

Highlighted
Five Stars

Custom component - setting schema in runtime

Hello good ppl of Talend,

 

I'm building a custom input component that should, for starters, get some json from my REST endpoint and print its content using tLogRow component. I need to be able to generate a schema of that json so that tLogRow (or some other component in the future) can consume received data. 

 

I found that tSalesforceInput component is really close to my use case. In Salesforce input component I just have to define my user credentials select endpoint and I can print received data using tLogRow without manually setting the schema (or clicking "Guess schema" button). When I checked the code of the Salesforce component I saw they're building their records in @Producer method using newRecordBuilder(Schema schema) constructor, which is available in component-api version 1.1.4. (https://talend.github.io/component-runtime/apidocs/1.1.5/api/org/talend/sdk/component/api/service/re...

 

The problem is I have the latest Talend Open Studio for DI, version 7.2.1 milestone 3, which uses component-api version 1.1.2 (https://talend.github.io/component-runtime/apidocs/1.1.4/api/org/talend/sdk/component/api/service/re...), so using newRecordBuilder(Schema schema) in my custome component results in 

java.lang.NoSuchMethodError:org.talend.sdk.component.api.service.record.RecordBuilderFactory.newRecordBuilder(Lorg/talend/sdk/component/api/record/Schema;)

 

My questions would be:

  1. is there any direct replacement for newRecordBuilder(Schema schema) in component-api 1.1.2, how can I programmatically build Schema for my Records in runtime, without the need for a user to click guessSchema button. (something like Salesforce component is doing, like I described above)?
  2. Is it possible to update Talend Open Studio component-runtime .jars to at least 1.1.5 version which contains desired constructor newRecordBuilder(Schema schema)?
  3. If newRecordBuilder(Schema schema) constructor isn't the way to build schema in runtime, can u suggest what is the right way?

Please note that I've read and tried to implement all the examples from the docs, like

but nothing helps to recreate Salesforce component behaviour (that's why I turned to forums Smiley Happy )

 

Thank you!

Matija

Employee

Re: Custom component - setting schema in runtime

Hello @matijapetanjek ,

 

 

1. If you don't care about the record data and just want to build a schema there is a dedicated newSchemaBuilder method (see https://talend.github.io/component-runtime/main/1.1.4/record-types.html). For records you can just omit the schema in previous versions and it should be fine,

2. Our studio team is working on upgrading the framework version for coming releases,

3. Guess 1. answers it

 

Note that you will still need to click on "guess schema" though to guarantee the data are accurate and not auto-populated which can easily break user work or slow down user workflow so we ensured it was intended and explicit.

 

Hope it helps,

Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/
Five Stars

Re: Custom component - setting schema in runtime

Hi @rmannibucau ,

 

thank you very much for the response! Since I posted this question I had some new successes and failures with Talend. Long story short, I cannot trigger my method annotated with @DiscoverSchema.

 

I was following your github example, in my service class I put

@DiscoverSchema(value = "guessTableSchema")
	public Schema guessTableSchema(final DataSet dataSet) {

System.out.println("HERE IS BREAKPOINT"); ....
//generate schema }

And in DataSet I have:

    @Option
    @Structure(discoverSchema = "guessTableSchema.",type= Structure.Type.IN)
    @Documentation(value = "List of field names.")
    private List<String> fields = new ArrayList<>();

 

When I click "guess schema" button, breakpoint stops in (component-runtime)TaCoKitGuessSchema#guessInputComponentSchema method. For some reason, action field has null value in that class, so guessSchemaThroughAction() always fails, and guessInputComponentSchemaThroughResult() is called (same behavior with TOS 7.2.1 Milestone 3 and TOS 7.1.1). I'm not sure if this is a bug or I'm doing something wrong? Do you see something obviously wrong in my examples? (note I also tried adding @Option annotation to Dataset in guessTableSchema but the result is same).

 

Thank you!

Matija

 

Employee

Re: Custom component - setting schema in runtime

Note that @DiscoverSchema parameter name must match a @DataSet name so this is a first point to check.

I also fixed earlier this morning an issue about @DiscoverSchema implementation where the action was very likely ignored - was on 7.2.1M3 - so it can be the case you felt in.

 

 

Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/
Five Stars

Re: Custom component - setting schema in runtime


@rmannibucau wrote:

I also fixed earlier this morning an issue about @DiscoverSchema implementation where the action was very likely ignored - was on 7.2.1M3 - so it can be the case you felt in.


Hey @rmannibucau , looks to me there is another issue in 7.2.1.M3 TOS. The whole day I was trying to figure out WHY when I run a job in the studio, my discovered schema resets all field types to String, and after that job breaks with 

javax.json.bind.JsonbException: Unable to parse 0 to class java.lang.String

In the end, I deployed the same component to 7.1.1 TOS and I finally got data logged with tLog component, and discovered schema stayed the same. Does this issue sounds familiar?

Employee

Re: Custom component - setting schema in runtime

Hi @matijapetanjek ,

 

yes, it has been fixed on the SNAPSHOT.

Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/
Four Stars

Re: Custom component - setting schema in runtime

Dear community,

 

I am also trying to develop a custom component that should get the schema of the previous component, save/hold/clone it in any way and overwrite in with the schema of the following component (tLogRow).

Now I have been reading all blog entries (community and development kit documentation) related and I am not making any further progress.

Therefore I would like to ask in general, what basic steps have to be followed?

I have chosen the processor type of component, I am able to get the schema of the input data, but how do I set the schema of the following component?

Which role does the service class play? Do I need any other classes?

 

Thank you for help!

Employee

Re: Custom component - setting schema in runtime

Hi @the_integrator ,

 

The output schema(s) is(are) filled using the "guess schema" button for an automatic mode (manual mode still being functional, i.e. you open the schema editor and fill yourself the columns).

 

Services just enable to bypass the evaluation by an actual run of the beginning of the job (all the steps before the current component). But for a processor it is generally ok to actually run it.

 

Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/
Four Stars

Re: Custom component - setting schema in runtime

Thank your for your response @rmannibucau.

 

Follow up question:

What if I wanted to include the logic of the "guess schema" button in the custom component, so that the user does not have to click on it? Does that make sense at all? How can it be achieved?

I think this has been asked here but it is not really clear for me.

 

In this method I can easily get the schema of the input. In which way would I get/set the schema of the output in this method?

 

@ElementListener
    public void onNext(
            @Input final Record defaultInput,
            @Output final OutputEmitter<Record> defaultOutput) {
        // this is the method allowing you to handle the input(s) and emit the output(s)
        // after some custom logic you put here, to send a value to next element you can use an
        // output parameter and call emit(value).

}

Thank you

Employee

Re: Custom component - setting schema in runtime

Hi @the_integrator,

You cant set a predefined schema because it does not map on any concept in the cloud - there is no design time schema.

Also, the element listener is a runtime method so not usable for design time.

However, you can rename the guess schema button normally which would allow you to have a more explicit naming if default one is unsatisfying.
Romain
Talend Component Kit Documentation: https://talend.github.io/component-runtime/

2019 GARNER MAGIC QUADRANT FOR DATA INTEGRATION TOOL

Talend named a Leader.

Get your copy

OPEN STUDIO FOR DATA INTEGRATION

Kickstart your first data integration and ETL projects.

Download now

What’s New for Talend Summer ’19

Watch the recorded webinar!

Watch Now

Agile Data lakes & Analytics

Accelerate your data lake projects with an agile approach

Watch

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch