I'm trying to write JSON documents to Elasticsearch in a Spark streaming job. I've set the document type to "doc" in the component configuration. I'm getting the following exception:
[ERROR]: org.apache.spark.executor.Executor - Exception in task 7.0 in stage 0.0 (TID 7) org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: invalid pattern given doc/ at org.elasticsearch.hadoop.util.Assert.isTrue(Assert.java:50) at org.elasticsearch.hadoop.serialization.field.AbstractIndexExtractor.compile(AbstractIndexExtractor.java:51) at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:565) at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:58) at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:102) at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:102) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:85) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
Here exists a jira issue:https://jira.talendforge.org/browse/TBD-3707.
Could you please let us know if it is what you are looking for?
That seems to be unrelated. I already have my content in a single column of type string:
This Stackoverflow answer makes me think the wrong parameters are being passed in to org.elasticsearch.spark.rdd.EsSpark.saveJsonToEs(). It expects the pattern "indexName/documentType", and it's getting something like "indexName/documentType/" somehow. Here's the relevant configuration for my tElasticSearchOutput:
I tried putting the index name under "Type" and the type ("doc") under "Index" to see if it was just a simple transposition of fields, but that didn't work either.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks