I have query regarding Talend Real time big data product.
Is it possible to have Cassandra as Source for building Big data streaming Job.
Because in the Talend Big Data Real time product trail version, i don't see Source object for Cassandra.
Tool image below:
Most real time streaming use-cases will start with a Kafka, Kinesis, Flume or MQTT source. Normally you will be getting events realtime and you will be looking against your pre-computed views from a Cassandra database. This is how the lambda architecture works with a batch and a realtime streaming use case.
Why do you need to read from cassandra for each micro batch? What will each microbatch be i.e. what will be the query etc.?
Thanks for you reply.
Actually we are receiving Call center related data i.e caller_id,call time,drop_time etc.
Source team will receive, fix if any changes and save it to cassandra tables in real time.
We need to read this real time data from cassandra tables for further processing and load to Google BigQuery.
1) Does Talend real time product provide cassandra as streaming source ? (May be licensed version)
2) Does Talend real time product provide Google BigQuery as Real time target object ?
We have big query components tBigQueryInput and tBigQueryOutput in Talend 6.4.1.
You can use a design as follows to get your data from Cassandra. You could generate the where condition for your cassandra lookup, or even some variable data and basically trigger a lookup in the tMap, and that will get you the data you need into your flow.
Talend named a Leader.
Kickstart your first data integration and ETL projects.
Watch the recorded webinar!
Learn how to make your data more available, reduce costs and cut your build time
Read about OTTO's experiences with Big Data and Personalized Experiences
Take a look at this video about Talend Integration with Databricks