One Star

Limitations of tCassandraRow; Advice to perform dynamic query

Am I reading the docs correctly that tCassandraRow does not support select statements in CQL ?
I have 2 column families that represent events and detail for those events. We are using DataStax Enterprise with SOLR engaged to support indexing and selection against any column in the rows. Using CQL constructs I can easily select a set of events of interest, then using one of the values in the row I can select from the details column family the rows that relate to the event. This would be trivial to implement in java code, but I can't figure out how I can configure the steps in Talend Open Studio job to perform sequential selects from 2 column families constructing the 2nd select based on values retrieved from the first.
As an aside I tried to do the same thing using the JasperSoft connector for Cassandra, but while their input step for cassandra allows the query to reference "variables" it does not allow values from a prior cassandra input step to be bound to those variables. I have seen others suggest using custom java steps to implement the second select, but I'm hoping Talend offers a better solution.
To make the use case more concrete what I want to do is basically (pseudo SQL):
select * from events where eventType = Deposit
then
select * from details where eventId = <id from previous select>
In CQL with solr enabled in DataStax Enterprise I would do:
select * from events where solr_query='eventTypeSmiley Very Happyeposit'
Then if I was writing the java code I would simply extract the eventId from that result set and do:
select * from details where solr_query='eventId:<value of eventId from previous results>'
Other examples of the tXXXRow step seemed like they would address this need, but if the tCassandraRow is not able to do selects I'm not sure how to do this.
Thanks in advance for any help or suggestions,
Joel
3 REPLIES
Community Manager

Re: Limitations of tCassandraRow; Advice to perform dynamic query

Hi Joel
We usually use tXXXInput to execute the select query rather than txxxRow component. You request can be implemented with this job design:
tCassandraInput_1--main(row1)--tFlowToIterate--tCassandraInput_2--main--tLogRow
on tCassandraInput_1, "select id from events where eventType = Deposit"
on tCassandraInput_2, "select * from details where eventId = "+(Integer)globalMap.get("row1.id)
----------------------------------------------------------
Talend | Data Agility for Modern Business
One Star

Re: Limitations of tCassandraRow; Advice to perform dynamic query

Shong - Thank you for the reply. I tried to implement the job design you suggested, but I must be missing something. Where in the configuration of the tCassandraInput component should I be able to specify the select statement ?
I tried searching for "tCassandraInput CQL" and various other things as well as reviewing the docs on this component, but I don't see where a query can be specified using select semantics. The only options I see are specifying row keys directly.
Ideally I would like to use a CQL select statement, since the column of interest is indexed using the SOLR integration of Datastax rather than a native Cassandra index. Therefore I need something of the form:
select * from Events where sold_query='eventTypeSmiley Very HappyEPOSIT';
In your reply you say "on tCassandraInput_1 "select ..." and similar for tCassandraInput_2, but I can't see how to set that on the component.
- Joel -
Community Manager

Re: Limitations of tCassandraRow; Advice to perform dynamic query

Hi Joel
Sorry, I thought tCassandraInput can be used as a general DB component like tMysqlInput and reply you quickly, but this component does't have a query field that we write directly a select statement Is it possible for you to read all data from Events with tCassandraInput? if so, you can filter the rows on other component like tFilterRow after tCassandraInput.
Shong
----------------------------------------------------------
Talend | Data Agility for Modern Business