Retrieve row count from a Amazon Redshift Spectrum table

Six Stars

Retrieve row count from a Amazon Redshift Spectrum table

Hi,

 

I have a requirement wherein i need to iterate a list of table names from SQl DB and then fire a query on each of these tables to retrieve the count of records in each.

 

I have made the below design 

 

tMSSQLInput-->tRedshiftRow-->tLogRow

 

The challenge I am facing is that, as per documentation, tRedshiftRow does not return any output. Hence, I tried using tJdbcRow to achieve this and still no luck. The counts always come back blank (not zero but blank).

 

Can anyone please suggest a way to achieve this?


Accepted Solutions
Highlighted
Nine Stars

Re: Retrieve row count from a Amazon Redshift Spectrum table

Hi,

 

Are you looking for approach, where you would like to send table name as input and get record count in that table and load in table?

 

Please try the below, here i used oracle. You can use other data base as well.

 

TableRecordCnt.PNG

 

Step1: Create context variable tableName and assign in put table name to context by using tJavarow. 

Step2: Using tMap join tDB input to input as cross join. Select load lookup for each row.

Step3: Use override Query in DB input and 

"select COUNT(*) REC_CNT from "+context.tableName

 

Regards,

Veeru Boppudi

All Replies
Moderator

Re: Retrieve row count from a Amazon Redshift Spectrum table

Hello,

Normally, Nb_LINE  is used to count the total number of records have been proceed. When this option is set to NONE, there is no a global variable for counting the total number of records have been inserted, updated, or deleted.

When inserted item is choosen, there will be a global variale such as ((Integer)globalMap.get("tRedshiftOutput_1_NB_LINE_INSERTED")) will be available, which counts the total number of records have been inserted into database. The same to updated, deleted items.

Best regards

Sabrina

 

--
Don't forget to give kudos when a reply is helpful and click Accept the solution when you think you're good with it.
Six Stars

Re: Retrieve row count from a Amazon Redshift Spectrum table

Thanks Sabrina for the response. But, my requirement is not to perform any DML on the table. The target table whose record count I require is a spectrum table and hence table cannot be modified.

 

Let me try to explain with a scenario. Lets say, I have 3 tables in Redshift Spectrum namely, TableA, TableB and TableC. Now I have the names of these tables available with me and I want to find the count of records in each of these tables.

 

So, I will iterate the list of table names and then fire SQL query on each.

 

 

select count(*) from TableA

select count(*) from TableB

select count(*) from TableC

and so on. 

 

I want to achieve this in a generic manner so that irrespective of how many table names I have, i can execute the same piece of code in a loop to retrieve the counts and provide in the output.

 

Please let me know if you require further details.

Highlighted
Nine Stars

Re: Retrieve row count from a Amazon Redshift Spectrum table

Hi,

 

Are you looking for approach, where you would like to send table name as input and get record count in that table and load in table?

 

Please try the below, here i used oracle. You can use other data base as well.

 

TableRecordCnt.PNG

 

Step1: Create context variable tableName and assign in put table name to context by using tJavarow. 

Step2: Using tMap join tDB input to input as cross join. Select load lookup for each row.

Step3: Use override Query in DB input and 

"select COUNT(*) REC_CNT from "+context.tableName

 

Regards,

Veeru Boppudi
Six Stars

Re: Retrieve row count from a Amazon Redshift Spectrum table

Thanks Veeranjaneyulu. That was an awesome solution. "Select load lookup for each row." was exactly what did the trick. Never thought of using it this way. Kudos.