Process the contents of the sql server database per slice

Seven Stars

Process the contents of the sql server database per slice

Hi, 

I have a large database sql server and I would like to process the lines in slices. 

For example for every 100,000 lines, I apply a process using a python script that I call with the component t_system. 

How can I do it?

 

Best regards,

Thirteen Stars

Re: Process the contents of the sql server database per slice

hi,

 

same as you will do in any language (for example SQL)

1. detect your id column (or any column which you could use for partitioning ), count number of rows

2. run you script with parameters - id > n * 100000 and id < (n+1)* 100000, where n from 0 to (number of rows / 100000)

3 increment n, n = n+1

4 do this until number of rows > n * 100000

 

-----------
Seven Stars

Re: Process the contents of the sql server database per slice

Hi, 

 

Thank you for your reply.
If I understand correctly, I should translate this code into SQL and write it in the tmssql_input component then I use t_system to apply my script?

 

Best regards. 

Thirteen Stars

Re: Process the contents of the sql server database per slice

there are many ways todo this

 

as variant - Input component with select count(*) from table_name

store this value to global variable and create loop with Talend components

-----------
Seven Stars

Re: Process the contents of the sql server database per slice

Ok !

But first how can i get the id row because i don't have an auto increment id that i can use it for the condition (id>n*100000).

 

Best regards.

What’s New for Talend Spring ’19

Join us live for a sneak peek!

Sign up now

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.

Download

Tutorial

Introduction to Talend Open Studio for Data Integration.

Watch

Downloads and Trials

Test drive Talend's enterprise products.

Downloads