Adding and removing rows in a Talend job

Six Stars

Adding and removing rows in a Talend job

I have the following problem:

I receive data from a supplier, which cannot handle daylight saving time.

The data is structured with one row for each hour.

So they deliver 24 hours each day of the year where they should deliver 23 hours on the day daylight savings starts and 25 hours on the day daylight savings ends.

But my target system needs the correct amount of hours so i need to remove the excess row on the correct day in the spring and add the missing row on the correct day in the fall.

 

Is there a way to do this in a Talend job?

 

Highlighted
Employee

Re: Adding and removing rows in a Talend job

Hi,

 

     I have some discovery queries to understand better about your requirement. Could you please share details for those?

 

a) How can we identify the date and time of data? Could you please share the layout of data structure in a csv format?

b) Would you like to delete the data or would you like to change the value of data based on date and time?

    

     In simple terms, you can cross check the incoming data with a reference value table you are having where you can store the daylight savings cutoff dates for next 10 years. If the incoming values are equal to cutoff date, you can add or remove rows based on conditions in your reference table.

 

Warm Regards,

 

Nikhil Thampi


Warm Regards,
Nikhil Thampi
Please appreciate our members by giving Kudos for spending their time for your query. If your query is answered, please mark the topic as resolved :-)
Six Stars

Re: Adding and removing rows in a Talend job

The incoming data is as following

Timezone is CET

As you can see there is a value for 25-03-2018 02:00 even though this hour does not exists because of the change to summertime.

I need to remove the hour 02:00 and its corresponding value from the resulting csv file i am to create:

<?xml version="1.0"?>
<data>
	<series>
		<value>value1</value>
		<value>value2</value>
	</series>
	<datarow date="25-03-2018 00:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 01:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 02:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 03:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 04:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 05:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 06:00">
		<value>1,05</value>
		<value>1,33</value>
	</datarow>
	<datarow date="25-03-2018 07:00">
		<value>13,99</value>
		<value>7,59</value>
	</datarow>
	<datarow date="25-03-2018 08:00">
		<value>47,58</value>
		<value>15,46</value>
	</datarow>
	<datarow date="25-03-2018 09:00">
		<value>95,28</value>
		<value>32,27</value>
	</datarow>
	<datarow date="25-03-2018 10:00">
		<value>84,95</value>
		<value>39,45</value>
	</datarow>
	<datarow date="25-03-2018 11:00">
		<value>102,86</value>
		<value>47,75</value>
	</datarow>
	<datarow date="25-03-2018 12:00">
		<value>100,61</value>
		<value>47,15</value>
	</datarow>
	<datarow date="25-03-2018 13:00">
		<value>104,07</value>
		<value>46,68</value>
	</datarow>
	<datarow date="25-03-2018 14:00">
		<value>84,31</value>
		<value>47,04</value>
	</datarow>
	<datarow date="25-03-2018 15:00">
		<value>72,81</value>
		<value>29,38</value>
	</datarow>
	<datarow date="25-03-2018 16:00">
		<value>70,65</value>
		<value>20,24</value>
	</datarow>
	<datarow date="25-03-2018 17:00">
		<value>38,34</value>
		<value>10,27</value>
	</datarow>
	<datarow date="25-03-2018 18:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 19:00">
		<value>0</value>
		<value>0</value>
	</datarow>
	<datarow date="25-03-2018 20:00">
		<value>0,04</value>
		<value>0,04</value>
	</datarow>
	<datarow date="25-03-2018 21:00">
		<value>0,04</value>
		<value>0,04</value>
	</datarow>
	<datarow date="25-03-2018 22:00">
		<value>0,02</value>
		<value>0,03</value>
	</datarow>
	<datarow date="25-03-2018 23:00">
		<value>0,02</value>
		<value>0,03</value>
	</datarow>
</data>

 

Cloud Free Trial

Try Talend Cloud free for 30 days.

Tutorial

Introduction to Talend Open Studio for Data Integration.

Definitive Guide to Data Integration

Practical steps to developing your data integration strategy.

Definitive Guide to Data Quality

Create systems and workflow to manage clean data ingestion and data transformation.