I have been experiencing a weird problem. On a tHiveInput I have a simple select query (select A, B, C from Table). Now let’s assume I have 1000 records in which 4 of those records have A=1. Thus, running this query I would expect the thousand rows from the input table (in which A = 1 appears 4 times). Although that does not happen. There is one of the records from A that is missing (so, A=1 appears only 3 times). This gets even more weird as the record appears when I add to the query the where statement: where A = 1. The query has no filters at all so I do not understand why there is a missing record and specially why there isn't when I introduce the where statement. I double checked the schema and simplified it as much as possible but I still have the same behavior.
Thanks, in advance
Would you mind posting your job design screenshot into forum which will be helpful for us to address your issue? Elaborating your case with an example with input and expected output values will be preferred.
thanks for answering. So the job is quite simple (although I am only putting the part that produces the error). The problem that in the first node which has a query such as the following one:
SELECT a.A, a.B, a.C, a.D FROM Table_1
the code above produces 3 rows for A = '1'
however, with the following code:
SELECT a.A, a.B, a.C, a.D FROM Table_1 where A='1'
The output is 4 rows.
Thanks in advance