I have a table names EMPLOYEE table, with 1 million records. I have a source file which I read and then perform some transformations and insert record into the EMPLOYEE table. There is a column in EMPLOYEE table called EMPLOYEE_ID. Now while inserting the records into the table, I need to check if the incoming EMPLOYEE_ID is already present in the table, if it is present should not insert raise exception, else insert the record. The solution that I thought is something like, before inserting into EMPLOYEE table, join with the EMPLOYEE table itself and identify the rejects. This design will work for small data volume records, in case huge data volume this will not be a good/ideal solution because we need to unload the table fully, anything you can suggest for better implementation.
It depends of which db you're using.
I have this kind of design with Vertica db which is very fast for queries, so this is not a problem in such a case.
Else, you may have a unique index on EMPLOYEE_ID so duplicates will be rejected by db when record already exists.