The Definitive Guide to Data Quality
In the tMSSQLOutput there are options for table (Action on table) and rows (Action on data).
From your post i assumed you use inner join on the primary key fields and create 2 outputs (inner join data & inner join rejects data). The matched joins data shall be updated and the unmatched data shall be inserted into the table right?
So, you can use the options i mentioned earlier. For matched data, use Action on data = update and for unmatched data, use Action on data = insert. The Action on data = update will check the primary keys and then check for any changes in each fields of the row. It will then update the changed fields only. It's advisable that you add an extra field in you table (date_updated) and use getCurrentDate functions so you can have a better tracking on your updated data..
Hope this will help you mate.. (^_^)
No the "action on data" option wont change only the updated values, actually it will update the whole row even if no change occured.
I f you want to update only the updated records, you should set up a CDC (stand for Change Data Capture) process in your job. To sum it up : all the change in your source database will be save in a log DB (column X : new value : ...) or log file. Then your jobs will read that log DB and update the target database only with the changes.
Take a look on http://en.wikipedia.org/wiki/Change_data_capture to have a good resume about CDC
Depending on your software but many databases vendors have integrated CDC system in their products.
A workaround would be to make a job to compare your file and the last one which was loaded in the database. The delta will contain the data to update or insert. But honestly this kind of comparaison is really hard to do...
Edit : take a look on this OpenSource projet JXyDiff it might help you http://potiron.loria.fr/projects/jxydiff