Here are some guidelines I have followed :
(1) Load data in chunks : Use cursor size while selecting data , it help in reducing memory requirement for job . Use commit size say 10K at output component . Keeping uncommitted data will need too much memory while job processing . Setting commit size too low like 50 or 100 will lead to many database hit .
(2). Avoid heavy lookups ,instead join the lookup table in source input component (+setting of cursor size).
(3). Create index on join attributes before data processing
(4). Load and Process only incremental data wherever possible
(5). Gather tables and index statistics , please have a scheduler which will gather stats in off business time
(6). Load jobs in parallel after identifying dependencies among them
.......and many more
My 2 cents on this:
1 - Reduce the amount of needed data... if you want to join mln of records with mln of records, focus on having the matched correct output and manage the amount of (un)needed columns, minimum required columns... varchar/strings are memory drainers.
2 - And for matching I sometimes upload the (lookup) data to the database and join it in the database and then get the output. Use your db engine wise.
3 - On the hardware part, No wifi connections, prefer cable...
4 - Insert statements are the best performing, Insert/Update... you can plan your trip to mars, dickup some sand and get back and still have time for coffee