there is one (in my opinion strong) reason in favour of #2 or #3:
A staging DB (assuming it's "physically" different from the DB where the original source data are stored) will allow you to impose less burden on the original source DB; just reading the data from there and simply staging them to a staging DB will definitely create the least possible traffic on the source DB server. In many customer environments the source systems are under heavy load almost constantly, meaning that each additional task put onto these servers will make the source system slower, so this staging approach can really help to decrease this additional load.
Now on to #2 vs. #3. I would consider two things. First, how much knowledge about PowerCenter do you have in-house? In the long run you will need someone on-site who can help with PowerCenter related questions, even though this someone need not necessarily be busy with PowerCenter all day long. So you need some PowerCenter experience in-house.
On the other side as far as I can tell PowerCenter needs less system resources than IDQ, so approach #3 will probably help you use the system resources of the IDQ machine better than performing this staging process in IDQ itself.
But please be aware that you need a PowerCenter license to do this in PowerCenter, as far as I understand the license model of Informatica the IDQ license alone does not allow you to run PowerCenter. So this probably will incur additional costs for you.
There is a fourth approach, assuming you're willing to work more with PowerCenter.
You can develop your load processes in IDQ (and this way utilise the features of the Developer GUI) and then transfer those processes to PowerCenter, including the IDQ transformations. This might be an option for you, and in fact you may get performance gains this way.
Again, assuming you do have PowerCenter know-how in-house and an appropriate license.
In short: from a performance perspective I would prefer approach #3 (staging in PowerCenter and then running IDQ to check the data) or #4 (developing the staging and checking processes in IDQ and then executing them in PowerCenter).
Hi Nico and thaks a lot for your fast and precise reply!
About PWC, our customer has the license and we also have some skilled persons belonging to another group but, if not really necessary, we would prefer to keep the DQ branch separated from PWC branch becasue of people, deployment processes and so on.
For this reason we would proceed with option #2 trying to implement it in the most performing way. About this this we have two more questions:
a) from a mapping point of you, do you think that the best way to load data from a source DB to a target DB, is two use 2 mappings (READ + WRITE and READ + DECISION + EXCEPTION) as we are supposing?
b) about the first mapping (READ + WRITE) is there any best practice? On the READ transformation we didn't find any particular option while on the WRITE one it should be possible to work on the option "LOAD TYPE = BULK (instead of NORMAL)". The BULK setting should speed up the loading process but our test didn't show any difference on timing.
from my personal perspective I would opt for approach #2, meaning to separate the staging process from the analysis process. For one simple reason: debugging and extending the system is so far easier with cleanly separated functionalities.
Other than that I'm the wrong person to respond any further. I've never worked with IDQ myself, so I don't have any real experience with it. Let's hope other people more knowledgeable in IDQ than me will respond here.