"I want to install Informatica BDM in a node of a CDH cluster."
As per Informatica, one has to install BDM either in a single node environment or cluster environment. Since you say you have a cluster, then you need to install BDM binaries in all the nodes, and not only in one node of the cluster.
"I am not certain about how many nodes cluster should I create..."
Is it a new Hadoop Implementation project or an Existing one for which you want to have Informatica Big Data implementation?
To understand how to make a hardware capacity plan for setting up Hadoop node clusters, the following has to be known.
- Size and Budget
- Business Services Requirements
- Technical Services Requirements
- Utilization and Optimization Plan
I presume that the sizing of the Hadoop cluster is usually done by the respective Big Data team. While sizing the Hadoop cluster, one should also consider the data volume that the final users will process on the cluster. The answer to this question will lead you to determine how many machines (nodes) you need in your cluster to process the input data efficiently and determine the disk/memory capacity of each one. All calculations are done based on various formulae available in determining the no of nodes, node storage, and Data Node Process and Data Node task tracker memory besides OS and HDFS memory.
Reach out to the Hadoop team to know the number of nodes in the Hadoop cluster (if it's new, or even the existing ones).