8 Replies Latest reply on Aug 17, 2020 11:32 AM by niti rawat

    Load Balancing and High Availability in IDQ

    niti rawat Seasoned Veteran

      Hi,

       

      Wanted to understand the concept of High Availability and load balancing with Informatica Data Quality.

      With HA- what i understand is that for a domain with more than 1 node, in order to let the jobs continue execution in the event of node failure the DIS switches to node 2. This ensures the system/services be available at most times. We need to purchase a separate license for IDQ to have HA enabled.

       

      Does load balancing mean the same or does it mean just routing the records randomly (or by internal
      DQ algorithm) to different nodes so as to reduce the load on a single node and also reduce the processing time? Do we need to purchase a separate license for load balancing as well?

       

      Regards,

      Niti

        • 1. Re: Load Balancing and High Availability in IDQ
          sunilsa Guru

          Yes, for High availability of the Data Quality services you need following license options:-

           

          • HA-Failover for Data Integration Service
          • HA-Failover for Model Repository Service
          • HA-Resiliency for Core Platform
          • HA-Workflow Automatic Recovery for Data Integration Service
          • HA-Workflow Recovery for Data Integration Service

           

          Your understanding of the service availability with HA license option is correct.

           

          Regarding the Load Balancing, you need the following license option:-

          • Grid for Data Integration Service

           

          When you enable a Data Integration Service assigned to a grid, a Data Integration Service process runs on each node in the grid that has the service role. If a service process shuts down unexpectedly, the Data Integration Service remains available as long as another service process runs on another node. Jobs can run on each node in the grid that has the compute role. The Data Integration Service balances the workload among the nodes based on the type of job and based on how the grid is configured.

           

          More details could be found at Data Integration Service Grid Overview

          1 of 1 people found this helpful
          • 2. Re: Load Balancing and High Availability in IDQ
            niti rawat Seasoned Veteran

            Thank you Sunil.

             

            We just got to know that our product version also includes the HA license.

            Does that mean we have it for all the 5 options that you mentioned in your response?

            Also, as part of DR, we are trying to build a replica of the PROD environment and are looking at the possible solutions for domain DR. Is it a good idea to have two domains at two different physical locations so that in case one domain goes down we can switch over to the next domain? I am yet to understand if this switch would be manual or can it be automated. OR is there a better solution instead of having two domains?

            • 3. Re: Load Balancing and High Availability in IDQ
              sunilsa Guru
              1 of 1 people found this helpful
              • 4. Re: Load Balancing and High Availability in IDQ
                niti rawat Seasoned Veteran

                Thank you Sunil.

                 

                When planning for the DR for our requirement, we came up with two approaches:

                1. Single Domain- with 2 servers (1 PROD and 1 DR) having 2 nodes each.

                2. Two Domain- One as PROD domain ( server 2 nodes) and other as DR domain ( server with 2 nodes)

                 

                DB used is Oracle

                 

                For the first approach is it possible to point each server PROD and DR to separate Databases ( one to PROD DB and other to DR DB and keep them in sync using Data Guard) or should both the servers be pointing to the same Oracle DB?

                • 5. Re: Load Balancing and High Availability in IDQ
                  sunilsa Guru

                  Domain can point to only one database at a time. Using DB technology you might be able to have another database which would be in sync always with the PROD DB. But, before you bring up the DR nodes, you might have to run the "infasetup updategatewaynode" command with the DR DB details, so that the domain DB is also updated with the appropriate information.

                  • 6. Re: Load Balancing and High Availability in IDQ
                    niti rawat Seasoned Veteran

                    Okay.

                    So to conclude: Since in a single domain approach, the domain is configured on its original DB, and the same DB needs to be aligned with both the PROD and the DR sever. This eliminates the possibility of having a separate DB for DR server.

                    A separate DB for DR server is a solution when we are going for a 2 domain approach.

                    Please correct me if i am wrong.

                     

                    Regards,

                    Niti

                    • 7. Re: Load Balancing and High Availability in IDQ
                      sunilsa Guru

                      Yes, the same DB with concepts like ORACLE RAC can be implemented. This way all the 4 nodes in the domain point to the same Oracle DB but, only 2 nodes will be used as PROD and the other nodes in the disabled mode will be used for DR. If there are issues with the PROD nodes then, the DR nodes will be brought up.

                       

                      Since Oracle DB is RAC enabled, you have high availability at the DB as well and it would be a concern only when the entire DB on all the DB nodes goes down.

                       

                      Hope you had referred to the document https://kb.informatica.com/h2l/HowTo%20Library/1/0450-SettingUpDisasterRecoveryforPowerCenter-H2L.pdf  which covers the topics for Disaster Recovery. Though the document mentions PowerCenter, the domain concept is the same for PowerCenter and Data Quality.

                       

                      Thanks,

                      Sunil

                      • 8. Re: Load Balancing and High Availability in IDQ
                        niti rawat Seasoned Veteran

                        Ok.

                        The only concern where we were preferring the two domain approach was for the scenarios where the domain or the Data Center itself goes down. Although one way of dealing with this would be backing up the domain periodically and restoring it from the back up if DC is unavailable due to some reason, but this would be a manual process and will involve an outage. Do you think there is a better solution in case of DC unavailability?

                         

                        Regards,

                        Niti