Data Engineering Integration : 2018 : December ... Skip navigation

Introduction

Informatica® Big Data Management allows users to build big data pipelines that can be seamlessly ported on to any big data ecosystem such as Amazon AWS, Azure HDInsight and so on. A pipeline built in the Big Data Management (BDM) is known as a mapping and typically defines a data flow from one or more sources to one or more targets with optional transformations in between. The mappings and other associated data objects are stored in a Model Repository via a Model Repository Service (MRS). In design-time environment, mappings are often organized into folders within projects. A mapping can refer to objects across projects and folders. Mappings can be grouped together into a workflow for orchestration. Workflow defines the sequence of execution of various objects including mappings.

Deployment overview

Mappings, workflows and other objects developed by Informatica developers are stored in the model repository that the MRS is integrated with. These design-time objects are deployed to the run-time DIS for execution. In a typical enterprise, there is more than one Informatica environment and the code developed in the Development domain is deployed to several non-production environment such as QA and UAT before deployed into Production. While the Development environments contain both design-time and run-time services, it is not necessary for the subsequent environments to be configured with both design-time and run-time services. For deploying objects from one environment to another, the objects must be added into containers called Applications. Applications can be deployed to a runtime Data Integration Service (DIS) or to an Application Archive (.iar) file. The application archive file can subsequently be deployed to data integration services in the same or different domain as depicted below.

BDM Deployment Process

 

There are two recommended ways of deployment: Classic deployment model and the CI/CD deployment model. In the example below, the migration and deployment of objects 

Classic deployment

In classic deployment model, the following process is followed:

  1. Metadata / objects that need to be deployed are deployed into the run-time DIS of the development environment
  2. Once unit testing is complete, the objects can be migrated to subsequent environment's MRS (such as QA) via XML export/import or via application export
  3. From the MRS of QA environment, application is rebuilt and deployed to the QA DIS
  4. Once functional testing is complete, the objects are migrated from QA MRS to Production MRS via XML export/import or via application export
  5. From the Production MRS, application is rebuilt and deployed to the Production DIS

 

Classic deployment model in BDM

 

In this approach, a design-time copy of the mappings and workflows are maintained in the MRS of every single environment. Application is rebuilt in each environment and deployed to the corresponding DIS. During migration of objects from one MRS to another, one of the available replacement strategies can be selected. Replacement strategies include replacing objects from the source upon conflict, reusing the objects in the target repository, etc. Upon conflicts, if the objects in the target repository are not replaced from the source, the application built in each environment may not match with that of the other as the dependency resolution can happen with different versions of the objects or different objects altogether

Agile deployment

In Agile deployment model, the following process is followed:

  1. An application archive is built in the Development repository
  2. This application archive (.iar) file is uploaded into a version control system such as GIT
  3. The application archive (.iar) file from version control system is then downloaded and deployed to the Development DIS using infacmd CLI
  4. Once unit testing is complete, the same step is repeated to deploy the application in to QA DIS
  5. Once functional testing is complete, the same step is repeated to deploy the application in to Production DIS

 

Agile deployment in BDM

In this approach, a single application archive file is used across several environments and hence consistency is assured. Though not common, the application archive can optionally be imported into MRS to maintain a design-time copy of the objects.

Automation

infacmd CLI can be used perform deployment in an automated manner. Both of the deployment models described above can be automated using the CLI. Automation server tools such as Jenkins can be used to automate the overall process of deployment as described in the blog: Continuous delivery with Informatica  BDM.

Summary

In Big Data Management, there are many ways to migrate and deploy objects  from one environment to another. Customers can choose the approach that best suits their needs. All approaches can be automated using infacmd CLI and automation tools such as Jenkins.

Introduction

Informatica® Big Data Management allows users to build big data pipelines that can be seamlessly ported on to any big data ecosystem such as Amazon AWS, Azure HDInsight and so on. A pipeline built in the Big Data Management (BDM) is known as a mapping and typically defines a data flow from one or more sources to one or more targets with optional transformations in between. The mappings and other associated data objects are stored in a Model Repository via a Model Repository Service (MRS). In design-time environment, mappings are often organized into folders within projects. A mapping can refer to objects across projects and folders. Mappings can be grouped together into a workflow for orchestration. Workflow defines the sequence of execution of various objects including mappings.

 

Deployment process overview

For mappings and workflows to be deployed and executed in the run-time, they are grouped into applications. Application is a container that holds executable objects such as mappings and workflows. Applications are defined in the Developer and deployed to a Data Integration Service for execution. Once deployed, Data Integration Service persists a copy of the Application. Application can also be deployed to a file known as Informatica Application Archive (.iar) file, which can subsequently be deployed to a Data Integration Service in same or different domain. The overall process flow for deployment in BDM is as shown here:

BDM Deployment Process

Automation

The process of deploying a design-time application to an Informatica application archive (.iar) file can be executed via a infacmd CLI with Object Import Export (oie) plugin. A sample of the deploy application command is as follows:

infacmd.sh oie deployApplication -dn $infaDomainName -un $infaUserName -pd $infaPassword -sdn $infaSecurityDomain -rs $designTimeMRSName -ap $applicationPath -od $Output_Directory

 

The above example uses several user-defined environment variables. They can be named as per the individual organization standards. The password provided is case sensitive. Alternatively, an encrypted password string can be stored in the predefined environment variable INFA_DEFAULT_DOMAIN_PASSWORD. When an encrypted password is used, -pd option is not required. This command is documented in detail in Informatica documentation at Command Reference Guide → infacmd OIE Command Reference → Deploy Application

 

Once the application archive file is created, it can be optionally checked into GIT or other version control system for audit and tracking purposes.

 

Subsequently, the application archive file can be deployed to Data Integration Service of the same or different domain. Typically the application archive file is created out of a development domain and is eventually deployed into QA, UAT and Production domains. This can be achieved via infacmd CLI with Data Integration Service (dis)  plugin. A sample of such deployment command is as follows:

infacmd.sh dis deployApplication -dn $infaDomainName -un $infaUserName -pd $infaPassword -sdn $infaSecurityDomain -sn $dataIntegrationServiceName -a $applicationName -f $applicationArchiveFileName

 

This command is documented in detail in Informatica documentation at Command Reference Guide → infacmd DIS Command Reference → Deploy Application. Once deployment is successful the listApplications and listApplicationObjects in the DIS plugin can be used to get a list of the deployed applications and their contents respectively. This information can be used for post-deployment verification / sanity checks.

 

Integration with Jenkins

The CLI described above can be used to initiate the deployment process from within a Jenkins task. A "Build Step" of type "Execute Shell" can be added to the Jenkins. The step can be configured to execute one of the infacmd commands as shown in the example below

 

BDM deployment in Jenkins

 

A sample template file for Jenkins is attached (Jenkins-Template-App-Deployment) . The template contains the commands to:

  1. Create an Informatica Application Archive (.iar) file
  2. Commit the application archive file to GIT
  3. Deploy the application into DIS

 

Summary

Informatica BDM jobs can be deployed using Jenkins without any need for 3ʳᵈ party plugins. infacmd CLI commands can be directly used in Jenkins just as they can be used in an enterprise scheduling tool.

 

Contributors

  • Keshav Vadrevu, Principal Product Manager
  • Paul Siddal, Big Data Presales Specialist

 

 

 

Filter Blog

By date: By tag: