Informatica® Big Data Management™ allows users to build big data pipelines that can be seamlessly ported on to any big data ecosystem such as Amazon AWS, Azure HDInsight and so on. A pipeline built in the Big Data Management (BDM) is known as a mapping and typically defines a data flow from one or more sources to one or more targets with optional transformations in between. The mappings and other associated data objects are stored in a Model Repository via a Model Repository Service (MRS). In design-time environment, mappings are often organized into folders within projects. A mapping can refer to objects across projects and folders. Mappings can be grouped together into a workflow for orchestration. Workflow defines the sequence of execution of various objects including mappings.
Mappings, workflows and other objects developed by Informatica developers are stored in the model repository that the MRS is integrated with. These design-time objects are deployed to the run-time DIS for execution. In a typical enterprise, there is more than one Informatica environment and the code developed in the Development domain is deployed to several non-production environment such as QA and UAT before deployed into Production. While the Development environments contain both design-time and run-time services, it is not necessary for the subsequent environments to be configured with both design-time and run-time services. For deploying objects from one environment to another, the objects must be added into containers called Applications. Applications can be deployed to a runtime Data Integration Service (DIS) or to an Application Archive (.iar) file. The application archive file can subsequently be deployed to data integration services in the same or different domain as depicted below.
There are two recommended ways of deployment: Classic deployment model and the CI/CD deployment model. In the example below, the migration and deployment of objects
In classic deployment model, the following process is followed:
- Metadata / objects that need to be deployed are deployed into the run-time DIS of the development environment
- Once unit testing is complete, the objects can be migrated to subsequent environment's MRS (such as QA) via XML export/import or via application export
- From the MRS of QA environment, application is rebuilt and deployed to the QA DIS
- Once functional testing is complete, the objects are migrated from QA MRS to Production MRS via XML export/import or via application export
- From the Production MRS, application is rebuilt and deployed to the Production DIS
In this approach, a design-time copy of the mappings and workflows are maintained in the MRS of every single environment. Application is rebuilt in each environment and deployed to the corresponding DIS. During migration of objects from one MRS to another, one of the available replacement strategies can be selected. Replacement strategies include replacing objects from the source upon conflict, reusing the objects in the target repository, etc. Upon conflicts, if the objects in the target repository are not replaced from the source, the application built in each environment may not match with that of the other as the dependency resolution can happen with different versions of the objects or different objects altogether
In Agile deployment model, the following process is followed:
- An application archive is built in the Development repository
- This application archive (.iar) file is uploaded into a version control system such as GIT
- The application archive (.iar) file from version control system is then downloaded and deployed to the Development DIS using infacmd CLI
- Once unit testing is complete, the same step is repeated to deploy the application in to QA DIS
- Once functional testing is complete, the same step is repeated to deploy the application in to Production DIS
In this approach, a single application archive file is used across several environments and hence consistency is assured. Though not common, the application archive can optionally be imported into MRS to maintain a design-time copy of the objects.
infacmd CLI can be used perform deployment in an automated manner. Both of the deployment models described above can be automated using the CLI. Automation server tools such as Jenkins can be used to automate the overall process of deployment as described in the blog: Continuous delivery with Informatica BDM.
In Big Data Management, there are many ways to migrate and deploy objects from one environment to another. Customers can choose the approach that best suits their needs. All approaches can be automated using infacmd CLI and automation tools such as Jenkins.