In this article we will show how to define a new model to import metadata in the catalog to represent programs (and their logic) that moves data across system in the enterprise. Programs should be searchable object in the catalog with an instantiation of the object in the lineage diagram for easy understanding of the object implication in the data lineage. Programs will be classified within categories as we want to represent “Mapping” program as well as other type of programs such as reporting programs, archiving programs, etc.
Create a model
Follow this document for more information on models and how to manage them.
Model for program logic representation
As we want to be able to represent the program as an object in the catalog that could contain members (incoming parameters, outcoming outputs). The model to represent the program logic can be derived from the core model using core.DataSet and core.DataElement. The program should be part of a category. We can represent the model this way:
Here is the representation of the model in the EDC administration UI
The model XML file can be viewed here:
Create a resource type
To allow the creation of a resource to load the metadata, we then need to create a resource type that will reference the models to be used to import the metadata as well as the class(es) which will be used to create connection endpoints used to link other resources. A resource type can point to multiple models, this means that all models classes and attributes will be available to be loaded. In this case we can create a resource type that uses the Proglogic model we created and use ProgramType for connection types. The connection type will determine at which level the auto connection assignment will be performed following the hierarchy of object to match with the other resource (e.g. Database/schema/Table).
Create a resource and load the metadata
To load the metadata, we need to create a resource that will allow use to provide the metadata to be loaded as CSV files. The metadata is separated in 2 types of files
- Object files, the filenames can be any name and contains the list of objects (class instances) to create in the catalog.
- Link file, the filename must be links.csv and contains the links between the objects loaded as part of the resource.
When creating the resource, you can provide the files as a zip file (CSV files should be at the root of the zip file and have headers containing the headers provided when you download the sample file for the resource type.).
content file can he found here
- The object file looks like the below file and creates the ProgramCategory, Program and its ports
The results look like the below for the a program overview:
Links contains the parent child associations
Program Port overview looks like:
Create lineage with external object in the catalog
Now to link the program we have created, we need to import lineage information with other metadata source. For this, you can use a custom lineage resource that will allow to bring import the lineage via CSV file.
an example of the custom lineage file can be found here
Summary lineage at program level is provided by the DataSetDataFlow associations from core model
Field level lineage is provided by the DirectionalDataFlow associations from core model
Here is the lineage view at the program level:
Here is the lineage view at the field level:
Validate the custom metadata content
Before loading custom metadata you can validate the content of the zip file you create and make sure that the created content will be imported properly, the following knowledge base article details how to use the validation utility.