Proximity data domains are used to narrow down the inferred results to identify close-to-identical columns or fields for a data domain.
You can add proximity data domains to the existing data domains by editing the existing data domains from ldmadmin.
Ex: For “Address” data domain, Zipcode, Street, City could be proximity data domains.
When you create or edit a resource and enable data domain discovery, you can add the proximity data domains
Once you run the resource, the profiling scanner scans the data source for the data domain and the proximity data domains in the resource and displays a match score in EDC UI.
The match score is the ratio of number proximal data domains discovered in the data source to the number of configured proximal data domains for an inferred data domain.
Thanks. That last part is copied directly from the documentation which wasn't intuitively helpful.
I think I get the point of them, I just wasn't really sure how they worked in practice.
1 of 1 people found this helpful
one example is, you might have a lot of fields name ID & and all id's are an auto-generated number.
If you want to create a data domain that represents a Product_id.
to do this, you could use either metadata or data rule (or both) to find an id, and set a proximal domain Product_Name.
so Product_Id would only be connected to an id field that is in the same dataset (or a joined dataset) that also has a Product_Name domain field.
Thanks Darren, that makes more sense.