4 Replies Latest reply on Apr 7, 2021 9:12 PM by Garry Ure

    Using proximity data domains

    Garry Ure New Member

      Does anyone know how proximity data domains should be used properly in EDC? Ideally with examples.

       

      The documentation isn't very clear on their purpose or how to best use them.

        • 1. Re: Using proximity data domains
          Venkatesh Srinivasan Seasoned Veteran

          Proximity data domains are used to narrow down the inferred results to identify close-to-identical columns or fields for a data domain.

           

          You can add proximity data domains to the existing data domains by editing the existing data domains from ldmadmin.

          Ex: For “Address” data domain,  Zipcode, Street, City could be proximity data domains.

          When you create or edit a resource and enable data domain discovery, you can add the proximity data domains

          Once you run the resource, the profiling scanner scans the data source for the data domain and the proximity data domains in the resource and displays a match score in EDC UI.

           

          The match score is the ratio of number proximal data domains discovered in the data source to the number of configured proximal data domains for an inferred data domain.

          • 2. Re: Using proximity data domains
            Garry Ure New Member

            Thanks. That last part is copied directly from the documentation which wasn't intuitively helpful.

             

            I think I get the point of them, I just wasn't really sure how they worked in practice.

            • 3. Re: Using proximity data domains
              Darren Wrigley Guru

              one example is, you might have a lot of fields name ID & and all id's are an auto-generated number.

              If you want to create a data domain that represents a Product_id.

               

              to do this, you could use either metadata or data rule (or both) to find an id, and set a proximal domain Product_Name.

               

              so Product_Id would only be connected to an id field that is in the same dataset (or a joined dataset) that also has a Product_Name domain field.

              1 of 1 people found this helpful
              • 4. Re: Using proximity data domains
                Garry Ure New Member

                Thanks Darren, that makes more sense.