Home page  

Help > Tutorial >

Scenario 3: Building Master Data

Version 8.0.0.490

Tutorial Scenario: Building Master Data.. 1

Step 1: Configuring Master Data Matching and Merge Rules. 2

Editing Entity Data. 2

Specifying Matching and Merge Rules. 4

Step 2: Map Master Data Columns for Querying. 16

Step 3: Execute Sample Queries. 22

Tutorial Scenario: Building Master Data

A major benefit of SDF is the ability to create and maintain master data from data across multiple data sources in a distributed manner. SDF does this without making copies of data in a central location while easily and seamlessly integrating master data with operational data without putting a query load on the underlying data sources. This is accomplished by taking advantage of built-in data federation and record linking features in SDF.

 

The master data generation process is based on finding matching attribute values for master data entity attributes. Master data entities can be simple or complex. Simple entities may have one or two attributes, whereas a complex entity may have other entities as attributes. An example of a simple entity in a patient management system can be a contact email address or phone number. An example of a complex entity is a patient having a name, address, email, and phone number as attributes.

 

Internally, the master data build process follows these steps for each entity type configured for master data generation: 

 

1.    Binning entities based on a selected subset of entity attributes using exact matches of raw or fuzzy versions of the attribute values.

2.    Linking records within each bin.

3.    A detailed scoring of entities in a bin based on closeness of attribute values using edit distance algorithms.

4.    Grouping and filtering matches based on a given cutoff threshold (records in the same group are considered as belonging to the same entity).

5.    Merging matching entities based on given merge rules to generate a master record for that entity.

 

This documentation serves as an example scenario for master data management and integration using sample data sources included in the SDF media. .

 

1.    Configure master data matching and merge rules to build master data.

2.    Map master data columns for querying.

3.    Execute sample queries on master data.

 

Step 1: Configuring Master Data Matching and Merge Rules

Editing Entity Data

The initial step in configuring MDM is defining the MDM entity and its essential entity attributes in terms of business mapped dictionary column names. These entity attribute columns will be available later as binning attributes for use during the MDM matching configuration.

 

In order to properly link records for master data generation, you need to make sure the entity column information is accurate for the entity and has all of the columns you need. This helps creating links using all the necessary attribute data.

 

·         Go to the ‘Dictionary Mapping’ tab and click ‘Edit Entity Data’ at the bottom of the window.

 

 

To define an entity, you need to make sure that the attributes that define a unique field are all in the center pane under the proper entity. For example, if the combination of the LastName, FirstName and DOB fields define a unique PERSON entity, those three fields become attributes for the PERSON entity and should be added to the center pane.

 

The following table contains the entity you will need for this scenario and the attributes that populate that entity.

 

Entity Name

Attributes

PERSON

FirstName, LastName, DOB

 

To create an entity, click ‘New Entity’. Enter a name for the entity and, if necessary, a description. Entities will auto-populate with one attribute when created.

 

 

The PERSON entity included with the installation has the wrong elements defining the entity based on your mapping. You need to replace FIRST_NAME and FAMILY_NAME with the FirstName and LastName elements we mapped.

 

·         Using the ‘Dictionary Elements box, search for the attributes that define the entity and add them to the ‘Attributes’ box by selecting them and clicking the left arrow. Remove any incorrect or unnecessary attributes by selecting them and clicking the right arrow.

 

Your resulting entity should look like the PERSON entity in the first screenshot shown above. You can close this window after you have ensured the necessary entities are there.

 

Specifying Matching and Merge Rules

With the EIQ Indexed Adapters configured, the EIQ Federation Server created, and the entities defined, you can now create a Link Index for the master data. Within the ‘Dictionary Mapping’ tab:

 

·         Expand the server node in the left pane and find ADW_FED.

 

 

·         Select ‘Show entity mapping’ (below the advanced settings and management tabs).

 

This menu shows the entities defined, the number of attributes defining those entities, and the remote servers under the EIQ Federation Server.

 

 

·         Click ‘Build Links/MDM’ in the bottom left.

 

This shows the available entities on the EIQ Federation Server. The only entities that appear here are entities that have mapped attributes on the adapters. For instance, the screenshot above shows more than three entities, however those entities have no mapped attributes on the adapters, so they are not selected for linking and master data building. All matching rules are set to ‘Exact’ by default.

 

 

You will need to apply fuzzy matching to the PERSON entity. Fuzzy matching is a technique that assists in record linkage. It works with matches that may be less than 100% perfect when finding correspondences between segments of a text and entries in a database of previous translations. Rather than having multiple files for one person, use it to find similarities between data sources and match together records into one master person index.

 

·         Select the entity where you need to apply Fuzzy Matching, for example, the PERSON entity.

 

 

·         Right Click on the Entity Name

·         Select “Configure matching rule on ‘PERSON’”.

 

 

The next window will ask which type of fuzzy matching you want to use - Probabilistic or Non-Probabilistic.

 

·         Select "Non-Probabilistic".

·         Then, select CUSTOMER from the dropdown menu. This tells the process to use the attributes from the CUSTOMER business object we created.

 

 

Note: Probabilistic Match is currently in a beta stage. This feature will be available shortly and is a more advanced way of matching. Ignore it for now.

 

The next window is for configuring the matching rule.

 

·         Select “Fuzzy Match” in the ‘Select Option’ box.

 

 

The tool automatically populates entity attributes as Binning Attributes. You can select the Binning Attributes and modify them as per requirements. For this scenario, we will keep it simple and leave them as default with the exception of the Matching Tokens. Now:

 

·         Change the Matching Token for FirstName and LastName from 'Original' to 'Metaphone'.

·         Do this by double-clicking 'Original' and selecting 'Metaphone' from the drop-down menu.

 

 

Note: Metaphone3 requires manually installing and enabling the Metaphone3 extension. This will be enabled automatically in future releases.

 

·         Add the Matching Attributes for the MDM Table. This should be all the columns in your master data set you want to merge. Add the following by selecting them in the ‘Mapped Columns’ box and clicking the right arrow. To remove an attribute, select it in the ‘Matching Attribute for MDM table’ box and click the opposite arrow.

·         Add Weight Max for each attribute and Weight Blank as per the requirement. Max Weight is the maximum value possible for that attribute, exact match gets the full weight and fuzzy logic gets a fraction based on the calculated distance.

 

Name

Weight Max

Weight Blank

Binning Plus

DOB

15

10

No

FirstName

20

10

No

LastName

13

10

No

City

6

3

No

Email

15

8

No

PhoneNumber

10

5

No

State

7

4

No

Street1

5

3

No

Street2

2

1

No

Zip

3

2

No

 

·         Leave Weight Threshold for link as the default of '50'. (This is the threshold required for a link to be created between the records).

·         Select “Generate master Data”.

o   Leave the Weight Threshold for master data as '75'. (This is the minimum total weight required for a record to be merged).

o   This uses the raw value not a percentage.

o   Merge rule: Most recent timestamp -- Tells it to get the most recent record when merging records for MDM.

o   Select “Update master data schema automatically”.

§  It automatically updates the schema if new attributes are added after initial creation.

 

 

·         Click “OK” to close the window.

 

Once matching rule configuration is complete, you can start the build process. Before that, however, any additional entities you may want to select to skip because they are unnecessary.

 

·         Right-click an entity and select 'Skip building links/MDM for entity....'

 

 

·         Click 'Start' to generate master data.

·         A successful build will look like the screenshot below.

 

 

 

Step 2: Map Master Data Columns for Querying

Now that the index has been built successfully, the generated master data needs to be mapped to a business data view.

 

·         Click ‘Dictionary Mapping’ in the bottom-left to return to the mapping screen.

·         Go to the ADW_ODS_FED_VDS or ADW_DW_FED_VDS, expand the tree, and click on the grey node.

·         Make sure ‘show indexed tables’ is selected.

 

 

 

         

The “D#_MDM” schema has been added and a PERSON table has been created with the selected matching attributes. The master data generation process created an entirely new schema for the master data. The same schema is available in the ADW_ODS_FED_VDS and ADW_DW_FED_VDS. In real use scenarios, every VDS that was connected to the EIQ Federation Server with the same entities when master data is generated will have the new Master Data schema.

 

Now users can add Dictionary Attributes to their master data columns to create a business data view. Follow the procedure below to add Dictionary Names to the ADW_DW_FED_VDS Master Data:

 

·         Select Column MDM_City -> right-click and select 'Edit'.

·         The Mapped Column Definition window opens.

·         Now enter the Dictionary Name: “MPI_City”. MPI is used to specify that it is the Master Person Index. Depending on a particular industry, users may want to define this differently. MCI could indicate a Master Customer Index, or MPI could be defined as a Master Patient Index. For the tutorial, we are just calling it a Master Person Index.

 

 

 

Master Data Column Name

Dictionary Name

MDM_City

MPI_City

MDM_DOB

MPI_DOB

MDM_Email

MPI_Email

MDM_FirstName

MPI_FirstName

MDM_LastName

MPI_LastName

MDM_ID

MPI_MDM_ID

MDM_PhoneNumber

MPI_PhoneNumber

MDM_State

MPI_ State

MDM_Street1

MPI_ Street1

MDM_Street2

MPI_ Street2

MDM_Zip

MPI_Zip

dim.Customer MDM_ID_PERSON

MDM_ID_Customer

 

You may notice from the mapping process that a new column was also inserted into the dim.Customer table. This MDM_ID column is a derived column inserted by the process in order to keep track of records that have been matched and merged during the MDM build process.

 

·         Click “Save Now” to save the business data view mapping.

 

 

The final step is creating a new business view for the MDM. On each adapter:

 

·         Go to the Business View creation tab.

·         Create a new Business View as follows:

o   Name the Business Object.

o   Select "D#_MDM"."PERSON" from the list of tables.

o   Add it to the Business View clause.

o   Verify the correct mapped columns are available.

o   Click "Apply" and then "OK" to close the window.

 

 

The new Business View will appear in the list on the main tab.

 

 

Make sure to perform the same setup on the ADW_ODS_FED_VDS. You'll need to map the MDM columns to the business data view, and then create the business view using the exact same table in the above process.

 

Note: The MDM_ID_PERSON column inserted by MDM in ADW_ODS_FED_VDS is in the vCustomerDemographics table. That will need to be mapped to MDM_ID_Customer.

 

This completes the MDM setup and creation process. Using the PGAdmin tool, connect to the EIQ Federation Server and make queries as usual.

 

Step 3: Execute Sample Queries

To query on the master data, connect to the EIQ Federation Server where the master data was created. Open PGAdmin and connect to the ADW_FED. You should notice the new MASTER_CUSTOMER table is available. That's where our MDM will be.

 

Note: The following screenshots show the query connection as ‘AdminSuper Connection’. For clarification, this should be the ‘ConfigAdmin Connection’ created in Scenario 1.

 

 

Try the sample query SELECT * FROM "SDF#_ADW_FED"."MASTER_CUSTOMER" .

 

 

This query shows all of the master data columns mapped to the business data view.

 

Now try the query SELECT * FROM "SDF#_ADW_FED"."MASTER_CUSTOMER" WHERE "MPI_FirstName" = 'Nichole'

 

 

The MDM shows a limited number of results. Due to the MPI_MDM_ID column, in instances like Nichole Nara we can see that two records have been marked as the same record. We can query the original Federation Server data from the same window to see how much changed.

 

Execute SELECT * FROM "SDF#_ADW_FED"."CUSTOMER" WHERE "FirstName" = 'Nichole' AND "MDM_ID_Customer" = 1829

 

 

In this screen we see that there were multiple records for Nichole Nara. The master data was able to link these records together and merge them as the same person. You can see in the MDM_ID_Customer column which records the master data generation process flagged as the same record.

 

That concludes Scenario 3. Feel free to continue running queries on the Master Data, and even try combining Master Data queries with the transactional base table queries to get the full power of federated EIQ Hybrid Adapters with MDM built on them.

 

Copyright © 2023 , WhamTech, Inc.  All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. Names may be trademarks of their respective owners.