The ‘Deduplication’ function allows you to limit the number of relationships in a selection that contain exactly the same data.
Drag a deduplicate block onto the selection screen in order to deduplicate records.
Connect an input (selection) block with the deduplicate block.
Double click the deduplicate block to open the screen below.
1. Source / Entity
First, select the source within which the duplication needs to take place.
More details: Source / Entity / Key.
2. Maximum number of relations
Then define the ‘Maximum number of relations’. This defines the number of relationships that must remain after executing the deduplication. In other words, by entering the maximum number of relations it is determined how often a relation per deduplication attribute may occur after the deduplication.
The relations to retain are selected at random.
3. Search / Select Entities
This part of the screen shows the different entities that are available to search and select. You can search for entities by scrolling through the list, expanding the entity tree or via the search box.
More details: Search / Select entities.
4. Define Selection / Split criteria
In this window you can define the condition(s) on which the deduplicate will occur.
More details: Define selection definition(s).
Indicate in every deduplicate block whether the number of records should be limited or not by using the ‘Limit’ function.
First, select ‘Off’, ‘Absolute’ or ‘Percentage’.
Then provide the absolute number or percentage per segment.
6. Block template
More details: Block template.
More details: Description.
8. Apply / Cancel / Save / Calculate
More details: Apply / Cancel / Save / Calculate.
Deduplicate a selection based on the same email address and the maximum number of relations is 1. When there are 3 relations in the selection with the email address email@example.com, the selection result will be limited to one relation with the email address firstname.lastname@example.org.