What is the use of Merge Row (diff)?

Merge Rows (Diff) is used to compare two sorted datasets and identify the differences between them.

It takes two input streams (Reference and Compare) and checks the records based on key fields. The step then indicates whether the rows are identical, changed, new, or deleted.

Purpose

The main purpose of Merge Rows (Diff) is to detect data changes between two datasets, which is commonly used in data synchronization, auditing, or incremental updates.

Possible Results

The step adds a flag field indicating the comparison result:

  • Identical – Record exists in both datasets and values are the same.

  • Changed – Record exists in both datasets but some field values are different.

  • New – Record exists only in the compare dataset.

  • Deleted – Record exists only in the reference dataset.