Non Matching Rows in ADF Data Flow Lookup Transformation

If you have been coming from traditional SSIS background then definitely you must have used Lookup transformation there which is the most common transformation in any ETL workload. And one of the native capability of this transformation there, was to route matching and non matching records to different paths…for a quick refresher, this is what you used to see in SSIS –

As you transition to Azure Data Factory’s Data Flow transformations, at the first glance you might be expecting something similar here also. But this is not available here as a configuration. So, let’s find out how you can handle matching and non matching rows coming from Lookup transformation in ADF Data Flow

Conceptually, as we know Lookup transformation is a simulation of Left Join and this is what Data Flow preserves it as a bottom line and hence gives you the output from Lookup transformation accordingly i.e. you get all the rows from first table and data in columns from second table only for the matching rows.

However, you have the option of using inequality operators for joining condition here.

Now, the question comes, how do you classify the rows which matched and which didn’t match so that you can process them accordingly in the downstream flow. There is NO configuration available to achieve this as part of Lookup transformation. You will have to use Conditional Split transformation to achieve this segregation with the following configuration –

isMatch() is the function that helps you to identify the rows that matched from its input Lookup transformation. Since, Conditional Split is a separate transformation so it offers its native capabilities too instead of just doing a classification of matching vs. not matching rows from Lookup transformation which can give you further lot of benefits depending on your use case. For example, you can have several conditions there to direct the incoming rows to several output streams.

For simplicity, consider Conditional Split as SQL Server CASE statement where isMatch() function is helping one scenario of identifying matched rows from Lookup transformation. Putting no condition serves as equivalent to ELSE part of CASE statement in SQL Server.

If you have any feedback, questions or further thoughts, you may please write into the comment section of this post.

Leave a Reply