-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
None
-
False
Feature request or enhancement
For feature requests or enhancements, provide this information, please:
Which use case/requirement will be addressed by the proposed feature?
Currently, sqlserver connector running with data.query.mode=direct requires a custom database index in order to get good performance. The default/clustered index couldn’t be used because it has __$command_id in between other columns. According to the documentation its purpose is to sort rows in a transaction similar to __$seqval. Moreover, according to the doc __$command_id should be used for sorting, not __$seqval:
__$seqval binary(10) Sequence of the operation as represented in the transaction log. Should not be used for ordering. Instead, use the __$command_id column.
Given the fact that sorting by __$command_id and by __$seqval produces the same ordering, it’s safe to add __$command_id to the ORDER BY clause of the query and this should allow the database to use the clustered index.
Looking at the execution plans confirms that. Here is a query without __$command_id:
select top 1000 __$start_lsn, __$seqval, __$operation from CXPPerformanceTest.cdc.DebeziumDataCapture_dbo_Accounts_20240905070510500_CT ddcdac where __$start_lsn >= CONVERT(binary,'0x000994E4000021D0003C',1) and __$start_lsn <= CONVERT(binary,'0x000994E4000021D0003C',1) and [__$seqval] >= CONVERT(binary, '0x000994CF000173780043', 1) order by __$start_lsn asc, __$seqval asc, __$operation asc
According to its execution plan TotalSubtreeCost is 69.386566.
The cost of the same query but with __$command_id added to the ORDER BY clause is 0.09784419.
Implementation ideas (optional)
Adding [__$command_id] ASC to the ORDER BY clause eliminates the need for a custom index.
The proposed patch is here: https://github.com/debezium/debezium/pull/6279