Change data capture utilizes the SQL Server Agent to log insertions, updates, and deletions occurring in a table. So, it makes these data changes accessible to be easily consumed using a relational format. - learn.microsoft.com
This is useful when data from different databases is sucked up into a data lake, this way we can easily keep the data lake current with changes we care about.
Simple POC
You can use sample databases from Microsoft like AdventureWorks or Northwind or you can just invent your own. For simplicity and reasons of being a nerd I did the latter.
A. Create a container for the SQL Server with settings ready for CDC
Local Docker SQL Container
- Spin up a Docker SQL Container. This will start the container with Product ID (PID) as Developer Edition.
| 1 | docker run --name cdc-container -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Password!" -e MSSQL_AGENT_ENABLED=true -p 5901:1433 -d mcr.microsoft.com/mssql/server:2019-latest | 
- --name cdc-containernames the container, else you will get a random name like- evil_ptolemy
- -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Password!" -e MSSQL_AGENT_ENABLED=trueare environmental variables,- MSSQL_AGENT_ENABLEDis important as this agent runs the CDC jobs
- -p 5901:1433is the port, so 5001 is your machine which maps to 1433 in the container
-  -dmeans run detached from the console
- Connect to the instance from SQL Server Management Studio
| 1 | Server Name: localhost,5901 | 
- Create the sample database and table, I adapted these from w3schools. The table needs a primary key for CDC to track on.
| 1 | CREATE DATABASE databasename; | 
You should now see the database in the object explorer:

B. Confirgure the SQL database to track CDC changes
Setup CDC
- The ownership of the table should be sa, if you restored a sample database you will need to run ALTER AUTHORIZATION before verifying the owner with the SQL below
| 1 | SELECT d.name, d.owner_sid, sl.name | 

- Enable tracking changes, you can also disable with the commented out procedure below it.
| 1 | EXEC sys.sp_cdc_enable_db | 
- Verify table changes now exists, it will be blank but this means the above was successful. The cdcschema tables are surfaced underSystem Tables
| 1 | SELECT * FROM cdc.change_tables | 

- Enable changes for the table person.
| 1 | EXEC sys.sp_cdc_enable_table | 
- capture_instanceis the new table name. NULL will set the default based on the params example- cdc.dbo_person_CT. You could set this to be- foo_person_changes_CT(the- _CTis always suffixed by default)
- supports_net_changesis for when the column changes frequently, so only capture the net change (ie the last one)
- captured_column_listthese are column you want to track, you have to include the primary key
This will take a few seconds to complete as its creating jobs. The capture job captures the data as it changes from the log file. The cleanup job purges the captured data, it will only keep it for 3 days by default. You can change this with @retention= which needs to be in minutes.

If you want to disable the table you can use this SQL
| 1 | EXEC sys.sp_cdc_disable_table | 
- Checking the table changes should now surface that personis being tracked
| 1 | SELECT * FROM cdc.change_tables | 

- Now query the change table, it should be empty
| 1 | SELECT * FROM cdc.dbo_person_CT | 

- Knowing what each change table column is used for helps understand the data when quering it
- _$start_lsncommit log sequence number (LSN) within the same Transaction
- _$seqvalorder changes within a transaction
- _$operation1=delete, 2=insert,3=updatebefore, 4=updateafter
- _$update_maskfor insert and delete all bits are set, for update bits set correspond to columns changed
C. Test and view the CDC changes
Test CDC
Now we can finally test the tracked changed
- INSERT some data, this is _$operation2
| 1 | INSERT INTO databasename.dbo.person | 
- UPDATE the data, this is _$operation3 and 4
| 1 | UPDATE databasename.dbo.person | 
- DELETE the data, this is _$operation1
| 1 | DELETE FROM databasename.dbo.person | 
- Query the changes
| 1 | -- INSERTS | 
Here you can see for __$operation = 2 new data was created

For __$operation IN (3,4) we have the update before (3) and update after (4) and for both __$seqval is the same meaning it was in the same transaction

For __$operation = 1 new data was deleted
