This is a Rosetta Code post.
Story
Consider the following relational database: FooDatabase
In the context of this database a tenant
can only be one of App One
, App Two
or User
. Each User
has their own unique GUID and the applications have the following GUIDs:
Application Name | GUID |
---|---|
App One | 3ba2e2e8-f093-41d8-8daf-64d5385604ac |
App Two | efec5b17-47fc-4d79-ae74-6e8b023023d5 |
13 333 records in dbo.item_line
have been persisted incorrectly. The column dbo.item_line.tenant_id
has been populated with the GUID for App One
for the period 01 Jan 2018
to 31 Dec 2018
when it should have been populated with the GUID for the User
tenant. This date is stored in dbo.item_line.created_date_uct
.
Thankfully dbo.item_line.created_by_user_id
is a mandatory field and will contain valid user GUIDs. Note that not all users will exist in dbo.tenant
.
The cleanup must be idempotent, this means it can be run multiple times without changing the result beyond the initial story/task.
Task
Cleanup the data using a SQL script in a safe manner.
- Identify all records in
dbo.item_line
that need to be updated, print the count. - Create records in
dbo.tenant
that dont exist when linking todbo.item_line.created_by_user_id
, print this count too. - Update
dbo.item_line.tenant_id
to be the newly createddbo.tenant.id
when it did not exist, else update to bedbo.item_line.created_by_user_id
. Print the total count of updates. This should match the first count printed.
Using the supplied data the prints and counts should be:
1 | total dbo.item_line to be updated : 13333 |
Setup
Spin up a MS SQL Instance, you can then connect to it from Microsoft SQL Server Management Studio on localhost,1500
1 | docker run --name=rosetta-df -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=Password123" -p 1500:1433 -d mcr.microsoft.com/mssql/server:2017-latest |
1 | 001 CREATE DATABASE.sql |
This should get the data into a state that matches the counts above.
Solutions
set operations
A set is an exactly defined collection of objects, this can be accomplished by selecting into a #temptable.
cursor
Use a STATIC cursor to iterate over the data.
WITH common_table_expression
Specifies a temporary named result set, known as a common table expression (CTE).
References
- https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?redirectedfrom=MSDN&view=sql-server-ver15
- https://docs.microsoft.com/en-us/sql/t-sql/language-elements/declare-cursor-transact-sql?view=sql-server-ver15
- https://www.sqlshack.com/understanding-the-interaction-between-set-theory-and-set-operators-in-sql-server/
- https://sqlblog.org/2012/01/26/bad-habits-to-kick-thinking-a-while-loop-isnt-a-cursor