Skip to content

Commit

Permalink
Merge pull request #1080 from redis/DOC-4744-rdi-sql-server-prep
Browse files Browse the repository at this point in the history
DOC-4744 added updated SQL Server prep instructions
  • Loading branch information
andy-stark-redis authored Jan 21, 2025
2 parents 0bf0f69 + 62aba4a commit fc3905b
Showing 1 changed file with 114 additions and 69 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -16,66 +16,82 @@ type: integration
weight: 2
---

To prepare your SQL Server database for Debezium, you must first run a query to
enable CDC globally and then separately enable CDC for each table you want to
To prepare your SQL Server database for Debezium, you must first create a dedicated Debezium user,
run a script to enable CDC globally, and then separately enable CDC for each table you want to
capture. You need administrator privileges to do this.

Once you enable CDC, it captures all of the INSERT, UPDATE, and DELETE operations
on your chosen tables. The Debezium connector can then emit these events to
[Kafka topics](https://kafka.apache.org/intro#intro_concepts_and_terms).
on your chosen tables. The Debezium connector can then emit these events to RDI.

## 1. Enable CDC on the database
## 1. Create a Debezium user

There are two system stored procedures to enable CDC (you need
administrator privileges to run these). Use `sys.sp_cdc_enable_db`
to enable CDC for the whole database and then
You can run the procedure with SQL Server Management Studio or with
Transact-SQL.

Before running the procedure, ensure that:

- You are a member of the `sysadmin` fixed server role for the SQL Server.
- You are a `db_owner` of the database.
- The SQL Server Agent is running.

Then, follow the steps below to enable CDC:
It is strongly recommended to create a dedicated Debezium user for the connection between RDI
and the source database. When using an existing user, ensure that the required
permissions are granted and that the user is added to the CDC role.

1. From the **View** menu in SQL Server Management Studio, click **Template Explorer**.
1. Create a user with the Transact-SQL below:

1. In the Template Browser, expand **SQL Server Templates**.
```sql
USE master
GO
CREATE LOGIN MyUser WITH PASSWORD = 'My_Password'
GO
USE MyDB
GO
CREATE USER MyUser FOR LOGIN MyUser
GO
```

1. Expand **Change Data Capture > Configuration** and then click **Enable Database for CDC**.
Replace `MyUser`, `My_Password` and `MyDB` with your chosen values.

1. In the template, replace the database name in the `USE` statement with the name of the
database where you want to enable CDC. For example, if your database was called
`myDB`, the template would be:
1. Grant required permissions:

```sql
USE master
GO
GRANT VIEW SERVER STATE TO MyUser
GO
USE MyDB
GO
EXEC sys.sp_cdc_enable_db
EXEC sp_addrolemember N'db_datareader', N'MyUser'
GO
```

1. Run the stored procedure `sys.sp_cdc_enable_db` to enable CDC for the database.
## 2. Enable CDC on the database

When you enable CDC for the database, it creates a schema called `cdc` and also
a CDC user, metadata tables, and other system objects.
There are two system stored procedures to enable CDC (you need
administrator privileges to run these). Use `sys.sp_cdc_enable_db`
to enable CDC for the whole database and then `sys.sp_cdc_enable_table` to enable CDC for individual tables.

Before running the procedures, ensure that:

Keep the **Change Data Capture > Configuration** foldout open in the Template Explorer
because you will need it to enable CDC on the individual tables next.
- You are a member of the `sysadmin` fixed server role for the SQL Server.
- You are a `db_owner` of the database.
- The SQL Server Agent is running.

Then, assuming your database is called `MyDB`, run the script below to enable CDC:

## 2. Enable CDC for the tables you want to capture
```sql
USE MyDB
GO
EXEC sys.sp_cdc_enable_db
GO
```

You must also enable CDC on the tables you want Debezium to capture using the
following steps (again, you need administrator privileges for this):
{{< note >}}For SQL Server on AWS RDS, you must use a different stored procedure:
```sql
EXEC msdb.dbo.rds_cdc_enable_db 'Chinook'
GO
```
{{< /note >}}

When you enable CDC for the database, it creates a schema called `cdc` and also
a CDC user, metadata tables, and other system objects.

1. With the **Change Data Capture > Configuration** foldout still open in the
Template Explorer, select **Enable Table Specifying Filegroup Option**.
## 3. Enable CDC for the tables you want to capture

1. In the template, replace the table name in the USE statement with the name of
the table you want to capture. For example, if your table was called `MyTable`
then the template would look like the following:
1. You must also enable CDC on the tables you want Debezium to capture using the
following commands (again, you need administrator privileges for this):

```sql
USE MyDB
Expand All @@ -85,38 +101,34 @@ following steps (again, you need administrator privileges for this):
@source_schema = N'dbo',
@source_name = N'MyTable',
@role_name = N'MyRole',
@filegroup_name = N'MyDB_CT',
@supports_net_changes = 0
GO
```

Repeat this for every table you want to capture.

{{< note >}}The value for `@role_name` can’t be a fixed database role, such as `db_datareader`.
Specifying a new name will create a corresponding database role that has full access to the
captured change data.
{{< /note >}}

1. Run the stored procedure `sys.sp_cdc_enable_table` to enable CDC for
the table.
1. Add the Debezium user to the CDC role:

1. Repeat steps 1 to 3 for every table you want to capture.
```sql
USE MyDB
GO
EXEC sp_addrolemember N'MyRole', N'MyUser'
GO
```

## 3. Check that you have access to the CDC table
## 4. Check that you have access to the CDC table

You can use another stored procedure `sys.sp_cdc_help_change_data_capture`
to query the CDC information for the database and check you have enabled
it correctly. Before doing this, check that:

* You have `SELECT` permission on all of the captured columns of the capture instance.
If you are a member of the `db_owner` database role then you can view information for
all of the defined capture instances.
* You are a member of any gating roles that are defined for the table that the query includes.

Follow the steps below to run `sys.sp_cdc_help_change_data_capture`:

1. From the **View** menu in SQL Server Management Studio, click **Object Explorer**.

1. From the Object Explorer, expand **Databases**, and then expand your database
object, for example, `MyDB`.

1. Expand **Programmability > Stored Procedures > System Stored Procedures**.
it correctly. To do this, connect as the Debezium user you created previously (`MyUser`).

1. Run the `sys.sp_cdc_help_change_data_capture` stored procedure to query
the table. For example, if your database was called `MyDB` then you would
the CDC configuration. For example, if your database was called `MyDB` then you would
run the following:

```sql
Expand All @@ -131,14 +143,31 @@ Follow the steps below to run `sys.sp_cdc_help_change_data_capture`:
access. If the result is empty then you should check that you have privileges
to access both the capture instance and the CDC tables.

## SQL Server on Azure
### Troubleshooting

You can also use the Debezium SQL Server connector with SQL Server on Azure.
See Microsoft's guide to
[configuring SQL Server on Azure for CDC with Debezium](https://learn.microsoft.com/en-us/samples/azure-samples/azure-sql-db-change-stream-debezium/azure-sql%2D%2Dsql-server-change-stream-with-debezium/)
for more information.
If no CDC is happening then it might mean that SQL Server Agent is down. You can check for this using the SQL query shown below:

```sql
IF EXISTS (SELECT 1
FROM master.dbo.sysprocesses
WHERE program_name = N'SQLAgent - Generic Refresher')
BEGIN
SELECT @@SERVERNAME AS 'InstanceName', 1 AS 'SQLServerAgentRunning'
END
ELSE
BEGIN
SELECT @@SERVERNAME AS 'InstanceName', 0 AS 'SQLServerAgentRunning'
END
```

If the query returns a result of 0, you need to need to start SQL Server Agent using the following commands:

### SQL Server capture job agent configuration parameters
```sql
EXEC xp_servicecontrol N'START',N'SQLServerAGENT';
GO
```

## SQL Server capture job agent configuration parameters

In SQL Server, the parameters that control the behavior of the capture job agent
are defined in the SQL Server table `msdb.dbo.cdc_jobs`. If you experience performance
Expand Down Expand Up @@ -169,6 +198,13 @@ of the Debezium SQL Server connector:

See the SQL Server documentation for more information about capture agent parameters.

## SQL Server on Azure

You can also use the Debezium SQL Server connector with SQL Server on Azure.
See Microsoft's guide to
[configuring SQL Server on Azure for CDC with Debezium](https://learn.microsoft.com/en-us/samples/azure-samples/azure-sql-db-change-stream-debezium/azure-sql%2D%2Dsql-server-change-stream-with-debezium/)
for more information.
## Handling changes to the schema
RDI can't adapt automatically when you change the schema of a CDC table in SQL Server. For example,
Expand All @@ -186,19 +222,28 @@ documentation for further details.
1. Create a new capture table for the updated source table by running the `sys.sp_cdc_enable_table` stored
procedure with a new, unique value for the parameter `@capture_instance`. For example, if the old value
was `dbo_customers`, you could replace it with `dbo_customers_v2`:
was `dbo_MyTable`, you could replace it with `dbo_MyTable_v2` (you can see the existing values by running
stored procedure `sys.sp_cdc_help_change_data_capture`):
```sql
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'customers', @role_name = NULL, @supports_net_changes = 0, @capture_instance = 'dbo_customers_v2';
EXEC sys.sp_cdc_enable_table
@source_schema = N'dbo',
@source_name = N'MyTable',
@role_name = N'MyRole',
@capture_instance = N'dbo_MyTable_v2',
@supports_net_changes = 0
GO
```
1. When Debezium starts streaming from the new capture table, drop the old capture table by running
the `sys.sp_cdc_disable_table` stored procedure with the parameter `@capture_instance` set to the old
capture instance name, `dbo_customers`:
capture instance name, `dbo_MyTable`:
```sql
EXEC sys.sp_cdc_disable_table @source_schema = 'dbo', @source_name = 'dbo_customers', @capture_instance = 'dbo_customers';
EXEC sys.sp_cdc_disable_table
@source_schema = N'dbo',
@source_name = N'MyTable',
@capture_instance = N'dbo_MyTable'
GO
```
Expand Down

0 comments on commit fc3905b

Please sign in to comment.