Azure Synapse Analytics Dedicated SQL Pool (Data Warehouse) SQL Database

Azure Link For SQL

Azure Link For SQL enables near real time analytics over operational data in an external SQL Database.

Azure Link can now be used with SQL Server 2022 and Azure Database, and it works by replicating the specified tables into a Synapse SQL Pool automatically, in an incremental way. This allows to avoid creating a (possibly) complex ETL process to keep the Datawarehouse in sync.

This integration can be done with a SQL Server and with an Azure SQL Database. For SQL Server a Master key needs to be created both in the Datawarehouse and the SQL Server, and it is required to use a Self-Hosted integration runtime hosted in a machine with access to the server.

For Azure SQL Database the process is much simpler, but there’s a pricing limitation as Synapse Link can only be used in Databases with at least Standard tier (so no Basic tier) and 100 DTUs. The cost of this is 147.19 USD every month, so this limitation is something to keep in mind:

Minimum specification to be able to use Synapse link. Lowering the Data max size doesn’t have an impact on the cost.

Creating a Synapse Link for Azure SQL Database

In this case I’m going to do this example using the “Azure SQL Database” path as it’s simpler. So the first step is to create a Linked Service connecting to the Azure SQL Database as usual (in this case I’ve used System Managed Identity for authentication). As stated previously, the Database needs to be at least Standard Tier and have 100 DTUs and Managed Identity enabled.

After that we can create the Link:

During the process you can select in which SQL Pool do you want to copy, and which tables are going to replicated. It is important to keep in mind that only tables with a primary key can be used.

With the link connection created, you have the chance to further customize the process, renaming the tables, changing the distribution type and the structure type (which need to be set to ‘Heap’ whenever your table contains nvarchar(max) columns), and deciding whether you want to drop already existing tables or not. Once everything is set up the link connection can be published and started. After this, no further steps needs to be taken in Synapse. You can stop the link at any point and monitor the connection:

After this, every insertion, update or deletion in the source will be reflected in the SQL Pool in less than a minute.