In contrast to Oracle where parallel DML is possible since a long time in SQL Server it is only until the 2016 last release where a parallel insert-select has been made possible. This article aims to illustrate this parallel operation in SQL Server and show how the resemblance with Oracle is very accurate.
I am going first to create the source and the target tables necessary for the demonstration
drop table t1; drop table t2; create table t1 (id INT, Nom VARCHAR(100), Prenom VARCHAR(100), Ville VARCHAR(100) ); insert into t1(id, Nom, Prenom, Ville) select TOP 1000000 ROW_NUMBER() over (order by a.name) ROWID, 'Bruno', CASE WHEN ROW_NUMBER() over (order by a.name)%2 =1 THEN 'Smith' ELSE 'Mohamed' END, CASE WHEN ROW_NUMBER() over (order by a.name)%10 =1 THEN 'Paris' WHEN ROW_NUMBER() over (order by a.name)%10 =5 THEN 'Lille' WHEN ROW_NUMBER() over (order by a.name)%10 =3 THEN 'Marseille' ELSE 'Londres' END FROM sys.all_objects a cross join sys.all_objects b; create table t2 (id INT, Nom VARCHAR(100), Prenom VARCHAR(100), Ville VARCHAR(100) );
I have created table t1 with 1,000,000 rows and have cloned it into an empty t2 table. I am going below to insert the content of t1 into t2 table hoping that this insert will operate in parallel.
Here’s below the SQL Server version I am using:
select @@version; Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64) Copyright (c) Microsoft Corporation Developer Edition
One of the parallel DML pre-requisites in SQL Server is the compatibility_level parameter which should have a value set to 130. Let’s verify this parameter value before trying the insert/select operation:
select name, compatibility_level from sys.databases where name = 'master'; name compatibility_level master 130
If the value of this parameter is not equal to 130 you can set it using the following command (I don’t know the impact of this change on your application so don’t change it without measuring its possible side effects ):
alter database master set compatibility_level = 130;
Finally I am now ready to launch the insert/select operation and gather its corresponding execution plan:
insert into t2 WITH (TABLOCK) select * from t1;
Notice that I have added a WITH (TABLOCK) clause to this insert so that it can operates in parallel. In close comparison with Oracle direct path load where a lock is automatically acquired on the inserted table (until the transaction is committed) it seems that in SQL Server we need to explicitly acquire this lock with the (TABLOCK) clause.
You might ask how I have figured out that the insert/select operation has been done in parallel. In fact the two yellow arrows in the above execution plan indicate the parallel nature of the execution. However the two yellow arrows do not indicate that the DOP (Degree Of Parallelism) is 2. If you want to know the actual DOP used by this operation you have to hover over the first operation (towards the left : INSERT) to see a tooltip showing that degree of parallelism if 4. However I still have not found a convenient way to capture a mouse tooltip using greenshot. When dealing with SQL Server execution plan I very often prefer the free version of the SQL Sentry plan explorer from which I have captured the following part of the above execution plan where you can see that the number of executions (Actual Executions or Starts in Oracle terms) equals 4 indicating that the DOP is 4:
Notice by the way that in contrast to the graphical execution plan where the INSERT operation doesn’t seem to be done in parallel the SQL Sentry Plan explorer is clearly indicating that the insert has been operated in parallel.
If you know how parallel process is handled in Oracle you will certainly not be disappointed when you start dealing with parallel processing in SQL Server. Almost all the parallel concepts are identical. That is:
- The maximum number of 2 concurrent DFO active per DFO tree
- The number of parallel servers (thread) which is 2 * DOP
- The different data parallel distribution between servers
- etc…
I have mentioned above that, one of the pre-requisites for a parallel insert/select operation to be successfully accomplished is the explicit lock of the inserted table. This has paved for me the way to check whether the Oracle direct path impeaching reasons: triggers and foreign keys can also restrict the parallel insert in SQL Server. Here’s then the demo; first with a trigger and second with a foreign key implemented on the t2 table:
CREATE TRIGGER t2_A_Ins_ ON t2 FOR INSERT AS begin declare @Firstname nvarchar(50) set @Firstname = 'Unknown' END;
truncate table t2; insert into t2 WITH (TABLOCK) select * from t1;
As you can point it out the insert trigger annihilates the parallel insert operations in SQL Server.
drop trigger master.t2_A_Ins; ALTER TABLE t1 ADD CONSTRAINT t1_pk UNIQUE (id); ALTER TABLE t2 ADD CONSTRAINT t2_t1_fk FOREIGN KEY (id) references t1(id);
And now an insert on table t2 having a foreign key
truncate table t2; insert into t2 WITH (TABLOCK) select * from t1;
Again the existence of a Foreign key on the t2 table pre-empted a parallel insert/select operation.
Funny enough the resemblance with Oracle.
Before pushing the “Publish” button I have played again with the model and from time to time I was unable to reproduce exactly the conclusions made in this article about the parallel insert impeachment reasons. I will certainly be back to this article when my investigations will be finished
