When Loading Data, Should I Drop Indexes or Not?

I just ran a few simple tests in Azure SQL DB to see how each would perform.

I have a target table with an identity column that is the clustered primary key, and two other indexes that are the same except for the field order. (Whether having both is useful is a question for another day.) Here’s the DDL for the target table:

CREATE TABLE [ETL1](
	Field1 [numeric](12, 0) NULL,
	Field2 [smallint] NULL,
	Field3 [int] NULL,
	Field4 [smallint] NULL,
	Field5 [nvarchar](1) NULL,
	Field6 [nvarchar](2) NULL,
	Field7 [numeric](4, 0) NULL,
	Field8 [numeric](2, 0) NULL,
	Field9 [nvarchar](2) NULL,
	Field10 [nvarchar](8) NULL,
	Field11 [datetime2](7) NULL,
	Field12 [nvarchar](8) NULL,
	Field13 [datetime2](7) NULL,
	[UpdateType] [nchar](1) NOT NULL,
	[RowCreated] [datetime2](7) NOT NULL,
	Id BIGINT IDENTITY NOT NULL
) ON [PRIMARY]
GO

CREATE UNIQUE CLUSTERED INDEX iscdpk ON ETL1 (Id) WITH (DATA_COMPRESSION = PAGE);
CREATE UNIQUE NONCLUSTERED  INDEX iscd1 ON ETL1 (UpdateType, RowCreated, Field1, Field2) WITH (DATA_COMPRESSION = PAGE);
CREATE UNIQUE NONCLUSTERED  INDEX iscd2 ON ETL1 (Field1, Field2, UpdateType, RowCreated) WITH (DATA_COMPRESSION = PAGE);
GO

Test 1: Truncate the Target, Drop the Indexes, Insert the Data, Recreate the Indexes.

Test 2: Drop the Indexes, Truncate the Target, Insert the Data, Recreate the Indexes

Test 3: Just Truncate the Target and Insert the Data

Test 4: Truncate the Target, Drop the non-clustered Indexes (leaving the the clustered index on the identity column), Insert the Data, Recreate the non-clustered Indexes.

Here are the results. All timings are in milliseconds. These were run on a PRS1 instance of Azure SQL Database.

Test 1:
Trunc then
Drop Idxs
Test 2:
Drop before
Trunc
Test 3:
No Drop/
Create
Test 4:
Trunc but don’t
drop clustered
index
Truncate4 2 04
Drop PK8 4  n/a  n/a 
Drop Index 15 23,630 n/a 2
Drop Index 26 2  n/a 2
Insert 1.84 M rows83,033 82,315 161,706 83,205
Create PK20,454 21,205  n/a  n/a 
Create Index 112,149 12,264  n/a 12,265
Create Index 211,142 11,313  n/a 11,247
Total Time (ms)126,801 150,735 161,706 106,725
Total Time (mins)2.11 2.51 2.70 1.78
Delta (ms)0 23,934 34,905 (20,076)

Test 4 was the clear winner as it avoided the cost of recreating the clustered index. Which makes sense as the clustered index was being filled in order by the identity column as rows were added. Test 1 came in second, so if your clustered index is not on an identity column, or you have no clustered index, you are still better off dropping and recreating the indexes.

Conclusion: When inserting larger data sets into an empty table, drop the indexes before inserting the data, unless the index is clustered on an identity column.

2 thoughts on “When Loading Data, Should I Drop Indexes or Not?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.