When Loading Data, Should I Drop Indexes or Not?

I just ran a few simple tests in Azure SQL DB to see how each would perform.

I have a target table with an identity column that is the clustered primary key, and two other indexes that are the same except for the field order. (Whether having both is useful is a question for another day.) Here’s the DDL for the target table:

CREATE TABLE [ETL1](
	Field1 [numeric](12, 0) NULL,
	Field2 [smallint] NULL,
	Field3 [int] NULL,
	Field4 [smallint] NULL,
	Field5 [nvarchar](1) NULL,
	Field6 [nvarchar](2) NULL,
	Field7 [numeric](4, 0) NULL,
	Field8 [numeric](2, 0) NULL,
	Field9 [nvarchar](2) NULL,
	Field10 [nvarchar](8) NULL,
	Field11 [datetime2](7) NULL,
	Field12 [nvarchar](8) NULL,
	Field13 [datetime2](7) NULL,
	[UpdateType] [nchar](1) NOT NULL,
	[RowCreated] [datetime2](7) NOT NULL,
	Id BIGINT IDENTITY NOT NULL
) ON [PRIMARY]
GO

CREATE UNIQUE CLUSTERED INDEX iscdpk ON ETL1 (Id) WITH (DATA_COMPRESSION = PAGE);
CREATE UNIQUE NONCLUSTERED  INDEX iscd1 ON ETL1 (UpdateType, RowCreated, Field1, Field2) WITH (DATA_COMPRESSION = PAGE);
CREATE UNIQUE NONCLUSTERED  INDEX iscd2 ON ETL1 (Field1, Field2, UpdateType, RowCreated) WITH (DATA_COMPRESSION = PAGE);
GO

Test 1: Truncate the Target, Drop the Indexes, Insert the Data, Recreate the Indexes.

Test 2: Drop the Indexes, Truncate the Target, Insert the Data, Recreate the Indexes

Test 3: Just Truncate the Target and Insert the Data

Test 4: Truncate the Target, Drop the non-clustered Indexes (leaving the the clustered index on the identity column), Insert the Data, Recreate the non-clustered Indexes.

Here are the results. All timings are in milliseconds. These were run on a PRS1 instance of Azure SQL Database.

	Test 1: Trunc then Drop Idxs	Test 2: Drop before Trunc	Test 3: No Drop/ Create	Test 4: Trunc but don’t drop clustered index
Truncate	4	2	0	4
Drop PK	8	4	n/a	n/a
Drop Index 1	5	23,630	n/a	2
Drop Index 2	6	2	n/a	2
Insert 1.84 M rows	83,033	82,315	161,706	83,205
Create PK	20,454	21,205	n/a	n/a
Create Index 1	12,149	12,264	n/a	12,265
Create Index 2	11,142	11,313	n/a	11,247
Total Time (ms)	126,801	150,735	161,706	106,725
Total Time (mins)	2.11	2.51	2.70	1.78
Delta (ms)	0	23,934	34,905	(20,076)

Test 4 was the clear winner as it avoided the cost of recreating the clustered index. Which makes sense as the clustered index was being filled in order by the identity column as rows were added. Test 1 came in second, so if your clustered index is not on an identity column, or you have no clustered index, you are still better off dropping and recreating the indexes.

Conclusion: When inserting larger data sets into an empty table, drop the indexes before inserting the data, unless the index is clustered on an identity column.

2 thoughts on “When Loading Data, Should I Drop Indexes or Not?”

One Man went to Mow…

…went to mow a meadow. One man's journey into uncharted territory.

When Loading Data, Should I Drop Indexes or Not?

2 thoughts on “When Loading Data, Should I Drop Indexes or Not?”

Leave a comment Cancel reply

Share this:

Related

2 thoughts on “When Loading Data, Should I Drop Indexes or Not?”

Leave a comment Cancel reply