Friday, July 10, 2015

Alternative to cursor-based code for maintenance operations

Introduction

Many maintenance operations in SQL Server require us to iterate over some list and perform one or more operations against each item in the list. For example, perhaps you want to backup a number of databases or rebuild a number of indexes. Since the basic BACKUP and ALTER INDEX commands cannot take variables for database or index names, we often see a pattern like this:

DECLARE mycursor as CURSOR FOR SELECT ... OPEN mycursor DECLARE @var1 ... FETCH NEXT FROM mycursor into @var1 ... WHILE @@FETCH_STATUS = 0 BEGIN -- do something with @var1 FETCH NEXT mycursor into @var1 END CLOSE mycursor DEALLOCATE mycursor

In Real Life

For a real-life example, consider ETL (Extract, Transform and Load).  A typical sequence runs like this:

  1. Extract rows from some source
  2. Change/Add (transform) the data
  3. Load the rows into some target table
Often, prior to step 3, it is helpful to disable the non-clustered indexes (NCIs), then re-build them when the load is done.  (As a side note, if your table has a clustered index, and you ensure proper ordering of your data to be loaded, you want the clustered index.  You can't disable it, since that effectively disables access to the table (since the clustered index is the table)).

Using a Cursor

Here is a script that will disable the NCIs for a table in the AdventureWorks database:


DECLARE cur CURSOR READ_ONLY FAST_FORWARD FOR SELECT i.NAME ,o.NAME FROM sys.indexes i INNER JOIN sys.objects o ON i.object_id = o.object_id INNER JOIN sys.schemas s ON s.schema_id = o.schema_id WHERE i.type_desc = 'NONCLUSTERED' AND o.type = 'U' AND o.NAME = 'EmployeeDepartmentHistory' AND s.NAME = 'HumanResources'; OPEN cur; DECLARE @ix_name SYSNAME, @db_name SYSNAME; FETCH NEXT FROM cur INTO @ix_name, @db_name; WHILE @@FETCH_STATUS = 0 BEGIN PRINT 'ALTER INDEX ' + @ix_name + ' ON ' + @db_name + ' DISABLE;' + CHAR(13) + CHAR(10); FETCH NEXT FROM cur INTO @ix_name, @db_name; END; CLOSE cur; DEALLOCATE cur;


This does the job quite handily.  We can even use similar logic to rebuild the indexes when the load is done (or simply REBUILD ALL).  What I'd like to show is a method to do the same without using a cursor.

Using a set-based operation

The basic idea is to build up the ALTER INDEX commands as a single NVARCHAR string, then use sp_executesql to run it.  The code looks like this:


DECLARE @sql AS NVARCHAR(MAX) = ( SELECT 'ALTER INDEX ' + i.NAME + ' ON ' + o.NAME + ' DISABLE;' + CHAR(13) + CHAR(10) FROM sys.indexes i INNER JOIN sys.objects o ON i.object_id = o.object_id INNER JOIN sys.schemas s ON s.schema_id = o.schema_id WHERE i.type_desc = 'NONCLUSTERED' AND o.type = 'U' AND o.NAME = 'EmployeeDepartmentHistory' AND s.NAME = 'HumanResources' FOR XML PATH(''), TYPE ).value('.', 'NVARCHAR(MAX)'); PRINT @sql;


As you can see, there is less code to write and debug,  I also find it less visually distracting.  The net results are exactly the same.  The performance is also the same, except for the front-end job of building the query string.  As is usually the case, the set-based code from the second example out-performs the first.  However, since we are doing this for maintenance-type jobs, there are not millions of rows (perhaps only hundreds or thousands) so the run time of the front-end code is overwhelmed by the run time of the executed code.

Summary

For many small jobs where a cursor seems a natural way to go, there is an alternative using a set-based operation to construct a list of commands and running the list in one execution.  Run times are essentially the same but the code can be more compact and done in just two statements (build the list, run the list) instead of the several statements required by a cursor-based approach.


No comments:

Post a Comment