Lesson 58. The Only Bloody Good Reason To Use Cursors

(AKA Moving Large Amounts Of Data Between Tables)

In general, cursors are the devil. They are slow. They are from a processing paradigm of a bygone era. You should be focusing on developing set-based solutions and not loops.

However, cursors can be useful for loading large amounts of data. I am talking about loads that take hours where, if the load fails, it will take even more hours to unwind.

In this case, you can use cursors to commit batches of data and create checkpoints in your load process. Below is an example of how to do this.

Examples

Comprehensive Cursor Example

Below is a comprehensive example of how to load a lot of data with a cursor. We are going to use the sample dataset "Flights Table From the nycflights13 Dataset".

The file is about 30MB. Not large by today's standard, but large enough that you get the idea.

In [ ]:

USE demo

DECLARE @Year INT
DECLARE @Month INT
DECLARE @i INT = 1

DROP TABLE IF EXISTS FlightsStaging
DROP TABLE IF EXISTS SelectFlightData

CREATE TABLE FlightsStaging(
year NVARCHAR(255) NULL,
month NVARCHAR(255) NULL,
day NVARCHAR(255) NULL,
dep_time NVARCHAR(255) NULL,
sched_dep_time NVARCHAR(255) NULL,
dep_delay NVARCHAR(255) NULL,
arr_time NVARCHAR(255) NULL,
sched_arr_time NVARCHAR(255) NULL,
arr_delay NVARCHAR(255) NULL,
carrier NVARCHAR(255) NULL,
flight NVARCHAR(255) NULL,
tailnum NVARCHAR(255) NULL,
origin NVARCHAR(255) NULL,
dest NVARCHAR(255) NULL,
air_time NVARCHAR(255) NULL,
distance NVARCHAR(255) NULL,
hour NVARCHAR(255) NULL,
minute NVARCHAR(255) NULL,
time_hour NVARCHAR(255) NULL,
)

CREATE TABLE SelectFlightData(
carrier NVARCHAR(255) NULL,
flight NVARCHAR(255) NULL,
tailnum NVARCHAR(255) NULL,
BatchLoadNumber TINYINT NULL,
)

BULK INSERT FlightsStaging
FROM 'E:\flights.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '0x0a',
FIRSTROW = 2
);

DECLARE BatchingCursor CURSOR FOR
SELECT DISTINCT year, month
FROM FlightsStaging

OPEN BatchingCursor;
FETCH NEXT FROM BatchingCursor INTO @Year, @Month;
WHILE @@FETCH_STATUS = 0
BEGIN

BEGIN TRANSACTION
INSERT INTO SelectFlightData(carrier, flight, tailnum, BatchLoadNumber)
SELECT carrier, flight, tailnum, @i
FROM FlightsStaging
WHERE year = @Year AND month = @Month     
COMMIT TRANSACTION

SET @i = @i + 1

FETCH NEXT FROM BatchingCursor INTO @Year, @Month;
END;
CLOSE BatchingCursor;
DEALLOCATE BatchingCursor;
GO

SELECT *
FROM SelectFlightData

SELECT BatchLoadNumber, COUNT(BatchLoadNumber) AS NumberOfRecordsLoadedInBatch
FROM SelectFlightData
GROUP BY BatchLoadNumber
ORDER BY BatchLoadNumber

DROP TABLE FlightsStaging
DROP TABLE SelectFlightData

Last updated