Lesson 67. How To Pivot Data With T-SQL When Columns Are Not Predefined
The PIVOT function is great. Until you find out that the function demands to know the column names prior to running the query.
WACK! That's wack.
When does that even happen in the real world?! Never. That's not how this works. That's not how any of this works.
Fortunately, there is a solution that utilizes some dynamic SQL and the deep magic.
So, picture this exchange.
Accountant: Hey Bob, I need some data.
Me: Yo dawg! I'm your hook up! I'm about to HOOK you UP! What you want what you need dawg?
Accountant: I need to see the sales of all salespeople by month from the beginning of time.
Me: (sucks teeth) Man dawg that ain't nothing! I got you fam!
Accountant: Ok but I need to see it with people in rows and months as column headers.
Me: .....For real dawg?!
Unfortunately, this story is only partially made up. I don't really talk like this, but the request for the shape of his data totally happened.
But hey. This is business analytics. We're ALL about some customer service. So, we gives the people what they wants!
Some of this code is magic that I can't explain and that’s fine. I can use it, that’s what matters. I just keep this query handy and alter it every time I need to do this. Most of it is straight forward SQL. It’s the end that gets weird. We’re going to take all the possible columns and STUFF them into a variable using XML PATH. Then we are going to build our dynamic statement. Most of the code there should be familiar from the standard PIVOT example. The only difference is the columns variable is providing the columns instead of a hard coded value.
How To Use This Example
Create a SQL statement with columns in the form of y, x, z where:
y = the values that you want to see in rows x = the values that you want to see in columns z = the value that you want to see at the intersection of x and y coordinates
Alter the temp table schema to fit the column name and data type of your SQL statement.
Alter the rest of the query so column and table names are correct.
Alter the rest of the query to suit your use case.
In [ ]:
DECLARE @columns AS NVARCHAR(MAX)
DECLARE @sql AS NVARCHAR(MAX)
DROP TABLE IF EXISTS #SalesReport
--We turn month of sale into an string value so we can add a leading zero later.
CREATE TABLE #SalesReport (SalesPerson NVARCHAR(50), MonthOfSale NVARCHAR(2), TotalSales MONEY)
INSERT INTO #SalesReport (SalesPerson, MonthOfSale, TotalSales)
CONCAT(p.FirstName, ' ',p.LastName) AS SalesPerson,
DATEPART(month,soh.OrderDate) AS MonthOfSale,
SUM(sod.LineTotal) AS TotalSales
FROM Person.Person p
JOIN Sales.SalesPerson sp
ON p.BusinessEntityID = sp.BusinessEntityID
JOIN Sales.SalesOrderHeader soh
ON sp.BusinessEntityID = soh.SalesPersonID
JOIN Sales.SalesOrderDetail sod
ON soh.SalesOrderID = sod.SalesOrderDetailID
GROUP BY p.LastName, p.FirstName, DATEPART(month,soh.OrderDate)
ORDER BY p.LastName, p.FirstName, DATEPART(month,soh.OrderDate)
--Add the leading zero so the output goes from Jan to December.
SET MonthOfSale =
WHEN LEN(MonthOfSale) = 1 THEN '0'+ MonthOfSale
SELECT @columns = STUFF(
SELECT DISTINCT ',' + QUOTENAME(MonthOfSale)
FOR XML PATH(''), TYPE
SET @sql = 'SELECT SalesPerson, ' + @columns + ' FROM
SELECT SalesPerson, MonthOfSale, TotalSales
FOR MonthOfSale IN (' + @columns + ')
) y '
DROP TABLE #SalesReport