Window functions in MySQL allow you to perform calculations across a specific set of rows related to the current row, without collapsing the result into a single aggregate. These are useful for ranking, running totals, moving averages, and more.
Types of MySQL Window Functions
MySQL supports several categories of window functions:
- Ranking Functions:
RANK()
,DENSE_RANK()
,ROW_NUMBER()
- Aggregate Functions:
SUM()
,AVG()
,COUNT()
,MAX()
,MIN()
- Analytical Functions:
LEAD()
,LAG()
,NTILE()
- Running Total and Moving Average: Using
SUM()
orAVG()
Example 1: Ranking Orders by Order Date
Goal: Rank orders by order date for each customer.
SELECT
OrderID,
CustomerID,
OrderDate,
RANK() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderRank
FROM Orders;
Explanation
PARTITION BY CustomerID
groups the ranking within each customer.ORDER BY OrderDate
sorts the orders for ranking.RANK()
assigns ranks, with ties sharing the same rank and skipping numbers.
Example 2: Assigning Row Numbers to Orders
Goal: Assign a unique row number to each order for each customer.
SELECT
OrderID,
CustomerID,
OrderDate,
ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum
FROM Orders;
Difference Between RANK()
and ROW_NUMBER()
ROW_NUMBER()
assigns a unique number to each row, even if there are ties.RANK()
assigns the same rank to tied values, skipping numbers.
Example 3: Cumulative Order Amount Per Customer
Goal: Calculate a running total of order amounts per customer.
SELECT
CustomerID,
OrderID,
OrderDate,
SUM(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal
FROM Orders;
Explanation
SUM(Freight) OVER (...)
calculates a running total of shipping costs (Freight
).PARTITION BY CustomerID
resets the total for each customer.ORDER BY OrderDate
ensures the cumulative sum follows the chronological order.
Example 4: Moving Average of Freight Costs
Goal: Calculate a 3-order moving average of freight costs.
SELECT
OrderID,
CustomerID,
OrderDate,
Freight,
AVG(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Orders;
Explanation
AVG(Freight) OVER (...)
computes the moving average.ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
ensures only the last 3 orders (including the current) are considered.
Example 5: Finding Previous Order Date (LAG)
Goal: Find the previous order date for each order per customer.
SELECT
OrderID,
CustomerID,
OrderDate,
LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS PrevOrderDate
FROM Orders;
Explanation
LAG(OrderDate) OVER (...)
gets the previous order date.PARTITION BY CustomerID
ensures we compare only within the same customer.
Example 6: Finding Next Order Date (LEAD)
Goal: Find the next order date for each order per customer.
SELECT
OrderID,
CustomerID,
OrderDate,
LEAD(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS NextOrderDate
FROM Orders;
Explanation
LEAD(OrderDate) OVER (...)
retrieves the next order’s date.
Example 7: Grouping Orders into Buckets (NTILE
)
Goal: Divide orders into 4 equal-sized buckets for each customer.
SELECT
OrderID,
CustomerID,
OrderDate,
NTILE(4) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderGroup
FROM Orders;
Explanation
NTILE(4) OVER (...)
assigns orders into 4 equal-sized groups.- Useful for ranking customers into quartiles or performance segments.
Key Takeaways
- Window functions allow calculations over partitions of rows while keeping all records.
- Ranking functions (
RANK()
,ROW_NUMBER()
) help in ordered listings. - Analytical functions (
LAG()
,LEAD()
) retrieve previous/next row values. - Aggregate functions (
SUM()
,AVG()
) help in cumulative calculations.
Would you like a real dataset example, or a visualization for better understanding? 🚀