Window functions in MySQL allow you to perform calculations across a specific set of rows related to the current row, without collapsing the result into a single aggregate. These are useful for ranking, running totals, moving averages, and more.


Types of MySQL Window Functions

MySQL supports several categories of window functions:

  1. Ranking Functions: RANK(), DENSE_RANK(), ROW_NUMBER()
  2. Aggregate Functions: SUM(), AVG(), COUNT(), MAX(), MIN()
  3. Analytical Functions: LEAD(), LAG(), NTILE()
  4. Running Total and Moving Average: Using SUM() or AVG()

Example 1: Ranking Orders by Order Date

Goal: Rank orders by order date for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    RANK() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderRank
FROM Orders;

Explanation

  • PARTITION BY CustomerID groups the ranking within each customer.
  • ORDER BY OrderDate sorts the orders for ranking.
  • RANK() assigns ranks, with ties sharing the same rank and skipping numbers.

Example 2: Assigning Row Numbers to Orders

Goal: Assign a unique row number to each order for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum
FROM Orders;

Difference Between RANK() and ROW_NUMBER()

  • ROW_NUMBER() assigns a unique number to each row, even if there are ties.
  • RANK() assigns the same rank to tied values, skipping numbers.

Example 3: Cumulative Order Amount Per Customer

Goal: Calculate a running total of order amounts per customer.

SELECT 
    CustomerID,
    OrderID,
    OrderDate,
    SUM(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal
FROM Orders;

Explanation

  • SUM(Freight) OVER (...) calculates a running total of shipping costs (Freight).
  • PARTITION BY CustomerID resets the total for each customer.
  • ORDER BY OrderDate ensures the cumulative sum follows the chronological order.

Example 4: Moving Average of Freight Costs

Goal: Calculate a 3-order moving average of freight costs.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    Freight,
    AVG(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Orders;

Explanation

  • AVG(Freight) OVER (...) computes the moving average.
  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ensures only the last 3 orders (including the current) are considered.

Example 5: Finding Previous Order Date (LAG)

Goal: Find the previous order date for each order per customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS PrevOrderDate
FROM Orders;

Explanation

  • LAG(OrderDate) OVER (...) gets the previous order date.
  • PARTITION BY CustomerID ensures we compare only within the same customer.

Example 6: Finding Next Order Date (LEAD)

Goal: Find the next order date for each order per customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    LEAD(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS NextOrderDate
FROM Orders;

Explanation

  • LEAD(OrderDate) OVER (...) retrieves the next order’s date.

Example 7: Grouping Orders into Buckets (NTILE)

Goal: Divide orders into 4 equal-sized buckets for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    NTILE(4) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderGroup
FROM Orders;

Explanation

  • NTILE(4) OVER (...) assigns orders into 4 equal-sized groups.
  • Useful for ranking customers into quartiles or performance segments.

Key Takeaways

  • Window functions allow calculations over partitions of rows while keeping all records.
  • Ranking functions (RANK(), ROW_NUMBER()) help in ordered listings.
  • Analytical functions (LAG(), LEAD()) retrieve previous/next row values.
  • Aggregate functions (SUM(), AVG()) help in cumulative calculations.

Would you like a real dataset example, or a visualization for better understanding? 🚀

Window functions in MySQL allow you to perform calculations across a specific set of rows related to the current row, without collapsing the result into a single aggregate. These are useful for ranking, running totals, moving averages, and more.


Types of MySQL Window Functions

MySQL supports several categories of window functions:

  1. Ranking Functions: RANK(), DENSE_RANK(), ROW_NUMBER()
  2. Aggregate Functions: SUM(), AVG(), COUNT(), MAX(), MIN()
  3. Analytical Functions: LEAD(), LAG(), NTILE()
  4. Running Total and Moving Average: Using SUM() or AVG()

Example 1: Ranking Orders by Order Date

Goal: Rank orders by order date for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    RANK() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderRank
FROM Orders;

Explanation

  • PARTITION BY CustomerID groups the ranking within each customer.
  • ORDER BY OrderDate sorts the orders for ranking.
  • RANK() assigns ranks, with ties sharing the same rank and skipping numbers.

Example 2: Assigning Row Numbers to Orders

Goal: Assign a unique row number to each order for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    ROW_NUMBER() OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RowNum
FROM Orders;

Difference Between RANK() and ROW_NUMBER()

  • ROW_NUMBER() assigns a unique number to each row, even if there are ties.
  • RANK() assigns the same rank to tied values, skipping numbers.

Example 3: Cumulative Order Amount Per Customer

Goal: Calculate a running total of order amounts per customer.

SELECT 
    CustomerID,
    OrderID,
    OrderDate,
    SUM(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS RunningTotal
FROM Orders;

Explanation

  • SUM(Freight) OVER (...) calculates a running total of shipping costs (Freight).
  • PARTITION BY CustomerID resets the total for each customer.
  • ORDER BY OrderDate ensures the cumulative sum follows the chronological order.

Example 4: Moving Average of Freight Costs

Goal: Calculate a 3-order moving average of freight costs.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    Freight,
    AVG(Freight) OVER (PARTITION BY CustomerID ORDER BY OrderDate ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS MovingAvg
FROM Orders;

Explanation

  • AVG(Freight) OVER (...) computes the moving average.
  • ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ensures only the last 3 orders (including the current) are considered.

Example 5: Finding Previous Order Date (LAG)

Goal: Find the previous order date for each order per customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    LAG(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS PrevOrderDate
FROM Orders;

Explanation

  • LAG(OrderDate) OVER (...) gets the previous order date.
  • PARTITION BY CustomerID ensures we compare only within the same customer.

Example 6: Finding Next Order Date (LEAD)

Goal: Find the next order date for each order per customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    LEAD(OrderDate) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS NextOrderDate
FROM Orders;

Explanation

  • LEAD(OrderDate) OVER (...) retrieves the next order’s date.

Example 7: Grouping Orders into Buckets (NTILE)

Goal: Divide orders into 4 equal-sized buckets for each customer.

SELECT 
    OrderID,
    CustomerID,
    OrderDate,
    NTILE(4) OVER (PARTITION BY CustomerID ORDER BY OrderDate) AS OrderGroup
FROM Orders;

Explanation

  • NTILE(4) OVER (...) assigns orders into 4 equal-sized groups.
  • Useful for ranking customers into quartiles or performance segments.

Key Takeaways

  • Window functions allow calculations over partitions of rows while keeping all records.
  • Ranking functions (RANK(), ROW_NUMBER()) help in ordered listings.
  • Analytical functions (LAG(), LEAD()) retrieve previous/next row values.
  • Aggregate functions (SUM(), AVG()) help in cumulative calculations.

Would you like a real dataset example, or a visualization for better understanding? 🚀