Introduction

SQL Window Functions are powerful tools that allow you to perform calculations across a set of table rows related to the current row without collapsing the result into a single group. Unlike aggregate functions, which summarize data into a single row per group, window functions retain individual rows while applying calculations over a “window” of rows.

This guide covers:

  • Different types of window functions
  • PARTITION BY and ORDER BY
  • Real-world use cases

What are Window Functions?

A window function performs a calculation across a subset (window) of rows related to the current row. It uses the OVER() clause, which defines how the window is partitioned and ordered.

Syntax:

SELECT column_name, 
       window_function() OVER (
           PARTITION BY column_to_partition 
           ORDER BY column_to_order
       ) AS alias
FROM table_name;

Types of Window Functions and When to Use Them

1. ROW_NUMBER() - Assigns a unique sequential number to each row.

When to Use:

  • When you need a unique rank for each row in an ordered dataset.
  • To remove duplicates while keeping the first occurrence.

Example: Get the first purchase of each customer

SELECT customer_id, order_id, order_date,
       ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS order_rank
FROM orders;

2. RANK() - Assigns a rank to each row, with gaps when there are ties.

When to Use:

  • When you want to rank items but allow tied values to share the same rank.

Example: Rank products by total sales

SELECT product_id, SUM(quantity) AS total_sold,
       RANK() OVER (ORDER BY SUM(quantity) DESC) AS sales_rank
FROM sales
GROUP BY product_id;

3. DENSE_RANK() - Similar to RANK(), but without gaps.

When to Use:

  • When ranking should be sequential even if values are the same.

Example: Rank employees by salary

SELECT employee_id, department, salary,
       DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;

4. NTILE(N) - Divides rows into N equal buckets.

When to Use:

  • When you need to split data into quartiles, deciles, or percentiles.

Example: Divide students into quartiles based on marks

SELECT student_id, marks,
       NTILE(4) OVER (ORDER BY marks DESC) AS quartile
FROM students;

5. LAG() - Fetches the previous row’s value.

When to Use:

  • When you need to compare a value with the previous row’s value.

Example: Compare this month’s vs. last month’s sales

SELECT month, total_sales,
       LAG(total_sales, 1) OVER (ORDER BY month) AS previous_month_sales,
       total_sales - LAG(total_sales, 1) OVER (ORDER BY month) AS sales_change
FROM monthly_sales;

6. LEAD() - Fetches the next row’s value.

When to Use:

  • When you want to compare a value with the next row’s value.

Example: Find the difference between an employee’s salary and the next highest salary

SELECT employee_id, salary,
       LEAD(salary) OVER (ORDER BY salary DESC) AS next_salary,
       LEAD(salary) OVER (ORDER BY salary DESC) - salary AS salary_difference
FROM employees;

7. SUM() OVER() & AVG() OVER() - Running totals and moving averages.

When to Use:

  • SUM() OVER() for cumulative totals.
  • AVG() OVER() for moving averages.

Example: Calculate cumulative sales per region

SELECT region, month, total_sales,
       SUM(total_sales) OVER (PARTITION BY region ORDER BY month) AS cumulative_sales
FROM regional_sales;

Summary: Which Function to Use When?

Use CaseFunction to Use
Assigning unique numbersROW_NUMBER()
Ranking with gapsRANK()
Ranking without gapsDENSE_RANK()
Dividing data into N groupsNTILE(N)
Comparing previous valuesLAG()
Comparing next valuesLEAD()
Cumulative totalSUM() OVER()
Moving averageAVG() OVER()

Image credit: Vedant Mehta on LinkedIn

By using SQL Window Functions, you can efficiently analyze trends, rankings, and comparisons in datasets. Start experimenting with these functions to improve your SQL queries!