A SELF JOIN is a join where a table is joined with itself. It is particularly useful when dealing with hierarchical data, such as employees reporting to managers. In this case, we treat one instance of the table as the “employees” and another instance as the “managers.”
Use Case
A common use of a SELF JOIN is to find relationships within the same table. For example, in an employees table, you might want to see which employees report to which managers.
Employees Table Structure
We will use an Employees table from the Northwind database as an example. The relevant columns are:
EmployeeID | LastName | FirstName | ReportsTo |
---|---|---|---|
1 | Davolio | Nancy | NULL |
2 | Fuller | Andrew | 1 |
3 | Leverling | Janet | 2 |
4 | Peacock | Margaret | 2 |
5 | Buchanan | Steven | 2 |
6 | Suyama | Michael | 5 |
7 | King | Robert | 5 |
- EmployeeID: Unique identifier for employees.
- ReportsTo: Refers to the
EmployeeID
of the employee’s manager.
Goal
Create a report showing:
- EmployeeID
- Employee’s LastName and FirstName
- Manager’s LastName and FirstName
Self Join Query
To achieve this, we can perform a SELF JOIN on the Employees
table by joining it with itself. We will alias one instance as e
(employee) and the other as m
(manager).
SELECT
e.EmployeeID AS EmployeeID,
CONCAT(e.LastName, ', ', e.FirstName) AS Employee,
CONCAT(m.LastName, ', ', m.FirstName) AS Manager
FROM
Employees e
LEFT JOIN
Employees m ON e.ReportsTo = m.EmployeeID
ORDER BY
e.EmployeeID;
Explanation of the Query
Employees e
: The first instance of the table represents the employees.Employees m
: The second instance of the table represents the managers.e.ReportsTo = m.EmployeeID
: TheReportsTo
column in the employee table matches theEmployeeID
of the manager.LEFT JOIN
: Ensures that employees without a manager (e.g., the CEO) are still included in the result.ORDER BY e.EmployeeID
: Sorts the result by the employee’s ID.
Result
EmployeeID | Employee | Manager |
---|---|---|
1 | Davolio, Nancy | NULL |
2 | Fuller, Andrew | Davolio, Nancy |
3 | Leverling, Janet | Fuller, Andrew |
4 | Peacock, Margaret | Fuller, Andrew |
5 | Buchanan, Steven | Fuller, Andrew |
6 | Suyama, Michael | Buchanan, Steven |
7 | King, Robert | Buchanan, Steven |
Key Points
- Hierarchical Data: A SELF JOIN is ideal for exploring hierarchical relationships, like employees reporting to managers.
- Alias Usage: Aliases (
e
andm
) make it clear which instance of the table refers to employees and which to managers. - Handling Null Values: The
LEFT JOIN
ensures that employees without a manager (e.g., the CEO) are displayed with aNULL
manager.
Conclusion
A SELF JOIN is a powerful way to query hierarchical data in the same table. By using aliases and careful join conditions, you can extract meaningful relationships, such as employees and their managers, as shown in this example.