A SELF JOIN is a join where a table is joined with itself. It is particularly useful when dealing with hierarchical data, such as employees reporting to managers. In this case, we treat one instance of the table as the “employees” and another instance as the “managers.”


Use Case

A common use of a SELF JOIN is to find relationships within the same table. For example, in an employees table, you might want to see which employees report to which managers.


Employees Table Structure

We will use an Employees table from the Northwind database as an example. The relevant columns are:

EmployeeIDLastNameFirstNameReportsTo
1DavolioNancyNULL
2FullerAndrew1
3LeverlingJanet2
4PeacockMargaret2
5BuchananSteven2
6SuyamaMichael5
7KingRobert5
  • EmployeeID: Unique identifier for employees.
  • ReportsTo: Refers to the EmployeeID of the employee’s manager.

Goal

Create a report showing:

  1. EmployeeID
  2. Employee’s LastName and FirstName
  3. Manager’s LastName and FirstName

Self Join Query

To achieve this, we can perform a SELF JOIN on the Employees table by joining it with itself. We will alias one instance as e (employee) and the other as m (manager).

SELECT 
    e.EmployeeID AS EmployeeID,
    CONCAT(e.LastName, ', ', e.FirstName) AS Employee,
    CONCAT(m.LastName, ', ', m.FirstName) AS Manager
FROM 
    Employees e
LEFT JOIN 
    Employees m ON e.ReportsTo = m.EmployeeID
ORDER BY 
    e.EmployeeID;

Explanation of the Query

  1. Employees e: The first instance of the table represents the employees.
  2. Employees m: The second instance of the table represents the managers.
  3. e.ReportsTo = m.EmployeeID: The ReportsTo column in the employee table matches the EmployeeID of the manager.
  4. LEFT JOIN: Ensures that employees without a manager (e.g., the CEO) are still included in the result.
  5. ORDER BY e.EmployeeID: Sorts the result by the employee’s ID.

Result

EmployeeIDEmployeeManager
1Davolio, NancyNULL
2Fuller, AndrewDavolio, Nancy
3Leverling, JanetFuller, Andrew
4Peacock, MargaretFuller, Andrew
5Buchanan, StevenFuller, Andrew
6Suyama, MichaelBuchanan, Steven
7King, RobertBuchanan, Steven

Key Points

  1. Hierarchical Data: A SELF JOIN is ideal for exploring hierarchical relationships, like employees reporting to managers.
  2. Alias Usage: Aliases (e and m) make it clear which instance of the table refers to employees and which to managers.
  3. Handling Null Values: The LEFT JOIN ensures that employees without a manager (e.g., the CEO) are displayed with a NULL manager.

Conclusion

A SELF JOIN is a powerful way to query hierarchical data in the same table. By using aliases and careful join conditions, you can extract meaningful relationships, such as employees and their managers, as shown in this example.

A SELF JOIN is a join where a table is joined with itself. It is particularly useful when dealing with hierarchical data, such as employees reporting to managers. In this case, we treat one instance of the table as the “employees” and another instance as the “managers.”


Use Case

A common use of a SELF JOIN is to find relationships within the same table. For example, in an employees table, you might want to see which employees report to which managers.


Employees Table Structure

We will use an Employees table from the Northwind database as an example. The relevant columns are:

EmployeeIDLastNameFirstNameReportsTo
1DavolioNancyNULL
2FullerAndrew1
3LeverlingJanet2
4PeacockMargaret2
5BuchananSteven2
6SuyamaMichael5
7KingRobert5
  • EmployeeID: Unique identifier for employees.
  • ReportsTo: Refers to the EmployeeID of the employee’s manager.

Goal

Create a report showing:

  1. EmployeeID
  2. Employee’s LastName and FirstName
  3. Manager’s LastName and FirstName

Self Join Query

To achieve this, we can perform a SELF JOIN on the Employees table by joining it with itself. We will alias one instance as e (employee) and the other as m (manager).

SELECT 
    e.EmployeeID AS EmployeeID,
    CONCAT(e.LastName, ', ', e.FirstName) AS Employee,
    CONCAT(m.LastName, ', ', m.FirstName) AS Manager
FROM 
    Employees e
LEFT JOIN 
    Employees m ON e.ReportsTo = m.EmployeeID
ORDER BY 
    e.EmployeeID;

Explanation of the Query

  1. Employees e: The first instance of the table represents the employees.
  2. Employees m: The second instance of the table represents the managers.
  3. e.ReportsTo = m.EmployeeID: The ReportsTo column in the employee table matches the EmployeeID of the manager.
  4. LEFT JOIN: Ensures that employees without a manager (e.g., the CEO) are still included in the result.
  5. ORDER BY e.EmployeeID: Sorts the result by the employee’s ID.

Result

EmployeeIDEmployeeManager
1Davolio, NancyNULL
2Fuller, AndrewDavolio, Nancy
3Leverling, JanetFuller, Andrew
4Peacock, MargaretFuller, Andrew
5Buchanan, StevenFuller, Andrew
6Suyama, MichaelBuchanan, Steven
7King, RobertBuchanan, Steven

Key Points

  1. Hierarchical Data: A SELF JOIN is ideal for exploring hierarchical relationships, like employees reporting to managers.
  2. Alias Usage: Aliases (e and m) make it clear which instance of the table refers to employees and which to managers.
  3. Handling Null Values: The LEFT JOIN ensures that employees without a manager (e.g., the CEO) are displayed with a NULL manager.

Conclusion

A SELF JOIN is a powerful way to query hierarchical data in the same table. By using aliases and careful join conditions, you can extract meaningful relationships, such as employees and their managers, as shown in this example.