Advanced SQL Join Techniques
1. Nested Loop Join
The nested loops join, also called nested iteration, uses one join input as the outer input table (shown as the top input in the graphical execution plan) and one as the inner (bottom) input table. The outer loop consumes the outer input table row by row. The inner loop, executed for each outer row, searches for matching rows in the inner input table.
Information sources:
- Understanding Nested Loops Joins – Microsoft
- Nested Loop Join – Wikipedia
- Nested Loop Join Algorithms – MySQL
Examples:
In Microsoft SQL Server the Nested Loop Join will be chosen if one of the tables is small and the other table has an index on the column that joins the tables. The join type can be enforced by using the OPTION clause.
SELECT a.field1 FROM table1 a JOIN table2 b ON a.table1_id = b.table2_id
There are three types of Nested-Loop join:
- Naive Nested-loop join – scans an entire table or index.
- Index Nested-loop join – performs lookups in an index to fetch rows
- Temporary index nested-loop join – uses temporary index
2. Hash Join
Hash joins are used for many types of set-matching operations: inner join; left, right, and full outer join; left and right semi-join; intersection; union; and difference. Moreover, a variant of the hash join can do duplicate removal and grouping (such as SUM(salary) GROUP BY department). These modifications use only one input for both the build and probe roles.
Information sources:
Currently, MySQL doesn’t support this.
3. Merge Join
The merge join requires that both inputs be sorted on the merge columns, which are defined by the equality (WHERE) clauses of the join predicate. The query optimizer typically scans an index, if one exists on the proper set of columns, or places a sort operator below the merge join. In rare cases, there may be multiple equality clauses, but the merge columns are taken from only some of the available equality clauses.
Information sources:
Further reading:
