- 由 zhongluqiang创建, 最后修改于2月 11, 2022
使用学生表和成绩表来演示连接,SQL架构如下:
Create table If Not Exists student_info (id int, name varchar(255)); Create table If Not Exists student_score (id int, score int); Truncate table student_info; insert into student_info (id, name) values ('1', 'Wang'); insert into student_info (id, name) values ('2', 'Alice'); insert into student_info (id, name) values ('3', 'Allen'); Truncate table student_score; insert into student_score (id, score) values ('1', '65'); insert into student_score (id, score) values ('2', '70'); insert into student_score (id, score) values ('3', '75');
执行连接查询的语句如下:
mysql> SELECT * FROM student_score, student_info; +------+-------+------+-------+ | id | score | id | name | +------+-------+------+-------+ | 3 | 75 | 1 | Wang | | 2 | 70 | 1 | Wang | | 1 | 65 | 1 | Wang | | 3 | 75 | 2 | Alice | | 2 | 70 | 2 | Alice | | 1 | 65 | 2 | Alice | | 3 | 75 | 3 | Allen | | 2 | 70 | 3 | Allen | | 1 | 65 | 3 | Allen | +------+-------+------+-------+ 9 rows in set (0.00 sec)
上面的连接查询直接把多表放在了FROM后面,用逗号进行间隔,所以是内连接,与下面的语句是等效的:
mysql> SELECT * FROM student_info JOIN student_score; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 3 | Allen | 1 | 65 | | 2 | Alice | 1 | 65 | | 1 | Wang | 1 | 65 | | 3 | Allen | 2 | 70 | | 2 | Alice | 2 | 70 | | 1 | Wang | 2 | 70 | | 3 | Allen | 3 | 75 | | 2 | Alice | 3 | 75 | | 1 | Wang | 3 | 75 | +------+-------+------+-------+ 9 rows in set (0.00 sec)
从结果来推导一下内连接的行为。
首先,连接的关键字是join
,join
是一个及物动词,比如A join B
,字面上的意思就是A加入B,或者说A成为B的一员,这里B是主导,而A是次要。
回到SQL中的join
。SQL在生成连接时,默认生成的是笛卡儿积,也就是从一个表中取出一条记录,然后和另一个表的所有记录进行组合。这里到底是先取哪个表,就是根据join
来定的。比如student_info JOIN student_score
,这种写法下,student_score
表是主导,student_info
表是次要,student_info
表负责“加入”student_score
表,所以先取的记录来自于student_score
表中,这也可以通过上面的结果得到验证。来自student_score
表的列虽然位于结果集的右侧,但是结果集的顺序是按顺序先取student_score
表的记录,再组合全部的student_info
表记录。
虽然student_info JOIN student_score
写法下student_score
表是主导而student_info
表是次要,但是结果集将student_info
表的列放在了前面,这应该是为了符合阅读习惯而设计的。至于为什么来自student_info
表的顺序是倒序的,暂不清楚。
接下来讨论一下关于连接的过滤条件,还是先以内连接为例。
对于连接的过滤,有两个层次,第一层是在生成临时表时起作用,第二层是在临时表生成后再起作用。这两层对应关键字ON
和WHERE
。
ON
作用于临时表的生成。两表在连接时默认生成笛卡尔积,这个结果有可能规模巨大。加入ON条件进行过滤后,可以减小临时表的规模。
WHERE
在临时表生成之后起作用,这时已经没有连接的概念了。
虽然作用机制不同,但在内连接中,ON和WHERE实现的效果是一样的,以下是ON
和WHERE
的示例:
mysql> SELECT * FROM student_info JOIN student_score ON student_info.id = student_score.id; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 1 | Wang | 1 | 65 | | 2 | Alice | 2 | 70 | | 3 | Allen | 3 | 75 | +------+-------+------+-------+ 3 rows in set (0.00 sec) mysql> SELECT * FROM student_info JOIN student_score WHERE student_info.id = student_score.id; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 1 | Wang | 1 | 65 | | 2 | Alice | 2 | 70 | | 3 | Allen | 3 | 75 | +------+-------+------+-------+ 3 rows in set (0.00 sec)
接下来演示外连接。为了演示外连接,这里先删除student_score表中id为3的记录,也就是编号为3的学生没有成绩记录,如下:
mysql> DELETE FROM student_score WHERE id = 3; Query OK, 1 row affected (0.03 sec) mysql> SELECT * FROM student_score; +------+-------+ | id | score | +------+-------+ | 1 | 65 | | 2 | 70 | +------+-------+ 2 rows in set (0.00 sec)
先查询一下内连接的结果,如下:
mysql> SELECT * FROM student_info JOIN student_score; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 1 | Wang | 2 | 70 | | 1 | Wang | 1 | 65 | | 2 | Alice | 2 | 70 | | 2 | Alice | 1 | 65 | | 3 | Allen | 2 | 70 | | 3 | Allen | 1 | 65 | +------+-------+------+-------+ 6 rows in set (0.00 sec) mysql> SELECT * FROM student_info JOIN student_score ON student_info.id = student_score.id; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 1 | Wang | 1 | 65 | | 2 | Alice | 2 | 70 | +------+-------+------+-------+ 2 rows in set (0.00 sec) mysql> SELECT * FROM student_info JOIN student_score WHERE student_info.id = student_score.id; +------+-------+------+-------+ | id | name | id | score | +------+-------+------+-------+ | 1 | Wang | 1 | 65 | | 2 | Alice | 2 | 70 | +------+-------+------+-------+ 2 rows in set (0.00 sec)
这里可以看到,由于student_score表中id为3的记录已经删除,所以内连接生成的结果集中就没有id为3的成绩记录了。
外连接一定要on,参考:https://cloud.tencent.com/developer/ask/206061
参考:
- 无标签