数据库知识点总结(未完待续)

​ 本文内容记载在我在学习数据库这门课程中的知识点,内容涵盖对sql关键字的讲解,sql语句的具体案例及分析。

SQL中的关键字讲解

distinct用法

作用于单列

1
2
> select distinct name from A
>

>

作用于多列

1
2
> select distinct name, id from A //实际上是根据name和id两个字段来去重的,这种方式Access和SQL Server同时支持。
>

COUNT统计

1
2
3
4
5
6
> select count(distinct name) from A; --表中name去重后的数目, SQL Server支持,而Access不支持
> count是不能统计多个字段的,下面的SQL在SQL Server和Access中都无法运行。
> select count(distinct name, id) from A;
> 若想使用,请使用嵌套查询,如下:
> select count(*) from (select distinct xing, name from B) AS M;
>

distinct必须放在开头

1
2
> select id, distinct name from A; --会提示错误,因为distinct必须放在开头
>

Union&&Union ALL的用法

UNION 操作符用于合并两个或多个 SELECT 语句的结果集。

请注意,UNION 内部的 SELECT 语句必须拥有相同数量的列。列也必须拥有相似的数据类型。同时,每条 SELECT 语句中的列的顺序必须相同。

Union因为要进行重复值扫描,所以效率低。如果合并没有刻意要删除重复行,那么就使用Union All

两个要联合的SQL语句 字段个数必须一样,而且字段类型要“相容”(一致)

如果我们需要将两个select语句的结果作为一个整体显示出来,我们就需要用到union或者union all关键字。union(或称为联合)的作用是将多个结果合并在一起显示出来。

union和union all的区别是,union会自动压缩多个结果集合中的重复结果,而union all则将所有的结果全部显示出来,不管是不是重复。

Union:对两个结果集进行并集操作,不包括重复行,同时进行默认规则的排序;

Union All:对两个结果集进行并集操作,包括重复行,不进行排序;

Intersect:对两个结果集进行交集操作,不包括重复行,同时进行默认规则的排序;

Minus:对两个结果集进行差操作,不包括重复行,同时进行默认规则的排序。

可以在最后一个结果集中指定Order by子句改变排序方式。

1
2
3
4
5
6
7
> select * from emp where deptno >= 20
> union all
> select * from emp where deptno <= 30
> select empno,ename from emp
> union
> select deptno,dname from dept
>

案例表结构

c、o、p、a表结构

表结构如下:

1
2
3
4
5
6
7
> CREATE TABLE customers (//顾客表
> cid CHAR(4) NOT NULL,
> cname VARCHAR(13),
> city VARCHAR(20),
> discnt REAL,
> PRIMARY KEY(cid) );
>

>

1
2
3
4
5
6
7
> CREATE TABLE agents (//代理商表
> aid CHAR(3) NOT NULL,
> aname VARCHAR(13),
> city VARCHAR(20),
> percent SMALLINT,
> PRIMARY KEY (aid) );
>

>

1
2
3
4
5
6
7
8
> CREATE TABLE products (//产品表
> pid CHAR(3) NOT NULL,
> pname VARCHAR(13),
> city VARCHAR(20),
> quantity INTEGER,
> price DOUBLE PRECISION,
> PRIMARY KEY(pid) );
>

>

1
2
3
4
5
6
7
8
9
10
> CREATE TABLE orders (//订单表
> ordno INTEGER NOT NULL,
> month CHAR(3),
> cid CHAR(4),
> aid CHAR(3),
> pid CHAR(3),
> qty INTEGER,
> dollars DOUBLE PRECISION,
> PRIMARY KEY(ordno) );
>

>

关系代数

1
2
3
4
5
6
> query in relational algebra (Theta-Join)
> ( R where Condition ) [ A1, A2, ..., Am ]
> SELECT A1, A2, ..., Am
> FROM R
> WHERE Condition ;
>

>

1
2
3
4
5
6
7
8
> query in relational algebra (PRODUCT)
> ((R1R2...Rn) where Condition) [A1,A2,...,Am]
> query in SQL
> SELECT A1, A2, ..., Am
> FROM R1, R2, …, Rn
> WHERE Condition ;
>
>

>

1
2
3
4
5
6
7
> query in relational algebra (Theta-Join)
> ( R Condition S ) [ A1, A2, ..., Am ]
> query in SQL
> SELECT A1, A2, ..., Am
> FROM R, S
> WHERE Condition ;
>

>

1
2
3
4
5
6
7
8
> Exp 4.3.1 Find aid and names of agents that are based in New York.
> Relational Algebra
> (AGENTS where city=‘New York’) [ aid, aname ]
> SQL
> SELECT aid, aname
> FROM agents
> WHERE city = 'New York’ ;
>

>

1
2
3
4
> Exp 4.3.2 Display all values of customers in table CUSTOMERS.
> Relational Algebra
> CUSTOMERS [ cid, cname, city, discnt ]
>
1
2
3
4
5
> Exp 4.3.4 Retrieve all (cname, aname) pairs where the customer places an order through the agent.
> Relational Algebra
> ( C[cid, cname] oo O ) oo A ) [ cname, aname ]//join 连接
> ((C x O x A) where C.cid=O.cid and O.aid=A.aid) [ C.cname, A.aname ]//乘积
>
1
2
3
> Exp 4.5.5: Find cids for customers who order all products ordered by customer c006.
> ORDERS [ cid, pid ] 除以 ( ORDERS where cid = ‘c006’ ) [ pid ]
>

>

子查询

子查询的定义

定义:子查询本质上是嵌套进其他SELECT,UPDATE,INSERT,DELETE语句的一个被限制的SELECT语句,在子查询中,只有下面几个子句可以使用

  1. SELECT子句(必须)
  2. FROM子句(必选)
  3. WHERE子句(可选)
  4. GROUP BY(可选)
  5. HAVING(可选)
  6. ORDER BY(只有在TOP关键字被使用时才可用)

子查询也可以嵌套在其他子查询中,这个嵌套最多可达32层。子查询也叫内部查询(Inner query)或者内部选择(Inner Select),而包含子查询的查询语句也叫做外部查询

子查询的谓词

in 谓词(predicate)expr [NOT] IN ( subquery )

The Quantified Comparison Predicate (量化比较谓词) expr q SOME|ANY|ALL( subquery )

The EXISTS Predicate [NOT] EXISTS ( subquery )

The BETWEEN Predicate expr [NOT] BETWEEN expr1 AND expr2

The IS NULL Predicate column IS [NOT] NULL

The LIKE Predicate column [NOT] LIKE val1 [ ESCAPE val2 ]

§underscore ( _ ): any single character

§percent ( % ): any sequence of zero or morecharacters

子查询的执行分析

​ 原文链接:点击

子查询的简单描述

通常来讲,子查询按照子查询所返回数据的类型,可以分为三种,分别为:

  1. 返回一张数据表(Table)
  2. 返回一列值(Column)
  3. 返回单个值(Scalar)

子查询做数据源的使用

当子查询在外部查询的FROM子句之后使用时,子查询被当作一个数据源使用,即使这时子查询只返回一个单一值(Scalar)或是一列值(Column),在这里依然可以看作一个特殊的数据源,即一个二维数据表(Table).作为数据源使用的子查询很像一个View(视图),只是这个子查询只是临时存在,并不包含在数据库中。

1
2
3
4
5
6
> SELECT P.ProductID, P.Name, P.ProductNumber, M.Name AS ProductModelName
> FROM Production.Product AS P INNER JOIN
> (SELECT Name, ProductModelID
> FROM Production.ProductModel) AS M
> ON P.ProductModelID = M.ProductModelID
>

>

子查询作为选择条件的使用

​ 作为选择条件的子查询也是子查询相对最复杂的应用.

​ 作为选择条件的子查询是那些只返回一列(Column)的子查询,如果作为选择条件使用,即使只返回单个值,也可以看作是只有一行一列.

1
2
3
4
5
6
7
> select distinct cid
> from orders
> where aid IN (
> select aid --uncorrelated Subquery:(a05,a06)非相关的子查询,更快
> from agents
> where city= ‘Duluth’ or city=‘Dallas’)
>

>

但是要强调的是,不要用IN和NOT IN关键字,这会引起很多潜在的问题,这篇文章对这个问题有着很好的阐述:http://wiki.lessthandot.com/index.php/Subquery_typo_with_using_in。这篇文章的观点是永远不要再用IN和NOT IN关键字,我的观点是存在即合理,我认为只有在IN里面是固定值的时候才可以用IN和NOT IN,比如: IN (25,33)

只有在上面这种情况下,使用IN和NOT IN关键字才是安全的,其他情况下,最好使用EXISTS,NOT EXISTS,JOIN关键字来进行替代. 除了IN之外,用于选择条件的关键字还有ANYALL,这两个关键字和其字面意思一样. 和”<”,”>”,”=”连接使用

在作为ANY和ALL关键字在子查询中使用时,所实现的效果如下

=ANY 和IN等价
<>ALL 和NOT IN等价
>ANY 大于最小的(>MIN)
<ANY 小于最大的(<MAX)
>ALL 大于最大的(>MAX)
<ALL 小于最小的(<MIN)
=ALL 下面说

=ALL关键字很少使用,这个的效果在子查询中为如果只有一个返回值,则和“=”相等,而如果有多个返回值,结果为空。

这里要注意,SQL是一种很灵活的语言,就像子查询所实现的效果可以使用JOIN来实现一样(效果一样,实现思路不同),ANY和ALL所实现的效果也完全可以使用其他方式来替代,按照上面表格所示,>ANY和>MIN完全等价,比如下面两个查询语句完全等价

IN is =SOME

NOT IN is <>ALL

相关子查询和EXISTS关键字

前面所说的查询都是无关子查询(Uncorrelated subquery),子查询中还有一类很重要的查询是相关子查询(Correlated subquery),也叫重复子查询比如,还是上面那个查询,用相关子查询来写:

1
2
3
4
5
6
7
8
9
> SELECT distinct cname FROM customers c
> WHERE ‘p05’ IN ( select pid --correlated Subquery 相关的子查询
> from orders o
> where o.cid=c.cid );
> SELECT [FirstName],[MiddleName],[LastName] FROM [AdventureWorks].[Person].[Contact] c
> WHERE EXISTS(SELECT *
> FROM [AdventureWorks].[HumanResources].[Employee] e
> WHERE c.ContactID=e.ContactID AND e.SickLeaveHours>68)
>

>

如何区别相关子查询无关子查询呢?最简单的办法就是直接看子查询本身能否执行。

上面的无关子查询,整个查询过程可以看作是子查询首先返回SQLResult(SQL结果集),然后交给外部查询使用,整个过程子查询只执行一次

而相反,作为相关子查询,子查询的执行的次数依赖于外部查询,外部查询每执行一行,子查询执行一次。

如上面代码所示。上面的相关子查询实际上会执行N次(N取决与外部查询的行数),外部查询每执行一行,都会将对应行所用的参数传到子查询中,如果子查询有对应值,则返回TRUE(既当前行被选中并在结果中显示),如果没有,则返回FALSE。然后重复执行下一行。

子查询作为计算列使用

子查询作为计算列使用时,只返回单个值(Scalar) 。用在SELECT语句之后,作为计算列使用。同样分为相关子查询无关子查询

相关子查询的例子比如:我想取得每件产品的名称和总共的销量。

表t_product id ,productName;

表t_order id ,orderNo,productId

1
2
3
4
5
> select productName , (select count(*) from t_order o where p.id=o.productId ) as totalNum
>
> from t_product p ;//相关的子查询
>
>

>

子查询作为计算列使用时,会针对外部查询的每一行,返回唯一的值。

同样的,SQL子查询都可以使用其他语句达到同样的效果,上面的语句和如下语句达到同样的效果:

1
2
3
4
> select productName ,count(*) as total from t_product p,t_order o where p.id = o.productId group by productName;
>
> select productName ,count(*) as total from t_product p join t_order o on p.id = o.productId group by productName;
>

>

子查询作为计算列且作为无关子查询时使用,只会一次性返回但一值,这里就不再阐述了。

子查询案例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Exp 4.4.1 Retrieve cids of customers who place orders with agents in Duluth or Dallas
SQL 1
select distinct cid
from orders o, agents a
where a.aid=o.aid and
(a.city=‘Duluth’ or a.city=‘Dallas’);
SQL 2 (FAST Why?)
select distinct cid
from orders
where aid IN (
select aid --uncorrelated Subquery:(a05,a06)非相关的子查询,更快
from agents
where city= ‘Duluth’ or city=‘Dallas’)
1
2
3
4
Exp 4.4.2 Get all information concerning agents based in Duluth or Dallas.
SELECT *
FROM agents
WHERE city IN ( ‘Duluth’, ‘Dallas’ ) ;
1
2
3
4
5
6
7
8
9
Exp 4.4.3 Get the names and discounts of all customers who place orders through agents in Duluth or Dallas.
SELECT cname, discnt FROM customers
WHERE cid IN (
SELECT o.cid
FROM orders o
WHERE o.aid IN (
SELECT a.aid
FROM agents a
WHERE a.city IN (‘Duluth’, ‘Dallas’))) ;
1
2
3
4
5
6
7
8
9
Exp 4.4.4 Find the names of customers who order product p05. (use uncorrelated Subquery?)相关的子查询
SQL(1)
SELECT distinct cname FROM customers c, orders o WHERE c.cid=o.cid and o.pid=‘p05’ ;
SQL(2)
SELECT distinct cname FROM customers c
WHERE ‘p05’ IN ( select pid --correlated Subquery 相关的子查询
from orders o
where o.cid=c.cid );
1
2
3
4
5
6
7
8
9
Exp 4.4.6 Find ordno values for all orders placed by customers in Duluth through agents in New York.
SELECT ordno
FROM orders
WHERE (cid, aid) IN
(select cid, aid
from customers c, agents a
where c.city=‘Duluth’ and a.city=‘New York’) ;
//这是两个表的笛卡儿积,先查询c表符合的条件,再查询a表符合的条件,做笛卡尔的乘积,让子查询作为选择条件。
select p.id,o.id from t_product p ,t_order o;
1
2
3
4
5
6
7
Exp 4.4.7 Find aid values of agents with a minimum percent commission(佣金).
SELECT aid
FROM agents
WHERE percent <= ALL (
SELECT percent
FROM agents
) ;
1
2
3
4
5
6
7
8
//找到所有与达拉斯或波士顿的客户相同的折扣客户
Exp 4.4.8 Find all customers who have the same discount as that of any of the customers in Dallas or Boston.
SELECT cid, cname
FROM customers
WHERE discnt = SOME (
select discnt
from customers
where city=‘Dallas’ or city=‘Boston’ ) ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Exp 4.4.9 Find cid values of customers with discnt smaller than those of any customers who live in Duluth.
SELECT cid
FROM customers
WHERE discnt < ALL (
SELECT discnt
FROM customers
WHERE city = ‘Duluth’ ) ;
SELECT cid
FROM customers c1
WHERE NOT EXISTS (
SELECT *
FROM customers c2
WHERE c2.city = ‘Duluth’ and
c1.discnt >= c2. discnt ) ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Retrieve all customer names where the customer places an order through agent a05.
SELECT distinct cname FROM customers c, orders o WHERE c.cid = o.cid and o.aid = ‘a05’ ;
SELECT distinct cname
FROM customers
WHERE cid IN ( SELECT cid
FROM orders
WHERE aid = ‘a05’ ) ;
SELECT distinct cname
FROM customers c
WHERE EXISTS (
SELECT *
FROM orders o
WHERE o.cid=c.cid and o.aid=‘a05’ ) ;
1
2
3
4
5
6
7
8
9
Find all customer names where the customer does not place an order through agent a05.
((C[cid] – (O where aid = ‘a05’) [cid])  C) [cname]
SELECT cname FROM customers
WHERE cid (
SELECT o.cid FROM orders o WHERE o.aid = ‘a05’ ) ;
SELECT cname FROM customers c
WHERE NOT EXISTS (
SELECT * FROM orders o
WHERE o.cid = c.cid and o.aid = ‘a05’ ) ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
[Example] Find all cid, aid pairs where the customer does not place an order through the agent.
SELECT cid, aid
FROM customers c, agents a
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = a.aid );
SELECT cid
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = ‘a03’ ) ;
1
2
3
4
5
6
7
8
9
10
Exp 2.9.4: Find products that have never been ordered by a customer based in New York through an agent based in Boston.
T1 := (C where city = ‘New York’)[cid]
T2 := (((T1  O)  A) where city = ‘Boston’) [pid]
T3 := P[pid] – T2
SQL:
SELECT p.pid
FROM products p
WHERE p.pid NOT IN (
SELECT o.pid FROM customers c, agents a, orders o WHERE c.city=‘New York’ and
a.city=‘Boston’ and c.cid=o.cid and o.aid=a.aid ) ;
1
[Exp 2.9.11] List pids of products that are ordered through agents who place orders for (possibly different) customers who order at least one product from an agent who has placed an order for customer c001.

除法运算的理解

除法运算的定义

投影运算的含义简单点就是:从表中选择需要的属性列。

给定关系R(X,Y)和S(Y,Z),其中X,Y,Z为属性组。R中Y与S中的Y可以有不同的属性名,但必须出自相同的域集。R与S的除运算可以得到一个新的关系P(X),P是R中满足下列条件的元组在X 属性列上的投影: 元组在X上的分量值x的像集Y(x)包含S在Y上的投影的集合。
求解步骤过程:
第一步:找出关系R和关系S中相同的属性,即Y属性。在关系S中对Y做投影(即将Y列取出);
第二步:被除关系R中与S中不相同的属性列是X,关系R在属性X上做取消重复值的投影;
第三步:求关系R中X属性对应的像集Y;
第四步:判断包含关系,R÷S其实就是判断关系R中X各个值的像集Y是否包含关系S中属性Y的所有值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
> 列是属性,行是元组..
> 而且作投影之后可能会出现重复项,比如:
> A B C
> a1 b1 c1
> a1 b2 c2
> a2 b2 c3
> 作A的投影就是a1, a2; 减少了一行
> 总结:
> 并:属性不变,元组可能增加(集合相等时不增加)
> 交:属性不变,元组可能减少(集合相等时不减少)
> 投影:属性可能减少(全投影时不减少),元组可能减少(投影后无重复项时不减少)
> 笛卡尔积:属性增加,元组可能增加(只有1个元组时不增加)
> 除运算:
>
1
2
3
> 除运算:
>
>

>

R:

A B C
a1 b1 c2
a2 b3 c7
a3 b4 c6
a1 b2 c3
a4 b6 c6
a2 b2 c3
a1 b2 c1

S:

B C D
b1 c2 d1
b2 c1 d1
b2 c3 d2

R÷S

A
a1

(1) 找S与R的共同属性,其元组看做整体 k
(2)选择R中包含k的 非S与R相同属性的 属性 即为R÷S
*/
解答如下:
在关系R中,A可以取四个值{a1,a2,a3,a4},其中:
a1的象集为:{(b1,c2),(b2,c3),(b2,c1)}就是a1 对应bc属性上的值
a2的象集为:{(b3,c7),(b2,c3)}
a3的象集为:{(b4,c6)}
a4的象集为:{(b6,c6)}
S在(B,C)上的投影为{(b1,c2),(b2,c3),(b2,c1)}。,只取BC两列
显然只有a1的象集(B,C)a1包含S在(B,C)属性组上的投影,全部包含,所以R÷S={a1}

SQL中除法运算的实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SQL中除法运算的实现
R(X,Y)÷S(Y,Z)的运算用结构化语言SQL 语句可表达为下列形式:
select distinct R.X from R R1
where not exists
( select S.Y from S
where not exists
( select * from R R2
where R2.X=R1.X and R2.Y=S.Y ) )
同理, 基于多属性关系除法的SQL 表达形式
select distinct R1.X1, R1.X2, ⋯, R1.Xn
from R R1
where not exists
( select S.Y1, S.Y2, ⋯, S.Ym from S
where not exists
( select * from R R2
where R2.X1=R1.X1 and R2.X2=R1.X2 and……and R2.Xn=R1.Xn and
R2.Y1=S.Y1 and R2.Y2=S.Y2 and……and R2.Ym=S.Ym) )

除法运算的案例

1
2
3
4
5
6
7
8
9
10
11
The division operation in Relational Algebra
[Example 4.5.2] Find cids of customers who place orders with ALL agents based in New York.
o[cid, aid]  (a where city=‘New York’)[aid]
SELECT c.cid FROM customers c
WHERE NOT EXISTS (
SELECT * FROM agents a
WHERE a.city = ‘New York’ and NOT EXISTS (
SELECT * FROM orders o
WHERE o.cid = c.cid and o.aid = a.aid ) )

ALL是除法运算的一个关键词

1
2
3
4
5
6
7
8
9
10
11
Exp 4.5.3: Get the aid values of agents in New York or Duluth who place orders for all products costing more than a dollar.
SELECT aid
FROM agents a
WHERE (city=‘New York’ or city=‘Duluth’) and
NOT EXISTS (
SELECT *
FROM products p
WHERE p.price > 1 and NOT EXISTS (
SELECT *
FROM orders o
WHERE o.aid=a.aid and o.pid=p.pid));
1
2
3
4
5
6
7
8
9
Exp 4.5.4: Find aid values of agents who place orders for product p01 as well as for all products costing more than a dollar.
SELECT aid FROM agents a
WHERE aid IN ( select aid from orders
where pid=‘p01’ )
and NOT EXISTS (
SELECT * FROM products p
WHERE p.price > 1 and NOT EXISTS (
SELECT * FROM orders o
WHERE o.aid=a.aid and o.pid=p.pid))
1
2
Exp 4.5.5: Find cids for customers who order all products ordered by customer c006.
ORDERS [ cid, pid ] 除以 ( ORDERS where cid = ‘c006’ ) [ pid ]

×

纯属好玩

扫码支持
扫码打赏,你说多少就多少

打开支付宝扫一扫,即可进行扫码打赏哦

文章目录
  1. 1. SQL中的关键字讲解
    1. 1.1. distinct用法
    2. 1.2. Union&&Union ALL的用法
  2. 2. 案例表结构
    1. 2.1. c、o、p、a表结构
  3. 3. 关系代数
  4. 4. 子查询
    1. 4.1. 子查询的定义
    2. 4.2. 子查询的谓词
    3. 4.3. 子查询的执行分析
      1. 4.3.1. 子查询做数据源的使用
      2. 4.3.2. 子查询作为选择条件的使用
      3. 4.3.3. 相关子查询和EXISTS关键字
      4. 4.3.4. 子查询作为计算列使用
    4. 4.4. 子查询案例
  5. 5. 除法运算的理解
    1. 5.1. 除法运算的定义
    2. 5.2. SQL中除法运算的实现
    3. 5.3. 除法运算的案例
Fork me on GitHub