1. 登陸:mysql -u root -p php
2. phpMyAdmin建立數據庫,並導入.sql文件。html
3. 支持中文:set names utf8; python
參考:面試寶典----數據庫【一些經典的問題】mysql
詳見: http://www.runoob.com/sql/sql-select.htmlweb
SELECT name,country FROM Websites;面試
SELECT DISTINCT country FROM Websites;正則表達式
SELECT * FROM websites ORDER BY country,alexa DESC; # 不寫明ASC DESC的時候,默認是ASCsql
SELECT * FROM Websites LIMIT 2; # MS使用select top, oracle使用rownum 數據庫
SELECT * FROM Websites WHERE id=1;oracle
運算符 | 描述 |
---|---|
= | 等於 |
<> | 不等於。註釋:在 SQL 的一些版本中,該操做符可被寫成 != |
> | 大於 |
< | 小於 |
>= | 大於等於 |
<= | 小於等於 |
BETWEEN | 在某個範圍內 |
LIKE | 搜索某種模式 |
IN | 指定針對某個列的多個可能值 |
IS NULL | e.g. where comm is null Goto: SQL NULL 函數 |
REGEXP | 操做正則表達式 |
補充 - 邏輯運算
Select * from emp where sal > 2000 and sal < 3000; Select * from emp where sal > 2000 or comm > 500; select * from emp where not sal > 1500;
Select * from emp where ename like 'M%';
Select * from emp where ename like '[CK]ars[eo]n'; # 將搜索下列字符串:Carsen、Karsen、Carson 和 Karson(如 Carson)
查詢 EMP 表中 Ename 列中有 M 的值,M 爲要查詢內容中的模糊信息。
'%a' | 以a結尾的數據 |
'a%' | 以a開頭的數據 |
'%a%' | 含有a的數據 |
‘_a_’ | 三位且中間字母是a的 |
'_a' | 兩位且結尾字母是a的 |
'a_' | 兩位且開頭字母是a的 |
SELECT * FROM websites WHERE name REGEXP '^[GFs]';
SELECT * FROM Websites WHERE name REGEXP '^[^A-H]'; # 選取 name 不以 A 到 H 字母開頭的網站。第一個^是開頭的意思;第二個^表否認。
SELECT * FROM Websites WHERE name BETWEEN 'A' AND 'H'; SELECT * FROM Websites WHERE name NOT BETWEEN 'A' AND 'H'; SELECT * FROM access_log WHERE date BETWEEN '2016-05-10' AND '2016-05-14';
詳情請見:SQL Date 函數
INSERT INTO websites (name, url, alexa, country) VALUES ('百度', 'https://www.baidu.com/', '4', 'CN');
# 先後一一對應便可。
UPDATE websites SET alexa='5000', country='USA' WHERE name='菜鳥教程';
# MySQL中強制在update 語句後攜帶 where 條件,不然就會報錯。set sql_safe_updates=1; 表示開啓該參數
DELETE FROM websites WHERE name='百度' AND country='CN';
DELETE |
DELETE FROM test |
刪除 全部內容,不保留表的定義,釋放空間。 | |
TRUNCATE |
TRUNCATE test; |
刪除 全部內容,保留表的定義,釋放空間。 | |
DROP |
DROP test; |
僅刪除 全部內容,保留表的定義,不釋放空間。 DROP INDEX / TABLE / DATABASE <name> |
CREATE DATABASE my_db;
CREATE TABLE Persons
(
PersonID int,
LastName varchar(255),
FirstName varchar(255),
Address varchar(255),
City varchar(255)
);
詳情請見:SQL 通用數據類型,SQL 用於各類數據庫的數據類型
SELECT name AS n, country AS c FROM Websites; SELECT name, CONCAT(url, ', ', alexa, ', ', country) AS site_info FROM Websites;
SELECT w.name, w.url, a.count, a.date FROM Websites AS w, access_log AS a WHERE a.site_id=w.id and w.name="菜鳥教程";
INNER JOIN | 若是表中有至少一個匹配,則返回行 |
LEFT JOIN | 即便右表中沒有匹配,也從左表返回全部的行 (左邊沒空格) |
RIGHT JOIN | 即便左表中沒有匹配,也從右表返回全部的行 (右邊沒空格) |
FULL JOIN | 只要其中一個表中存在匹配,則返回行 (兩邊可能都有空格) |
Ref: SQL的各類鏈接Join詳解
select * from Table A inner join Table B on Table A.id=Table B.id select * from Table A left join Table B on Table A.id=Table B.id select * from Table A right join Table B on Table A.id=Table B.id select * from Table A full outer join Table B on Table A.id=Table B.id
默認地,UNION 操做符選取不一樣的值。若是容許重複的值,請使用 UNION ALL。
SELECT country, name FROM Websites WHERE country='CN' UNION ALL SELECT country, app_name FROM apps WHERE country='CN' ORDER BY country;
MySQL 數據庫不支持 SELECT ... INTO 語句,但支持 INSERT INTO ... SELECT 。
INSERT INTO websites (name, country) SELECT app_name, country FROM apps;
備份表數據:CREATE TABLE emp AS SELECT * FROM scott.emp
還原表數據:INSERT INTO emp SELECT * FROM scott.emp
CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255) )
CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255) DEFAULT 'Sandnes'
OrderDate date DEFAULT GETDATE() )
CREATE TABLE Persons ( ID int NOT NULL AUTO_INCREMENT, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255),
PRIMARY KEY (ID) )
默認地,AUTO_INCREMENT 的開始值是 1,每條新記錄遞增 1。若是是自定義起始值,以下操做。
ALTER TABLE Persons AUTO_INCREMENT=100
[1] 惟一標識
CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255),
CONSTRAINT uc_PersonID UNIQUE (P_Id,LastName) )
註釋:
(a) 命名 UNIQUE 約束,
(b) 定義多個列的 UNIQUE 約束
[2] 撤銷約束
ALTER TABLE Persons DROP INDEX uc_PersonID # 撤銷 UNIQUE
[1] PRIMARY KEY - 相似 UNIQUE
CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255),
CONSTRAINT pk_PersonID PRIMARY KEY (P_Id,LastName) )
[2] 撤銷約束
ALTER TABLE Persons
DROP PRIMARY pk_PersonID # 撤銷 PRIMARY KEY
[0] 什麼是外鍵
一個表中的 FOREIGN KEY 指向另外一個表中的 PRIMARY KEY。
"Orders" 表中的 "P_Id" 列指向 "Persons" 表中的 "P_Id" 列。
["Persons" 表]
P_Id | LastName | FirstName | Address | City |
---|---|---|---|---|
1 | Hansen | Ola | Timoteivn 10 | Sandnes |
2 | Svendson | Tove | Borgvn 23 | Sandnes |
3 | Pettersen | Kari | Storgt 20 | Stavanger |
["Orders" 表]
O_Id | OrderNo | P_Id |
---|---|---|
1 | 77895 | 3 |
2 | 44678 | 3 |
3 | 22456 | 2 |
4 | 24562 | 1 |
[1] 建立外鍵
CREATE TABLE Orders ( O_Id int NOT NULL, OrderNo int NOT NULL, P_Id int,
PRIMARY KEY (O_Id), # 這裏沒有命名 CONSTRAINT fk_PerOrders FOREIGN KEY (P_Id) REFERENCES Persons(P_Id) )
[2] 撤銷外鍵
ALTER TABLE Orders DROP FOREIGN KEY fk_PerOrders
Switch to table Persons. Then, 添加一列: data。
ALTER TABLE Persons
ADD DateOfBirth date
ALTER TABLE Persons
ALTER COLUMN DateOfBirth year
[1] 建立約束
(a) 命名 CHECK 約束,
(b) 並定義多個列的 CHECK 約束
CREATE TABLE Persons ( P_Id int NOT NULL, LastName varchar(255) NOT NULL, FirstName varchar(255), Address varchar(255), City varchar(255),
CONSTRAINT chk_Person CHECK (P_Id>0 AND City='Sandnes') )
[2] 撤銷約束
ALTER TABLE Persons DROP CHECK chk_Person
在不讀取整個表的狀況下,索引使數據庫應用程序能夠更快地查找數據。
用戶沒法看到索引,它們只能被用來加速搜索/查詢。
爲Persons表中的LastName, FirstName列作索引。
CREATE INDEX PIndex ON Persons (LastName, FirstName)
AVG() - 返回某一列的平均值
SELECT site_id, count FROM access_log WHERE count > (SELECT AVG(count) FROM access_log);
SUM() - 返回總和
SELECT SUM(count) AS nums FROM access_log;
COUNT() - 返回行數 - 由於是新列,固然須要AS。
SELECT COUNT(count) AS nums FROM access_log WHERE site_id=3;
SELECT COUNT(DISTINCT site_id) AS nums FROM access_log;
FIRST() - 返回第一個記錄的值 - 使用Limit=1關鍵字
LAST() - 返回最後一個記錄的值 - 使用ORDER BY id DESC,以及Limit=1
MAX() - 返回最大值,MIN() - 返回最小值
SELECT MAX(alexa) AS max_alexa FROM Websites; SELECT MIN(alexa) AS min_alexa FROM Websites;
Ref: SQL中Group By的使用【寫得不錯】
結合聚合函數,根據一個或多個列對結果集進行分組。
[1] 任務:根據 '類別' 統計各個 ‘類別' 的 '數量'。
[2] 方案:注意這裏把摘要撇在了一邊兒。
select 類別, sum(數量) as 數量之和 from A group by 類別
[3] 結果:實際上就是分類彙總。
Ref: http://www.runoob.com/sql/sql-groupby.html
[1] 任務:統計全部網站的訪問的記錄數。
[2] 方案:由於不想顯示id,而是須要name,因此須要 LEFT JOIN websites表。
SELECT websites.name, COUNT(access_log.aid) AS nums FROM access_log 【GROUP BY site_id】
LEFT JOIN websites 【websites只是個給出site_id對應名字的參考表】
ON access_log.site_id=websites.id
GROUP BY websites.name;
where 子句:是在對查詢結果進行分組前,將不符合where條件的行去掉,即在分組以前過濾數據,where條件中不能包含聚組函數,使用where條件過濾出特定的行。
having 子句:是篩選知足條件的組,即在分組以後過濾數據,條件中常常包含聚組函數,使用having 條件過濾出特定的組,也可使用多個分組標準進行分組。
select 類別, sum(數量) as 數量之和 from A group by 類別 having sum(數量) > 18
一張表,分組後只須要 sum > 18 的。
分組前篩選一次;分組後篩選一次。
select 類別, SUM(數量)from A where 數量 gt;8 group by 類別 having SUM(數量) gt; 10
Ref: SQL中Group By的使用【寫得不錯】
compute子句可以觀察「查詢結果」的數據細節或統計各列數據(如例10中max、min和avg),返回結果由select列表和compute統計結果組成。
對階段性結果表的再次以統計計算。
select * from A where 數量>8
compute max(數量),min(數量),avg(數量)
select * from A where 數量>8
order by 類別
compute max(數量),min(數量),avg(數量) by 類別
執行結果:
UCASE() - 將某個字段轉換爲大寫;LCASE() - 將某個字段轉換爲小寫
SELECT UCASE(name) AS site_title, url FROM Websites; SELECT LCASE(name) AS site_title, url FROM Websites;
MID() - 從某個文本字段提取字符,MySql 中使用
SELECT MID(name,1,4) AS ShortTitle FROM Websites;
LENGTH() - 返回某個文本字段的長度
SELECT name, LENGTH(url) as LengthOfURL FROM Websites;
ROUND() - 對某個數值字段進行指定小數位數的四捨五入
返回參數X的四捨五入的有 D 位小數的一個數字。若是D爲0,結果將沒有小數點或小數部分。
mysql> select ROUND(1.298, 1); -> 1.3 mysql> select ROUND(1.298, 0); -> 1
NOW() - 返回當前的系統日期和時間,日期加時間的徹底格式。
SELECT name, url, Now() AS date FROM Websites;
FORMAT() - 格式化某個字段的顯示方式,以自定義格式顯示時間
SELECT name, url, DATE_FORMAT(Now(),'%Y-%m-%d') AS date FROM Websites;
#!/usr/bin/env python3 # -*- coding: utf-8 -*- import pandas as pd # Please update .csv path here before running this .py. str_path = './data_analyst_sample_data.csv' cols = ["week_sold",'price','num_sold','store_id','product_code','department_name'] dataset=pd.read_csv(str_path,header=None, sep=',',names=cols) ####### # Q1 ####### total_price = 0.0 for i in range(1,len(dataset)): if (dataset['department_name'][i] == 'BEVERAGE'): each_price = float(dataset['price'][i]) * float(dataset['num_sold'][i]) each_price = round(each_price, 2) total_price += each_price print("Total price is %.2f" % total_price) print("") ############################################################################### # SELECT SUM(price*num_sold) AS sales FROM <table name> where department_name='BEVERAGE' ############################################################################### from collections import Counter total_counts = Counter() for i in range(1,len(dataset)): product_code = dataset['product_code'][i] total_counts[product_code] += 1 #print(total_counts) print("There are %d unique products in the store." % len(total_counts) ) print("") ############################################################################### # SELECT product_code, COUNT(*) FROM <table name> GROUP BY product_code ############################################################################### ####### # Q2 ####### from collections import Counter from decimal import * #from datetime import datetime #def convert_to_month(week_sold): # time_str = week_sold # time = datetime.strptime(time_str, '%Y-%m-%d') # return time.strftime('%Y-%m') total_counts = Counter() for i in range(1,len(dataset)): # month_sold = convert_to_month(dataset['week_sold'][i]) store_id = dataset['store_id'][i] product_code = dataset['product_code'][i] each_price = float(dataset['price'][i]) * float(dataset['num_sold'][i]) total_counts[store_id, product_code] += Decimal(each_price/3).quantize(Decimal('0.00')) print(total_counts) ############################################################################### # SELECT store_id, product_code, round(SUM(price*num_sold)/3.0, 2) FROM <table name> GROUP BY store_id ############################################################################### # Save import csv row = ['store_id', 'product_code', 'average_monthly_revenue'] out = open("result.csv", "a", newline = "") csv_writer = csv.writer(out, dialect = "excel") csv_writer.writerow(row) row = ['store_id', 'product_code', 'average_monthly_revenue'] for k, v in total_counts.items(): row = [k[0], k[1], v] out = open("result.csv", "a", newline = "") csv_writer = csv.writer(out, dialect = "excel") csv_writer.writerow(row)
SELECT CASE WHEN A + B > C THEN
CASE
WHEN A = B AND B = C THEN 'Equilateral' WHEN A = B OR B = C OR A = C THEN 'Isosceles' WHEN A != B AND B != C OR A != C THEN 'SCALENE' END ELSE 'Not A Triangle' END FROM TRIANGLES;
SELECT id, (case sex
when ' ' then 'bbbbb' when null then 'aaaaa' else sex
end) as sex
FROM aa;
SELECT concat( NAME, concat("(",concat( substr(OCCUPATION,1,1), ")")) ) FROM OCCUPATIONS ORDER BY NAME ASC;
SELECT "There are a total of ", count(OCCUPATION), concat(lower(occupation),"s.") FROM OCCUPATIONS GROUP BY OCCUPATION ORDER BY count(OCCUPATION), OCCUPATION ASC
set @r1=0, @r2=0, @r3=0, @r4=0; select min(Doctor), min(Professor), min(Singer), min(Actor) from( select case when Occupation='Doctor' then (@r1:=@r1+1) when Occupation='Professor' then (@r2:=@r2+1) when Occupation='Singer' then (@r3:=@r3+1) when Occupation='Actor' then (@r4:=@r4+1)
end as RowNumber,
case when Occupation='Doctor' then Name end as Doctor, case when Occupation='Professor' then Name end as Professor, case when Occupation='Singer' then Name end as Singer, case when Occupation='Actor' then Name end as Actor from OCCUPATIONS order by Name ) Temp group by RowNumber;
一個以at符號(@)開頭的標識符表示一個本地的變量或者參數。
一個以數字符號(#)開頭的標識符表明一個臨時表或者過程。
一個以兩個數字符號(##)開頭的標識符標識的是一個全局臨時對象。
SET sql_mode = ''; SELECT Start_Date, End_Date FROM (SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) a, (SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) b
WHERE Start_Date < End_Date GROUP BY Start_Date ORDER BY DATEDIFF(End_Date, Start_Date), Start_Date
sql_mode="",即強制不設定MySql模式(如不做輸入檢測、錯誤提示、語法模式檢查等)應該能提升性能,但有以下問題:
若是插入了不合適數據(錯誤類型或超常),mysql會將數據設爲「最好的可能數據」而不報錯,如:
/數字 | 0/可能最小值/可能最大值 |
/字符串 | 空串/可以存儲的最大容量字符串 |
/表達式 | 返回一個可用值(1/0-null) |
因此,解決辦法是:全部列都要採用默認值,這對性能也好。