===============================================html
2019/7/16_第1次修改 ccb_warlockmysql
===============================================sql
接着上一個話題(https://www.cnblogs.com/straycats/p/11198340.html),作完了表結構和表內容的備份後,接着就須要刪除數據。數據庫
然而在刪除數據的過程當中發現,存在多條相同的業務數據記錄到了數據庫中(表現爲,除了索引字段,其餘全部字段的內容徹底一致)。這樣就致使本來的線性增長趨勢更明顯,髒數據不只浪費了空間,更影響了查詢的效率。spa
故仍是經過sql語句的處理還刪除那些邏輯上重複的數據。code
daily_t表結構以下:htm
字段名 | 描述 |
TID | 索引id |
USER_ID | 用戶id |
STATS_DATE | 日期 |
SELECT * FROM daily_t WHERE (USER_ID, STATS_DATE) IN ( SELECT * FROM (SELECT USER_ID, STATS_DATE FROM daily_t GROUP BY USER_ID, STATS_DATE HAVING count(*) > 1) A) AND TID NOT IN ( SELECT * FROM (SELECT min(TID) FROM daily_t GROUP BY USER_ID, STATS_DATE HAVING count(*) > 1) B) ORDER BY USER_ID, STATS_DATE;
DELETE FROM daily_t WHERE (USER_ID, STATS_DATE) IN ( SELECT * FROM (SELECT USER_ID, STATS_DATE FROM daily_t GROUP BY USER_ID, STATS_DATE HAVING count(*) > 1) A) AND TID NOT IN ( SELECT * FROM (SELECT min(TID) FROM daily_t GROUP BY USER_ID, STATS_DATE HAVING count(*) > 1) B);
PS.sql語句中之因此對子查詢多嵌套了一層(select *)是爲了規避mysql不支持在where中進行鍼對須要刪除操做的表的子查詢(1093-You can’t specify target table for update in FROM clause),由於多嵌套了一層(select *)後,子查詢內操做的是兩張臨時表A、B,而不是daily_t。blog