線上mysql
數據庫裏面有張表保存有天天的統計結果,天天有1千多萬條,這是咱們意想不到的,統計結果咋有這麼多。運維找過來,磁盤佔了200G
,最後問了運營,能夠只保留最近3天的,前面的數據,只能刪了。刪,怎麼刪?
由於這是線上數據庫,裏面存放有不少其它數據表,若是直接刪除這張表的數據,確定不行,可能會對其它表有影響。嘗試每次只刪除一天的數據,仍是卡頓的厲害,沒辦法,寫個Python腳本批量刪除吧。
具體思路是:python
# -*-coding:utf-8 -*- import sys # 這是咱們內部封裝的Python Module sys.path.append('/var/lib/hadoop-hdfs/scripts/python_module2') import keguang.commons as commons import keguang.timedef as timedef import keguang.sql.mysqlclient as mysql def run(starttime, endtime, regx): tb_name = 'statistic_ad_image_final_count' days = timedef.getDays(starttime,endtime,regx) # 遍歷刪除全部天的數據 for day in days: print '%s 數據刪除開始'%(day) mclient = getConn() sql = ''' select 1 from %s where date = '%s' limit 1 '''%(tb_name, day) print sql result = mclient.query(sql) # 若是查詢到了這一天的數據,繼續刪除 while result is not (): sql = 'delete from %s where date = "%s" limit 50000'%(tb_name, day) print sql mclient.execute(sql) sql = ''' select 1 from %s where date = '%s' limit 1 '''%(tb_name, day) print sql result = mclient.query(sql) print '%s 數據刪除完成'%(day) mclient.close() # 返回mysql 鏈接 def getConn(): return mysql.MysqlClient(host = '0.0.0.0', user = 'test', passwd = 'test', db= 'statistic') if __name__ == '__main__': regx = '%Y-%m-%d' yesday = timedef.getYes(regx, -1) starttime = '2019-08-17' endtime ='2019-08-30' run(starttime, endtime, regx)
循環判斷數據,若是有,繼續刪除當天50000
條數據;不然,開始刪除下一天的數據。花了半個小時,終於刪除完了。mysql
歡迎關注個人微信公衆號
sql