背景:又要作狗血的數據遷移、數據清洗,每次面對此類需求,內心都會把pm祖宗老老少小都問候個遍。解決辦法,要麼用java寫一大堆支撐代碼,要麼在vm上寫蹩腳的shell,都很不爽。最近嘗試了下python解決此類問題,有點小爽,心中安喜 : "BB,我不再怕pm這些腦殘需求了"。php
環境準備: Ubuntu 13.04。java
python #
建議至少安裝Python2.7/3.2版本,畢竟同Python 2.X/3.x仍是有區別的
sudo apt-get install python2.7 python2.7-dev
mysql
#安裝libssl和libevent編譯環境
sudo apt-get install build-essential libssl-dev libevent-dev libjpeg-dev libxml2-dev libxslt-dev
c++
#安裝mysqldbsql
sudo easy_install mysql-pythonshell
#測試數據庫
whereis python | python -V數組
Python開始: 有了上面的環境準備,就能夠書寫pthon了。建立python文件,touch firstPython.py。文件名是firstPython,擴展名是py。編輯此文件,相似php、java,python也有本身的函數庫。bash
主方法,python文件被執行的入口,
讀取主方法傳入的參數,sys.argv返回的是一個參數數組,sys.argv[index]。
定義成員方法,在主方法中調用執行。
profile是成員方法的參數,因爲python是弱語言類型,因此變量不須要聲明類型,這點有別於強語言類型c++、java。
因爲個人需求背景是數據遷移,因此在python中有效的訪問數據庫很重要。
看到這個數據庫握手鍊接、fetch數據,是否是感到很方便、簡潔,和php同樣的類庫風格。
python數據結構,內置類型
一、list:列表(動態數組, c++標準庫的vector,能夠在一個列表中包含不一樣類型的元素)
列表下標從0開始,-1是最後一個元素。取list的元素數量:len(list)。
建立連續的list
L.append(var) #追加元素
L.insert(index,var)
L.pop(var) #返回最後一個元素,並從list中刪除
L.remove(var) #刪除第一次出現的該元素
L.count(var) #該元素在列表中出現的個數
L.index(var) #該元素的位置,無則拋異常
L.extend(list) #追加list,即合併list到L上
L.sort() #排序
L.reverse() #倒序
二、dictionary 字典(c++標準庫的map)
每個元素是一個pair鍵值對,key是Integer或String類型,value是任意類型。
dictionary的方法:
D.get(key, 0) #同dict[key],多了個沒有則返回缺省值,0。[]沒有則拋異常
D.has_key(key) #有該鍵返回TRUE,不然FALSE
D.keys() #返回字典鍵的列表
D.values() #以列表的形式返回字典中的值,返回值的列表中可包含重複元素
D.items() #將全部的字典項以列表方式返回,這些列表中的每一項都來自於(鍵,值),可是項在返回時並無特殊的順序
D.update(dict2) #增長合併字典
D.popitem() #獲得一個pair,並從字典中刪除它。已空則拋異常
D.clear() #清空字典,同del dict
D.copy() #拷貝字典
D.cmp(dict1,dict2) #比較字典,(優先級爲元素個數、鍵大小、鍵值大小) 第一個大返回1,小返回-1,同樣返回0
dictionary的複製
dict1 = dict #別名
dict2=dict.copy() #克隆,即另外一個拷貝。
示例代碼:
import MySQLdbimport timeimport datetimeimport sys,oshms_connections = {}transfer_connections = {}totalResult = []def queryFromHms(): print "query from hms beginning..." db=MySQLdb.connect(host=hms_connections.get('host'),user=hms_connections.get('user'),passwd=hms_connections.get('passwd'),db=hms_connections.get('db'),port=hms_connections.get('port')) try: cursor = db.cursor() resultPerDay = {} sql = "select a.user_id,a.hotel_id,a.parent_group_id from hotel_sub_account a inner join lm_transfer_hotel b on a.hotel_id = b.hotel_id and b.QTA_STATUS=1" print sql cursor.execute(sql) results = cursor.fetchall() for row in results: print row user_id = row[0] hotel_id = row[1] parent_group_id = row[2] totalResult.append({"user_id":user_id, "hotel_id" : hotel_id, "parent_group_id" : parent_group_id}) cursor.close(); finally: db.close(); print "function queryFromHms to close db connection...";def queryFromTransfer(): print "query from transfer beginning..." db=MySQLdb.connect(host=transfer_connections.get('host'),user=transfer_connections.get('user'),passwd=transfer_connections.get('passwd'),db=transfer_connections.get('db'),port=transfer_connections.get('port')) try: for row in totalResult: cursor = db.cursor() sql = "select qta_id,hms_id from mapping_hms_qta_price where hms_level=1 and qta_level=1 and hms_id = %s" %(row.get('parent_group_id')) print sql cursor.execute(sql) results = cursor.fetchall() for subrow in results: print subrow row["qta_id"]= subrow[0] cursor.close(); finally: db.close(); print "function queryFromTransfer to close db connection...";def outputSupplierAccount(): print "output sql to supplier_account..." upgradeSql = "insert into supplier_account (`supplier_id`, `account`, `create_time`, `is_delete`) values(%(qta_id)s, '%(user_id)s', now(), 0); \n" callbackSql = "delete from supplier_account where supplier_id = %(qta_id)s and account = '%(user_id)s'; \n" upgradeFile = open("qta_upgrade.sql", "w") callbackFile = open("qta_callback.sql", "w") for row in totalResult: upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def outputUserHotelMapping(): print "output sql to eb_auth_user_hotel_mapping..." upgradeSql = "insert into eb_auth_user_hotel_mapping (`user_name`, `hotel_id`, `create_time`, `hotel_seq`, `supplier_id`, `group_id`) values('%(user_id)s', '', now(), '', %(qta_id)s, %(parent_group_id)s); \n" callbackSql = "delete from eb_auth_user_hotel_mapping where `user_name`='%(user_id)s' and `supplier_id`=%(qta_id)s and `group_id`=%(parent_group_id)s; \n" upgradeFile = open("hms_upgrade.sql", "w") callbackFile = open("hms_callback.sql", "w") for row in totalResult: upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def outputUserUriMapping(): print "output sql to eb_auth_user_uri_mapping..." upgradeFile = open("hms_upgrade.sql", "a") callbackFile = open("hms_callback.sql", "a") uris = [1,2,3,5,6,7,8,9,10,21,22,24,34,35,36,37,40,41,42,43,44,46,47,49,50,54,55,56,57,58,59,60,61,62,63,76,77,78,79] for row in totalResult: for uri in uris: upgradeSql = "insert into eb_auth_user_uri_mapping(`user_name`, `uri_id`, `create_time`) values('%s', %s, now()); \n" %(row['user_id'], uri) callbackSql = "delete from eb_auth_user_uri_mapping where user_name='%s' and uri_id=%s; \n" %(row['user_id'], uri) upgradeFile.write(upgradeSql%row) callbackFile.write(callbackSql%row) upgradeFile.close() upgradeFile.close()def configDbProfile(profile): print "current DB profile is %s" %(profile) if profile == "beta": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "" hms_connections['port'] = 3306 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "data_transfer" transfer_connections['port'] = 3306 elif profile == "product": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "hms" hms_connections['port'] = 3307 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "" transfer_connections['port'] = 3307 elif profile == "productb": hms_connections['host'] = "" hms_connections['user'] = "" hms_connections['passwd'] = "" hms_connections['db'] = "hms" hms_connections['port'] = 3307 transfer_connections['host'] = "" transfer_connections['user'] = "" transfer_connections['passwd'] = "" transfer_connections['db'] = "data_transfer" transfer_connections['port'] = 3308 else: print "input parameter invalid, choose (beta | product | productb)" sys.exit(0)if __name__ == '__main__': if len(sys.argv) != 2: print "please input parameter : (beta | product | productb)" sys.exit(0) profile = sys.argv[1] configDbProfile(profile) queryFromHms(); queryFromTransfer(); outputSupplierAccount(); outputUserHotelMapping(); outputUserUriMapping();