目錄html
數據表示->數據清洗->數據統計->數據可視化->數據挖掘->人工智能python
人工智能:數據/語言/圖像/視覺等方面深度分析與決策git
Python庫之機器學習github
Numpy: 表達N維數組的最基礎庫,http://www.numpy.org算法
import numpy as np def np_sum(): a = np.array([0, 1, 2, 3, 4]) b = np.array([9, 8, 7, 6, 5]) c = a**2 + b**3 return c print(np_sum())
[729 513 347 225 141]
def py_sum(): a = [0, 1, 2, 3, 4] b = [9, 8, 7, 6, 5] c = [] for i in range(len(a)): c.append(a[i]**2 + b[i]**3) return c print(py_sum())
[729, 513, 347, 225, 141]
Pandas: Python數據分析高層次應用庫,http://pandas.pydata.orgsql
能操做sql、json、pickle、csv、excel、ini等文件apache
DataFrame = 行列索引 + 二維數據json
SciPy: 數學、科學和工程計算功能庫,http://www.scipy.org數組
Matplotlib: 高質量的二維數據可視化功能庫,http://matplotlib.org網絡
Seaborn: 統計類數據可視化功能庫,http://seaborn.pydata.org/
Mayavi:三維科學數據可視化功能庫,http://docs.enthought.com/mayavi/mayavi/
PyPDF2:用來處理pdf文件的工具集,http://mstamy2.github.io/PyPDF2
from PyPDF2 import PdfFileReader, PdfFileMerger merger = PdfFileMerger() input1 = open("document1.pdf", "rb") input2 = open("document2.pdf", "rb") merger.append(fileobj=input1, pages=(0, 3)) merger.merge(position=2, fileobj=input2, pages=(0, 1)) output = open("document-output.pdf", "wb") merger.write(output)
NLTK:天然語言文本處理第三方庫,http://www.nltk.org/
from nltk.corpus import treebank t = treebank.parsed_sents('wsj_0001.mrg')[0] t.draw()
Python-docx:建立或更新Microsoft Word文件的第三方庫,http://python-docx.readthedocs.io/en/latest/index.html
from docx import Document document = Document() document.add_heading('Document Title', 0) p = document.add_paragraph('A plain paragraph having some ') document.add_page_break() document.save('demo.docx')
Scikit-learn:機器學習方法工具集,與數據處理相關的第三方庫,http://scikit-learn.org/
TensorFlow:AlphaGo背後的機器學習計算框架,https://www.tensorflow.org/
import tensorflow as tf init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) res = sess.run(result) print('result:', res)
MXNet:基於神經網絡的深度學習計算框架,https://mxnet.incubator.apache.org/