機器學習學習筆記1-初識機器學習

導語：

以前有作過一些簡單的跟機器學習相關的一些小玩意，好比OCR識別，車牌識別的東西，不過沒有系統話的學習過機器學習，最近打算系統的看看學一學。這個是看吳恩達的機器學習的學習筆記的記錄。學習地址算法

什麼是機器學習 Machine Learning：

視頻中介紹了兩位機器學習專家給出的機器學習的定義： 1.Artthur Samuel(1959年).Machine Learning：Field of study that give computers the ability to learn without being explicitly programmed. 在某個領域，使計算機無需明確編程便可學會某項技能。 2.Tom Mitchell（1998）.Well-posed Learning Problem：A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. 適定的學習問題：若是某計算機程序在T上的性能（由P衡量）隨經驗E的提升而提升，則該計算機程序能夠從經驗E中學習一些任務T和一些績效指標P。 ps：感受這兩個說法雖然看起來不一樣，可是確實在說同一件事件。第二個說法更加具體，我本身的理解就是讓計算機通過訓練，能完成某種特定場景的任務。編程

監督學習 Supervised Learning：

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input an d the output. Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some cont inuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories. 在監督學習中，咱們獲得了一個數據集，而且已經知道咱們的正確輸出應該是什麼樣子，並認爲輸入與輸出之間存在關係。監督學習問題分爲「迴歸」和「分類」問題。在迴歸問題中，咱們試圖預測連續輸出中的結果，這意味着咱們試圖將輸入變量映射到某些連續函數。在分類問題中，咱們改成嘗試預測離散輸出中的結果。換句話說，咱們正在嘗試將輸入變量映射爲離散類別。 ps：文中用預測房價的走向舉了一個例子，仍是比較好理解的，至關於某些事情是有一個內在的規律，好比房價預測就是一個一元二次的方程，x是房屋面積，y是價格，而後經過不斷的輸入數據來找尋這個x的係數和指數是多少，完成這個公式以後就能夠用這個公式去預測其餘面積x對應的房價是多少。之因此被稱爲監督學習，是人要給計算器提供正確的答案，引導計算機往人類想要的結果的方向去學習。markdown

無監督學習 Unsupervised Learning：

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables. We can derive this structure by clustering the data based on relationships among the variables in the data. With unsupervised learning there is no feedback based on the prediction results. 無監督學習使咱們幾乎或根本不瞭解結果應該是什麼樣的狀況下解決問題。咱們能夠從沒必要知道變量影響的數據中得出結構。咱們能夠經過基於數據中變量之間的關係對數據進行聚類來推導此結構。在無監督學習的狀況下，沒有基於預測結果的反饋。 ps：無監督學習純屬計算機本身去學習的一個過程，人類不用給出正確或錯誤的答案，程序自動分析數據的相同和不一樣點作分類，把它們區分開來，比較經常使用的就有聚類算法。app

總結：

吳恩達的教程是英文的可是有雙語字幕，並非很難懂，還有PPT能夠下載，應該對像我這樣的初學者仍是很友好的。結合着周志華的西瓜書對照着看下仍是頗有幫助的。但願後面把這個專題學完寫完能對人工智能有個新的認識也能寫點小項目。中美將來這些年確定是要相互競爭的，美國但願中國放棄高科技產業，包括人工智能，這反而激起了想學習人工智能的熱情，雖然本質工做不是幹這個的，不過多學學總有好處。機器學習