原文: http://www.deeplearningbook.org/contents/intro.htmlhtml
Inventors have long dreamed of creating machines that think. Ancient Greek myths tell of intelligent objects, such as animated statues of human beings and tables that arrive full of food and drink when called。git
一直以來,發明家們都夢想着可以創造會思考的機器。古老的希臘神話中提到了不少智能的東西,例如栩栩如生的人類雕塑,放滿美食、飲料的桌子能隨叫隨到。程序員
When programmable computers were first conceived, people wondered whetherthey might become intelligent, over a hundred years before one was built (Lovelace,1842). Today, artificial intelligence (AI) is a thriving field with many practical applications and active research topics. We look to intelligent software to automate routine labor, understand speech or images, make diagnoses in medicine and support basic scientific research.算法
當可編程電腦第一次成爲構想時,人們就思考它是否會變得智能,這個問題困擾了人類100年,直到第一個可編程電腦的出現(Lovelace,1842)。今天,人工智能是一個繁榮昌盛的研究領域,有不少實際的應用 ,也有不少活躍的研究課題。咱們指望智能軟件可以,自動完成平常勞務, 理解語音或者圖像,在醫學上完成疾病診斷,而且可以支撐基礎的科學研究。數據庫
In the early days of artificial intelligence, the field rapidly tackled and solved problems that are intellectually diffcult for human beings but relatively straight-forward for computers, problems that can be described by a list of formal, math-ematical rules. The true challenge to artificial intelligence proved to be solving the tasks that are easy for people to perform but hard for people to describe formally。Problems that we solve intuitively, that feel automatic, like recognizing spoken words or faces in imagesexpress
在人工智能的早期,AI可以快速的解決那些對於人類很複雜,可是對於機器來講很直白的問題,通常對於機器來講,這些問題每每可以被描述爲一系列形式化的數學公式。 然而,真正對人工智能有挑戰的問題大多都是對人類來講很容易解決,可是人類很難形式化的去描述它。這部分問題咱們每每可以很天然的憑直覺去解決,好比識別說話內容,辨別圖像中的臉。編程
This book is about a solution to these more intuitive problems. This solution is to allow computers to learn from experience and understand the world in terms of ahierarchy of concepts, with each concept defined in terms of its relation to simpler concepts. By gathering knowledge from experience, this approach avoids the need for human operators to formally specify all of the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated conceptsby building them out of simpler ones. If we draw a graph showing how these concepts are built on top of each other, the graph is deep, with many layers. For this reason, we call this approach to AI deep learning.api
這本書討論的內容,就是關於如何解決這類對人類來講很直觀的問題的方案。這種方案容許計算機從經驗中學習,而後它從一系列層次化的概念中去理解這個世界,每個概念的定義都與一個更簡單的概念相關,經過從經驗中得到知識,這種方法避免了人類去主動的去給計算機指定他們解決問題所需的知識。這些有層次概念結構使得計算機能夠經過首先學習簡單的概念,而後逐步創建起更復雜的概念。若是咱們畫一個流圖來展現這些概念是如何在其餘概念的基礎上創建起來的話,那麼這個圖將會是深度的,多層的。基於這些緣由,咱們把這種人工智能的方法叫作深度學習。數組
Many of the early successes of AI took place in relatively sterile and formal environments and did not require computers to have much knowledge about the world. For example, IBM’s Deep Blue chess-playing system defeated world champion Garry Kasparov in 1997 (Hsu, 2002). Chess is of course a very simple world, containing only sixty-four locations and thirty-two pieces that can move in only rigidly circumscribed ways. Devising a successful chess strategy is a tremendous accomplishment, but the challenge is not due to the difficulty of describing the set of chess pieces and allowable moves to the computer. Chess can be completely described by a very brief list of completely formal rules, easily provided ahead of time by the programmer。網絡
不少早期的人工智能的成功發生在單1、形式化的環境下,不須要計算機對世界有太多的知識。例如,IBM的深藍象棋系統在1997年(Hsu)擊敗世界冠軍Garry Kasparov。國際象棋自己固然是一個很是單一的環境,僅僅包含64個位置和32個棋子,而且他們的移動方式有嚴格的規定。設計一個成功的象棋策略是非凡的成就,可是這個設計任務的難點自己不在於描述象棋棋子以及他們可能的移動步伐。象棋徹底能夠用一個徹底形式化的規則列表去描述,而這個份列表是程序員預先提供好的。
Ironically, abstract and formal tasks that are among the most difficult mental undertakings for a human being are among the easiest for a computer. Computers have long been able to defeat even the best human chess player, but are only recently matching some of the abilities of average human beings to recognize objects or speech. A person’s everyday life requires an immense amount of knowledge about the world. Much of this knowledge is subjective and intuitive, and therefore difficult to articulate in a formal way. Computers need to capture this same knowledge in order to behave in an intelligent way. One of the key challenges in artificial intelligence is how to get this informal knowledge into a computer。
具備諷刺意味的是,對於人類來講最難的腦力任務之一就是抽象、形式化任務,然而這對於計算機來講倒是最簡單的任務之一。即使是最好的人類棋手,計算機一直以來都可以擊敗。但在語音識別和對象識別這些認知領域,直到最近計算機纔可以達到人類的平均水平。一我的的平常生活須要大量的關於世界的知識。這些知識大可能是主觀的,直觀的,所以很難形式化的去清晰地表達。計算機須要得到這些主觀的知識來表現的更加智能。人工智能最關鍵的挑戰之一就是如何把這些非形式化的知識轉化到計算機裏面。
Several artificial intelligence projects have sought to hard-code knowledge about the world in formal languages. A computer can reason about statements in these formal languages automatically using logical inference rules. This is known as the knowledge base approach to artificial intelligence. None of these projects has led to a major success. One of the most famous such projects is Cyc (Lenat and Guha,1989). Cyc is an inference engine and a database of statements in a language called CycL. These statements are entered by a staff of human supervisors. It is an unwieldy process. People struggle to devise formal rules with enough complexity to accurately describe the world. For example, Cyc failed to understand a story about a person named Fred shaving in the morning (Linde, 1992). Its inference engine detected an inconsistency in the story: it knew that people do not have electrical parts, but because Fred was holding an electric razor, it believed the entity 「FredWhileShaving」 contained electrical parts. It therefore asked whether Fred was still a person while he was shaving
幾種人工智能項目試圖經過硬編碼的方式去形式化的描述這個世界的知識。計算機可以自動的經過邏輯推理規則來推理這些用形式化語言描述的陳述。這被稱爲 the knowledge base 的人工智能方法。這類方法沒有哪一個項目取得特別成功。最著名的此類項目之一是Cyc(Lenat和古哈,1989)。Cyc是一個推理引擎以及一些用CycL語言描述的陳述組成的數據庫。這些語句是人類再有監督的狀況下輸入的,這是一個笨拙繁瑣的過程。人們難以制定足夠形式化的規則來準確地描述世界。例如,Cyc不能理解一個名爲Fred的人在早上刮鬍子的故事(Linde,1992)。推理引擎檢測到故事中的一個不一致的地方:它知道人們是不含有電子部分(天然體),但由於弗雷德有一個電動剃鬚刀,它認爲「FredWhileShaving」這個實體是包含電子部分的(有電動剃鬚刀)。所以Cyc很疑惑剃鬚時的Fred是否是仍然是一我的。
The difficulties faced by systems relying on hard-coded knowledge suggest that AI systems need the ability to acquire their own knowledge, by extracting patterns from raw data. This capability is known as machine learning. The introduction of machine learning allowed computers to tackle problems involving knowledge of the real world and make decisions that appear subjective. A simple machine learning algorithm called logistic regression can determine whether to recommend cesarean delivery (Mor-Yosef et al., 1990). A simple machine learning algorithm called naive Bayes can separate legitimate e-mail from spam e-mail.
依賴於知識硬編碼的系統面臨的問題告訴咱們AI系統須要可以本身得到知識,經過從原始數據從得到固存的模式。這種能力被稱之爲機器學習。機器學習的引入使得計算機可以處理一些涉及真實世界知識的問題,而且可以主觀的作決定。一個簡單的機器學習算法 邏輯斯地迴歸 可以決定是否推薦剖腹產(Mor-Yosef et al., 1990)。 一個簡單的機器學習算法,樸素貝葉斯算法能夠把正規郵件從垃圾郵件裏面分離出來。
The performance of these simple machine learning algorithms depends heavily on the representation of the data they are given. For example, when logistic regression is used to recommend cesarean delivery, the AI system does not examine the patient directly. Instead, the doctor tells the system several pieces of relevant information, such as the presence or absence of a uterine scar. Each piece of information included in the representation of the patient is known as a feature.Logistic regression learns how each of these features of the patient correlates with various outcomes. However, it cannot influence the way that the features are defined in any way. If logistic regression was given an MRI scan of the patient, rather than the doctor’s formalized report, it would not be able to make useful predictions. Individual pixels in an MRI scan have negligible correlation with any complications that might occur during delivery
這些簡單的機器學習算法的性能很大程度上取決於他們所獲得數據的表現形式(representation)。例如當邏輯迴歸用於剖腹產推薦時,人工智能系統不會直接去檢查病人。取而代之的是,醫生會告訴這個系統幾條相關的信息,例如病人是否有子宮疤痕。representation中的每一條信息被稱之爲特徵。邏輯迴歸能夠從這些病人的特徵中學習出特徵是如何與結果相關的。然而,他不能影響到這些特徵是如何定義的。若是給邏輯迴歸分類器的是一個MRI掃描的數據,而不是醫生的檢測報告,那麼邏輯迴歸很難作出有用的預測。MRI中每個單獨的數據與分娩過程當中的任何一個併發症的關聯都是微乎其微的。
This dependence on representations is a general phenomenon that appears throughout computer science and even daily life. In computer science, operations such as searching a collection of data can proceed exponentially faster if the collection is structured and indexed intelligently. People can easily perform arithmetic on Arabic numerals, but find arithmetic on Roman numerals much more time-consuming. It is not surprising that the choice of representation has an enormous effect on the performance of machine learning algorithms. For a simple visual example, see Fig. 1.1.3
這種對於representation的依賴性在整個計算機科學中是一種廣泛的現象,甚至是平常生活中。計算機科學裏,例如當咱們在一個數據庫中作搜索等操做,在更加明智的數據結構以及索引構建方法下,檢索速度會呈指數增快。人們能夠很容易的在阿拉伯數字上作數學計算,可是在羅馬數字上作數據計算顯然更耗時。很顯然,representation的選擇在機器學習算法的性能方面有很大的影響。一個簡單的可視化例子,如1.1.3
Many artificial intelligence tasks can be solved by designing the right set of features to extract for that task, then providing these features to a simple machine learning algorithm. For example, a useful feature for speaker identification from sound is an estimate of the size of speaker’s vocal tract. It therefore gives a strong clue as to whether the speaker is a man, woman, or child
首先設計最合適的特徵集合,而後把這些特徵送到一個簡單的機器學習算法中,不少人工智能任務均可以這麼解決。例如,對於經過聲音辨認說話者來講,說話人的聲道的大小是一個很重要的特徵。由於它給了一個很強的線索,來判斷這個說話者是男是女,是總是少。
However, for many tasks, it is difficult to know what features should be extracted. For example, suppose that we would like to write a program to detect cars in photographs. We know that cars have wheels, so we might like to use the presence of a wheel as a feature. Unfortunately, it is difficult to describe exactly what a wheel looks like in terms of pixel values. A wheel has a simple geometric shape but its image may be complicated by shadows falling on the wheel, the sun glaring off the metal parts of the wheel, the fender of the car or an object in the foreground obscuring part of the wheel, and so on。
可是對於不少任務來講,咱們很難知道該提取哪些特徵。例如,假如咱們須要來寫一個檢測圖片中汽車的程序。咱們知道汽車有輪子。所以咱們可能會用是否有輪子做爲一個特徵。可是,從像素值的角度咱們很難描述一個輪子究竟長什麼樣。一個輪子有簡單的幾何形狀,可是他的圖片可能十分複雜,由於有影子可能照到輪子上,太陽光分照射在輪子的金屬部分,汽車的護欄,或者前景物體 遮擋了輪子的部分,等等。
One solution to this problem is to use machine learning to discover not onlythe mapping from representation to output but also the representation itself.This approach is known as representation learning. Learned representations often result in much better performance than can be obtained with hand-designed representations. They also allow AI systems to rapidly adapt to new tasks, with minimal human intervention. A representation learning algorithm can discover a good set of features for a simple task in minutes, or a complex task in hours to months. Manually designing features for a complex task requires a great deal of human time and effort; it can take decades for an entire community of researchers
對於特徵問題,一種解決辦法就是利用機器學習的方法去解決。不只僅是學習從表達到輸出結果的映射,同時還學習如何得到表達自己。這種方法就是咱們熟知的表達學習 representation learning 。 相比手動設計的特徵表達,自動學習的表達經常能夠獲得一個更加好的性能。他們也容許AI系統更快的適應新的任務,以最少的人工干預。一個表達學習算法能夠再幾分鐘內爲一個簡單的任務學習一個很好的特徵集合,對於複雜的任務,時間多是幾個小時到幾個月。可是爲複雜任務作人工的特徵設計經常要花掉大量的時間和精力。可能須要花掉整個領域的科研人員幾十年的時間。
The quintessential example of a representation learning algorithm is the au-toencoder. An autoencoder is the combination of an encoder function that convertsthe input data into a different representation, and a decoder function that convertsthe new representation back into the original format. Autoencoders are trained topreserve as much information as possible when an input is run through the encoderand then the decoder, but are also trained to make the new representation havevarious nice properties. Different kinds of autoencoders aim to achieve differentkinds of properties
最典型的表達學習算法就是autoencoder。一個autoencoder是結合了將輸入數據轉化爲不一樣的特徵表達的編碼器與將一個新的特徵表達轉化爲原始的數據格式的解碼器。訓練autoencoder,一方面是爲了儘可能的在編碼和解碼的過程當中保留更多的信息,同時也爲了得到更多樣的更好的性質的特徵表達。不一樣種類的autoencoder爲了得到不一樣的性質。
When designing features or algorithms for learning features, our goal is usually to separate the factors of variation that explain the observed data. In this context,we use the word 「factors」 simply to refer to separate sources of influence; the factors are usually not combined by multiplication. Such factors are often not quantities that are directly observed. Instead, they may exist either as unobserved objects or unobserved forces in the physical world that affect observable quantities. Theymay also exist as constructs in the human mind that provide useful simplifying explanations or inferred causes of the observed data. They can be thought of asconcepts or abstractions that help us make sense of the rich variability in the data
當咱們設計特徵或者設計學習特徵的算法時,咱們的目標是通常都是把能解釋咱們觀測的數據結果差異的 「差異因子」 從中分離出來。在這裏,咱們用 「因子」這個詞來簡單的指代不一樣的差異影響源;因子之間通常都不是簡單的相乘疊加。這些因子通常不是能直接觀測到的量。 相反,他們要麼是以看不到的物體的形式,要麼是以看不到的力的形式,來影響可觀測的量。 這些因子也可能僅僅是存在於人類思惟中的某種結構,他們提供了對原始數據的一種有效的簡化解釋或者一種因果上的推斷。他們能夠被認爲是一種概念或者一種抽象化的事物,來幫助咱們理解數據中豐富的變化。
When analyzing a speech recording, the factors of variation include the speaker’sage, their sex, their accent and the words that they are speaking. When analyzingan image of a car, the factors of variation include the position of the car, its color,and the angle and brightness of the sun.
當咱們分析一個語音記錄時,差異因子包含說話者的年紀,性別,口音和他們所說的話。當分析一個汽車圖片時,差異因子包括 汽車的位置,顏色,觀察角度以及太陽的亮度。
A major source of diffculty in many real-world artificial intelligence applications is that many of the factors of variation influence every single piece of data we are able to observe. The individual pixels in an image of a red car might be very close to black at night. The shape of the car’s silhouette depends on the viewing angle.Most applications require us to disentangle the factors of variation and discard the ones that we do not care about.
不少現實世界人工智能應用的一個主要的困難來源就在於差異因子影響着每個咱們能夠觀測的數據。一個紅色汽車圖像中的某個獨立像素在晚上看多是黑色,汽車剪影的形狀很大程度上取決於觀測的角度。大多數應用要求咱們把這些差異因子分解開來,而後遺棄掉那些咱們不關心的。
Of course, it can be very difficult to extract such high-level, abstract featuresfrom raw data. Many of these factors of variation, such as a speaker’s accent,can be identified only using sophisticated, nearly human-level understanding ofthe data. When it is nearly as difficult to obtain a representation as to solve theoriginal problem, representation learning does not, at first glance, seem to help us.
固然,要叢原始數據中提取這些高層次的抽象的特徵是很是難的。不少差異因子,例如說話者的口音只能經過經驗判斷,接近於人類認知層面上對於數據的理解。當得到特徵表達的問題的難度已經接近於原始問題自己時,憑直覺來講,表達學習彷佛已經幫不上啥忙了。
Deep learning solves this central problem in representation learning by introducing representations that are expressed in terms of other, simpler representations.Deep learning allows the computer to build complex concepts out of simpler con-cepts. Fig. 1.2 shows how a deep learning system can represent the concept of animage of a person by combining simpler concepts, such as corners and contours,which are in turn defined in terms of edges.
經過引入由許多更簡單的淺層表達組合獲得高層表達,深度學習解決了表達學習這個中心問題。深度學習容許咱們 經過簡單的概念創建更復雜的概念。表1.2展現了一個深度學習系統如何經過結合簡單的概念,例如角點,連通域表達出一我的臉的圖像。而這些角點,連通域的概念是有邊緣像素來定義的。
The quintessential example of a deep learning model is the feedforward deepnetwork or multilayer perceptron (MLP). A multilayer perceptron is just a mathe-matical function mapping some set of input values to output values. The function is formed by composing many simpler functions. We can think of each applicationof a different mathematical function as providing a new representation of the input
深度學習模型的一個典型例子是前饋深度網絡,或者說多層感知器(MLP)。MLP只是一個把輸入映射到輸出的數學函數。這個函數是由不少個更簡單的函數組合而成。咱們能夠認爲每個不一樣的數學公式的做用就是爲輸入數據提供了一種新的特徵表達。
The idea of learning the right representation for the data provides one perspective on deep learning. Another perspective on deep learning is that it allows the computer to learn a multi-step computer program. Each layer of the representation can be thought of as the state of the computer’s memory after executing another set of instructions in parallel. Networks with greater depth can execute more in-structions in sequence. Being able to execute instructions sequentially offers greatpower because later instructions can refer back to the results of earlier instructions.According to this view of deep learning, not all of the information in a layer’s representation of the input necessarily encodes factors of variation that explainthe input. The representation is also used to store state information that helps toexecute a program that can make sense of the input. This state information couldbe analogous to a counter or pointer in a traditional computer program. It hasnothing to do with the content of the input specifically, but it helps the model to organize its processing
學習數據正確的特徵表達是看待深度學習的一種觀點。關於深度學習另外一種觀點就是使得計算機可以學習一個多步的計算機程序。在執行完一些並行的數據指令以後,每層representation能夠被認爲是當前電腦內存的一種狀態。網絡的層數越多,網絡越深,它就能夠順序的執行更多的指令。這種可以按順序執行指令的能力使得深度網絡更增強大,由於後面層的指令能夠參照前面指令的計算結果。依據這種深度學習的觀點,並非全部的層的特徵表達都必定編碼那些解釋輸入數據的「差異因子」。有一部分也會用來存儲那些可以幫助理解輸入數據的狀態信息。這些狀態信息相似於傳統程序中的計數器,指針。他與輸入數據的內容沒啥關係,可是可以幫助模型自己組織他當前的進程。
There are two main ways of measuring the depth of a model. The first view isbased on the number of sequential instructions that must be executed to evaluatethe architecture. We can think of this as the length of the longest path througha flow chart that describes how to compute each of the model’s outputs givenits inputs. Just as two equivalent computer programs will have different lengthsdepending on which language the program is written in, the same function may bedrawn as a flowchart with different depths depending on which functions we allowto be used as individual steps in the flowchart. Fig. 1.3 illustrates how this choiceof language can give two different measurements for the same architecture
評價一個模型的深度,又要看兩點。第一,就是評測這個模型時,咱們須要執行的順序指令的數目。咱們能夠把這個看作是 給定輸入的狀況下,計算輸出的流程圖中最長的路徑有多長。正如 給定兩個相同的程序,不一樣的編程語言會有不一樣的長度,流程圖中條相同的步驟也可能會有不一樣的長度,這取決於咱們咱們用什麼函數。圖1.3解釋了同一個框架,不一樣的語言下,給出了不一樣的結構。
Another approach, used by deep probabilistic models, regards the depth of a model as being not the depth of the computational graph but the depth of thegraph describing how concepts are related to each other. In this case, the depthof the flowchart of the computations needed to compute the representation ofeach concept may be much deeper than the graph of the concepts themselves.This is because the system’s understanding of the simpler concepts can be refinedgiven information about the more complex concepts. For example, an AI systemobserving an image of a face with one eye in shadow may initially only see one eye.After detecting that a face is present, it can then infer that a second eye is probably present as well. In this case, the graph of concepts only includes two layers—alayer for eyes and a layer for faces—but the graph of computations includes 2nlayers if we refine our estimate of each concept given the other n times.
另一種評價模型深度的方法,大多用在的深度機率模型中,認爲模型的深度不是計算流圖的深度,而是描述模型的概念關係的流圖的深度。這種狀況下,用來計算每一個概念的特徵表達的計算流圖的深度可能會比概念流圖自己的深度 更深。這是由於,若是給定了關於複雜概念的相關信息,系統對於簡單概念的理解將會更加精細。舉例子來講,有一張人臉的圖像,其中一個眼睛在陰影下,當一個AI系統觀測這麼一張圖時,智能檢測到一個眼睛。在檢測到有一張人臉後,他就能推測出也許還有另一個眼睛存在。這種狀況下,概念流圖只包含兩層---一層是眼睛,一層是臉---可是對於計算流圖來講,若是咱們再把所給的概念精細化n次,計算流圖就包含2n層,
Because it is not always clear which of these two views—the depth of thecomputational graph, or the depth of the probabilistic modeling graph—is most relevant, and because different people choose different sets of smallest elements from which to construct their graphs, there is no single correct value for the depth of an architecture, just as there is no single correct value for the length of a computer program. Nor is there a consensus about how much depth a model requires to qualify as 「deep.」 However, deep learning can safely be regarded as the study of models that either involve a greater amount of composition of learned functions or learned concepts than traditional machine learning does
正是由於咱們並不知道這兩種觀點哪一個更接近真實狀況---到底是計算流圖的深度,仍是機率模型(概念流圖)的深度。同時,不一樣的人在構建他們的流圖時選擇的最小單元集合也不一樣,所以對於一個架構的深度,咱們沒有一個肯定正確的值,也沒有一個一致定論說什麼樣的模型才能被稱爲「深度」模型。然而,這樣的問題並不會影響咱們,深度學習能夠被認爲是一種比傳統的機器學習更加複雜的學科,它的模型學習過程包含更多學習函數、更多須要學習的概念。
To summarize, deep learning, the subject of this book, is an approach to AI.Specifically, it is a type of machine learning, a technique that allows computersystems to improve with experience and data. According to the authors of thisbook, machine learning is the only viable approach to building AI systems thatcan operate in complicated, real-world environments. Deep learning is a particularkind of machine learning that achieves great power and flexibility by learning torepresent the world as a nested hierarchy of concepts, with each concept defined inrelation to simpler concepts, and more abstract representations computed in termsof less abstract ones. Fig. 1.4 illustrates the relationship between these differentAI disciplines. Fig. 1.5 gives a high-level schematic of how each works
總結下來,這本書的主題---深度學習是一種實現人工智能的途徑。展開說,它是一種機器學習的方法,一種容許計算機系統隨着經驗和數據改變性能的技術。從筆者的角度來看,機器學習是惟一可行的來實現可以在複雜真實環境中運做的人工智能系統的途徑。尤爲是深度學習,它是那種高效靈活的,可以經過學習將世界以一個嵌套層次化的概念的形式展示出來,每個概念都與一個更簡單的概念相關,更抽象的概念能夠被更具體一點的概念計算出來。 圖1.4闡述了這些AI法則之間的不一樣關係。圖1.5給出了闡述他是如何工做的高層的圖解
總結:人類對於人工智能很早就開始有所期待,可是機器每每能快速解決對人類很複雜的問題(例如超過1000的數的乘法,機器能夠把問題形式化成公式),可是解決不了對人來講顯而易見的問題,好比辨別一張人臉。而深度學習就是使得機器有這種能力。深度學習是由淺入深的,以概念爲基礎,多層次的流圖模型,能夠從數據,經驗中學習。
早起人工智能只能解決單一環境下的問題,例如規則固定、變化有限的國際象棋。其中一個叫CYC,它是一個有監督的,基於邏輯推理規則的人工智能系統,這種系統侷限性太強,經常會出錯。以後,有了一些機器學習方法的興起,例如邏輯斯地迴歸,貝葉斯,他們能夠解決簡單的分類/迴歸問題,垃圾郵件,房價預測等。然而,當問題更復雜,分類/迴歸效果取決於數據表達形式representation。因而,在某個階段,研究者致力於精心設計的representation,不一樣任務設計representation的過程很複雜,也很耗時,representation learning天然而然出現了,它可使機器本身學習合適的特徵,從而能把不一樣樣本「差異因子」體現出來,典型的autoencoder。固然,這種特徵表達學習是困難的,有的時候難度甚至接近於問題自己。因而就有了深度學習的發展。
深度學習的典型模型就是前饋深度網絡,也就是多層感知機(MLP)。有一種觀點認爲深度學習也是在爲數據學習一種新的表達。更工程的觀點認爲深度學習模型是一個多步運行的計算機系統,每一層是電腦內存的一種狀態。這種觀點下,不是每一層都是數據表達,有的只是狀態信息的存儲。關於「深度」到底有多深,不一樣觀點也不一樣,有的認爲是流圖層數,有的認爲是概念圖層數。不論如何,深度學習是一種實現人工智能的途徑。展開說,它是一種機器學習的方法,一種容許計算機系統隨着經驗和數據改變性能的技術。