卷積神經網絡 遞歸神經網絡_引入遞歸神經網絡

卷積神經網絡 遞歸神經網絡

Artificial intelligence (AI) is bridging the gap between technology and humans by allowing machines to automatically learn things from data and become more ‘human-like’; thus, becoming more ‘intelligent’. In this case, intelligence can be considered to be the ability to process information which can be used to inform future decisions. This is ideal because humans can spontaneously put information together by recognizing old patterns, developing new connections, and perceiving something that they have learnt in a new light to develop new and effective processes. When combined with a machine’s computational power, tremendous results can be achieved.

人工智能(AI)通過允許機器自動從數據中學習事物並變得更像「人類」來彌合技術與人類之間的鴻溝。 因此,變得更加「智能」。 在這種情況下,可以將智能視爲處理信息的能力,該信息可用於爲將來的決策提供依據。 之所以理想,是因爲人們可以通過識別舊模式,建立新的聯繫並感知他們以新的視角學到的東西來自動開發信息,從而開發出新的有效流程。 與機器的計算能力相結合,可以獲得巨大的結果。

The combination of automatic learning and computational efficiency can best be described by deep learning. This is a subset of AI and machine learning (ML) where algorithms are made to determine a pattern in data and develop a target function which best maps an input variable, x, to a target variable, y. The goal here is to automatically extract the most useful pieces of information needed to inform future decisions. Deep learning models are very powerful and they can be used to tackle a wide variety of problems; from predicting the likelihood that a student will pass a course, to recognizing an individual’s face to unlock their iPhones using Face ID.

深度學習可以最好地描述自動學習和計算效率的結合。 這是AI和機器學習(ML)的子集,在其中進行算法以確定數據中的模式並開發目標函數,該函數最好將輸入變量x映射到目標變量y 。 這裏的目標是自動提取最有用的信息,以爲將來的決策提供依據。 深度學習模型非常強大,可用於解決各種問題。 從預測學生通過課程的可能性,到識別人臉以使用Face ID解鎖iPhone。

Image for post
Image by Author
圖片作者

Deep learning models are built on the idea of ‘neural networks’, and this is what allows the models to learn from raw data. Simply put, the deep neural network is created by stacking perceptrons, which is a single neuron. Information is propagated forward through this system by having a set of inputs, x, and each input has a corresponding weight, w. The input should also include a ‘bias term’ which is independent of x. The bias term is used to shift the function being used accordingly, given a problem at hand. Each corresponding input and weight are then multiplied, and the sum of products is calculated. The sum then passes through a non-linear activation function, and an output, y, is generated.

深度學習模型是基於「神經網絡」的思想構建的,這使模型可以從原始數據中學習。 簡而言之,深層神經網絡是通過堆疊感知器(單個神經元)創建的。 通過具有一組輸入x ,信息通過該系統向前傳播,並且每個輸入具有對應的權重w 。 輸入還應包括一個獨立於x的「偏差項」。 給定當前的問題,可以使用偏置項來相應地改變所使用的功能。 然後將每個相應的輸入和權重相乘,然後計算出乘積之和。 然後,該和通過非線性激活函數,並生成輸出y

However, this ‘feed-forward’ type of model is not always applicable, and their fundamental architecture makes it difficult to apply them to certain scenarios. For example, consider a model that is designed to predict where a flying object will go to next, given a snapshot of that flying object. This is a sequential problem because the object will be covering some distance over time, and the current position of the object will depend on where the object was previously. If no information about the object’s previous position is given, then predicting where the object will go next is no better than a random guess.

但是,這種「前饋」類型的模型並不總是適用,並且它們的基本體系結構使其很難將它們應用於某些情況。 例如,考慮一個模型,該模型被設計爲在給定飛行物體快照的情況下預測該飛行物體將要到達的位置。 這是一個順序問題,因爲隨着時間的推移,對象將覆蓋一定距離,並且對象的當前位置將取決於對象先前所在的位置。 如果沒有給出有關對象的先前位置的信息,那麼預測對象下一步將去的地方並不比隨機猜測更好。

Image for post
Image by Author
圖片作者

Let us consider another simple, yet important problem: predicting the next word. Models which do this are common now as they are used in applications such as autofill and autocorrect, and they are often taken for granted. This is a sequential task since the most appropriate ‘next word’ depends on the words which came before it. A feed-forward network would not be appropriate for this task because it would require a sentence with a particular length as an input to then predict the next word. However, this is an issue because we cannot guarantee an input of the same length each time, and the model’s performance would then be negatively affected.

讓我們考慮另一個簡單但重要的問題:預測下一個單詞。 執行此操作的模型現在很普遍,因爲它們被用於諸如自動填充和自動更正之類的應用程序中,並且通常被認爲是理所當然的。 這是一個順序的任務,因爲最合適的「下一個單詞」取決於之前的單詞。 前饋網絡不適用於該任務,因爲它將需要一個具有特定長度的句子作爲輸入,然後才能預測下一個單詞。 但是,這是一個問題,因爲我們不能保證每次輸入的長度都是相同的,這樣會對模型的性能造成負面影響。

A potential way to combat this issue is to only look at a subsection of this input sentence, such as the last two words maybe. This combats the issue of variable-length inputs because, despite the total input length, the model will only use the last two words of the sentence to predict the next word. But this is still not ideal because the model now cannot account for long-term dependencies. That is, consider the sentence 「I grew up in Berlin and only moved to New York a year ago. I can speak fluent …」. By only considering the last two words, every language would be equally likely. But when the entire sentence is considered, German would be most likely.

解決此問題的一種潛在方法是僅查看此輸入句子的一個小節,例如最後兩個單詞。 這解決了可變長度輸入的問題,因爲儘管輸入總長度很大,但是該模型將僅使用句子的最後兩個單詞來預測下一個單詞。 但這仍然不是理想的,因爲該模型現在無法解決長期依賴性。 也就是說,請考慮以下句子:「我在柏林長大,一年前才搬到紐約。 我會說流利的……」。 僅考慮最後兩個詞,每種語言的可能性均等。 但是當考慮整個句子時,德語很有可能會出現。

Image for post
Image by Author
圖片作者

The best way to overcome these issues is to have an entirely new network structure; one that can update information over time. This is a Recurrent Neural Network (RNN). This is similar to a perceptron in that over time, information is being forward through the system by a set of inputs, x, and each input has a weight, w. Each corresponding input and weight are then multiplied, and the sum of products is calculated. The sum then passes through a non-linear activation function, and an output, y, is generated.

解決這些問題的最佳方法是擁有一個全新的網絡結構。 一種可以隨着時間更新信息的工具。 這是遞歸神經網絡(RNN)。 這類似於感知器,隨着時間的流逝,信息由一組輸入x通過系統轉發,並且每個輸入的權重w 。 然後將每個相應的輸入和權重相乘,然後計算出乘積之和。 然後,該和通過非線性激活函數,並生成輸出y

The difference is that, in addition to the output, the network is also generating an internal state update, u. This update is then used when analyzing the next set of input information and provides a different output that is also dependent on the previous information. This is ideal because information persists throughout the network over time. As the name suggests, this update function is essentially a recurrence relation that happens at every step of the sequential process, where u is a function of the previous u and the current input, x.

區別在於,除了輸出之外,網絡還生成內部狀態更新u 。 然後,在分析下一組輸入信息時使用此更新,並提供一個也取決於先前信息的不同輸出。 這是理想的,因爲隨着時間的推移,信息會在整個網絡中持續存在。 顧名思義,此更新函數本質上是在順序過程的每個步驟中發生的遞歸關係,其中u是上一個u和當前輸入x的函數

Image for post
Image by Author
圖片作者

The concept of looping through the RNN’s system over time might be a bit abstract and difficult to grasp. Another way to think of an RNN is to actually unfold this system over time. That is, think of the RNN as a set of singular feed-forward models, where each model is linked together by the internal state update. Viewing the RNN like this can truly provide some insight as to why this structure is suitable for sequential tasks. At each step of the sequence, there is an input, some process being performed on that input, and a related output. For the next step of the sequence, the step before must have some influence does not affect the input but affects the related output.

隨着時間的流逝,循環遍歷RNN系統的概念可能有點抽象並且難以掌握。 想到RNN的另一種方法是隨着時間的推移實際展開該系統。 也就是說,將RNN視爲一組奇異的前饋模型,其中每個模型都通過內部狀態更新鏈接在一起。 像這樣查看RNN可以真正提供一些有關爲什麼此結構適合於順序任務的見解。 在序列的每個步驟中,都有一個輸入,對該輸入執行一些處理,以及相關的輸出。 對於序列的下一個步驟,之前的步驟必須具有一定的影響力,但不會影響輸入,但會影響相關的輸出。

If we go back to either the flying object scenario or the word prediction scenario, and we consider them using the unfolded RNN, we would be able to understand the solutions more. At each previous position of the flying object, we can predict a possible path. The predicted path updates as the model receives more information about where the object was previously, and this information updates itself to then feed into the future sequences of the model. Similarly, as each new word from the sentence scenario is fed into the model, a new combination of likely words is generated.

如果我們回到飛行物體場景或單詞預測場景,並使用展開的RNN考慮它們,我們將能夠更多地瞭解解決方案。 在飛行物體的每個先前位置,我們可以預測一條可能的路徑。 當模型接收到有關對象先前所在位置的更多信息時,預測的路徑會更新,並且此信息會更新自身,然後輸入到模型的未來序列中。 類似地,當將來自句子場景的每個新單詞輸入到模型中時,將生成可能單詞的新組合。

Image for post
Image by Author
圖片作者

Neural networks are an essential part of AI and ML as they allow models to automatically learn from data, and they combine a version of human learning with great computational ability. However, applying a non-sequential structure to a sequential task will result in poor model performance, and the true power of neural networks would not be harnessed. RNNs are artificial learning systems which internally update themselves based on previous information, in order to predict the most accurate results over time.

神經網絡是AI和ML的重要組成部分,因爲它們允許模型自動從數據中學習,並且將人類學習的一種版本與強大的計算能力結合在一起。 但是,將非順序結構應用於順序任務將導致較差的模型性能,並且無法利用神經網絡的真正功能。 RNN是人工學習系統,可以根據以前的信息在內部進行自我更新,以便預測一段時間內最準確的結果。

dspace.mit.edu/bitstream/handle/1721.1/113146/1018306404-MIT.pdf?sequence=1

dspace.mit.edu/bitstream/handle/1721.1/113146/1018306404-MIT.pdf?sequence=1

stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks

stanford.edu/~shervine/teaching/cs-230/cheatsheet-recurrent-neural-networks

wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

karpathy.github.io/2015/05/21/rnn-effectiveness/

karpathy.github.io/2015/05/21/rnn-efficiency/

Other Useful Material:

其他有用的材料:

deeplearning.mit.edu/

deeplearning.mit.edu/

neuralnetworksanddeeplearning.com/

neuronetworksanddeeplearning.com/

towardsdatascience.com/what-is-deep-learning-adf5d4de9afc

向datascience.com/what-is-deep-learning-adf5d4de9afc

towardsdatascience.com/the-mathematics-behind-deep-learning-f6c35a0fe077

向datascience.com/the-mathematics-behind-deep-learning-f6c35a0fe077

翻譯自: https://towardsdatascience.com/introducing-recurrent-neural-networks-f359653d7020

卷積神經網絡 遞歸神經網絡