什麼是深度識別?算法
在實際應用中,例如對象分類問題如對象的分類(對象但是文檔、圖像、音頻等),咱們不得不面對的一個是問題是如何用數據來表示這個對象,固然這裏的數據並不是初始的像素或者文字,也就是這些數據是比初始數據具備更爲高層的含義,這裏的數據每每指的就是對象的特徵。例如人們經常將文檔、網頁等數據用詞的集合來表示,根據文檔的詞集合表示到一個詞組短語的向量空間(vector space model, VSM模型)中,而後才能根據不一樣的學習方法設計出適用的分類器來對目標對象進行分類;又如在圖像處理中,像素強度的集合的表示方法能夠最初淺的表示一幅圖像,這也是咱們視覺意義上的閣像,但是因爲各類緣由人們提出了更高層的語義的特徵,如SIFT爲經典的幾何特徵、以LBP爲經典的紋理特徵、以特徵臉爲經典的統計特徵等,像SIFT,特徵在不少圖像處理的應用中突顯出其優越性,所以特徵選取得好壞對於實際應用的影響是很深入的。所以,選取什麼特徵或者用什麼特徵來表示某一對象對於解決一個實際問題很是的重要。然而,人爲地選取特徵的時間代價是很是昂貴,另外勞動成本也高,而所謂的啓發式的算法獲得的結果每每不穩定,結果好壞常常是依靠經驗和運氣。既然如此,人們天然考慮到自動學習來完成特徵抽取這一任務。Deep Learning的產生就是緣於此任務,它又被稱爲無監督的特徵學習(Unsupervised Feature Learning),顯然從這個名稱就能夠知道這是一個沒有人爲參與的特徵選取方法。express
來源promise
深度學習(Deep Learning)的概念是2006年左右由Geoffrey Hinton等人在《science》上發表的一篇文章《Reducing the dimensionality of data with neural networks))丨26 丨提出來的,主要經過神經網絡(Neural Network,NN)來模擬人的大腦的學習過程,但願借鑑人腦的多層抽象機制來實現對現實對象或數據(閣像、語音及文本等)的抽象表達,整合特徵抽取和分類器到一個學習框架下,特徵的抽取過程當中應該儘可能少地減小人爲的干預。網絡
深度學習是經過大量的簡單神經元組成,每層的神經元接收更低層的神經元的輸入,經過輸入與輸出之間的非線性關係,將低層特徵組合成更高層的抽象表示,並發現觀測數據的分佈式特徵。經過自下而上的學習造成多層的抽象表示,並多層次的特徵'、其一個自動地無人工干預的過程。根據學習到的網絡結構,系統將輸入的樣本數據映射到各類層次的特徵,並利用分類器或者匹配算法對頂層的輸出單元進行分類識別等。併發
研究代表,哺乳動物的大腦皮層,處理輸入信息時是採用了一種分層機制信息從感知器官輸入後,通過多層的神經元,在通過每一層神經元時,神經元會將可以體現對象本質的特徵抽取出來,而後將這些特徵繼續傳遞到下一層神經元上,一樣地,後繼的各層神經元都是以相似的方式處理和傳遞信息,最後傳至大腦[28]。深層的人工神經網絡的誕生很大程度上受這一發現的啓示,即構建一種包含多層結點,而且使得信息得以app
逐層處理抽象的神經網絡。換言之,哺乳動物的大腦是以深度方式組織的[29],這一類深層的結構組織裏的每一層會對於輸入進行不一樣層次的信息處理或者抽象表示,所以,實際生活中,層次化的方框架
法常常被用於表示一些抽象的語義概念。與哺乳動物同樣,人類的大腦處理信息時也是採用逐層傳輸和表達的方式,人腦的初級視覺系統,首先利用某些神經元探測物體邊界、元形狀,而後又利用其餘的神經元組織,逐步向上處理造成更復雜的視覺形狀[29]。人腦識別物體的原理是:外部世界中的物體先在視網膜上進行投影,而後大腦的視皮層對於彙集在視網膜上的投影進行分解處理,最後利用這些分解處理後的信息進行物體識別。iview
所以,視皮層的功能不是僅限於簡單的重現視網膜圖像,而是提取和計算感知信號丨3"]。視覺系統的輸入數據量在人類感知系統的層次結構進行了維數減約,並剔除了與物體個性無關的信息;例如對於處理潛在結構是複雜的豐富數據(如圖像、視頻、主意等),深度學習應該與人類視覺系統同樣能夠精準地獲取對象的本質特徵。深度學習的構想是借鑑大腦的分層組織方式,經過由下向上、由簡單到高級的逐層抽象的特徵學習,研究者們指望深度網絡結構能經過模擬大腦來解決複雜的模式識別難題。所以,深層的人工神經網絡是一種人工定義用於模擬人腦組織形式的多層神經網絡。dom
參考文獻分佈式
[1] BENGIO Y. Learning deep architectures for AI [J]. Foundations and Trends in Machine Learning, 2009,2(1): 1-12.
[2] BENGIO Y, DELALLEAU 0. On the expressive power of deep architectures [C]. Algorithmic Learning Theory, Berlin Heidelberg, 201]: 18-36.
[3] BENGIO Y, LECUN Y. Scaling learning algorithms towards AI [J]. Large-Scale Kernel
Machines, 2007,1 — 34
[4] PARKE F I. Computer generated animation of faces [C]. Proceedings of the ACM annual
conference, Boston, 1972 (1): 451-457.
[5]楊健.線性投影分析的理論勾算法及其在特徵抽取中的應W研究[D];南京:南京理1:人予,
2002.
[6] CHAN H, BLEDSOE W. A man-machine facial recognition system: some preliminary results
[R]. Tech. Rep. Panoramic Research Tnc,Palo Alto, 1965,
[7] KC G S, KARGER P A. Cryptology ePrint Archive [R]. Report 2005 .
[8] 1IEITMEYER R. Biometric identification promises fast and secure processing of airline
passengers [J]. ICAO Journal, 2000, 55(9): 10-11.
[9]孫冬梅,裘正定.生物特徵識別技術綜述[J].屯+學報,200],29(12) : 1744-1748.
[10]HUANG T, X10NG Z, ZHANG Z. Face recognition applications [M]. Handbook of Face
Recognition. London : Springer. 2011.
[11]ZHAO W, CHELLAPPA R, PHILLIPS P J, el al. Face recognition: A literature survey [J].
ACM Computing Surveys (CSUR), 2003, 35 (4): 399-458.
[12]PRINCE S J, WARRELL J, ELDER J, el al. Tied factor analysis for face recognition across
large pose differences [J], IEEE Transactions on Pattern Analysis and Machine
Intel 1j gence, 2008, 30(6): 970-984.
[13]BLANZ V, VETTER T. Face recognition based on fitting a 3D morphable model [J]. IEEE
Transact ions on Pattern Analysis and Machine Intelligence,2003,25(9): 1063—1074.
[14]GR0SS R, MATTHEWS I, BAKER S. Appearance-based face recognition and ]ight-fields [J].
IEEE Transactions on Pattern Analysis and Machine Intel 3 igence, 2004,26(4): 449-465.
[]5]GI-0RGHIADES A S, BHLHUMEUR P N, KR1EGMAN D J. From few to many: Illumination cone models
for face recognition under variable lighting and pose [‘]]. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 2001, 23(6): 643-660.
[16]LAWRENCE S, Gil上S C 丨」 丁SOT A C,el a]. Face recogni t ion: A convo]utiona] neural-network
approach [J]. IEEE Transact ions on Neura] Networks, ]997,8(1): 98-113.
‘55 -
大連理工大學碩士學位論文
[64] JIA K, GONGS. Multi-modal tensor face for simultaneous super-resolution and recognition
[C], ICCV, Beijing China, 2005: 1683-1690.
[65]JIA K, GONG S, Multi-modal face image super-resolutions in tensor space [C]. Advanced
Video and Signal Based Surveillance, Como Italy, 2005: 121-128.
[66]JIA K, GONG S. Generalized face super-resolution [J]. IEEE Transactions on Image
Processing, 2008,17(6): 873-886.
[67]MA X,HUANG H, WANG S,et al. A simple approach to multiview face hallucination [J].
Signal Processing Letters, 2010,17(6): 579-582.
[68]MA X, ZHANG J, QI C. Hallucinating face by position-patch [J]. Pattern recognition,
2010’ 43(6) : 2224-2236.
[69] LIN F, COOK J, CHANDRAN V,et al. Face recognition from super-resolved images [C]. ISSPA,
Nanjing China, 2005: 2217-2229.
[70]AL-AZZEH M, ELEYAN A, DEMIREL H. PCA-based face recognition from video using super
resolution [C]. ICSIS, Istanbul, 2008:1-4.
[71]ZHOU S, KRUEGER V, CHELLAPPA R. Probabilistic recognition of human faces from video
[J]. Computer Vision and Imago Understanding, 2003, 91(1): 214-245.
[72]HUANG H, HE H. Super-Resolution Method for Face Recognition Using Nonlinear Mappings
on Coherent Features [J]. IEEE Transactions on Neural Networks, 2010, 99: 1-10.
[73]HUANG H, ZENG X. Super-resolution method for multi-view face recognition from a Single
image per person using nonlinear mappings on coherent features [J]. Signal Processing
Letters, 2012,19(4) : 195-198.
基丁_深度予習的人臉識別研究
[49]BAKER S, KANADE T. Hallucinating faces [C]. Automatic Face and Gesture Recognition,
Boston, 2000:83-88.
[50]LIN F, FOOKES C, CHANDRAN V, et al. Super-resolved faces for improved face recognition
from surveillance video [M]//Advances in Biometrics. Springer. 2007: l-]0.
[51]WANG X, TANG X. Hallucinating face by eigen-transformation [J]. IEEE Transactions on
Systems, Man, and Cybernetics, 2005, 35(3): 425-434.
[52]WHEELER F W, LIU X,TU P H. Multi-frame super-resolution for face recognition [C]. IEEE
International Conference on Biometrics: Theory, Applications and Systems, Washington,
2007:1-6.
[53]SINHA P, BALAS B, OSTROVSKY Y, et al. Facc recognition by humans: Nineteen results a]]
computer vision researchers should know about [J]. Proceedings of the IEEE, 2006, 94(11):
1948-1962.
[54]ZHANG H, ZHANG B, HUANG W, et al. Gabor wavelet associative memory for face recognition
[J]. IEEE Transactions on Neura] Networks, 2005, 16(1): 275-278.
[55]SEZER 0 G, ALTUNBASAK Y, ERCIL A. Face recognition with independent component-based
super-resolution [C]. Electronic Imaging in International Society for Optics and
Photonics,San Diego, 2006: 607705-607715.
[56]GUNTUKK B K, BATUR A U,ALTUNBASAK Y, et al. Eigenface-domain super-resolution for face
recognition [J]. IEEE Transactions on Image Processing, 2003, 12(5): 597-606.
[57]LEE S-W,PARK J,LEE S-W. Low resolution face recognition based on support vector data
description [J]. Pattern recognition, 2006, 39(9); 1809-1812.
[58]ARANDJEL0VIC 0, CIP0LLA R. A manifold approach to facc recognition from low quality
video across illumination and pose using implicit super-reso]ution [C]. 1CCV, Rio de
Janeiro, Brazil, 2007,1.
[59]BURTON A M, WILSON S, COWAN M, et al. Face recognition in poor-quality video: Evidence
from security surveillance [J]. Psychological Science, 1999,10(3): 243-248.
[60]ZHUANG Y, ZHANG J, WU F. Hallucinating faces: LPH super-resolution and neighbor
reconstruction for residue compensation [J]. Pattern recognition, 2007, 40(1]):
3178-3194.
[61]LI B, CHANG H, SHAN S,et al. Low-resolution face recognition via coupled locali ty
preserving mappings [j]. Signal Processing Letters, 2010, 17(1) : 20-23.
[62]HENNINGS-YE0MANS P H, BAKER S, KUMAR B V. Simultaneous super-resolution and feature
extraction for recognition of low-resolution faces [C]. Computer Vision and Pattern
Recognition, Anchorage Alaska, 2008:1467-1475.
[63]LI Y, LIN X. Face hallucination with pose variation [C]. proceedings of the Automatic
Face and Gesture Recognition,Seoul, Korea, 2004:723-728.
58
[32]LE ROUX N, BENGIO Y. Representational power of restricted Boltzmann machines and deep
belief networks [J]. Neural Computation, 2008, 20(6): 1631-1649.
[33]WELLING M, ROSEN-ZVI M, HINTON G. Exponential family harmoniums with an application
to information retrieval [J]. Advances in neural information processing systems, 2005,
17:1481-1488.
[34]HINT0N G. A practical guide to training restricted Boltzmann machines [R]. Report of
Momentum, 2010, 9(1); 1-20.
[35]LIU J S. Monte Carlo strategies in scientific computing [M]. Berlin Heidelberg: Springer
Verlag, 2008.
[36]HINTON G E. Training products of experts by minimizing contrastive divergence [J].
Neural Computation, 2002,14(8): 1771-1800.
[37]HINT0N G E. Distributed representations [R]. Tech. Report, University of Toronto, 1984,
[38]BENGIO Y, LAMBLIN P, P0P0VICI D, et al. Greedy layer-wise training of deep networks [J].
Advances in neural information processing systems, 2007,19:153.
[39]SALAKHUTDINOV R, HINTON G. Learning a nonlinear embedding by preserving class
neighborhood structure [J]. International Journal of Computer Mathematics, 2007,84(7):
1265-1276.
[40]MANJUNATH B, CHELLAPPA R, VON DER MALSBURG C. A feature based approach to face
recognition[C]. Computer Vision and Pattern Recognition, Champaign, 1992:663-671.
[41]LADES M, VORBRUGGEN J C, BUHMANN J, et al. Distortion invariant object recognition in
the dynamic link architecture [J]. IEEE Transactions on Computers, 1993,42(3): 300-311.
[42]BEYMER D,P0GGI0 T. Face recognition from one example view [C]. Computer Vision,
Massachusetts, 1995:500-507.
[43]VETTER T, POGGIO T. Image synthesis from a single example image [C]. ECCV, Cambridge,
England ,1996: 652-659.
[44]VETTER T. Synthesis of novel views from a single face image [J]. International Journal
of Computer Vision, 1998, 28(2): 103-116-
[45]CHAI X, SHAN S, CHEN X, et al. Locally linear regression for pose-invariant face
recognition [J]. IEEE Transact ions on Image Processing, 2007,16(7): 1716-1725.
[46]R0HBAN M H, RABIEE H R, VAHDAT A. Face virtual pose generation using aligned locally
linear regression for face recognition [C]. IEEE International Conference on Image
Processing (ICIP), Cairo Egypt, 2009, 4121-4124.
[47]VAN 0UWERKERK J. Image super-resolution survey [J]. Image and Vision Computing, 2006,
24(10): 1039-】052.
[48]PARK S C, PARK M K, KANG M G. Super-resolution image reconstruction: a technical overview
[J]. Signal Processing Magazine, 2003, 20(3): 21-36.
[17]ZHANG X, GAO Y, LEUNG M. Recognizing rotated faces from frontal and side views: An approach toward effective use of mugshot databases [J]. IEEE Transactions on Information
Foronsics and Security, 2008,3(4): 684-697.
[18]TAN X,CHEN S, ZHOU Z H,et a]. Face recognition from a single image per person: A survey
[J]. Pattern recognition, 2006, 39(9): 1725-1745.
[19JFENTLAND A, MOGHADDAM B, STARKER T. View-based and modular eigenspaces for face
recognition [C]. Computer Vision and Pattern Recognition, Seattle, WA,USA, 1994 :84-91.
[20]TURK M A,PENTLAND A P. Face recognition using eigenfaces [C]. Computer Vision and
Pattern Recognition, San Diego, 1991:22-28.
[21]ZHAO W, KR1SHNASWAMY A, CHELLAPPA R, et al. Discriminant analysis of principal
components for face recognition [M]. Face Recognition. Berlin Heidelberg :Springer.
1998.
[22]GA()Y,LEUNG M K 11. Face recogni t ion using line edge map [J]. IEI:E Transactions on Pattern
Analysis and Machine Intelligence, 2002, 24(6): 764-779.
[23]WISK0TT L, FELL0US J-M, KUIGER N, et al. Face recognition by elastic bunch graph matching
[J]. ]EEE Transactions on Pattern Analysis and Machine Intelligence, 1997,19(7):
775-779.
[24]SHIN H-C, PARK J H, KIM S-D. Combination of warping robust elastic graph matching and
kerne]-basecl projection discriminant analysis for face recognition [J]. IEEE Trans on
Multimedia, 2007, 9(6): 1125-1136.
[25]0JALA T, PIET1KAINEN M, MAENPAA T. Mulliresolution gray-scale and rotation invariant
texture classification with local binary patterns [J]. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 2002, 24(7): 971-987.
[26]HINT0N G E,SALAKHUTDINOV R R. Reducing the dimensional i ty of data with neural networks
[J]. Science, 2006, 313(5786): 504.
[27]LEE T S, MUMF0RD D. Hierarchical Bayesian inference in the visual cortex [J]. J0SA 八,
2003, 20(7) : 1434-1448.
[28]LEE 丁 S, MUMFORD I), ROMERO R, et a]. The role oi' the primary visual cortex in higher
]eve] vision [J]. Vision research, 1998, 38(15): 2429-2454.
[29]SERKE T, KREIMAN G, K0UI1 M, et al. A quantitative theory of immediate visual recognilion
[J]. Progress in Brain Research, 2007, 165:33-56.
[30]ROSS1 A F,Dl^SIMONE R, UNGERLEIDER L G. Contextual modulation in primary visual cortex
of macaqucs [,】].The Journal of Neuroscience, 2001, 21(5): 1698-1709.
[31]BRAVKRMAN M. Pol y-logari thmi c independence I'ool s bounded-depth boolean circuits [J].
Communications of the ACM, 2011,54 ⑷:108-115.
56