How to read video frames in hadoop?如何在Hadoop中讀取視頻幀?

To process specialized file formats (such as video) in Hadoop, you'd have to write a custom InputFormat and RecordReader that understands how to turn a video file into splits (the InputFormat) and then read splits into values (the RecordReader).在Hadoop要處理的專用文件格式(如視頻),你就必須寫一個自定義的InputFormat和RecordReader,瞭解如何將一個視頻文件分割(InputFormat),而後讀值(RecordReader)分裂成。 This is a non-trivial task and requires some intermediate knowledge of how Hadoop handles the splitting of data.這是一個不平凡的任務和Hadoop的處理分割的數據須要一些中間的知識。 I highly recommend Tom White's Hadoop the Definitive Guide book by O'Reilly as well as the videos on http://www.cloudera.com .我強烈建議湯姆白色的Hadoop權威指南書由O'Reilly和視頻http://www.cloudera.com 。 (Full disclosure: I work for Cloudera.) (披露:我工做的Cloudera的。)html

Keep in mind that video formats are generally compressed which gets even more complicated because InputSplits (created by an InputFormat) are simple byte offsets into the file (normally).請記住,通常都是壓縮的視頻格式變得更加複雜,由於InputSplits建立一個InputFormat是簡單的字節偏移量到文件中(一般狀況下)。 Start withhttp://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/InputFormat.html從與http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/InputFormat.html開始apache

To summarize: InputFormat knows how to generate a list of InputSplit objects that are (usually) between 64MB and 128MB and do NOT respect the notion of frames.總結:InputFormat知道如何生成的列表InputSplit對象(一般狀況下),64MB和128MB之間, 尊重的概念框架。 The RecordReader then is used to read frames out of a InputSplit to create value objects that the map reduce job can process. 「的RecordReader而後是用於讀取幀一的InputSplit,以建立的Map Reduce做業能夠處理的值對象。 If you want to generate video output you'll also need to write a custom OutputFormat.若是你想生成視頻輸出,您還須要編寫一個自定義的OutputFormat。api

Hope this helps.但願這會有所幫助。框架

相關文章
相關標籤/搜索