總結一些機器視覺庫

VLFeat 目前最好的Sift開源實現。同時包含了KD-tree，KD-Forest，BoW實現。
VLFeat：著名而經常使用
項目網站：http://www.vlfeat.org

著名的計算機視覺/圖像處理開源項目，知名度應該沒必要OpenCV低太多，曾獲ACM Open Source Software Competition 2010一等獎。使用C語言編寫，提供C語言和Matlab兩種接口。實現了大量計算機視覺算法，包括：
- 經常使用圖像處理功能，包括顏色空間變換、幾何變換（做爲Matlab的補充），經常使用機器學習算法，包括GMM、SVM、KMeans等，經常使用的圖像處理的plot工具。
- 特徵提取，包括 Covariant detectors, HOG, SIFT,MSER等。VLFeat提供了一個vl_covdet() 函數做爲框架，能夠方便的統一所謂「co-variant feature detectors」，包括了DoG, Harris-Affine, Harris-Laplace而且能夠提取SIFT或raw patches描述子。
- 超像素（Superpixel）分割，包括經常使用的Quick shift, SLIC算法等
- 高級聚類算法，好比整數KMeans：Integer k-means (IKM)、hierarchical version of integer k-means (HIKM)，基於互信息自動斷定聚類類數的算法Agglomerative Information Bottleneck (AIB) algorithm等
- 高維特特徵匹配算法，隨機KD樹Randomized kd-trees
能夠在這裏查看VLFeat完整的功能列表。 http://www.vlfeat.org/matlab/matlab.html

Ferns 基於Naive Bayesian Bundle的特徵點識別。高速，但佔用內存高。

SIFT By Rob Hess 基於OpenCV的Sift實現。

目標檢測/Object Detection

AdaBoost By JianXin.Wu 又一個AdaBoost實現。訓練速度快。

行人檢測 By JianXin.Wu 基於Centrist和Linear SVM的快速行人檢測。

（近似）最近鄰/ANN

FLANN 目前最完整的（近似）最近鄰開源庫。不但實現了一系列查找算法，還包含了一種自動選取最快算法的機制。

ANN 另一個近似最近鄰庫。

SLAM & SFM

SceneLib [LGPL] monoSLAM庫。由Androw Davison開發。

圖像分割/Segmentation

SLIC Super Pixel 使用Simple Linear Iterative Clustering產生指定數目，近似均勻分佈的Super Pixel。

目標跟蹤/Tracking

TLD 基於Online Random Forest的目標跟蹤算法。

KLT Kanade-Lucas-Tracker

Online boosting trackers Online Boosting Trackers

直線檢測/Line Detection

DSCC 基於聯通域鏈接的直線檢測算法。

LSD [GPL] 基於梯度的，局部直線段檢測算子。

指紋/Finger Print

pHash [GPL] 基於感知的多媒體文件Hash算法。（提取，對比圖像、視頻、音頻的指紋）

視覺顯著性/Visual Salience

Global Contrast Based Salient Region Detection Ming-Ming Cheng的視覺顯著性算法。

FFT/DWT

FFTW [GPL] 最快，最好的開源FFT。

FFTReal [WTFPL] 輕量級的FFT實現。許可證是亮點。

音頻處理/Audio processing

STK [Free] 音頻處理，音頻合成。

libsndfile [LGPL] 音頻文件IO。

libsamplerate [GPL ]音頻重採樣。

小波變換 快速小波變換（FWT）

BRIEF: Binary Robust Independent Elementary Feature 一個很好的局部特徵描述子，裏面有FAST corner + BRIEF實現特徵點匹配的DEMO：http://cvlab.epfl.ch/software/brief/

http://code.google.com/p/javacv

Java打包的OpenCV, FFmpeg, libdc1394, PGR FlyCapture, OpenKinect, videoInput, and ARToolKitPlus庫。能夠放在Android上用~

libHIK,HIK SVM，計算HIK SVM跟Centrist的Lib。http://c2inet.sce.ntu.edu.sg/Jianxin/projects/libHIK/libHIK.htm

一組視覺顯著性檢測代碼的連接：http://cg.cs.tsinghua.edu.cn/people/~cmm/saliency/

Peter Kovesi的工具箱：輕量好用，側重圖像處理 http://www.peterkovesi.com/matlabfns/

項目網站：http://www.csse.uwa.edu.au/~pk/research/matlabfns/

這位Peter大哥目前在The University of Western Australia工做，他本身寫了一套Matlab計算機視覺算法，所謂工具箱其實就是許多m文件的集合，所有Matlab實現，無需編譯安裝，支持Octave（若是沒有Matlab的話，有了這個工具箱也能夠在Octave下進行圖像處理了）。別看這位大哥單槍匹馬，人家的工具箱但是至關有名，研究時候須要哪一個Matlab的計算機視覺小功能，直接到他家主頁上下幾個m文件放在本身文件夾就行了。這個工具箱主要以圖像處理算法爲主，附帶一些三維視覺的基本算法，列一些包括的功能：

Feature Detection via Phase Congruency，經過相位一致性檢測圖像特徵
Spatial Feature Detection，Harris、Canny之類的特徵算法
Edge Linking and Line Segment Fitting，邊緣特徵和線特徵的各類操做
Image Denoising，圖像降噪
Surface Normals to Surfaces，從法向量積分出表面
Scalogram Calculation，量圖計算
Anisotropic diffusion，著名的保邊緣平滑算法
Frequency Domain Transformations，傅立葉變換
Functions Supporting Projective Geometry，透視幾何、三維視覺的一些算法
Feature Matching、特徵匹配
Model Fitting and Robust Estimation、RANSAC
Fingerprint Enhancement，指紋圖像加強
Interesting Synthetic Images，一些好玩兒的圖像生成算法
Image Blending，圖像融合
Colourmaps and colour conversions，顏色空間算法

MexOpenCV：讓Matlab支持調用的OpenCV

項目網站：http://www.cs.sunysb.edu/~kyamagu/mexopencv/

做者Kota Yamaguchi桑是石溪大學（Stony Brook University）的PhD，早些時候本身搞了一套東西把OpenCV的代碼編譯成Matlab可用的mex接口，而後這個東西迅速火了。今年夏天這個項目被OpenCV吸取爲一個模塊，貌似是搞了一個Google Summer of Code（GSoC）的項目，最近（大概是九、10月）已經merge到了OpenCV主包，有興趣的能夠到Github的OpenCV庫下的module/matlab去玩一下，應該會在10月份的OpenCV 3 alpha里正式發佈。如今OpenCV就同時有了Python和Maltab的binding（好強大）。具體的功能就不細說了，既然是OpenCV的binding，固然是可使用OpenCV的絕大多數算法了。

介紹n款計算機視覺庫/人臉識別開源庫/軟件

計算機視覺庫 OpenCV

OpenCV是Intel®開源計算機視覺庫。它由一系列 C 函數和少許 C++ 類構成，實現了圖像處理和計算機視覺方面的不少通用算法。 OpenCV 擁有包括 300 多個C函數的跨平臺的中、高層 API。它不依賴於其它的外部庫——儘管也可使用某些外部庫。 OpenCV 對非商業...

人臉識別 faceservice.cgi

faceservice.cgi 是一個用來進行人臉識別的 CGI 程序，你能夠經過上傳圖像，而後該程序即告訴你人臉的大概座標位置。faceservice是採用 OpenCV 庫進行開發的。

OpenCV的.NET版 OpenCVDotNet

OpenCVDotNet 是一個 .NET 對 OpenCV 包的封裝。

人臉檢測算法 jViolajones

jViolajones是人臉檢測算法Viola-Jones的一個Java實現，並可以加載OpenCV XML文件。示例代碼：http://www.oschina.net/code/snippet_12_2033

Java視覺處理庫 JavaCV

JavaCV 提供了在計算機視覺領域的封裝庫，包括：OpenCV、ARToolKitPlus、libdc1394 2.x 、PGR FlyCapture和FFmpeg。此外，該工具能夠很容易地使用Java平臺的功能。 JavaCV還帶有硬件加速的全屏幕圖像顯示（CanvasFrame），易於在多個內核中執行並行代碼（並...

運動檢測程序 QMotion

QMotion 是一個採用 OpenCV 開發的運動檢測程序，基於 QT。

視頻監控系統 OpenVSS

OpenVSS - 開放平臺的視頻監控系統 - 是一個系統級別的視頻監控軟件視頻分析框架（VAF）的視頻分析與檢索和播放服務，記錄和索引技術。它被設計成插件式的支持多攝像頭平臺，多分析儀模塊（OpenCV的集成），以及多核心架構。

手勢識別 hand-gesture-detection

手勢識別，用OpenCV實現

人臉檢測識別 mcvai-tracking

提供人臉檢測、識別與檢測特定人臉的功能，示例代碼 cvReleaseImage( &gray ); cvReleaseMemStorage(&storage); cvReleaseHaarClassifierCascade(&cascade);...

人臉檢測與跟蹤庫 asmlibrary

Active Shape Model Library (ASMLibrary©) SDK, 用OpenCV開發，用於人臉檢測與跟蹤。

Lua視覺開發庫 libecv

ECV 是 lua 的計算機視覺開發庫(目前只提供linux支持)

OpenCV的.Net封裝 OpenCVSharp

OpenCVSharp 是一個OpenCV的.Net wrapper，應用最新的OpenCV庫開發，使用習慣比EmguCV更接近原始的OpenCV，有詳細的使用樣例供參考。

3D視覺庫 fvision2010

基於OpenCV構建的圖像處理和3D視覺庫。示例代碼： ImageSequenceReaderFactory factory; ImageSequenceReader* reader = factory.pathRegex("c:/a/im_%03d.jpg", 0, 20); //ImageSequenceReader*
reader = factory.avi("a.avi"); if (reader == NULL) { ...

基於QT的計算機視覺庫 QVision

基於 QT 的面向對象的多平臺計算機視覺庫。能夠方便的建立圖形化應用程序，算法庫主要從 OpenCV，GSL，CGAL，IPP，Octave 等高性能庫借鑑而來。

圖像特徵提取 cvBlob

cvBlob 是計算機視覺應用中在二值圖像裏尋找連通域的庫.可以執行連通域分析與特徵提取.

實時圖像/視頻處理濾波開發包 GShow

GShow is a real-time image/video processing filter development kit. It successfully integrates DirectX11 with DirectShow framework. So it has the following
features: GShow 是實時圖像/視頻處理濾波開發包，集成DiretX11。...

視頻捕獲 API VideoMan

VideoMan 提供一組視頻捕獲 API 。支持多種視頻流同時輸入（視頻傳輸線、USB攝像頭和視頻文件等）。能利用 OpenGL 對輸入進行處理，方便的與 OpenCV，CUDA 等集成開發計算機視覺系統。

開放模式識別項目 OpenPR

Pattern Recognition project（開放模式識別項目），致力於開發出一套包含圖像處理、計算機視覺、天然語言處理、模式識別、機器學習和相關領域算法的函數庫。

OpenCV的Python封裝 pyopencv

OpenCV的Python封裝，主要特性包括：提供與OpenCV 2.x中最新的C++接口極爲類似的Python接口，而且包括C++中不包括的C接口提供對OpenCV 2.x中全部主要部件的綁定：CxCORE (almost complete), CxFLANN (complete), Cv (complete),
CvAux (C++ part almost...

視覺快速開發平臺 qcv

計算機視覺快速開發平臺，提供測試框架，使開發者能夠專一於算法研究。

圖像捕獲 libv4l2cam

對函數庫v412的封裝，從網絡攝像頭等硬件得到圖像數據，支持YUYV裸數據輸出和BGR24的OpenCV IplImage輸出

計算機視覺算法 OpenVIDIA

OpenVIDIA projects implement computer vision algorithms running on on graphics hardware such as single or multiple graphics processing units(GPUs) using
OpenGL, Cg and CUDA-C. Some samples will soon support OpenCL and Direct Compute API'...

高斯模型點集配准算法 gmmreg

實現了基於混合高斯模型的點集配准算法，該算法描述在論文： A Robust Algorithm for Point Set Registration Using Mixture of Gaussians, Bing Jian and Baba C. Vemuri. ，實現了C++/Matlab/Python接口...

模式識別和視覺庫 RAVL

Recognition And Vision Library (RAVL) 是一個通用 C++ 庫，包含計算機視覺、模式識別等模塊。

圖像處理和計算機視覺經常使用算法庫 LTI-Lib

LTI-Lib 是一個包含圖像處理和計算機視覺經常使用算法和數據結構的面向對象庫，提供 Windows 下的 VC 版本和 Linux 下的 gcc 版本，主要包含如下幾方面內容：一、線性代數二、聚類分析三、圖像處理四、可視化和繪圖工具

OpenCV優化 opencv-dsp-acceleration

優化了OpenCV庫在DSP上的速度。

C++計算機視覺庫 Integrating Vision Toolkit

Integrating Vision Toolkit (IVT) 是一個強大而迅速的C++計算機視覺庫，擁有易用的接口和麪向對象的架構，而且含有本身的一套跨平臺GUI組件，另外能夠選擇集成OpenCV

計算機視覺和機器人技術的工具包 EGT

The Epipolar Geometry Toolbox (EGT) is a toolbox designed for Matlab (by Mathworks Inc.). EGT provides a wide set of functions to approach computer vision
and robotics problems with single and multiple views, and with different vision se...

OpenCV的擴展庫 ImageNets

ImageNets 是對OpenCV 的擴展，提供對機器人視覺算法方面友好的支持，使用Nokia的QT編寫界面。

libvideogfx

視頻處理、計算機視覺和計算機圖形學的快速開發庫。

Matlab計算機視覺包 mVision

Matlab 的計算機視覺包，包含用於觀察結果的 GUI 組件，貌似也中止開發了，拿來作學習用挺不錯的。

Scilab的計算機視覺庫 SIP

SIP 是 Scilab（一種免費的類Matlab編程環境）的圖像處理和計算機視覺庫。SIP 能夠讀寫 JPEG/PNG/BMP 格式的圖片。具有圖像濾波、分割、邊緣檢測、形態學處理和形狀分析等功能。

STAIR Vision Library

STAIR Vision Library (SVL) 最初是爲支持斯坦福智能機器人設計的，提供對計算機視覺、機器學習和機率統計模。

視覺相關網站

今天的主要任務就是和你們分享一些鄙人收藏的認爲至關研究價值的網頁：

Oxford大牛：Andrew Zisserman，http://www.robots.ox.ac.uk/~vgg/hzbook/code/，此人主要研究多幅圖像的幾何學，該網站提供了部分工具，至關實用，還有例子

西澳大利亞大學的Peter Kovesi：http://www.csse.uwa.edu.au/~pk/research/matlabfns/，提供了一些基本的matlab工具，主要內容涉及Computer Vision, Image Processing

CMU：http://www.cs.cmu.edu/afs/cs/project/cil/ftp/html/vision.htm l,該網站是個人最愛，尤爲後面這個地址http://www.cs.cmu.edu/afs/cs/project/cil/ftp/html/v-groups.html，在這裏提供了世界各地機構、大學在Computer Vision所涉及各領域的研究狀況，包括Image Processing, Machine Vision，我後來也是經過它鏈接到了不少國外的網站

Cambridge：http://mi.eng.cam.ac.uk/milab.html，這是劍橋大學的機器智能實驗室，裏面有三個小組，Computer Vision & Robotics, Machine Intelligence, Speech，目前爲止，Computer Vision & Robotics的一些研究成果對我往後的幫助可能會比較大，因此在此說起

大量計算機視覺方面的原版電子書：http://homepages.inf.ed.ac.uk/rbf/CVonline/books.htm，我今天先下了本Zisserman的書，呵呵，國外的原版書，雖然都是比較老的，可是對於基礎的理解學習仍是頗有幫助的，至於目前的研究現狀只能經過論文或者一些研究小組的網站

stanford：http://ai.stanford.edu/~asaxena/reconstruction3d/，這個網站是Andrew N.G老師和一個印度阿三的博士一塊兒維護的，主要對於單張照片的三維重建，尤爲他有個網頁make3d.stanford.edu可讓你本身上傳你的照片，經過網站來重建三維模型，這個網站對於剛開始接觸Computer Vision的我來講，如獲至寶，但有個致命問題就是make3d已經沒法註冊，我也屢次給Andrew和印度阿三email，至今未回，鬱悶，要是有這個網站的賬號，那仍是至關爽的，不知道是否是因爲他們的郵箱把個人email當成垃圾郵件過濾，哎，但這個stanford網站的貢獻主要是代碼，有不少computer vision的基礎工具，貌似40M左右，全都是基於matlab的

caltech：http://www.vision.caltech.edu/bouguetj/calib_doc/，這是咱們Computer Vision老師課件上的鏈接，主要是用於攝像機標定的工具集，固然也有涉及對標定圖像三維重建的前期處理過程

JP Tarel：http://perso.lcpc.fr/tarel.jean-philippe/，這是他的我的主頁，也是目前爲止我發的email中，惟一一個給我回信的老外，由於我須要重建練習的正是他的圖片集，我讀過他的論文，但沒有涉及代碼的內容，再加上又是94年之前的論文，不少相關的引文，我都沒法下載，在個人再三追問下，Tarel教授只告訴我，你能夠按照個人那篇論文對足球進行重建，但是...你知道嗎，你有不少圖像處理的引文都下不了了，我只知道你經過那篇文章作了圖像的預處理，根本不知道具體過程，固然我有幸找到過一篇90左右的論文，講的是region-based segmentation，但是這文章裏全部引文又是找不到的....悲劇的人生

開源軟件網站：www.sourceforge.net

機器視覺開源代碼集合

1、特徵提取Feature Extraction：

SIFT [1] [Demo program][SIFT Library] [VLFeat]
PCA-SIFT [2] [Project]
Affine-SIFT [3] [Project]
SURF [4] [OpenSURF] [Matlab Wrapper]
Affine Covariant Features [5] [Oxford project]
MSER [6] [Oxford project] [VLFeat]
Geometric Blur [7] [Code]
Local Self-Similarity Descriptor [8] [Oxford implementation]
Global and Efficient Self-Similarity [9] [Code]
Histogram of Oriented Graidents [10] [INRIA Object Localization Toolkit] [OLT toolkit for Windows]
GIST [11] [Project]
Shape Context [12] [Project]
Color Descriptor [13] [Project]
Pyramids of Histograms of Oriented Gradients [Code]
Space-Time Interest Points (STIP) [14][Project] [Code]
Boundary Preserving Dense Local Regions [15][Project]
Weighted Histogram[Code]
Histogram-based Interest Points Detectors[Paper][Code]
An OpenCV - C++ implementation of Local Self Similarity Descriptors [Project]
Fast Sparse Representation with Prototypes[Project]
Corner Detection [Project]
AGAST Corner Detector: faster than FAST and even FAST-ER[Project]
Real-time Facial Feature Detection using Conditional Regression Forests[Project]
Global and Efficient Self-Similarity for Object Classification and Detection[code]
WαSH: Weighted α-Shapes for Local Feature Detection[Project]
HOG[Project]
Online Selection of Discriminative Tracking Features[Project]

2、圖像分割Image Segmentation：

Normalized Cut [1] [Matlab code]
Gerg Mori’ Superpixel code [2] [Matlab code]
Efficient Graph-based Image Segmentation [3] [C++ code] [Matlab wrapper]
Mean-Shift Image Segmentation [4] [EDISON C++ code] [Matlab wrapper]
OWT-UCM Hierarchical Segmentation [5] [Resources]
Turbepixels [6] [Matlab code 32bit] [Matlab code 64bit] [Updated code]
Quick-Shift [7] [VLFeat]
SLIC Superpixels [8] [Project]
Segmentation by Minimum Code Length [9] [Project]
Biased Normalized Cut [10] [Project]
Segmentation Tree [11-12] [Project]
Entropy Rate Superpixel Segmentation [13] [Code]
Fast Approximate Energy Minimization via Graph Cuts[Paper][Code]
Efﬁcient Planar Graph Cuts with Applications in Computer Vision[Paper][Code]
Isoperimetric Graph Partitioning for Image Segmentation[Paper][Code]
Random Walks for Image Segmentation[Paper][Code]
Blossom V: A new implementation of a minimum cost perfect matching algorithm[Code]
An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Computer Vision[Paper][Code]
Geodesic Star Convexity for Interactive Image Segmentation[Project]
Contour Detection and Image Segmentation Resources[Project][Code]
Biased Normalized Cuts[Project]
Max-flow/min-cut[Project]
Chan-Vese Segmentation using Level Set[Project]
A Toolbox of Level Set Methods[Project]
Re-initialization Free Level Set Evolution via Reaction Diffusion[Project]
Improved C-V active contour model[Paper][Code]
A Variational Multiphase Level Set Approach to Simultaneous Segmentation and Bias Correction[Paper][Code]
Level Set Method Research by Chunming Li[Project]
ClassCut for Unsupervised Class Segmentation[code]
SEEDS: Superpixels Extracted via Energy-Driven Sampling [Project][other]

3、目標檢測Object Detection：

A simple object detector with boosting [Project]
INRIA Object Detection and Localization Toolkit [1] [Project]
Discriminatively Trained Deformable Part Models [2] [Project]
Cascade Object Detection with Deformable Part Models [3] [Project]
Poselet [4] [Project]
Implicit Shape Model [5] [Project]
Viola and Jones’s Face Detection [6] [Project]
Bayesian Modelling of Dyanmic Scenes for Object Detection[Paper][Code]
Hand detection using multiple proposals[Project]
Color Constancy, Intrinsic Images, and Shape Estimation[Paper][Code]
Discriminatively trained deformable part models[Project]
Gradient Response Maps for Real-Time Detection of Texture-Less Objects: LineMOD [Project]
Image Processing On Line[Project]
Robust Optical Flow Estimation[Project]
Where's Waldo: Matching People in Images of Crowds[Project]
Scalable Multi-class Object Detection[Project]
Class-Specific Hough Forests for Object Detection[Project]
Deformed Lattice Detection In Real-World Images[Project]
Discriminatively trained deformable part models[Project]

4、顯著性檢測Saliency Detection：

Itti, Koch, and Niebur’ saliency detection [1] [Matlab code]
Frequency-tuned salient region detection [2] [Project]
Saliency detection using maximum symmetric surround [3] [Project]
Attention via Information Maximization [4] [Matlab code]
Context-aware saliency detection [5] [Matlab code]
Graph-based visual saliency [6] [Matlab code]
Saliency detection: A spectral residual approach. [7] [Matlab code]
Segmenting salient objects from images and videos. [8] [Matlab code]
Saliency Using Natural statistics. [9] [Matlab code]
Discriminant Saliency for Visual Recognition from Cluttered Scenes. [10] [Code]
Learning to Predict Where Humans Look [11] [Project]
Global Contrast based Salient Region Detection [12] [Project]
Bayesian Saliency via Low and Mid Level Cues[Project]
Top-Down Visual Saliency via Joint CRF and Dictionary Learning[Paper][Code]
Saliency Detection: A Spectral Residual Approach[Code]

5、圖像分類、聚類Image Classification, Clustering

Pyramid Match [1] [Project]
Spatial Pyramid Matching [2] [Code]
Locality-constrained Linear Coding [3] [Project] [Matlab code]
Sparse Coding [4] [Project] [Matlab code]
Texture Classification [5] [Project]
Multiple Kernels for Image Classification [6] [Project]
Feature Combination [7] [Project]
SuperParsing [Code]
Large Scale Correlation Clustering Optimization[Matlab code]
Detecting and Sketching the Common[Project]
Self-Tuning Spectral Clustering[Project][Code]
User Assisted Separation of Reflections from a Single Image Using a Sparsity Prior[Paper][Code]
Filters for Texture Classification[Project]
Multiple Kernel Learning for Image Classification[Project]
SLIC Superpixels[Project]

6、摳圖Image Matting

A Closed Form Solution to Natural Image Matting [Code]
Spectral Matting [Project]
Learning-based Matting [Code]

7、目標跟蹤Object Tracking：

A Forest of Sensors - Tracking Adaptive Background Mixture Models [Project]
Object Tracking via Partial Least Squares Analysis[Paper][Code]
Robust Object Tracking with Online Multiple Instance Learning[Paper][Code]
Online Visual Tracking with Histograms and Articulating Blocks[Project]
Incremental Learning for Robust Visual Tracking[Project]
Real-time Compressive Tracking[Project]
Robust Object Tracking via Sparsity-based Collaborative Model[Project]
Visual Tracking via Adaptive Structural Local Sparse Appearance Model[Project]
Online Discriminative Object Tracking with Local Sparse Representation[Paper][Code]
Superpixel Tracking[Project]
Learning Hierarchical Image Representation with Sparsity, Saliency and Locality[Paper][Code]
Online Multiple Support Instance Tracking [Paper][Code]
Visual Tracking with Online Multiple Instance Learning[Project]
Object detection and recognition[Project]
Compressive Sensing Resources[Project]
Robust Real-Time Visual Tracking using Pixel-Wise Posteriors[Project]
Tracking-Learning-Detection[Project][OpenTLD/C++ Code]
the HandVu：vision-based hand gesture interface[Project]
Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities[Project]

8、Kinect：

Kinect toolbox[Project]
OpenNI[Project]
zouxy09 CSDN Blog[Resource]
FingerTracker 手指跟蹤[code]

9、3D相關：

3D Reconstruction of a Moving Object[Paper] [Code]
Shape From Shading Using Linear Approximation[Code]
Combining Shape from Shading and Stereo Depth Maps[Project][Code]
Shape from Shading: A Survey[Paper][Code]
A Spatio-Temporal Descriptor based on 3D Gradients (HOG3D)[Project][Code]
Multi-camera Scene Reconstruction via Graph Cuts[Paper][Code]
A Fast Marching Formulation of Perspective Shape from Shading under Frontal Illumination[Paper][Code]
Reconstruction:3D Shape, Illumination, Shading, Reflectance, Texture[Project]
Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers[Code]
Learning 3-D Scene Structure from a Single Still Image[Project

10、機器學習算法：

Matlab class for computing Approximate Nearest Nieghbor (ANN) [Matlab class providing interface toANN library]
Random Sampling[code]
Probabilistic Latent Semantic Analysis (pLSA)[Code]
FASTANN and FASTCLUSTER for approximate k-means (AKM)[Project]
Fast Intersection / Additive Kernel SVMs[Project]
SVM[Code]
Ensemble learning[Project]
Deep Learning[Net]
Deep Learning Methods for Vision[Project]
Neural Network for Recognition of Handwritten Digits[Project]
Training a deep autoencoder or a classifier on MNIST digits[Project]
THE MNIST DATABASE of handwritten digits[Project]
Ersatz：deep neural networks in the cloud[Project]
Deep Learning [Project]
sparseLM : Sparse Levenberg-Marquardt nonlinear least squares in C/C++[Project]
Weka 3: Data Mining Software in Java[Project]
Invited talk "A Tutorial on Deep Learning" by Dr. Kai Yu (餘凱)[Video]
CNN - Convolutional neural network class[Matlab Tool]
Yann LeCun's Publications[Wedsite]
LeNet-5, convolutional neural networks[Project]
Training a deep autoencoder or a classifier on MNIST digits[Project]
Deep Learning 大牛Geoffrey E. Hinton's HomePage[Website]
Multiple Instance Logistic Discriminant-based Metric Learning (MildML) and Logistic Discriminant-based Metric Learning (LDML)[Code]
Sparse coding simulation software[Project]
Visual Recognition and Machine Learning Summer School[Software]

11、目標、行爲識別Object, Action Recognition：

Action Recognition by Dense Trajectories[Project][Code]
Action Recognition Using a Distributed Representation of Pose and Appearance[Project]
Recognition Using Regions[Paper][Code]
2D Articulated Human Pose Estimation[Project]
Fast Human Pose Estimation Using Appearance and Motion via Multi-Dimensional Boosting Regression[Paper][Code]
Estimating Human Pose from Occluded Images[Paper][Code]
Quasi-dense wide baseline matching[Project]
ChaLearn Gesture Challenge: Principal motion: PCA-based reconstruction of motion histograms[Project]
Real Time Head Pose Estimation with Random Regression Forests[Project]
2D Action Recognition Serves 3D Human Pose Estimation[Project]
A Hough Transform-Based Voting Framework for Action Recognition[Project]
Motion Interchange Patterns for Action Recognition in Unconstrained Videos[Project]
2D articulated human pose estimation software[Project]
Learning and detecting shape models [code]
Progressive Search Space Reduction for Human Pose Estimation[Project]
Learning Non-Rigid 3D Shape from 2D Motion[Project]

12、圖像處理：

Distance Transforms of Sampled Functions[Project]
The Computer Vision Homepage[Project]
Efficient appearance distances between windows[code]
Image Exploration algorithm[code]
Motion Magnification 運動放大 [Project]
Bilateral Filtering for Gray and Color Images 雙邊濾波器 [Project]
A Fast Approximation of the Bilateral Filter using a Signal Processing Approach [Project]

十3、一些實用工具：

EGT: a Toolbox for Multiple View Geometry and Visual Servoing[Project] [Code]
a development kit of matlab mex functions for OpenCV library[Project]
Fast Artificial Neural Network Library[Project]

十4、人手及指尖檢測與識別：

finger-detection-and-gesture-recognition [Code]
Hand and Finger Detection using JavaCV[Project]
Hand and fingers detection[Code]

十5、場景解釋：

Nonparametric Scene Parsing via Label Transfer [Project]

十6、光流Optical flow：

High accuracy optical flow using a theory for warping [Project]
Dense Trajectories Video Description [Project]
SIFT Flow: Dense Correspondence across Scenes and its Applications[Project]
KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker [Project]
Tracking Cars Using Optical Flow[Project]
Secrets of optical flow estimation and their principles[Project]
implmentation of the Black and Anandan dense optical flow method[Project]
Optical Flow Computation[Project]
Beyond Pixels: Exploring New Representations and Applications for Motion Analysis[Project]
A Database and Evaluation Methodology for Optical Flow[Project]
optical flow relative[Project]
Robust Optical Flow Estimation [Project]
optical flow[Project]

十7、圖像檢索Image Retrieval：

Semi-Supervised Distance Metric Learning for Collaborative Image Retrieval [Paper][code]

十8、馬爾科夫隨機場Markov Random Fields：

Markov Random Fields for Super-Resolution [Project]
A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors [Project]

十9、運動檢測Motion detection：

Moving Object Extraction, Using Models or Analysis of Regions [Project]
Background Subtraction: Experiments and Improvements for ViBe [Project]
A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications [Project]
changedetection.net: A new change detection benchmark dataset[Project]
ViBe - a powerful technique for background detection and subtraction in video sequences[Project]
Background Subtraction Program[Project]
Motion Detection Algorithms[Project]
Stuttgart Artificial Background Subtraction Dataset[Project]
Object Detection, Motion Estimation, and Tracking[Project]

Feature Detection and Description

General Libraries:

VLFeat – Implementation of various feature descriptors (including SIFT, HOG, and LBP) and covariant feature detectors (including DoG, Hessian, Harris Laplace, Hessian Laplace, Multiscale Hessian, Multiscale Harris). Easy-to-use Matlab interface. See Modern features: Software – Slides providing a demonstration of VLFeat and also links to other software. Check also VLFeat hands-on session training
OpenCV – Various implementations of modern feature detectors and descriptors (SIFT, SURF, FAST, BRIEF, ORB, FREAK, etc.)

Fast Keypoint Detectors for Real-time Applications:

FAST – High-speed corner detector implementation for a wide variety of platforms
AGAST – Even faster than the FAST corner detector. A multi-scale version of this method is used for the BRISK descriptor (ECCV 2010).

Binary Descriptors for Real-Time Applications:

BRIEF – C++ code for a fast and accurate interest point descriptor (not invariant to rotations and scale) (ECCV 2010)
ORB – OpenCV implementation of the Oriented-Brief (ORB) descriptor (invariant to rotations, but not scale)
BRISK – Efficient Binary descriptor invariant to rotations and scale. It includes a Matlab mex interface. (ICCV 2011)
FREAK – Faster than BRISK (invariant to rotations and scale) (CVPR 2012)

SIFT and SURF Implementations:

SIFT: VLFeat, OpenCV, Original code by David Lowe, GPU implementation, OpenSIFT
SURF: Herbert Bay’s code, OpenCV, GPU-SURF

Other Local Feature Detectors and Descriptors:

VGG Affine Covariant features – Oxford code for various affine covariant feature detectors and descriptors.
LIOP descriptor – Source code for the Local Intensity order Pattern (LIOP) descriptor (ICCV 2011).
Local Symmetry Features – Source code for matching of local symmetry features under large variations in lighting, age, and rendering style (CVPR 2012).

Global Image Descriptors:

GIST – Matlab code for the GIST descriptor
CENTRIST – Global visual descriptor for scene categorization and object detection (PAMI 2011)

Feature Coding and Pooling

VGG Feature Encoding Toolkit – Source code for various state-of-the-art feature encoding methods – including Standard hard encoding, Kernel codebook encoding, Locality-constrained linear encoding, and Fisher kernel encoding.
Spatial Pyramid Matching – Source code for feature pooling based on spatial pyramid matching (widely used for image classification)

Convolutional Nets and Deep Learning

EBLearn – C++ Library for Energy-Based Learning. It includes several demos and step-by-step instructions to train classifiers based on convolutional neural networks.
Torch7 – Provides a matlab-like environment for state-of-the-art machine learning algorithms, including a fast implementation of convolutional neural networks.
Deep Learning - Various links for deep learning software.

Part-Based Models

Deformable Part-based Detector – Library provided by the authors of the original paper (state-of-the-art in PASCAL VOC detection task)
Efficient Deformable Part-Based Detector – Branch-and-Bound implementation for a deformable part-based detector.
Accelerated Deformable Part Model – Efficient implementation of a method that achieves the exact same performance of deformable part-based detectors but with significant acceleration (ECCV 2012).
Coarse-to-Fine Deformable Part Model – Fast approach for deformable object detection (CVPR 2011).
Poselets – C++ and Matlab versions for object detection based on poselets.
Part-based Face Detector and Pose Estimation – Implementation of a unified approach for face detection, pose estimation, and landmark localization (CVPR 2012).

Attributes and Semantic Features

Relative Attributes – Modified implementation of RankSVM to train Relative Attributes (ICCV 2011).
Object Bank – Implementation of object bank semantic features (NIPS 2010). See also ActionBank
Classemes, Picodes, and Meta-class features – Software for extracting high-level image descriptors (ECCV 2010, NIPS 2011, CVPR 2012).

Large-Scale Learning

Additive Kernels – Source code for fast additive kernel SVM classifiers (PAMI 2013).
LIBLINEAR – Library for large-scale linear SVM classification.
VLFeat – Implementation for Pegasos SVM and Homogeneous Kernel map.

Fast Indexing and Image Retrieval

FLANN – Library for performing fast approximate nearest neighbor.
Kernelized LSH – Source code for Kernelized Locality-Sensitive Hashing (ICCV 2009).
ITQ Binary codes – Code for generation of small binary codes using Iterative Quantization and other baselines such as Locality-Sensitive-Hashing (CVPR 2011).
INRIA Image Retrieval – Efficient code for state-of-the-art large-scale image retrieval (CVPR 2011).

Object Detection

See Part-based Models and Convolutional Nets above.
Pedestrian Detection at 100fps – Very fast and accurate pedestrian detector (CVPR 2012).
Caltech Pedestrian Detection Benchmark – Excellent resource for pedestrian detection, with various links for state-of-the-art implementations.
OpenCV – Enhanced implementation of Viola&Jones real-time object detector, with trained models for face detection.
Efficient Subwindow Search – Source code for branch-and-bound optimization for efficient object localization (CVPR 2008).

3D Recognition

Point-Cloud Library – Library for 3D image and point cloud processing.

Action Recognition

ActionBank – Source code for action recognition based on the ActionBank representation (CVPR 2012).
STIP Features – software for computing space-time interest point descriptors
Independent Subspace Analysis – Look for Stacked ISA for Videos (CVPR 2011)
Velocity Histories of Tracked Keypoints - C++ code for activity recognition using the velocity histories of tracked keypoints (ICCV 2009)

Datasets

Attributes

Animals with Attributes – 30,475 images of 50 animals classes with 6 pre-extracted feature representations for each image.
aYahoo and aPascal – Attribute annotations for images collected from Yahoo and Pascal VOC 2008.
FaceTracer – 15,000 faces annotated with 10 attributes and fiducial points.
PubFig – 58,797 face images of 200 people with 73 attribute classifier outputs.
LFW – 13,233 face images of 5,749 people with 73 attribute classifier outputs.
Human Attributes – 8,000 people with annotated attributes. Check also this link for another dataset of human attributes.
SUN Attribute Database – Large-scale scene attribute database with a taxonomy of 102 attributes.
ImageNet Attributes – Variety of attribute labels for the ImageNet dataset.
Relative attributes – Data for OSR and a subset of PubFig datasets. Check also this link for the WhittleSearch data.
Attribute Discovery Dataset – Images of shopping categories associated with textual descriptions.

Fine-grained Visual Categorization

Caltech-UCSD Birds Dataset – Hundreds of bird categories with annotated parts and attributes.
Stanford Dogs Dataset – 20,000 images of 120 breeds of dogs from around the world.
Oxford-IIIT Pet Dataset – 37 category pet dataset with roughly 200 images for each class. Pixel level trimap segmentation is included.
Leeds Butterfly Dataset – 832 images of 10 species of butterflies.
Oxford Flower Dataset – Hundreds of flower categories.

Face Detection

FDDB – UMass face detection dataset and benchmark (5,000+ faces)
CMU/MIT – Classical face detection dataset.

Face Recognition

Face Recognition Homepage – Large collection of face recognition datasets.
LFW – UMass unconstrained face recognition dataset (13,000+ face images).
NIST Face Homepage – includes face recognition grand challenge (FRGC), vendor tests (FRVT) and others.
CMU Multi-PIE – contains more than 750,000 images of 337 people, with 15 different views and 19 lighting conditions.
FERET – Classical face recognition dataset.
Deng Cai’s face dataset in Matlab Format – Easy to use if you want play with simple face datasets including Yale, ORL, PIE, and Extended Yale B.
SCFace – Low-resolution face dataset captured from surveillance cameras.

Handwritten Digits

MNIST – large dataset containing a training set of 60,000 examples, and a test set of 10,000 examples.

Pedestrian Detection

Caltech Pedestrian Detection Benchmark – 10 hours of video taken from a vehicle,350K bounding boxes for about 2.3K unique pedestrians.
INRIA Person Dataset – Currently one of the most popular pedestrian detection datasets.
ETH Pedestrian Dataset – Urban dataset captured from a stereo rig mounted on a stroller.
TUD-Brussels Pedestrian Dataset – Dataset with image pairs recorded in an crowded urban setting with an onboard camera.
PASCAL Human Detection – One of 20 categories in PASCAL VOC detection challenges.
USC Pedestrian Dataset – Small dataset captured from surveillance cameras.

Generic Object Recognition

ImageNet – Currently the largest visual recognition dataset in terms of number of categories and images.
Tiny Images – 80 million 32x32 low resolution images.
Pascal VOC – One of the most influential visual recognition datasets.
Caltech 101 / Caltech 256 – Popular image datasets containing 101 and 256 object categories, respectively.
MIT LabelMe – Online annotation tool for building computer vision databases.

Scene Recognition

MIT SUN Dataset – MIT scene understanding dataset.
UIUC Fifteen Scene Categories – Dataset of 15 natural scene categories.

Feature Detection and Description

VGG Affine Dataset – Widely used dataset for measuring performance of feature detection and description. CheckVLBenchmarks for an evaluation framework.

Action Recognition