光流是圖像亮度的運動信息描述。光流法計算最初是由Horn和Schunck於1981年提出的,創造性地將二維速度場與灰度相聯繫,引入光流約束方程,獲得光流計算的基本算法.光流計算基於物體移動的光學特性提出了2個假設:算法
①運動物體的灰度在很短的間隔時間內保持不變;
②給定鄰域內的速度向量場變化是緩慢的。
windows
假設圖像上一個像素點(x,y),在t時刻的亮度爲E(x+Δx,y+Δy,t+Δt),同時用u(x,y0和v(x,y)來表示該點光流在水平和垂直方向上的移動份量:
函數
u=dx/dtui
v=dy/dtthis
在通過一段時間間隔Δt後該點對應點亮度爲E(x+Δx,y+Δy,t+Δt),當Δt很小趨近於0時,咱們能夠認爲該點亮度不變,因此能夠有:
E(x,y,t)=E(x+Δx,y+Δy,t+Δt)
當該點的亮度有變化時,將移動後點的亮度由Taylor公式展幵,可得:
忽略其二階無窮小,因爲Δt趨近於0時,有:spa
式中w=(u,v),因此上式就是基本的光流約束方程。
其中令表示圖像中像素點灰度沿x,y,t方向的梯度,可將上式改寫成:
code
Lucas-Kanade是一種普遍使用的光流估計的差分方法,這個方法是由Bruce D. Lucas和Takeo Kanade發明的。它假設光流在像素點的鄰域是一個常數,而後使用最小二乘法對鄰域中的全部像素點求解基本的光流方程。
經過結合幾個鄰近像素點的信息,盧卡斯-金出方法(簡稱爲L-K方法)一般可以消除光流方程裏的多義性。並且,與逐點計算的方法相比,L-K方法對圖像噪聲不敏感。不過,因爲這是一種局部方法,因此在圖像的均勻區域內部,L-K方法沒法提供光流信息。
遞歸
令I0 = I 是第 0 層的圖像,它是金字塔圖像中分辨率最高的圖像,圖像的寬度和高度分別定義爲nx0 = nx 和 ny0 = ny 。以一種遞歸的方式創建金字塔:從I0中計算I1,從I1中計算I2 ,···。令L =1, 2,...表明金字塔的層數,L一般取2,3,4。IL−1 是第L−1層的圖像,nxL−1 和 nyL−1分別是圖像IL−1 的寬度和高度。圖像IL可按以下方式由IL−1 求得:
ip
即用一個[0.25 0.5 0.25]的低通濾波器對IL-1進行卷積。get
整體來說,金字塔特徵跟蹤算法描述以下:首先,光流和仿射變換矩陣在最高一層的圖像上計算出;將上一層的計算結果做爲初始值傳遞給下一層圖像,這一層的圖像在這個初始值的基礎上,計算這一層的光流和仿射變化矩陣;再將這一層的光流和仿射矩陣做爲初始值傳遞給下一層圖像,直到傳遞給最後一層,即原始圖像層,這一層計算出來的光流和仿射變換矩陣做爲最後的光流和仿射變換矩陣的結果。
對於L=0,1,2,…L,定義是圖像中像素點u在第L層對應點的座標。根據上一步中圖像金字塔的定義,能夠計算出
咱們用數學的思想從新描述在L層和L+1層迭代運算,假定在第L層有對被跟蹤目標的位置有個大體估計,而從第L+1層傳遞到L層的運動矢量,即光流計算初值爲(後面會對gL作一個解釋)而且對於最上層的變換矩陣猜想
爲了在L層上計算光流和仿射變換矩陣,須要從新定義在L層上的匹配偏差ξL:
其中圖像和是原始圖像在L層上採樣出來的圖像,基於這層中的光流和仿射矩陣初值gL和GL能夠計算出兩個對應圖像和:
這裏用L+1層獲得的最初估計gL對L層做預平移,L層在gL的基礎上求該層的光流dL,這樣求得的殘餘光流向量dL= [dLx, dLy]T就足夠小,所以可以經過標準的光流法來求出這個運動矢量。而後獲得的dL結合gL又能夠對L-1層的gL-1作估計。最終的光流和就是在全部層的分段光流d的疊加。使用金字塔圖像計算光流的一個明顯的好處是,對於一個有着較大的像素偏移的矢量d,能夠經過計算幾個比較小的殘餘光流來獲得。這裏就是金字塔跟蹤算法的核心。
接下來就是計算該層上的光流dL和變換矩陣AL,咱們將在下一步中談論。如今,假設在這一層上的光流和變換矩陣己經計算出來。接着將結果傳遞給下一層,計算出下一層的假設初值:
將gL-1和GL-1做爲初值,從新循環上面的步驟,直到最上一層,計算出光流d和仿射變換矩陣A。
因爲金字塔的縮放減少了光流值,最高層的光流估計值能夠設爲0,設頂層時的初始爲:
這種算法最明顯的優點在於對於每一層的光流都會保持很小,可是最終計算來的光流能夠進行累積,便於有效地跟蹤特徵點。
這一步是算法的核心步驟。在金字塔的每一層,目標是計算出光流dL和仿射變換矩陣AL從而使偏差ξL最小。因爲每一層的迭代過程是相同的,因此咱們就描述從一層到下一層的迭代過程。首先將上一層的光流u和A傳給這一層,計算這一幀圖像中像素點的光照,同時計算出圖像在該點x方向和y方向上的偏導
Ix=[I(x+1,y)-I(x-1,y)]/2
Iy=[I(x,y+1)-I(x,y-1)]/2
在此基礎上,計算出空間梯度矩陣:
更新光流v=2*v
迭代過程:計算後一幀圖像中對應像素點的灰度,計算兩
幀圖像間相同位置點的灰度值之差,在計算圖像之間的偏差
向量:
最後計算針對仿射光流
,
更新跟蹤結果
直到某個閾值,結束在這一層的迭代過程。
所以,可按照如下的步驟選擇特徵點:
一、計算圖像 I 中每個像素的矩陣G和最小特徵值λm。
二、尋找整副圖像中最小特徵值 λm 中的最大特徵值λmax。
三、保留最小特徵值 λm 大於給定閾值的像素點。閾值一般取5% λmax ~10% λmax 。
四、保留 λm 局部最大值的像素:像素特徵值 λm 大於其3*3 鄰域中其餘像素的特徵值 λm 。
五、剔除像素密集區域中的一些像素,確保圖像中相鄰像素的距離都大於給定的閾值(常取5~10 pixels)。
上述操做完成後,圖像 I 中剩下的像素即爲選擇的特徵點,並做爲跟蹤特徵點。特徵點選擇算法的步驟5 確保了特徵點間的最小距離。
沒有必要取一個大的綜合窗口選擇特徵點(或計算矩陣G)。大量實驗證實,wx = wy =1的 3*3 大小的綜合窗口可以取得滿意的效果。
在大多數的狀況下,超過4的金字塔圖像層次沒有太大的意義。
有時爲了簡化能夠將仿射變換矩陣G簡化爲單位矩陣。
下面是個人實現。
#ifndef _LKOF_ #define _LKOF_ /***********Lucas-Kanade optical flow track**********/ /*****************author Marshall********************/ /**********************2015.10.14********************/ /******************version 1.0***********************/ class LucasKanadeTracker { struct DBPoint { double x; double y; DBPoint(const double X, const double Y):x(X),y(Y){ } DBPoint(){} }; int*height; int*width; private: unsigned int max_pyramid_layer; unsigned int original_imgH; unsigned int original_imgW; unsigned int window_radius; BYTE**pre_pyr;//the pyramid of previous frame image,img1_pyr[0] is of max size BYTE**next_pyr;//the frame after img1_pyr bool isusepyramid; DBPoint*target,*endin; int numofpoint; private: void build_pyramid(BYTE**&original_gray); void lucaskanade(BYTE**&frame_pre, BYTE**&frame_cur, DBPoint*& start, DBPoint*& finish, unsigned int point_nums, char*state); void get_max_pyramid_layer(); void pyramid_down(BYTE*&src_gray_data, const int src_h, const int src_w, BYTE*& dst, int&dst_h, int&dst_w); void lowpass_filter(BYTE*&src, const int H, const int W, BYTE*&smoothed); double get_subpixel(BYTE*&src, int h, int w, const DBPoint& point); // caculate the inverse mareix of pMatrix,the result is put into _pMatrix void ContraryMatrix(double *pMatrix, double * _pMatrix, int dim); bool matrixMul(double *src1, int height1, int width1, double *src2, int height2, int width2, double *dst); public: LucasKanadeTracker(const int windowRadius, bool usePyr); ~LucasKanadeTracker(); void get_pre_frame(BYTE*&gray);//use only at the beginning void discard_pre_frame(); //set the next frame as pre_frame,must dicard pre_pyr in advance void get_pre_frame(); //use every time,must after using get_pre_frame(BYTE**pyr) void get_next_frame(BYTE*&gray); void get_info(const int nh, const int nw); void get_target(POINT*target, int n); void run_single_frame(); POINT get_result(); BYTE*&get_pyramid(int th); int get_pyrH(int th){ return height[th]; } int get_pyrW(int th){ return width[th]; } }; #endif
#include "stdafx.h" #include "LucasKanadeTracker.h" using namespace std; LucasKanadeTracker::LucasKanadeTracker(const int windowRadius, bool usePyr) :window_radius(windowRadius), isusepyramid(usePyr) { } LucasKanadeTracker::~LucasKanadeTracker() { for (int i = 0; i < max_pyramid_layer; i++) { if (pre_pyr[i]) delete[]pre_pyr[i]; if (next_pyr[i]) delete[]next_pyr[i]; } delete[]pre_pyr; delete[]next_pyr; if (height) delete[]height; if (width) delete[]width; } void LucasKanadeTracker::lowpass_filter(BYTE*&src, const int H, const int W, BYTE*&smoothed) { //tackle with border for (int i = 0; i < H; i++) { smoothed[i*W] = src[i*W]; smoothed[i*W + W - 1] = src[i*W + W - 1]; } for (int i = 0; i < W; i++) { smoothed[i] = src[i]; smoothed[(H - 1)*W + i] = src[(H - 1)*W + i]; } for (int i = 1; i < H - 1; i++) for (int j = 1; j < W - 1; j++) { double re = 0; re += src[i*W + j] * 0.25; re += src[(i - 1)*W + j] * 0.125; re += src[i*W + j + 1] * 0.125; re += src[i*W + j - 1] * 0.125; re += src[(i + 1)*W + j] * 0.125; re += src[(i - 1)*W + j - 1] * 0.0625; re += src[(i + 1)*W + j - 1] * 0.0625; re += src[(i - 1)*W + j + 1] * 0.0625; re += src[(i + 1)*W + j + 1] * 0.0625; smoothed[i*W + j] = BYTE(re); } delete[]src; src = smoothed; } void LucasKanadeTracker::get_info(const int nh, const int nw) { original_imgH = nh; original_imgW = nw; if (isusepyramid) get_max_pyramid_layer(); else max_pyramid_layer = 1; pre_pyr = new BYTE*[max_pyramid_layer]; next_pyr = new BYTE*[max_pyramid_layer]; height = new int[max_pyramid_layer]; width = new int[max_pyramid_layer]; height[0] = nh; width[0] = nw; } void LucasKanadeTracker::get_target(POINT*target, int n) { this->target = new DBPoint[n]; endin = new DBPoint[n]; for (int i = 0; i < n; i++) { this->target[i].x = target[i].x; this->target[i].y = target[i].y; } numofpoint = n; } BYTE*&LucasKanadeTracker::get_pyramid(int th) { return pre_pyr[th]; } POINT LucasKanadeTracker::get_result() { POINT pp; pp.x = target[0].x; pp.y = target[0].y; return pp; } void LucasKanadeTracker::get_pre_frame(BYTE*&gray)//use only at the beginning { pre_pyr[0] = gray; build_pyramid(pre_pyr); //save_gray("1.bmp", pre_pyr[1], height[1], width[1]); } void LucasKanadeTracker::discard_pre_frame() { //we don't new memory for original data,so we don't delete it here for (int i = 0; i < max_pyramid_layer; i++) delete[]pre_pyr[i]; } //set the next frame as pre_frame,must dicard pre_pyr in advance void LucasKanadeTracker::get_pre_frame() { for (int i = 0; i < max_pyramid_layer; i++) pre_pyr[i] = next_pyr[i]; } //use every time,must after using get_pre_frame(BYTE**pyr) void LucasKanadeTracker::get_next_frame(BYTE*&gray) { next_pyr[0] = gray; build_pyramid(next_pyr); //save_gray("1.bmp", next_pyr[1], height[1], width[1]); } void LucasKanadeTracker::pyramid_down(BYTE*&src_gray_data, const int src_h, const int src_w, BYTE*& dst, int&dst_h, int&dst_w) { dst_h = src_h / 2; dst_w = src_w / 2; int ii = height[1]; int hh = width[1]; assert(dst_w > 3 && dst_h > 3); //BYTE*smoothed = new BYTE[src_h*src_w]; dst = new BYTE[dst_h*dst_w]; //lowpass_filter(src_gray_data, src_h, src_w,smoothed); for (int i = 0; i < dst_h - 1; i++) for (int j = 0; j < dst_w - 1; j++) { int srcY = 2 * i + 1; int srcX = 2 * j + 1; double re = src_gray_data[srcY*src_w + srcX] * 0.25; re += src_gray_data[(srcY - 1)*src_w + srcX] * 0.125; re += src_gray_data[(srcY + 1)*src_w + srcX] * 0.125; re += src_gray_data[srcY*src_w + srcX - 1] * 0.125; re += src_gray_data[srcY*src_w + srcX + 1] * 0.125; re += src_gray_data[(srcY - 1)*src_w + srcX + 1] * 0.0625; re += src_gray_data[(srcY - 1)*src_w + srcX - 1] * 0.0625; re += src_gray_data[(srcY + 1)*src_w + srcX - 1] * 0.0625; re += src_gray_data[(srcY + 1)*src_w + srcX + 1] * 0.0625; dst[i*dst_w + j] = re; } for (int i = 0; i < dst_h; i++) dst[i*dst_w + dst_w - 1] = dst[i*dst_w + dst_w - 2]; for (int i = 0; i < dst_w; i++) dst[(dst_h - 1)*dst_w + i] = dst[(dst_h - 2)*dst_w + i]; } //bilinear interplotation double LucasKanadeTracker::get_subpixel(BYTE*&src, int h, int w, const DBPoint& point) { int floorX = floor(point.x); int floorY = floor(point.y); double fractX = point.x - floorX; double fractY = point.y - floorY; return ((1.0 - fractX) * (1.0 - fractY) * src[floorX + w* floorY]) + (fractX * (1.0 - fractY) * src[floorX + 1 + floorY*w]) + ((1.0 - fractX) * fractY * src[floorX + (floorY + 1)*w]) + (fractX * fractY * src[floorX + 1 + (floorY + 1)*w]); } void LucasKanadeTracker::get_max_pyramid_layer() { int layer = 0; int windowsize = 2 * window_radius + 1; int temp = original_imgH > original_imgW ? original_imgW : original_imgH; if (temp > ((1 << 4) * 2 * windowsize)) { max_pyramid_layer = 5; return; } temp = double(temp) / 2; while (temp > 2 * windowsize) { layer++; temp = double(temp) / 2; } max_pyramid_layer = layer; } void LucasKanadeTracker::build_pyramid(BYTE**&pyramid) { for (int i = 1; i < max_pyramid_layer; i++) { pyramid_down(pyramid[i - 1], height[i - 1], width[i - 1], pyramid[i], height[i], width[i]); } } void LucasKanadeTracker::run_single_frame() { char*state = NULL; lucaskanade(pre_pyr, next_pyr, target, endin, numofpoint, state); for (int i = 0; i < numofpoint; i++) { target[i].x = endin[i].x; target[i].y = endin[i].y; } } void LucasKanadeTracker::lucaskanade(BYTE**&frame_pre, BYTE**&frame_cur, DBPoint*& start, DBPoint*& finish, unsigned int point_nums, char*state) { double*derivativeXs = new double [(2 * window_radius + 1)*(2 * window_radius + 1)]; double*derivativeYs = new double [(2 * window_radius + 1)*(2 * window_radius + 1)]; for (int i = 0; i < point_nums; i++) { double g[2] = { 0 }; double finalopticalflow[2] = { 0 }; memset(derivativeXs, 0, sizeof(double)* (2 * window_radius + 1)*(2 * window_radius + 1)); memset(derivativeYs, 0, sizeof(double)* (2 * window_radius + 1)*(2 * window_radius + 1)); for (int j = max_pyramid_layer - 1; j >= 0; j--) { DBPoint curpoint; curpoint.x = start[i].x / pow(2.0, j); curpoint.y = start[i].y / pow(2.0, j); double Xleft = curpoint.x - window_radius; double Xright = curpoint.x + window_radius; double Yleft = curpoint.y - window_radius; double Yright = curpoint.y + window_radius; double gradient[4] = { 0 }; int cnt = 0; for (double xx = Xleft; xx < Xright + 0.01; xx += 1.0) for (double yy = Yleft; yy < Yright + 0.01; yy += 1.0) { assert(xx < 1000 && yy < 1000 && xx >= 0 && yy >= 0); double derivativeX = get_subpixel(frame_pre[j], height[j], width[j], DBPoint(xx + 1.0, yy)) - get_subpixel(frame_pre[j], height[j], width[j], DBPoint(xx - 1.0, yy)); derivativeX /= 2.0; double t1 = get_subpixel (frame_pre[j], height[j], width[j], DBPoint(xx, yy + 1.0)); double t2 = get_subpixel(frame_pre[j], height[j], width[j], DBPoint(xx, yy - 1.0)); double derivativeY = (t1 - t2) / 2.0; derivativeXs[cnt] = derivativeX; derivativeYs[cnt++] = derivativeY; gradient[0] += derivativeX * derivativeX; gradient[1] += derivativeX * derivativeY; gradient[2] += derivativeX * derivativeY; gradient[3] += derivativeY * derivativeY; } double gradient_inv[4] = { 0 }; ContraryMatrix(gradient, gradient_inv, 2); double opticalflow[2] = { 0 }; int max_iter = 50; double opticalflow_residual = 1; int iteration = 0; while (iteration<max_iter&&opticalflow_residual>0.00001) { iteration++; double mismatch[2] = { 0 }; cnt = 0; for (double xx = Xleft; xx < Xright + 0.001; xx += 1.0) for (double yy = Yleft; yy < Yright + 0.001; yy += 1.0) { assert(xx < 1000 && yy < 1000 && xx >= 0 && yy >= 0); double nextX = xx + g[0] + opticalflow[0]; double nextY = yy + g[1] + opticalflow[1]; assert(nextX < 1000 && nextY < 1000 && nextX >= 0 && nextY >= 0); double pixelDifference = (get_subpixel(frame_pre[j], height[j], width[j], DBPoint(xx, yy)) - get_subpixel(frame_cur[j], height[j], width[j], DBPoint(nextX, nextY))); mismatch[0] += pixelDifference*derivativeXs[cnt]; mismatch[1] += pixelDifference*derivativeYs[cnt++]; } double temp_of[2]; matrixMul(gradient_inv, 2, 2, mismatch, 2, 1, temp_of); opticalflow[0] += temp_of[0]; opticalflow[1] += temp_of[1]; opticalflow_residual = abs(temp_of[0]) + abs(temp_of[1]); } if (j == 0) { finalopticalflow[0] = opticalflow[0]; finalopticalflow[1] = opticalflow[1]; } else { g[0] = 2 * (g[0] + opticalflow[0]); g[1] = 2 * (g[1] + opticalflow[1]); } } finalopticalflow[0] += g[0]; finalopticalflow[1] += g[1]; finish[i].x = start[i].x + finalopticalflow[0]; finish[i].y = start[i].y + finalopticalflow[1]; } delete[]derivativeXs, derivativeYs; } //matrix inverse void LucasKanadeTracker::ContraryMatrix(double *pMatrix, double * _pMatrix, int dim) { double *tMatrix = new double[2 * dim*dim]; for (int i = 0; i < dim; i++){ for (int j = 0; j < dim; j++) tMatrix[i*dim * 2 + j] = pMatrix[i*dim + j]; } for (int i = 0; i < dim; i++){ for (int j = dim; j < dim * 2; j++) tMatrix[i*dim * 2 + j] = 0.0; tMatrix[i*dim * 2 + dim + i] = 1.0; } //Initialization over! for (int i = 0; i < dim; i++)//Process Cols { double base = tMatrix[i*dim * 2 + i]; if (fabs(base) < 1E-300){ // AfxMessageBox("求逆矩陣過程當中被零除,沒法求解!" ); _ASSERTE(-1);//throw exception exit(0); } for (int j = 0; j < dim; j++)//row { if (j == i) continue; double times = tMatrix[j*dim * 2 + i] / base; for (int k = 0; k < dim * 2; k++)//col { tMatrix[j*dim * 2 + k] = tMatrix[j*dim * 2 + k] - times*tMatrix[i*dim * 2 + k]; } } for (int k = 0; k < dim * 2; k++){ tMatrix[i*dim * 2 + k] /= base; } } for (int i = 0; i < dim; i++) { for (int j = 0; j < dim; j++) _pMatrix[i*dim + j] = tMatrix[i*dim * 2 + j + dim]; } delete[] tMatrix; } bool LucasKanadeTracker::matrixMul(double *src1, int height1, int width1, double *src2, int height2, int width2, double *dst) { int i, j, k; double sum = 0; double *first = src1; double *second = src2; double *dest = dst; int Step1 = width1; int Step2 = width2; if (src1 == NULL || src2 == NULL || dest == NULL || height2 != width1) return false; for (j = 0; j < height1; j++) { for (i = 0; i < width2; i++) { sum = 0; second = src2 + i; for (k = 0; k < width1; k++) { sum += first[k] * (*second); second += Step2; } dest[i] = sum; } first += Step1; dest += Step2; } return true; }
下面是兩幀圖像間特徵點的跟蹤
參考文獻:Pyramidal Implementation of the AÆne Lucas Kanade Feature Tracker Description of the algorithm
An Iterative Image Registration Technique with an Application to Stereo Vision
Detection and Tracking of Point Features
Good Features to Track
Derivation of Kanade-Lucas-Tomasi Tracking Equation