四種簡單的圖像顯著性區域特徵提取方法-----AC/HC/LC/FT。

時間 2020-07-22

標籤四種簡單圖像顯著區域特徵提取方法简体版

原文原文鏈接

四種簡單的圖像顯著性區域特徵提取方法-----> AC/HC/LC/FT。

分類：圖像處理 2014-08-03 12:40 4088人閱讀評論(4) 收藏舉報算法

salient region detec 顯著性檢測ide

laviewpbt 2014.8.3 編輯函數

Email：laviewpbt@sina.com QQ：33184777測試

最近閒來蛋痛，看了一些顯著性檢測的文章，只是簡單的看看，並無深刻的研究，如下將研究的一些收穫和經驗共享。優化

先從最簡單的最容易實現的算法提及吧：網站

一、 LC算法ui

參考論文：Visual Attention Detection in Video Sequences Using Spatiotemporal Cues。 Yun Zhai and Mubarak Shah. Page 4-5。 this

算法原理部分見論文的第四第五頁。編碼

When viewers watch a video sequence, they are attracted not only by the interesting events, but also sometimes by the interesting objects in still images. This is referred as the spatial attention. Based on the psychological studies, human perception system is sensitive to the contrast of visual signals, such as color, intensity and texture. Taking this as the underlying assumption, we propose an e±cient method for computing the spatial saliency maps using the color statistics of images. The algorithm is designed with a linear computational complexity with respect to the number of image pixels. The saliency map of an image is built upon the color contrast between image pixels. The saliency value of a pixel I_k in an image I is defined as,url

where the value of I_i is in the range of [0; 255], and || * ||represent the color distance metric。

要實現這個算法，只要有這個公式(7)就徹底足夠了。就是每一個像素的顯著性值是其和圖像中其餘的全部像素的某個距離的總和，這個距離通常使用歐式距離。

若是採用直接的公式定義，則算法的時間複雜度很高，這個的優化不用想就知道是直方圖，我都懶得說了。

注意這篇文章採用的一個像素的灰度值來做爲顯著性計算的依據。這樣圖像最多的像素值只有256種了。

該算法的代碼在HC對應的文章的附帶代碼裏有，我這裏貼出我本身的實現：

extern void Normalize(float *DistMap, unsigned char *SaliencyMap, int Width, int Height, int Stride, int Method = 0);

/// <summary>
/// 實現功能： 基於SPATIAL ATTENTION MODEL的圖像顯著性檢測
///    參考論文： Visual Attention Detection in Video Sequences Using Spatiotemporal Cues。 Yun Zhai and Mubarak Shah.  Page 4-5。
///    整理時間： 2014.8.2
/// </summary>
/// <param name="Src">須要進行檢測的圖像數據，只支持24位圖像。</param>
/// <param name="SaliencyMap">輸出的顯著性圖像，也是24位的。</param>
/// <param name="Width">輸入的彩色數據的對應的灰度數據。</param>
/// <param name="Height">輸入圖像數據的高度。</param>
/// <param name="Stride">圖像的掃描行大小。</param>
/// <remarks> 基於像素灰度值進行的統計。</remarks>

void __stdcall SalientRegionDetectionBasedonLC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
{
    int X, Y, Index, CurIndex ,Value;
    unsigned char *Gray = (unsigned char*)malloc(Width * Height);
    int *Dist = (int *)malloc(256 * sizeof(int));
    int *HistGram = (int *)malloc(256 * sizeof(int));
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));

    memset(HistGram, 0, 256 * sizeof(int));

    for (Y = 0; Y < Height; Y++)
    {
        Index = Y * Stride;
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
        {
            Value = (Src[Index] + Src[Index + 1] * 2 + Src[Index + 2]) / 4;        //    保留灰度值，以便不須要重複計算
            HistGram[Value] ++;
            Gray[CurIndex] = Value;
            Index += 3;
            CurIndex ++;
        }
    }

    for (Y = 0; Y < 256; Y++)
    {
        Value = 0;
        for (X = 0; X < 256; X++) 
            Value += abs(Y - X) * HistGram[X];                //    論文公式（9），灰度的距離只有絕對值，這裏其實能夠優化速度，但計算量不大，不必了
        Dist[Y] = Value;
    }
    
    for (Y = 0; Y < Height; Y++)
    {
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
        {
            DistMap[CurIndex] = Dist[Gray[CurIndex]];        //    計算全圖每一個像素的顯著性
            CurIndex ++;
        }
    }

    Normalize(DistMap, SaliencyMap, Width, Height, Stride);    //    歸一化圖像數據

    free(Gray);
    free(Dist);
    free(HistGram);
    free(DistMap);
}

算法效果：

這篇論文並無提到是否在LAB空間進行處理，有興趣的朋友也能夠試試LAB的效果。

二、HC算法

參考論文： 2011 CVPR Global Contrast based salient region detection Ming-Ming Cheng

這篇論文有相關代碼能夠直接下載的，不過須要向做者索取解壓密碼，有pudn帳號的朋友能夠直接在pudn上下載，不過那個下載的代碼是用 opencv的低版本寫的，下載後須要本身配置後才能運行，而且彷佛只有前一半能運行（顯著性檢測部分）。

論文提出了HC和RC兩種顯著性檢測的算法，我這裏只實現了HC。

在本質上，HC和上面的LC沒有區別，可是HC考慮了彩色信息，而不是像LC那樣只用像素的灰度信息，因爲彩色圖像最多有256*256*256種顏色，所以直接基於直方圖技術的方案不太可行了。可是實際上一幅彩色圖像並不會用到那麼多種顏色，所以，做者提出了下降顏色數量的方案，將RGB各份量分別映射成12等份，則隱射後的圖最多隻有12*12*12種顏色，這樣就能夠構造一個較小的直方圖用來加速，可是因爲過渡量化會對結果帶來必定的瑕疵。所以做者又用了一個平滑的過程。最後和LC不一樣的是，做者的處理時在Lab空間進行的，而因爲Lab空間和RGB並非徹底對應的，其量化過程仍是在RGB空間完成的。

咱們簡單看看這個量化過程，對於一幅彩色圖像，減小其RGB各份量的值，能夠用Photoshop的色調分離功能直接看到其結果，以下所示：

原圖：共有64330種顏色色調分離結果圖：共有1143種顏色

（上圖因爲保存爲JPG格式了，大家下載分析後實際顏色的數量確定會有所不一樣了）。

對於上面的圖，彷佛以爲量化後區別不是特別大，可是咱們在看一個例子：

原圖：172373種顏色結果圖：共有1143種顏色

這種轉換後的區別就比較大了，這就是做者說的瑕疵。

在做者的附帶代碼中，有這個算法的實現，我只隨便看了下，以爲寫的比較複雜，因而我本身構思了本身的想法。

能夠確定的一點就是，爲了加快處理速度必須下降圖像的彩色信息量，可是我得控制這個下降的程度，那麼我想到了我最那首的一些東西：圖像的位深處理。在個人Imageshop中，能夠將24位真彩色圖像用盡可能少的視覺損失下降爲8位的索引圖像。所以，個人思路就是這樣，可是不用下降位深而已。

那麼這個處理的第一步就是找到彩色圖像的中最具備表明性的顏色值，這個過程能夠用8叉樹實現，或者用高4位等方式獲取。第二，就是在量化的過程當中必須採用相關的抖動技術，好比ordered dither或者FloydSteinberg error diffuse等。更進一步，能夠超越8位索引的概念，能夠實現諸如大於256的調色板，1024或者4096都是能夠的，可是這將稍微加大計算量以及編碼的複雜度。我就採用256種顏色的方式。量化的結果以下圖：

原圖：172373種顏色結果圖：共有256種顏色

能夠看到256種顏色的效果比上面的色調分離的1143種顏色的視覺效果還要好不少的。

從速度角度考慮，用8叉樹獲得調色板是個比較耗時的過程，一種處理方式就是從原圖的小圖中獲取，一半來講256*256大小的小圖獲取的調色板和原圖相比基本沒有啥區別，不過這個獲取小圖的插值方式最好是使用最近鄰插值：第一：速度快；第二：不會產生新的顏色。

最後，畢竟處理時仍是有視覺損失和瑕疵，在個人算法最後也是對顯著性圖進行了半徑爲1左右的高斯模糊的。

貼出部分代碼：

/// <summary>
/// 實現功能： 基於全局對比度的圖像顯著性檢測
///    參考論文： 2011 CVPR Global Contrast based salient region detection  Ming-Ming Cheng
///               http://mmcheng.net/salobj/
///    整理時間： 2014.8.3
/// </summary>
/// <param name="Src">須要進行檢測的圖像數據，只支持24位圖像。</param>
/// <param name="SaliencyMap">輸出的顯著性圖像，也是24位的。</param>
/// <param name="Width">輸入的彩色數據的對應的灰度數據。</param>
/// <param name="Height">輸入圖像數據的高度。</param>
/// <param name="Stride">圖像的掃描行大小。</param>
///    <remarks> 在Lab空間進行的處理，使用了整形的LAB轉換，採用抖動技術將圖像顏色總數量下降爲256種，在利用直方圖計算出顯著性查找表，最後採用高斯模糊下降量化後的顆粒感。</remarks>

void __stdcall SalientRegionDetectionBasedonHC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
{
    int X, Y, XX, YY, Index, Fast, CurIndex;
    int FitX, FitY, FitWidth, FitHeight;
    float Value;
    unsigned char *Lab = (unsigned char *) malloc(Height * Stride);
    unsigned char *Mask = (unsigned char *) malloc(Height * Width);
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));
    float *Dist = (float *)malloc(256 * sizeof(float));
    int *HistGram = (int *)malloc(256 * sizeof(int));

    GetBestFitInfoEx(Width, Height, 256, 256, FitX, FitY, FitWidth, FitHeight);
    unsigned char *Sample = (unsigned char *) malloc(FitWidth * FitHeight * 3);

    InitRGBLAB();
    for (Y = 0; Y < Height; Y++)
        RGBToLAB(Src + Y * Stride, Lab + Y * Stride, Width);

    Resample (Lab, Width, Height, Stride, Sample, FitWidth, FitHeight, FitWidth * 3, 0);    //    最近鄰插值

    RGBQUAD *Palette = ( RGBQUAD *)malloc( 256 * sizeof(RGBQUAD));
    
    GetOptimalPalette(Sample, FitWidth, FitHeight, FitWidth * 3, 256, Palette);

    ErrorDiffusionFloydSteinberg(Lab, Mask, Width, Height, Stride, Palette, true);            //    先把圖像信息量化到較少的範圍內，這裏量化到256種彩色

    memset(HistGram, 0, 256 * sizeof(int));

    for (Y = 0; Y < Height; Y++)
    {
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
        {
            HistGram[Mask[CurIndex]] ++;
            CurIndex ++;
        }
    }

    for (Y = 0; Y < 256; Y++)                                // 採用相似LC的方式進行顯著性計算
    {
        Value = 0;
        for (X = 0; X < 256; X++) 
            Value += sqrt((Palette[Y].rgbBlue - Palette[X].rgbBlue)*(Palette[Y].rgbBlue - Palette[X].rgbBlue) + (Palette[Y].rgbGreen- Palette[X].rgbGreen)*(Palette[Y].rgbGreen - Palette[X].rgbGreen) + (Palette[Y].rgbRed- Palette[X].rgbRed)*(Palette[Y].rgbRed - Palette[X].rgbRed)+ 0.0 )  * HistGram[X];
        Dist[Y] = Value;
    }

    for (Y = 0; Y < Height; Y++)
    {
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)
        {
            DistMap[CurIndex] = Dist[Mask[CurIndex]];
            CurIndex ++;
        }
    }

    Normalize(DistMap, SaliencyMap, Width, Height, Stride);                //    歸一化圖像數據

    GuassBlur(SaliencyMap, Width, Height, Stride, 1);                    //    最後作個模糊以消除分層的現象
    
    free(Dist);
    free(HistGram);
    free(Lab);
    free(Palette);
    free(Mask);
    free(DistMap);
    free(Sample);
    FreeRGBLAB();
}

上述方式比直接的Bruce-force的實現方式快了NNNN倍，比原做者的代碼也快一些。而且效果基本沒有啥區別。

原圖 HC結果,用時20ms 直接實現：150000ms 原做者的效果

我作的HC和原做者的結果有所區別，我沒仔細看代碼，初步懷疑是否是LAB空間的處理不一樣形成的，也有多是最後的浮點數量化到[0,255]算法不一樣形成的。

三：AC算法

參考論文：Salient Region Detection and Segmentation Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine SÄusstrunk 2008 , Page 4-5

這篇論文提出的算法的思想用其論文的一句話表達就是：

saliency is determined as the local contrast of an image region with respect to its neighborhood at various scales.

具體實現上，用這個公式表示：

以及：

其實很簡單，就是用多個尺度的模糊圖的顯著性相加來得到最終的顯著性。關於這個算法的理論分析，FT算法那個論文裏有這樣一段話：

Objects that are smaller than a ﬁlter size are detected ompletely, while objects larger than a ﬁlter size are only artially detected (closer to edges). Smaller objects that are well detected by the smallest ﬁlter are detected by all three ﬁlters, while larger objects are only detected by the larger ﬁlters. Since the ﬁnal saliency map is an average of the three feature maps (corresponding to detections of he three ﬁlters), small objects will almost always be better highlighted.

這個算法編碼上也很是簡單：

/// <summary>
/// 實現功能： saliency is determined as the local contrast of an image region with respect to its neighborhood at various scales
/// 參考論文： Salient Region Detection and Segmentation   Radhakrishna Achanta, Francisco Estrada, Patricia Wils, and Sabine SÄusstrunk   2008  , Page 4-5
///    整理時間： 2014.8.2
/// </summary>
/// <param name="Src">須要進行檢測的圖像數據，只支持24位圖像。</param>
/// <param name="SaliencyMap">輸出的顯著性圖像，也是24位的。</param>
/// <param name="Width">輸入的彩色數據的對應的灰度數據。</param>
/// <param name="Height">輸入圖像數據的高度。</param>
/// <param name="Stride">圖像的掃描行大小。</param>
/// <param name="R1">inner region's radius R1。</param>
/// <param name="MinR2">outer regions's min radius。</param>
/// <param name="MaxR2">outer regions's max radius。</param>
/// <param name="Scale">outer regions's scales。</param>
///    <remarks> 經過不一樣尺度局部對比度疊加獲得像素顯著性。</remarks>

void __stdcall SalientRegionDetectionBasedonAC(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride, int R1, int MinR2, int MaxR2, int Scale)
{
    int X, Y, Z, Index, CurIndex;
    unsigned char *MeanR1 =(unsigned char *)malloc( Height * Stride);
    unsigned char *MeanR2 =(unsigned char *)malloc( Height * Stride);
    unsigned char *Lab = (unsigned char *) malloc(Height * Stride);
    float *DistMap = (float *)malloc(Height * Width * sizeof(float));

    InitRGBLAB();    
    for (Y = 0; Y < Height; Y++) 
        RGBToLAB(Src + Y * Stride, Lab + Y * Stride, Width);                    //    注意也是在Lab空間進行的

    memcpy(MeanR1, Lab, Height * Stride);
    if (R1 > 0)                                                                    //    若是R1==0，則表示就取原始像素
        BoxBlur(MeanR1, Width, Height, Stride, R1);

    memset(DistMap, 0, Height * Width * sizeof(float));

    for (Z = 0; Z < Scale; Z++)
    {
        memcpy(MeanR2, Lab, Height * Stride);
        BoxBlur(MeanR2, Width, Height, Stride, (MaxR2 - MinR2) * Z / (Scale - 1) + MinR2);
        for (Y = 0; Y < Height; Y++) 
        {
            Index = Y * Stride;
            CurIndex = Y * Width;
            for (X = 0; X < Width; X++)                    //    計算全圖每一個像素的顯著性
            {
                DistMap[CurIndex] += sqrt( (MeanR2[Index] - MeanR1[Index]) * (MeanR2[Index] - MeanR1[Index]) + (MeanR2[Index + 1] - MeanR1[Index + 1]) * (MeanR2[Index + 1] - MeanR1[Index + 1]) + (MeanR2[Index + 2] - MeanR1[Index + 2]) * (MeanR2[Index + 2] - MeanR1[Index + 2]) + 0.0) ;
                CurIndex++;
                Index += 3;
            }
        }
    }
    
    Normalize(DistMap, SaliencyMap, Width, Height, Stride, 0);        //    歸一化圖像數據

    free(MeanR1);
    free(MeanR2);
    free(DistMap);
    free(Lab);
    FreeRGBLAB();
}

核心就是一個 boxblur,注意他也是在LAB空間作的處理。

以上檢測均是在R1 =0 , MinR2 = Min(Width,Height) / 8 . MaxR2 = Min(Width,Height) / 2, Scale = 3的結果。

四、FT算法

參考論文： Frequency-tuned Salient Region Detection， Radhakrishna Achantay， Page 4-5, 2009 CVPR

這篇論文對顯著性檢測提出瞭如下5個指標：

一、 Emphasize the largest salient objects.

二、Uniformly highlight whole salient regions.

三、Establish well-deﬁned boundaries of salient objects.

四、Disregard high frequencies arising from texture, noise and blocking artifacts.

五、Efﬁciently output full resolution saliency maps.

而起最後提出的顯著性檢測的計算方式也很簡答：

where I is the mean image feature vector, I!hc (x; y) is the corresponding image pixel vector value in the Gaussian blurred version (using a 55 separable binomial kernel) of the original image, and || *|| is the L2 norm.

這個公式和上面的五點式如何對應的，論文裏講的蠻清楚，我就是以爲那個爲何第一項要用平局值其實直觀的理解就是當高斯模糊的半徑爲無限大時，就至關於求一幅圖像的平均值了。

這篇論文做者提供了M代碼和VC的代碼，可是M代碼實際上和VC的代碼是否是對應的, M代碼是有錯誤的,他求平均值的對象不對。

我試着用我優化的整形的LAB空間來實現這個代碼，結果和原做者的效果有些圖有較大的區別，最後我仍是採用了做者的代碼裏提供的浮點版本的RGBTOLAB。

相關參考代碼以下：

/// <summary>
/// 實現功能： 基於Frequency-tuned 的圖像顯著性檢測
///    參考論文： Frequency-tuned Salient Region Detection， Radhakrishna Achantay， Page 4-5, 2009 CVPR 
///               http://ivrgwww.epfl.ch/supplementary_material/RK_CVPR09/
///    整理時間： 2014.8.2
/// </summary>
/// <param name="Src">須要進行檢測的圖像數據，只支持24位圖像。</param>
/// <param name="SaliencyMap">輸出的顯著性圖像，也是24位的。</param>
/// <param name="Width">輸入的彩色數據的對應的灰度數據。</param>
/// <param name="Height">輸入圖像數據的高度。</param>
/// <param name="Stride">圖像的掃描行大小。</param>
///    <remarks> 在Lab空間進行的處理，可是不能用庫中的整形RGBLAB顏色函數，必須用原始的浮點數處理。否則不少結果不明顯，緣由未知。</remarks>

void __stdcall SalientRegionDetectionBasedOnFT(unsigned char *Src, unsigned char *SaliencyMap, int Width, int Height, int Stride)
{
    int X, Y, XX, YY, Index, Fast, CurIndex, SrcB, SrcG, SrcR, DstB, DstG, DstR;
    float *Lab = (float *) malloc(Height * Stride * sizeof(float));
    float *DistMap = (float *) malloc(Height * Width * sizeof(float));
    float MeanL = 0, MeanA = 0, MeanB = 0;
    
    for (Y = 0; Y < Height; Y++) 
        RGBToLABF(Src + Y * Stride, Lab + Y * Stride, Width);                //    浮點類型的數據轉換
    
    for (Y = 0; Y < Height; Y++) 
    {
        Index = Y * Stride;
        for (X = 0; X < Width; X++)
        {
            MeanL +=  Lab[Index];
            MeanA +=  Lab[Index + 1];
            MeanB +=  Lab[Index + 2];
            Index += 3;
        }
    }
    MeanL /= (Width * Height);                                            //    求LAB空間的平均值
    MeanA /= (Width * Height);
    MeanB /= (Width * Height);

    GuassBlurF(Lab, Width, Height, Stride, 1);                            //    use Gaussian blur to eliminate ﬁne texture details as well as noise and coding artifacts

    for (Y = 0; Y < Height; Y++)                                        //    網站的matlab代碼的blur部分代碼不對
    {
        Index = Y * Stride;
        CurIndex = Y * Width;
        for (X = 0; X < Width; X++)                                        //    計算像素的顯著性
        {
            DistMap[CurIndex++] = (MeanL - Lab[Index]) *  (MeanL - Lab[Index]) +  (MeanA - Lab[Index + 1]) *  (MeanA - Lab[Index + 1]) +  (MeanB - Lab[Index + 2]) *  (MeanB - Lab[Index + 2])   ;
            Index += 3;
        }
    }
    
    Normalize(DistMap, SaliencyMap, Width, Height, Stride);                //    歸一化圖像數據

    free(Lab);
    free(DistMap);

}

檢測效果以下圖:

5、四種算法的綜合比較

經過一些試驗圖像，我到時以爲4種算法，FT的效果最爲明顯，舉例以下：

原圖 FT(50ms) AC(25ms)

LC(2ms) AC(23ms)

只有FT檢測出了那個葉。

原圖 FT AC

LC AC

6、下一步工做

這裏我研究的幾種顯著性分析都是很簡單很基礎的算法，實現起來也比較方便，如今還有不少效果顯著可是算法比較複雜的論文，等有空或者有能力的是在去看看他們。在這顯著性分析只是不少其餘處理的第一步，有了這個基礎，我也想看看後續的分割或者再感知縮放方面的應用吧。

http://files.cnblogs.com/Imageshop/salientregiondetection.rar

作了一個測試集。