信號分幀的方法對比

1. 背景html

  當一段時域信號很長時,一般咱們須要將一長段信號切成一小段一小段的信號進行處理,好比 短時傅里葉變換stft或小波wavelet變換等等。python

  一般,爲了信號的平滑過渡,N個一小段信號中 , 前一個小段信號與後一個小段信號之間存在着一段重合的部分,咱們叫作overlap。數組

  在前一段隨筆(如何將聲學的spectrogram(聲譜圖)從新反變換成時域語音信號 )中,咱們也遇到過這種分幀形式。less

 

 

 

 

 

2. 實現方法 (python代碼爲主)ide

  不管哪一種方法,首先咱們要獲取一個概況:函數

  假設咱們有一個信號 sigData, 數據總長爲sigLen,咱們每一幀的數據個數爲blkSize, 重合的百分比爲 Overlapspa

  stepSize : 那麼每次咱們向前移動的數據個數stepSize 爲 int( blkSize*(1-Overlap) ) ,且必須大於1。code

  frameNumSize: 一共會分爲的數據塊個數 frameNumSize : frameNumSize = 1+ floor ( (Length(sigData) - blkSize) / stepSize )htm

 

  2.1 循環取數的方法blog

#%% method 1
import numpy as np
def cut_to_sigBlks_test1(sigData,blkSize,Overlap):
 
    if Overlap > 1:
        Overlap = Overlap/100
        
    # 1.獲取其實idx的step ,因爲overlap 存在 ,stepSize 小於等於blkSize
    sigLen = np.size(sigData)
    stepSize = int( np.floor(blkSize*(1-Overlap)) )
    
    if stepSize < 1:
        stepSize =int(1)
        
    frameNumSize = int( ((sigLen-blkSize)//stepSize) +1)  # 得到一共有多少個 片斷
   
    # 2.3 循環得到數據
    sigBlks = np.zeros((frameNumSize,blkSize),dtype= sigData.dtype)for i in np.arange(frameNumSize):
        sigBlks[i,:] = sigData[i*stepSize:i*stepSize+blkSize]
    return sigBlks

#%% Test
sigData = np.arange(20)
blkSize = 7
Overlap = 0.3
sigBlks = cut_to_sigBlks_test1(sigData,blkSize,Overlap)

print('sigData: \n',sigData)
print('sigBlks: \n',sigBlks)

  顯示結果爲:

  sigData:
  [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
  sigBlks:
  [[ 0   1  2   3    4   5    6]
   [ 4   5  6   7    8   9   10]
   [ 8   9  10 11  12 13 14]
   [12 13 14 15 16 17 18 ] ]

 

  2.2 引索取數方法

  

#%% method 2
import numpy as np
def cut_to_sigBlks_test2(sigData,blkSize,Overlap):
 
    if Overlap > 1:
        return print('overlap need less than 1')
        Overlap = Overlap/100
        
    # 1.獲取其實idx的step ,因爲overlap 存在 ,stepSize 小於等於blkSize
    sigLen = np.size(sigData)
    stepSize = int( np.floor(blkSize*(1-Overlap)) )
    
    if stepSize < 1:
        stepSize =int(1)
        
    frameNumSize = int( ((sigLen-blkSize)//stepSize) +1)  # 得到一共有多少個 片斷
   
    # 2.2 method 2 得到idxArray, [向量化方法]
    
    # 生成 引索數組, 大小爲 row nums = frameNumSize, col nums = blocksize 
    # 生成開始引索序列,間隔爲 stepSize ,考慮上 overlap 
    startIdxArry = np.arange(0,stepSize*frameNumSize,stepSize)  
    # 生成信號分塊的引索數組,按行分塊
    idxArray = np.tile(np.r_[0:blkSize],(frameNumSize,1)) + startIdxArry[:,np.newaxis] 
    sigBlks = sigData[idxArray]
    return sigBlks
#%% Test

sigData = np.arange(20)
sigData.astype(np.float64)
blkSize = 7
Overlap = 0.3
# sigBlks = cut_to_sigBlks_test1(sigData,blkSize,Overlap)
sigBlks = cut_to_sigBlks_test2(sigData,blkSize,Overlap)

print('sigData: \n',sigData)
print('sigBlks: \n',sigBlks)

  顯示結果爲:

  sigData:
  [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
  sigBlks:
  [[ 0   1  2   3    4   5    6]
   [ 4   5  6   7    8   9   10]
   [ 8   9  10 11  12 13 14]
   [12 13 14 15 16 17 18 ] ]

 

  2.3 使用python 中 as_strides 方法,至關於引索,不過是numpy內置的引索函數,不過要求必須是內存中連續存放的一段數據。 stride至關於上文中的step

   

#%% method 3
import numpy as np
def cut_to_sigBlks_test3(sigData,blkSize,Overlap,axis=0):
 
    if Overlap > 1:
        return print('overlap need less than 1')
        Overlap = Overlap/100
        
    # 1.獲取其實idx的step ,因爲overlap 存在 ,stepSize 小於等於blkSize
    sigLen = np.size(sigData)
    stepSize = int( np.floor(blkSize*(1-Overlap)) )
    
    if stepSize < 1:
        stepSize =int(1)
        
    frameNumSize = int( ((sigLen-blkSize)//stepSize) +1)  # 得到一共有多少個 片斷
   
    # 2.2 method 3 得到idxArray, [向量化方法]
    sigData = np.ascontiguousarray(sigData) # 將x轉化爲連續內存存儲

    strides = np.asarray(sigData.strides)
    new_stride = np.prod(strides[strides > 0] // sigData.itemsize) * sigData.itemsize
    axis=0 # 切分數據 按行存儲
    if axis == -1:
        shape = list(sigData.shape)[:-1] + [blkSize, frameNumSize]
        strides = list(strides) + [stepSize * new_stride]
    elif axis == 0:
        shape = [frameNumSize, blkSize] + list(sigData.shape)[1:]
        strides = [stepSize * new_stride] + list(strides) 
    else:
       print('error')

    sigBlks = np.lib.stride_tricks.as_strided(sigData, shape=shape, strides=strides)

    return sigBlks

#%% Test

sigData = np.arange(20)
sigData.astype(np.float64)
blkSize = 7
Overlap = 0.3
# sigBlks = cut_to_sigBlks_test1(sigData,blkSize,Overlap)
# sigBlks = cut_to_sigBlks_test2(sigData,blkSize,Overlap)
sigBlks = cut_to_sigBlks_test3(sigData,blkSize,Overlap)
print('sigData: \n',sigData)
print('sigBlks: \n',sigBlks)

  顯示結果爲:

  sigData:
  [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
  sigBlks:
  [[ 0   1  2   3    4   5    6]
   [ 4   5  6   7    8   9   10]
   [ 8   9  10 11  12 13 14]
   [12 13 14 15 16 17 18 ] ]

 

3.比較這3中運算的時間效率

  這三種方法中,無疑越日後方法越好,第一種是方便理解的循環思惟,第二種是向量化思惟,第三種也是向量化思惟同時運用了一個numpy庫的as_stride性質

  第三種的運算時間比較短

  建立一個1000000個數據點,每1024個點分幀,overlap = 0.3。每種方法循環1000次,用的時間分別爲:

#%% Test cost time
import time as time
sigData = np.arange(1000000)
sigData = np.array(sigData,dtype='float64')
blkSize = 1024
Overlap = 0.3

st= time.time()
for i in np.arange(100):
    sigBlks1 = cut_to_sigBlks_test1(sigData,blkSize,Overlap)
et= time.time()
print('cut_to_sigBlks_test1:',et-st)


st= time.time()
for i in np.arange(100):
    sigBlks2 = cut_to_sigBlks_test2(sigData,blkSize,Overlap)
et= time.time()
print('cut_to_sigBlks_test2:',et-st)

st= time.time()
for i in np.arange(100):
    sigBlks3 = cut_to_sigBlks_test3(sigData,blkSize,Overlap)
et= time.time()
print('cut_to_sigBlks_test3:',et-st)

 

cut_to_sigBlks_test1: 1.0691425800323486
cut_to_sigBlks_test2: 1.8650140762329102
cut_to_sigBlks_test3: 0.003989458084106445

可見耗時 爲 method 3 < method 1 < method 2

原本覺得第一種比第二種方法耗時間長,實驗出乎意料啊。不過第二種寫法更優美,哈哈!

相關文章
相關標籤/搜索