加權中位數

問題描述爲: 一個無序的數列,每一個數有其對應的權重,權重爲非負整數,表明數列中的數字出現的次數。要求找出這一無序數列中的中位數。spa

 

1. 直接解法,先對該數列和權重排序。而後找出累計權重爲中位數的數字。 時間複雜度爲排序的 O(nlog(n)+n)code

 2 import numpy as np  3 
 4 def weighted_median(data, weights):  5     """
 6  Args:  7  data (list or numpy.array): data  8  weights (list or numpy.array): weights  9     """
10     data, weights = np.array(data).squeeze(), np.array(weights).squeeze() 11     s_data, s_weights = map(np.array, zip(*sorted(zip(data, weights)))) 12     midpoint = 0.5 * sum(s_weights) 13     if any(weights > midpoint): 14         w_median = (data[weights == np.max(weights)])[0] 15     else: 16         cs_weights = np.cumsum(s_weights) 17         idx = np.where(cs_weights <= midpoint)[0][-1] 18         if cs_weights[idx] == midpoint: 19             w_median = np.mean(s_data[idx:idx+2]) 20         else: 21             w_median = s_data[idx+1] 22     return w_median 23 
24 def test_weighted_median(): 25     print("hello, world") 26     data = [ 27         [7, 1, 2, 4, 10], 28         [7, 1, 2, 4, 10], 29         [7, 1, 2, 4, 10, 15], 30         [1, 2, 4, 7, 10, 15], 31         [0, 10, 20, 30], 32         [1, 2, 3, 4, 5], 33         [30, 40, 50, 60, 35], 34         [2, 0.6, 1.3, 0.3, 0.3, 1.7, 0.7, 1.7, 0.4], 35  ] 36     weights = [ 37         [1, 1/3, 1/3, 1/3, 1], 38         [1, 1, 1, 1, 1], 39         [1, 1/3, 1/3, 1/3, 1, 1], 40         [1/3, 1/3, 1/3, 1, 1, 1], 41         [30, 191, 9, 0], 42         [10, 1, 1, 1, 9], 43         [1, 3, 5, 4, 2], 44         [2, 2, 0, 1, 2, 2, 1, 6, 0], 45  ] 46     answers = [7, 4, 8.5, 8.5, 10, 2.5, 50, 1.7] 47     for datum, weight, answer in zip(data, weights, answers): 48         assert(weighted_median(datum, weight) == answer) 49 
50 if __name__ == "__main__": 51     test_weighted_median()

 

 2. 按照快速排序的思路,先找到一個數字,而後 按照該數字將數列劃分紅左右兩段,根據左右兩段的權重之和,遞歸調用左半側或者右半側數列。blog

 

相關文章
相關標籤/搜索