問題描述爲: 一個無序的數列,每一個數有其對應的權重,權重爲非負整數,表明數列中的數字出現的次數。要求找出這一無序數列中的中位數。spa
1. 直接解法,先對該數列和權重排序。而後找出累計權重爲中位數的數字。 時間複雜度爲排序的 O(nlog(n)+n)code
2 import numpy as np 3 4 def weighted_median(data, weights): 5 """ 6 Args: 7 data (list or numpy.array): data 8 weights (list or numpy.array): weights 9 """ 10 data, weights = np.array(data).squeeze(), np.array(weights).squeeze() 11 s_data, s_weights = map(np.array, zip(*sorted(zip(data, weights)))) 12 midpoint = 0.5 * sum(s_weights) 13 if any(weights > midpoint): 14 w_median = (data[weights == np.max(weights)])[0] 15 else: 16 cs_weights = np.cumsum(s_weights) 17 idx = np.where(cs_weights <= midpoint)[0][-1] 18 if cs_weights[idx] == midpoint: 19 w_median = np.mean(s_data[idx:idx+2]) 20 else: 21 w_median = s_data[idx+1] 22 return w_median 23 24 def test_weighted_median(): 25 print("hello, world") 26 data = [ 27 [7, 1, 2, 4, 10], 28 [7, 1, 2, 4, 10], 29 [7, 1, 2, 4, 10, 15], 30 [1, 2, 4, 7, 10, 15], 31 [0, 10, 20, 30], 32 [1, 2, 3, 4, 5], 33 [30, 40, 50, 60, 35], 34 [2, 0.6, 1.3, 0.3, 0.3, 1.7, 0.7, 1.7, 0.4], 35 ] 36 weights = [ 37 [1, 1/3, 1/3, 1/3, 1], 38 [1, 1, 1, 1, 1], 39 [1, 1/3, 1/3, 1/3, 1, 1], 40 [1/3, 1/3, 1/3, 1, 1, 1], 41 [30, 191, 9, 0], 42 [10, 1, 1, 1, 9], 43 [1, 3, 5, 4, 2], 44 [2, 2, 0, 1, 2, 2, 1, 6, 0], 45 ] 46 answers = [7, 4, 8.5, 8.5, 10, 2.5, 50, 1.7] 47 for datum, weight, answer in zip(data, weights, answers): 48 assert(weighted_median(datum, weight) == answer) 49 50 if __name__ == "__main__": 51 test_weighted_median()
2. 按照快速排序的思路,先找到一個數字,而後 按照該數字將數列劃分紅左右兩段,根據左右兩段的權重之和,遞歸調用左半側或者右半側數列。blog