Given a data stream input of non-negative integers a1, a2, ..., an, ..., summarize the numbers seen so far as a list of disjoint intervals. For example, suppose the integers from the data stream are 1, 3, 7, 2, 6, ..., then the summary will be: [1, 1] [1, 1], [3, 3] [1, 1], [3, 3], [7, 7] [1, 3], [7, 7] [1, 3], [6, 7] **Follow up:** What if there are lots of merges and the number of disjoint intervals are small compared to the data stream's size?
這裏面提到了一個disjoint interval的概念,它是指不相交的區間。若是新來的數據與當前的區間集產生了重合,則須要將當前的區間集進行合併。從而確保每次獲得的數據集都是不相交的。java
這道題目總體來講有兩種思路,一種就是每次插入數據的時候將出現交集的區間進行合併,另外一種就是在插入數據時只更新區間的範圍,並不將區間合併。優化
這兩種方案在不一樣特色的數據集下各有優點。第一種方法適用於數據量大區間集少的場景。由於區間集少,而每次合併帶來的區間集的數據移動成本較低。後者則恰好相反,適用於區間集較多的場景,只有在讀取區間集的時候,去觸發合併操做。this
這裏是採用方案一來實現的。若是咱們維護了一組有序的區間集,這時數據流中傳來一個新的數字,則該數字和區間流中的區間一共有以下幾種狀況:code
代碼以下:對象
List<Pair> pairs = new ArrayList<>(); public void addNum(int val) { if (pairs.size() == 0) { pairs.add(new Pair(val, val)); return; } int left = 0, right = pairs.size() - 1; while (left <= right) { int mid = (left + right) / 2; if (pairs.get(mid).left < val) { left = mid + 1; }else if (pairs.get(mid).left > val) { right = mid - 1; }else { return; } } if (left == 0) { if (pairs.get(left).left == val + 1) { pairs.get(left).left = val; }else { pairs.add(0, new Pair(val, val)); } } else if (left == pairs.size()) { if (pairs.get(left-1).right == val - 1) { pairs.get(left-1).right = val; }else if (pairs.get(left-1).right < val - 1) { pairs.add(new Pair(val, val)); } }else if (pairs.get(left-1).right == val - 1 && pairs.get(left).left == val + 1) { pairs.get(left-1).right = pairs.get(left).right; pairs.remove(left); }else if (pairs.get(left-1).right == val - 1) { pairs.get(left - 1).right = val; }else if (pairs.get(left).left == val + 1) { pairs.get(left).left = val; }else if (pairs.get(left-1).right < val){ pairs.add(left, new Pair(val, val)); } } public int[][] getIntervals() { int[][] result = new int[pairs.size()][2]; for (int i = 0 ; i < pairs.size() ; i++) { result[i][0] = pairs.get(i).left; result[i][1] = pairs.get(i).right; } return result; } public static class Pair{ int left; int right; public Pair(int left, int right) { this.left = left; this.right = right; } }
上面的代碼新建了類Pair做爲區間的對象。而後使用List對區間Pair按照從小到大存儲。每次調用addNum傳入val時,都會利用二分法找到左邊界比val大的最近的區間。找到該區間後再按照上文的判斷方式逐個處理。假設數據流大小爲n,區間的平均大小爲m,則這種方法的時間複雜度爲O(lgM*n)。可是由於判斷邏輯較多,所以代碼顯得凌亂,可讀性不好。rem
這裏可使用TreeMap進行優化,TreeMap默認上是最小堆的實現,它幫咱們省去了大量二分法的邏輯,同時在插入區間的時候,時間複雜度也下降到O(NlgN)。只須要對邊界場景進行判斷便可。代碼以下:get
TreeMap<Integer, Integer> treeMap = new TreeMap<>(); public void addNum(int val) { if (treeMap.containsKey(val)) { return; } Integer lowerKey = treeMap.lowerKey(val); Integer higherKey = treeMap.higherKey(val); if (lowerKey != null && higherKey != null && treeMap.get(lowerKey) == val - 1 && higherKey == val + 1) { treeMap.put(lowerKey, treeMap.get(higherKey)); treeMap.remove(higherKey); }else if (lowerKey != null && treeMap.get(lowerKey) >= val - 1) { treeMap.put(lowerKey, Math.max(val, treeMap.get(lowerKey))); }else if (higherKey != null && higherKey == val + 1) { treeMap.put(val, treeMap.get(higherKey)); treeMap.remove(higherKey); }else { treeMap.put(val, val); } } public int[][] getIntervals() { int[][] result = new int[treeMap.size()][2]; int index = 0; for (int key : treeMap.keySet()){ result[index][0] = key; result[index][1] = treeMap.get(key); } return result; }