k個最大的數及變種小結

時間 2019-12-10

標籤最大變種小結简体版

原文原文鏈接

k個最大的數及變種小結

聲明

文章均爲本人技術筆記，轉載請註明出處：
[1] https://segmentfault.com/u/yzwall
[2] blog.csdn.net/j_dark/java

0 堆實現

堆邏輯上是徹底二叉樹，物理上表現爲數組，節點編號默認從0開始；
構造函數public Heap(boolean flag)，flag爲真創建最小堆，不然創建最大堆；向外提供接口以下：git

offer(int x)：將元素x加入堆中；算法
poll()：彈出堆頂，空堆彈出Integer.MAX_VALUE；segmentfault
top()：返回堆頂值但不彈出，空堆返回Integer.MAX_VALUE，數組
size()：返回堆中元素總數；ide

/**
 * 堆實現
 * @author yzwall
 */
public class Heap {
    private ArrayList<Integer> array;
    
    // true表明最小堆, false表明最大堆
    private boolean flag;
    
    public Heap(boolean flag) {
        this.array = new ArrayList<Integer>();
        this.flag = flag;
    }
    
    // 檢查父節點a與其孩子節點b是否知足堆序
    private boolean isOrder(int a, int b) {
        if (flag) {
            return a < b ? true : false;
        }
        return a > b ? true : false;
    }
    
    /**
     * 添加元素，時間複雜度O(logn)
     * 1. 將新元素添加到徹底二叉樹尾部，物理上爲數組尾部
     * 2. 新元素向上檢查是否知足堆序
     */
    public void offer(int newNum) {
        array.add(newNum);
        int index = array.size() - 1;
        shiftUp(index);
    }
    //　上濾操做，檢查index與父節點是否知足堆序
    private void shiftUp(int index) {
        while (index != 0) {
            int parent = (index - 1) / 2;
            if (isOrder(array.get(parent), array.get(index))) {
                break;
            }
            // 不知足堆序，互換位置
            int temp = array.get(index);
            array.set(index, array.get(parent));
            array.set(parent, temp);
            index = parent;
        }
    }
    
    /**
     * 刪除堆頂，時間複雜度O(logn)
     * 1. 將末元素覆蓋堆頂（刪除堆頂）
     * 2. 刪除末元素原來位置（保證堆節點總數-1）
     * 3. 向下檢查新堆頂與孩子節點的堆序性
     */
    public int poll() {
        int top = top();
        if (top == Integer.MAX_VALUE) {
            return top;
        }
        array.set(0, array.get(array.size() - 1));
        array.remove(array.size() - 1);
        shiftDown(0);
        return top;
    }
    //　下濾操做，檢查index與孩子節點是否知足堆序
    private void shiftDown(int index) {
        int bigIndex, leftIndex, rightIndex;
        boolean hasLChild = false, hasRChild = false;
        while (true) {
            leftIndex = 2 * index + 1;
            rightIndex  = 2 * index + 2;
            if (leftIndex < array.size() && array.get(leftIndex) != Integer.MAX_VALUE) {
                hasLChild = true;
            }
            if (rightIndex < array.size() && array.get(rightIndex) != Integer.MAX_VALUE) {
                hasRChild = true;
            }
            // 葉子節點默認知足堆序
            if (!hasLChild && !hasRChild) {
                break;
            } else if(hasLChild && hasRChild) {
                bigIndex = isOrder(array.get(leftIndex), array.get(rightIndex)) ? leftIndex : rightIndex;
            } else if(hasLChild) {
                bigIndex = leftIndex;
            } else {
                bigIndex = rightIndex;
            }
            
            if (isOrder(array.get(index), array.get(bigIndex))) {
                break;
            }
            // 不知足堆序，index與較大孩子節點互換位置
            int temp = array.get(index);
            array.set(index, array.get(bigIndex));
            array.set(bigIndex, temp);
            index = bigIndex;
            hasLChild = false;
            hasRChild = false;
        }
    }
    // 獲取堆頂
    public int top() {
        return array.size() == 0 ? Integer.MAX_VALUE : array.get(0);
    }
    // 返回堆大小
    public int size() {
        return array.size();
    }
}

1 求k個最大的數

lintcode 544 Top K Largest Numbers(解法1~解法3，解法4沒法保證結果有序)函數

1.1 解法1 最大堆實現$O(nlog n)$時間複雜度

將全部元素加入最大堆中（當元素總數n特別大時，建立堆時間開銷大，花費$O(nlog n)$）；大數據
彈出堆頂k次，保證結果降序（花費時間$O(nlog k)$）；this

/**
 * 求給定數組前k大數，結果要求以降序給出
 * http://www.lintcode.com/zh-cn/problem/top-k-largest-numbers/
 * 解法2：最大堆解決前k大數：將全部元素入堆，最後彈出k次，時間複雜度O(nlogn)，大數據量時速度低於最小堆解法
 * @author yzwall
 */
class Solution11 {
    public int[] topk(int[] nums, int k) {
        Heap heap = new Heap(false);
        for (int i = 0; i < nums.length; i++) {
            heap.offer(nums[i]);
        }
        int[] topK = new int[k];
        for (int i = 0; i < k; i++) {
            topK[i] = heap.poll();
        }
        return topK;
    }
}

1.2 解法2 最小堆實現$O(nlog n)$時間複雜度

針對最大堆解法建立堆開銷時間太高，維護一個元素總數最多爲k的最小堆；每次淘汰待進堆元素和堆頂的較小者，保證堆中元素始終爲已掃描數據中的最大的k個數；
解法：.net

創建最小堆（花費時間$O(nlog k)$）；
彈出堆頂k次（花費時間$O(klog k)$）；
根據題意將結果逆序（花費時間$O(k)$）；；

/**
 * 求給定數組前k大數，結果要求以降序給出
 * http://www.lintcode.com/zh-cn/problem/top-k-largest-numbers/
 * 解法2：最小堆解決前k大數，時間複雜度O(nlogn)
 * @author yzwall
 */
class Solution10 {
    public int[] topk(int[] nums, int k) {
        Heap heap = new Heap(true);
        for (int i = 0; i < nums.length; i++) {
            if (heap.size() < k) {
                heap.offer(nums[i]);
            } else {
                // 更新前k大數中最小值
                if (heap.top() < nums[i]) {
                    heap.poll();
                    heap.offer(nums[i]);
                }
            }
        }
        int[] topK = new int[k];
        for (int i = 0; i < k; i++) {
            topK[i] = heap.poll();
        }
        for (int i = 0, j = k - 1; i < j; i++, j--) {
            int temp = topK[i];
            topK[i] = topK[j];
            topK[j] = temp;
        }
        return topK;
    }
}

1.3 解法3 優先隊列實現$O(nlog n)$時間複雜度

優先隊列可用來實現最大堆（解法1）和最小堆（解法2），根據須要重寫構造方法中的比較器；
給出優先隊列實現最大堆解法：

/**
 * 求給定數組前k大數，要求給出降序結果
 * http://www.lintcode.com/zh-cn/problem/top-k-largest-numbers/
 * 解法3：重寫優先隊列比較器實現最大堆
 * @author yzwall
 */
class Solution {
    public int[] topk(int[] nums, int k) {
        int[] topK = new int[k];
        PriorityQueue<Integer> pq = new PriorityQueue<>(k, new Comparator<Integer>() {
            public int compare(Integer o1, Integer o2) {
                return o1 > o2 ? -1 : 1;
            }
        });
        for (int i = 0; i < nums.length; i++) {
            pq.offer(nums[i]);
        }
        for (int i = 0; i < k; i++) {
            topK[i] = pq.poll();
        }
        return topK;
    }
}

1.4 解法4 Partiton $O(n)$時間複雜度

將「求k個最大數」轉變爲：

Partition切分思想求出數組第k小數；
切分操做完成後，此時數組前k個數必然是最大的k個數

注意：切分操做完成後，結果亂序；

解法對比

---	是否修改輸入	時間複雜度	空間複雜度	是否按序輸出結果
解法1	否	$O(nlog n)$	$O(n)$	是
解法2	否	$O(nlog k)$	$O(n)$	是
解法3	否	$O(nlog n)$或者$O(nlog k)$	$O(n)$	是
解法4	是	$O(n)$	$O(1)$	否

解法1：最大堆時間複雜度最高，空間複雜度最高；
解法2：最小堆在解法1基礎上在建立堆時控制入堆元素數量，下降建立堆時間開銷；
解法3：優先隊列是堆的庫函數實現，只須要根據題意重寫比較器java.util.Comparator<T>，無需手寫堆；
解法4：時間複雜度和空間複雜度均最低，可是Partition切分算法沒法保證最終結果有序；

2 求出現次數最多的k個單詞

lintcode 471 Top K Frequent Words

使用優先隊列實現，根據題意：

你須要按照單詞的詞頻排序後輸出，越高頻的詞排在越前面。若是兩個單詞出現的次數相同，則詞典序小的排在前面。

解法以下：

使用HashMap統計單詞出現次數；
優先隊列隊首維護出現次數最多單詞；
優先隊列出隊k次；

建立優先隊列時候需重寫比較器，單詞字典序比較使用java.util.String類自帶的compareTo方法（String類實現Comparable<T>接口）；

/**
 * 優先隊列實現，字典序比較使用String類自帶CompareTo方法
 * 求出現次數最多的K個單詞，要求結果按序輸出（出現次數升序輸出，相同按照單詞字典序升序輸出）；
 * http://www.lintcode.com/en/problem/top-k-frequent-words/
 * @author yzwall
 */
class Solution {
    private class Word {
        String word;
        int count;
        public Word(String word, int count) {
            this.word = word;
            this.count = count;
        }
    }
    
    public String[] topKFrequentWords(String[] words, int k) {
        PriorityQueue<Word> pq = new PriorityQueue<>(words.length, new Comparator<Word>() {
            public int compare(Word o1, Word o2) {
                if (o1.count != o2.count) {
                    return o1.count > o2.count ? -1 : 1;
                } else {
                    return o1.word.compareTo(o2.word);
                }
            }
        });
        
        HashMap<String, Integer> map = new HashMap<>();
        for (String word : words) {
            if (map.containsKey(word)) {
                map.put(word, map.get(word) + 1);
            } else {
                map.put(word, 0);
            }
        }
        for (Map.Entry<String, Integer> entry : map.entrySet()) {
            pq.offer(new Word(entry.getKey(), entry.getValue()));
        }
        String[] topK = new String[k];
        for (int i = 0; i < k; i++) {
            if (!pq.isEmpty()){
                topK[i] = pq.poll().word;
            } else {
                break;
            }
        }
  
        return topK;
    }
}

3 求距離最近的k個點

lintcode 612 K Closest Points

求距離origit點最近的k個點，結果要求以距離升序輸出，距離相同按x座標升序，x座標相同按照y座標升序
解法：根據題意使用優先隊列java.util.PriorityQueue<T>，按照題意構造比較器Compator<T>;

/**
 * 求距離origit點最近的k個點，距離相同按x座標升序，x座標相同按照y座標升序
 * http://www.lintcode.com/en/problem/k-closest-points/
 * 使用優先隊列java.util.PriorityQueue，按照題意構造比較器Compator;
 * @author yzwall
 */
class Solution {
    private class Point {
        int x, y;
        Point() { x = 0; y = 0; };
        Point(int a, int b) { x = a; y = b; }
    }
    
    private class PPoint {
        int x, y, dist;
        PPoint(Point p, int dist) { 
            this.x = p.x; 
            this.y = p.y;
            this.dist = dist;
        }
    }
    
    private int distance(Point src, Point dest) {
        int xx = src.x - dest.x;
        int yy = src.y - dest.y;
        return xx * xx + yy * yy;
    }
    
    public Point[] kClosest(Point[] points, Point origin, int k) {
        // 使用new PriorityQueue<>(new Comparatpr<T>()) lintcode編譯會報錯
        PriorityQueue<PPoint> pq = new PriorityQueue<>(k, new Comparator<PPoint>(){
            @Override
            public int compare(PPoint o1, PPoint o2) {
                if (o1.dist != o2.dist) {
                    return o1.dist - o2.dist;
                } else {
                    if (o1.x != o2.x) {
                        return o1.x - o2.x;
                    }
                    return o1.y - o2.y;
                }
            }
        });
        if (points == null || points.length < k) {
            return null;
        }
        Point[] array = new Point[k];
        for (Point point : points) {
            pq.offer(new PPoint(point, distance(origin, point)));
        }
        for (int i = 0; i < k; i++) {
            PPoint temp = pq.poll();
            array[i] = new Point(temp.x, temp.y);
        }
        
        return array;
    }
}

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。