Guava源碼分析——Multi Collections(1)

Immutable集合做爲Guava提供的集合類型,並無脫離集合的接口,例如ImmutableList依然實現List接口。但接下來幾章要分析的Multi Collections則幾乎脫離了JAVA本來所帶的集合(這也是爲何Multixxx,表明集合數據結構的單詞爲小寫),做爲了JAVA集合的一個補充。安全

對於Map<T,Integer>這樣的Map結構,會常常被使用到,而咱們統計T出現的次數的時候,大多時候進行的操做遍歷統計,代碼以下所示:數據結構

Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : words) {
  Integer count = counts.get(word);
  if (count == null) {
    counts.put(word, 1);
  } else {
    counts.put(word, count + 1);
  }
}

而使用MultiSet實現,代碼則變爲:ide

Multiset<String> sets = HashMultiset.create();
for (String word : words) {   sets.add(word); } int count = sets.count("word");

簡單使用就介紹到這,具體的API使用能夠去參照API文檔,接下來,來講說Multiset的實現,UML圖以下所示:this

在UML圖,加入了HashSet的繼承體系來講明,Multiset是一種新的集合,並非AbsTractSet的子類(不是一種Set),MultiSet接口,在Collection的基礎上擴展出對重複元素處理的方法,例如:int count(Object element)、int add(@Nullable E element, int occurrences)方法(見名知意,就很少說了)。spa

HashMultiset是最經常使用的,起實現是以Map<T,Count>爲存儲結構,其中的add和remove方法是對Count進行的操做(Count並非線程安全的),Multiset與Map<T,Integer>最大的不一樣是,Multiset遍歷時能夠遍歷出Map.keySize * count個元素,而map卻不能夠,最大的區別就在於其itertator和Entry<T,Count>的iterator實現代碼以下:線程

private class MapBasedMultisetIterator implements Iterator<E> {
    final Iterator<Map.Entry<E, Count>> entryIterator;
    Map.Entry<E, Count> currentEntry;
    int occurrencesLeft;//元素個數的剩餘,用來判斷是夠移動迭代器指針
    boolean canRemove;

    MapBasedMultisetIterator() {
      this.entryIterator = backingMap.entrySet().iterator();
    }

    @Override
    public boolean hasNext() {
      return occurrencesLeft > 0 || entryIterator.hasNext();
    }

    @Override
    public E next() {
      if (occurrencesLeft == 0) {//若是爲0,則移動迭代器指針
        currentEntry = entryIterator.next();
        occurrencesLeft = currentEntry.getValue().get();
      }
      occurrencesLeft--;
      canRemove = true;
      return currentEntry.getKey();
    }

    @Override
    public void remove() {
      checkRemove(canRemove);
      int frequency = currentEntry.getValue().get();
      if (frequency <= 0) {
        throw new ConcurrentModificationException();
      }
      if (currentEntry.getValue().addAndGet(-1) == 0) {
        entryIterator.remove();
      }
      size--;
      canRemove = false;
    }
  }

而Multiset.Entry<E>,實現被map的entryset代理的同時,加入了getCount()操做,而且支持remove和clear,代碼以下:代理

 Iterator<Entry<E>> entryIterator() {
        final Iterator<Map.Entry<E, Count>> backingEntries =
                backingMap.entrySet().iterator();//被map代理
        return new Iterator<Multiset.Entry<E>>() {
            Map.Entry<E, Count> toRemove;

            @Override
            public boolean hasNext() {
                return backingEntries.hasNext();
            }

            @Override
            public Multiset.Entry<E> next() {
                final Map.Entry<E, Count> mapEntry = backingEntries.next();
                toRemove = mapEntry;
                return new Multisets.AbstractEntry<E>() {
                    @Override
                    public E getElement() {
                        return mapEntry.getKey();
                    }
            //getcount操做
                    @Override
                    public int getCount() {
                        Count count = mapEntry.getValue();
                        if (count == null || count.get() == 0) {
                            Count frequency = backingMap.get(getElement());
                            if (frequency != null) {
                                return frequency.get();
                            }
                        }
                        return (count == null) ? 0 : count.get();
                    }
                };
            }

            @Override
            public void remove() {
                checkRemove(toRemove != null);//在next的時候記錄元素,能夠被remove
                size -= toRemove.getValue().getAndSet(0);
                backingEntries.remove();
                toRemove = null;
            }
        };
    }

Multiset總體來講是個很好用的集合,並且實現巧妙,一種元素並無被存多分,並且巧妙的利用iterator指針來模擬多分數據。指針

相關文章
相關標籤/搜索