System.arraycopy內存塊賦值

System#arraycopy 是直接對內存中的數據塊進行復制的,是一整塊一塊兒複製的,它是採用本地編碼實現的。

而採用下標一個一個地進行賦值時,時間主要浪費在了尋址和賦值上。

所以,建議複製數組或者動態增長數組長度時,採用 System#arraycopy 方法。

像咱們經常使用的 ArrayList 內部就是採用數組存儲的,在滿掉的時候,就採用 System#arraycopy 來動態
增長其內部存儲容量的。java

當我還年幼的時候,我很任性,複製數組也是,寫一個for循環,來回倒騰,後來長大了,就發現了System.arraycopy的好處。linux

爲了測試倆者的區別我寫了一個簡單賦值int[100000]的程序來對比,而且中間使用了nanoTime來計算時間差:數組

程序以下:app

int[] a = new int[100000];
        for(int i=0;i<a.length;i++){
            a[i] = i;
        }
        
        int[] b = new int[100000];
        
        int[] c = new int[100000];
        for(int i=0;i<c.length;i++){
            c[i] = i;
        }
        
        int[] d = new int[100000];
        
        for(int k=0;k<10;k++){
            long start1 = System.nanoTime();
            for(int i=0;i<a.length;i++){
                b[i] = a[i];
            }
            long end1 = System.nanoTime();
            System.out.PRintln("end1 - start1 = "+(end1-start1));
            
            
            long start2 = System.nanoTime();
            System.arraycopy(c, 0, d, 0, 100000);
            long end2 = System.nanoTime();
            System.out.println("end2 - start2 = "+(end2-start2));
            
            System.out.println();
        }

爲了不內存不穩定干擾和運行的偶然性結果,我在一開始的時候把全部空間申明完成,而且只以後循環10次執行,獲得以下結果:jvm

end1 - start1 = 366806
end2 - start2 = 109154

end1 - start1 = 380529
end2 - start2 = 79849

end1 - start1 = 421422
end2 - start2 = 68769

end1 - start1 = 344463
end2 - start2 = 72020

end1 - start1 = 333174
end2 - start2 = 77277

end1 - start1 = 377335
end2 - start2 = 82285

end1 - start1 = 370608
end2 - start2 = 66937

end1 - start1 = 349067
end2 - start2 = 86532

end1 - start1 = 389974
end2 - start2 = 83362

end1 - start1 = 347937
end2 - start2 = 63638

能夠看出,System.arraycopy的性能很不錯,爲了看看究竟這個底層是如何處理的,我找到openJDK的一些代碼留戀了一些:函數

System.arraycopy是一個native函數,須要看native層的代碼:oop

public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

找到對應的openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp,這裏有JVM_ArrayCopy的入口:性能

JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
                               jobject dst, jint dst_pos, jint length))
  JVMWrapper("JVM_ArrayCopy");
  // Check if we have null pointers
  if (src == NULL || dst == NULL) {
    THROW(vmSymbols::java_lang_NullPointerException());
  }
  arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
  arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
  assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
  assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
  // Do copy
  Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread);
JVM_END

前面的語句都是判斷,知道最後的copy_array(s, src_pos, d, dst_pos, length, thread)是真正的copy,進一步看這裏,在openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass.cpp中:測試

void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) {
  assert(s->is_typeArray(), "must be type array");

  // Check destination
  if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) {
    THROW(vmSymbols::java_lang_ArrayStoreException());
  }

  // Check is all offsets and lengths are non negative
  if (src_pos < 0 || dst_pos < 0 || length < 0) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check if the ranges are valid
  if  ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
     || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check zero copy
  if (length == 0)
    return;

  // This is an attempt to make the copy_array fast.
  int l2es = log2_element_size();
  int ihs = array_header_in_bytes() / WordSize;
  char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es);
  char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es);
  Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es);//仍是在這裏處理copy
}

這個函數以前的仍然是一堆判斷,直到最後一句纔是真實的拷貝語句。ui

在openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp中找到對應的函數:

// Copy bytes; larger units are filled atomically if everything is aligned.
void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) {
  address src = (address) from;
  address dst = (address) to;
  uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size;

  // (Note:  We could improve performance by ignoring the low bits of size,
  // and putting a short cleanup loop after each bulk copy loop.
  // There are plenty of other ways to make this faster also,
  // and it's a slippery slope.  For now, let's keep this code simple
  // since the simplicity helps clarify the atomicity semantics of
  // this Operation.  There are also CPU-specific assembly versions
  // which may or may not want to include such optimizations.)

  if (bits % sizeof(jlong) == 0) {
    Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong));
  } else if (bits % sizeof(jint) == 0) {
    Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint));
  } else if (bits % sizeof(jshort) == 0) {
    Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort));
  } else {
    // Not aligned, so no need to be atomic.
    Copy::conjoint_jbytes((void*) src, (void*) dst, size);
  }
}

上面的代碼展現了選擇哪一個copy函數,咱們選擇conjoint_jints_atomic,在openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp進一步查看:

// jints,                 conjoint, atomic on each jint
  static void conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    assert_params_ok(from, to, LogBytesPerInt);
    pd_conjoint_jints_atomic(from, to, count);
  }

繼續向下查看,在openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp中:

static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
  _Copy_conjoint_jints_atomic(from, to, count);
}

繼續向下查看,在openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp中:

void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    if (from > to) {
      jint *end = from + count;
      while (from < end)
        *(to++) = *(from++);
    }
    else if (from < to) {
      jint *end = from;
      from += count - 1;
      to   += count - 1;
      while (from >= end)
        *(to--) = *(from--);
    }
  }

能夠看到,直接就是內存塊賦值的邏輯了,這樣避免不少引用來回倒騰的時間,必然就變快了。

相關文章
相關標籤/搜索