String源碼分析

時間 2019-12-12

標籤 string 源碼分析简体版

原文原文鏈接

1、類定義java

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {...}

final型，表示不能被繼承。
實現了Serializable，表示能夠序列化和反序列化。
實現了Comparable，表示須要完成compareTo(String s)方法，用於比較
實現了CharSequence，包含了length():int , charAt(int):char,subSequence(int,int):CharSequece,toString():String,chars():intStream,codePoints():IntStream.

2、成員變量正則表達式

private final char value[];
private int hash;
private static final long serialVersionUID = -6849794470754667710L;
private static final ObjectStreamField[] serialPersistentFields = new ObjectStreamField[0];
public static final Comparator<String> CASE_INSENSITIVE_ORDER = new CaseInsensitiveComparator();

value 做爲string的底層實現，爲字符數組。數組

hash 爲字符串的hashcode安全

serialVersionUID 做爲系列化和反序列化的標誌網絡

serialPersistentFields ObjectStreamFields數組用來聲明一個類的序列化字段。類中未使用ide

CASE_INSENSITIVE_ORDER 用於作無大小寫排序用的比較器，一個內部類生成的比較器函數

3、方法ui

2.1 構造方法this

(1)字符串做爲參數編碼

public String(){ this.value = "".value};
public String(String original){
  this.value=original.value; 
  this.hash=original.hash;
}

用一個String類型的對象來初始化一個String。這裏將直接將源String中的value和hash兩個屬性直接賦值給目標String。

(2)字符數組做爲參數

public String(char value[]){
  this.value=Arrays.copyOf(value, value.length)
}

public String(char value[],int offest, int count){
  if(offest<0){
    throw new StringIndexOutOfBoundsException(count);
  }
  if(offest <=0){
    if(count<0){throw new StringIndexOutOfBoundsException(count);}
    if(offest<=value.length){this.value = "".vlaue; return;}
  }
  if(offest>value.length-count){
    throw new StringIndexOutOfBoundsException(offset+ count);
  }
  this.value = Arrays.copyOfRange(value,offset,offset+count);
}

當咱們使用字符數組建立String的時候，會用到Arrays.copyOf方法和Arrays.copyOfRange方法。這兩個方法是將原有的字符數組中的內容逐一的複製到String中的字符數組中。

(3)int數組做爲參數

public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        int n = count;
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                n++;
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

(4) 字節數組做爲參數

public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
        this(bytes, 0, bytes.length, charsetName);
    }

public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

public String(byte bytes[], int offset, int length) {
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }

public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

byte是網絡傳輸或存儲的序列化形式。byte[]和String之間的相互轉換就不得不關注編碼問題。String(byte[] bytes, Charset charset)是指經過charset來解碼指定的byte數組，將其解碼成unicode的char[]數組，夠形成新的String。其中都用到了decode函數，具體以下：

static char[] decode(String charsetName, byte[] ba, int off, int len)
        throws UnsupportedEncodingException
    {
        StringDecoder sd = deref(decoder);
        String csn = (charsetName == null) ? "ISO-8859-1" : charsetName;
        if ((sd == null) || !(csn.equals(sd.requestedCharsetName())
                              || csn.equals(sd.charsetName()))) {
            sd = null;
            try {
                Charset cs = lookupCharset(csn);
                if (cs != null)
                    sd = new StringDecoder(cs, csn);
            } catch (IllegalCharsetNameException x) {}
            if (sd == null)
                throw new UnsupportedEncodingException(csn);
            set(decoder, sd);
        }
        return sd.decode(ba, off, len);
    }

能夠如是不指定字符集的話，則會用默認的ISO-8859-1字符集解碼

(5)StringBuffer和StringBulider做爲參數

public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

關於效率問題，Java的官方文檔有提到說使用StringBuilder的toString方法會更快一些，緣由是StringBuffer的toString方法是synchronized的，在犧牲了效率的狀況下保證了線程安全。

2.2 經常使用方法

length() 返回字符串長度

isEmpty() 返回字符串是否爲空

charAt(int index) 返回字符串中第（index+1）個字符

char[] toCharArray() 轉化成字符數組

trim() 去掉兩端空格

toUpperCase() 轉化爲大寫

toLowerCase() 轉化爲小寫

String concat(String str) //拼接字符串

String replace(char oldChar, char newChar) //將字符串中的oldChar字符換成newChar字符

//以上兩個方法都使用了String(char[] value, boolean share)；

boolean matches(String regex) //判斷字符串是否匹配給定的regex正則表達式

boolean contains(CharSequence s) //判斷字符串是否包含字符序列s

String[] split(String regex, int limit) 按照字符regex將字符串分紅limit份。

String[] split(String regex)

getBytes

public byte[] getBytes(String charsetName)throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }

public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }

比較方法

boolean equals(Object anObject)；
boolean contentEquals(StringBuffer sb)；
boolean contentEquals(CharSequence cs)；
boolean equalsIgnoreCase(String anotherString)；
int compareTo(String anotherString)；
int compareToIgnoreCase(String str)；
boolean regionMatches(int toffset, String other, int ooffset,int len)  //局部匹配
boolean regionMatches(boolean ignoreCase, int toffset,String other, int ooffset, int len)   //局部匹配

其中比較有特色的：

public boolean equals(Object anObject) {  
        if (this == anObject) {  //判斷兩個對象是不是指向同一內存地址的
            return true;
        }
        if (anObject instanceof String) {  //判斷兩個字符串的值是否相同
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

其中的局部匹配使用參考

判斷字符串開始結束字符串

public boolean startsWith(String prefix, int toffset) {  //prefix前綴， toffset開始比較的位置
        char ta[] = value;
        int to = toffset;
        char pa[] = prefix.value;
        int po = 0;
        int pc = prefix.value.length;
        // Note: toffset might be near -1>>>1.
        if ((toffset < 0) || (toffset > value.length - pc)) {
            return false;
        }
        while (--pc >= 0) {
            if (ta[to++] != pa[po++]) {
                return false;
            }
        }
        return true;
    }
同理有：
public boolean startsWith(String prefix){}
public boolean endsWith(String suffix) {return startsWith(suffix, value.length - suffix.value.length);}

4、總結

String對象是不可改變的，賦值給字符串引用以新的引用時，實際是改變其指向的內存地址，可是原內存的值是沒有改變的。

5、注意