Java基礎系列—字符串

時間 2019-12-19

標籤 java 基礎系列字符串欄目 Java 简体版

原文原文鏈接

能夠證實，字符串操做是計算機程序設計中最多見的行爲。java

不可變String

String對象是不可變的。查看JDK文檔你就會發現，String類中每個看起來會修改String值的方法，實際上都是建立了一個全新的String對象，以包含修改後的字符串內容，而最初的String對象則絲毫未變。正則表達式

public class Immutable {

    public static String upcase(String s){
        return  s.toUpperCase();
    }

    public static void main(String[] args) {
        String q = "howdy";
        System.out.println(q);
        String qq = upcase(q);
        System.out.println(qq);
        System.out.println(q);
    }
}
複製代碼

重載「+」與StringBuilder

不可變性會帶來必定的效率問題。用於String的「+」與「+=」是Java中僅有的兩個重載過的操做符，而Java並不容許程重載任何操做符。編程

操做符「+」能夠用來鏈接String：bash

public class Concatenation {

    public static void main(String[] args) {
        String mango = "mango";
        String s = "abc" + mango;
        System.out.println(s);
    }
}
複製代碼

想看看以上代碼究竟是如何工做的嗎，能夠用JDK自帶的工具javap來反編譯以上代碼，能夠獲得如下的字節碼：app

編譯器自動引入了java.lang.StringBuilder類（由於它更高效），經過調用append方法將字符串鏈接起來，最後調用toString方法生成最終結果。ide

避免循環體內使用 "+="

public String implicit(String[] fields) {
    String result = "";
    for (int i = 0; i < fields.length; i++) {
        result += fields[i];
    }
    return result;
}
複製代碼

經過反編譯得以下字節碼：工具

StringBuilder是在循環體內構造的，這意味着每通過循環一次，就會建立一個新的StringBuilder對象。ui

無心識的遞歸

public class InfiniteRecursion {

    @Override
    public String toString() {
        // 打印InfiniteRecursion對象的內存地址
        return " InfiniteRecursion adress: " + this + "\n";
    }

    public static void main(String[] args) {
        InfiniteRecursion infiniteRecursion  = new InfiniteRecursion();
        System.out.println(infiniteRecursion);
    }
}
複製代碼

運行以上程序出現以下結果：this

" InfiniteRecursion adress: " + this這裏發生了自動類型轉換，由InfiniteRecursion類型轉換成String類型。由於編譯器發現String對象後面跟着一個「+」，然後面的對象不是String，編譯器調用this.toString()方法進行類型轉換，所以發生了遞歸調用。spa

若是你真的想要打印出對象的內存地址，應該調用Object.toString()方法。因此不應使用this，而是應該調用super.toString()方法。

格式化輸出

Java SE5推出了C語言中printf()風格的格式化輸出這一功能，不須要使用重載的「+」操做符來鏈接引用號內的字符串或者字符串常量，而是使用特殊的佔位符來表示數據未來的位置。

public static void main(String[] args) {
    int x = 5;
    double y = 5.332;

    // the old way
    System.out.println("Row 1：[" + x + " " + y + "]");

    // the new way
    System.out.printf("Row 1：[%d %f]\n", x, y);
    
    // or
    System.out.format("Row 1：[%d %f]\n", x, y);
}
複製代碼

運行以上程序，首先將x的值插入到%d的位置，而後將y的值插入%f的位置。這些佔位符被稱爲格式修飾符，它們不但說明了將插入什麼類型的變量，以及如何對其格式化。

Formatter類

在Java中，全部新的格式化功能都由java.util.Formatter類處理。能夠將Formatter類看做一個翻譯器，它將你的格式化字符串與數據翻譯成須要的結果。

Formatter formatter = new Formatter(System.out);
formatter.format("Row 1：[%d %f]\n", x, y);
複製代碼

String.format()

String.format()是一個static方法，它接受與Formatter.format()方法同樣的參數，但返回一個String對象。

格式化說明符

在插入數據時，若是想要控制空格與對齊，你須要更精細複雜的格式修飾符。如下是其抽象的語法：

%[argument_index$][flags][width][.precision]conversion
複製代碼

字段	說明
argument_index	須要將參數列表中第幾個參數進行格式化
flags	一些特殊的格式，好比‘-’表明向左對齊
width	輸出的最小的寬度，適用於全部的參數類型
[.precision]	參數爲String，則表示打印String輸出字符的最大數量；參數爲float，則表示小數點最大位數。不使用於int
conversion	接受的參數類型，如s表明後面接String類型的參數；d表明接int型的參數

參數詳細說明

正則表達式

正則表達式是一種強大而靈活的文本處理工具。使用正則表達式，咱們可以以編程的方式，構造複雜的文本模式，並對輸入的字符串進行搜索。一旦找到了匹配這些模式的部分，你就可以爲所欲爲地對它們進行處理。

Java語言與其餘語言相比對反斜槓\有不一樣的處理：

在其餘語言中，\\表示「我想要在正則表達式中插入一個普通的（字面上的）反斜槓，請不要給它作任何特殊的意義。」而在Java中，\\的意思是「我要插入一個正則表達式的反斜槓，因此其後的字符具備特殊的意義。」

例如，若是你想表示一位數字，那麼正則表達式應該是\\d。若是你想插入一個普通的反斜槓，則應該這樣\\\\。不過換行和製表符之類的東西只需使用單斜槓線：\n\t。

String類支持正則表達式的方法

public String[] split(String regex)

public String[] split(String regex, int limit) 

public String replaceFirst(String regex, String replacement) public String replaceAll(String regex, String replacement) public boolean matches(String regex) 複製代碼

Pattern和Matcher

通常來講，比起功能有限的String類，咱們更願意構造功能強大的正則表達式對象。經過Pattern.complie()方法來編譯你的正則表達式便可。它會根據你的String類型的正則表達式生成一個Pattern對象。接下來，把你想要檢索的字符串傳入Pattern對象的matcher()方法，matcher()方法會生成一個Matcher對象，它有不少功能可用。

public class RegexExpression {

    public static void main(String[] args) {
        String phone = "18926119073";
        Pattern pattern = Pattern.compile("1([358][0-9]|4[579]|66
                |7[0135678]|9[89])[0-9]{8}");
        Matcher matcher = pattern.matcher(phone);
        System.out.println(matcher.matches());
    }
}
複製代碼

組

組是用括號劃分的正則表達式，能夠根據組的編號來引用某個組。組號爲0表示整個表達式，組號1表示被第一對括號括起的組，依次類推。所以，在下面這個表達式：

A(B(C))D
複製代碼

中有三個組：組0是ABCD，組1是BC，組2是C。

Mather提供了不少有用的方法具體使用查看API,使用Mather的替換方法能夠實現隱藏手機號中間數字、隱藏用戶名等。

String phone = "18926119073";
Pattern pattern = Pattern.compile("(\\d{3})\\d{4}(\\d{4})");
Matcher matcher = pattern.matcher(phone);
phone = matcher.replaceAll("$1****$2");
System.out.println(phone);
複製代碼

用正則表達式掃描

Java SE5新增類Scanner類，它能夠大大減輕掃描輸入的工做。

public class ScannerRead {

    public static void main(String[] args) {
        Scanner scanner = new Scanner("Sir Robin of Camelot
        \n22 1.61803"));
        System.out.println("What is your name?");
        String name = scanner.nextLine();
        System.out.println(name);
        System.out.println("(input: <age> <double>)");
        System.out.println(scanner.nextInt()+" "+ scanner.nextDouble());
    }
}
複製代碼

Scanner的構造期能夠接受任何類型的輸入對象，包括File對象、InputStream、String或者Readable對象。

Scanner全部的輸入、分詞以及翻譯的操做都隱藏在不一樣類型的next方法中，普通的next()方法返回下一個String，全部的基本類型（除char以外）都有對應的next方法，包括BigDecimal和BigInteger。全部的next方法，只有在找到一個完整的分詞以後纔會返回，hasNext方法用以判斷下一個輸入分詞是否所需的類型。

在默認的狀況下，Scanner根據空白符對輸入進行分詞，可是你能夠用正則表達式指定本身所需的定界符：

public class ScannerDelimiter {

    public static void main(String[] args) {
        Scanner scanner = new Scanner("12,42,78,99");
        scanner.useDelimiter(",");
        while (scanner.hasNextInt()){
            System.out.println(scanner.nextInt());
        }
    }
}
複製代碼

除了可以掃描基本類型以外，你還可使用自定義的正則表達式進行掃描。以下所示：

public class ThreatAnalyzer {

    static String threatData =
            "58.27.82.161@02/10/2005\n" +
            "124.45.82.161@02/10/2005\n" +
            "58.27.82.161@02/10/2005\n" +
            "72.27.82.161@02/10/2005\n" +
             "[Next log section with different data format]";

    public static void main(String[] args) {
        Scanner scanner = new Scanner(threatData);
        String pattern = "(\\d+[.]\\d+[.]\\d+[.]\\d+)@(\\d{2}/\\d{2}/\\d{4})";
        while (scanner.hasNext(pattern)) {
            scanner.next(pattern);
            MatchResult match = scanner.match();
            String ip = match.group(1);
            String date = match.group(2);
            System.out.format("Threat on %s from %s\n", date, ip);
        }
    }
}
複製代碼