需求,把以下字符替換成空格:html
!#$%&()[]*+-@?{|}~¢£¤¥¦§©ª«¬®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_\\^þÞ¡¨!<>\'*˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚「」„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢjava
天然考慮使用String的replaceAll來替換,jdk中此方法的定義以下:正則表達式
/** * Replaces each substring of this string that matches the given <a * href="../util/regex/Pattern.html#sum">regular expression</a> with the * given replacement. * * <p> An invocation of this method of the form * <i>str</i>{@code .replaceAll(}<i>regex</i>{@code ,} <i>repl</i>{@code )} * yields exactly the same result as the expression * * <blockquote> * <code> * {@link java.util.regex.Pattern}.{@link * java.util.regex.Pattern#compile compile}(<i>regex</i>).{@link * java.util.regex.Pattern#matcher(java.lang.CharSequence) matcher}(<i>str</i>).{@link * java.util.regex.Matcher#replaceAll replaceAll}(<i>repl</i>) * </code> * </blockquote> * *<p> * Note that backslashes ({@code \}) and dollar signs ({@code $}) in the * replacement string may cause the results to be different than if it were * being treated as a literal replacement string; see * {@link java.util.regex.Matcher#replaceAll Matcher.replaceAll}. * Use {@link java.util.regex.Matcher#quoteReplacement} to suppress the special * meaning of these characters, if desired. * * @param regex * the regular expression to which this string is to be matched * @param replacement * the string to be substituted for each match * * @return The resulting {@code String} * * @throws PatternSyntaxException * if the regular expression's syntax is invalid * * @see java.util.regex.Pattern * * @since 1.4 * @spec JSR-51 */ public String replaceAll(String regex, String replacement) { return Pattern.compile(regex).matcher(this).replaceAll(replacement); }
第一個參數是正則表達式,把須要替換的字符放到[]中,而後放入第一個參數,這還沒完,須要把這些字符中的屬於正則表達式的特殊字符轉義一下。express
特殊字符可見以下連接:連接測試
把特殊字符抽取出來,單獨替換,代碼以下:this
result = result.replaceAll("[\\$\\(\\)\\*\\+\\.\\[\\]\\?\\\\^\\{\\}\\|]", " "); result = result.replaceAll("[!#%&-@~¢£¤¥¦§©ª«¬\u00AD®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_þÞ¡¨!<>'˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚「」„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢ]", " ");
寫完了以後測試發現數字也能夠被替換掉,這就奇怪了,使用二分法來篩選究竟是哪塊除了問題,最後定位到&-@,原來橫線也是特殊字符,只要ASCII碼在&(38)和@(64)之間的(好比數字、括號、星號、加號)都會知足正則表達式。把它也抽取出來轉義就行了,以下:spa
result = result.replaceAll("[\\$\\(\\)\\*\\+\\.\\[\\]\\?\\\\^\\{\\}\\|\\-]", " "); result = result.replaceAll("[!#%&@~¢£¤¥¦§©ª«¬\u00AD®¯°±²³µ¶¹º»¼«½¾¿×~‘’`_þÞ¡¨!<>'˝´\"ſß÷ΓΔΘΛΞΠΣΦΨΩγδθΛΦЂЃЉЊЋЍЏБДЖЗИЙЛФЦШЧЩЪЫЬЭЮЯ‐–—―‘’‚「」„†‡…•‰‹›‽₂₁₀ⁿ⁾⁽⁼⁻⁺⁹⁸⁷⁶⁵⁴⁰⁄₃₄₅₆₇₈₉₊₋₌₎₍€℅ℓ№℗⅟⅞⅝⅜⅛⅚⅙⅘⅗⅖⅕⅔⅓℮Ω™℠←↑→↓↔↕↖↗↘↙∂∆∏∑−∙√fflffiflfiff◊≥≤≠≈∫∞ѲҐΏГПѝѢ]", " ");