PHP之mb_convert_case使用

時間 2019-11-05

標籤 php convert case 使用欄目 PHP 简体版

原文原文鏈接

mb_convert_case

(PHP 4 >= 4.3.0, PHP 5, PHP 7)

mb_convert_case — Perform case folding on a string

mb_convert_case — 對字符串進行大小寫轉換

Description

string mb_convert_case ( string $str , int $mode [, string $encoding = mb_internal_encoding() ] )
//Performs case folding on a string, converted in the way specified by mode.
//對一個 string 進行大小寫轉換，轉換模式由 mode 指定。

Parameters

str

The string being converted.
要被轉換的 string。

mode

The mode of the conversion. It can be one of MB_CASE_UPPER, MB_CASE_LOWER, or MB_CASE_TITLE.
轉換的模式。它能夠是 MB_CASE_UPPER、 MB_CASE_LOWER 和 MB_CASE_TITLE 的其中一個。

encoding

The encoding parameter is the character encoding. If it is omitted, the internal character encoding value will be used.
encoding 參數爲字符編碼。若是省略，則使用內部字符編碼。

Return Values

A case folded version of string converted in the way specified by mode.
按 mode 指定的模式轉換 string 大小寫後的版本。

Examples

<?php
/**
 * Created by PhpStorm.
 * User: zhangrongxiang
 * Date: 2018/1/28
 * Time: 下午3:16
 */

/**Example #1 mb_convert_case() 例子*/
$str = "mary had a Little lamb and she loved it so";
$str = mb_convert_case( $str, MB_CASE_UPPER, "UTF-8" ) . PHP_EOL;
echo $str; // 輸出 MARY HAD A LITTLE LAMB AND SHE LOVED IT SO
$str = mb_convert_case( $str, MB_CASE_TITLE, "UTF-8" ) . PHP_EOL;
echo $str; // 輸出 Mary Had A Little Lamb And She Loved It So

/**Example #2 非拉丁 UTF-8 文本的mb_convert_case() 例子*/
$str = "Τάχιστη αλώπηξ βαφής ψημένη γη, δρασκελίζει υπέρ νωθρού κυνός";
$str = mb_convert_case( $str, MB_CASE_UPPER, "UTF-8" ) . PHP_EOL;
echo $str; // 輸出 ΤΆΧΙΣΤΗ ΑΛΏΠΗΞ ΒΑΦΉΣ ΨΗΜΈΝΗ ΓΗ, ΔΡΑΣΚΕΛΊΖΕΙ ΥΠΈΡ ΝΩΘΡΟΎ ΚΥΝΌΣ
$str = mb_convert_case( $str, MB_CASE_TITLE, "UTF-8" ) . PHP_EOL;
echo $str; // 輸出 Τάχιστη Αλώπηξ Βαφήσ Ψημένη Γη, Δρασκελίζει Υπέρ Νωθρού Κυνόσ

/**
 * mb_strtolower() - 使字符串小寫
 * mb_strtoupper() - 使字符串大寫
 * strtolower() - 將字符串轉化爲小寫
 * strtoupper() - 將字符串轉化爲大寫
 * ucfirst() - 將字符串的首字母轉換爲大寫
 * ucwords() - 將字符串中每一個單詞的首字母轉換爲大寫
 */

echo mb_convert_case( 'AAA "aaa"', MB_CASE_TITLE ) . PHP_EOL; //Aaa "aaa"
// but  I want this ===> AAA "Aaa"

function mb_convert_case_utf8_variation( $s ) {
    $arr    = preg_split( "//u", $s, - 1, PREG_SPLIT_NO_EMPTY );
    var_dump($arr);
    $result = "";
    $mode   = false;
    foreach ( $arr as $char ) {
        $res = preg_match(
                   '/\\p{Mn}|\\p{Me}|\\p{Cf}|\\p{Lm}|\\p{Sk}|\\p{Lu}|\\p{Ll}|' .
                   '\\p{Lt}|\\p{Sk}|\\p{Cs}/u', $char ) == 1;
        if ( $mode ) {
            if ( ! $res ) {
                $mode = false;
            }
        } elseif ( $res ) {
            $mode = true;
            $char = mb_convert_case( $char, MB_CASE_TITLE, "UTF-8" );
        }
        $result .= $char;
    }
    
    return $result;
}

echo mb_convert_case_utf8_variation('AAA "aaa"').PHP_EOL;
//AAA "Aaa"

echo mb_convert_case("Hello 中國",MB_CASE_UPPER).PHP_EOL;//HELLO 中國
echo mb_convert_case("Hello 中國",MB_CASE_UPPER,"GBK").PHP_EOL;//HELLO 中國

Extension

Unicode

By contrast to the standard case folding functions such as strtolower() and strtoupper(), case folding is performed on the basis of the Unicode character properties. Thus the behaviour of this function is not affected by locale settings and it can convert any characters that have 'alphabetic' property, such as A-umlaut (Ä).
和相似 strtolower()、strtoupper() 的標準大小寫轉換函數相比，大小寫轉換的執行根據 Unicode 字符屬性的基礎。所以此函數的行爲不受語言環境（locale）設置的影響，可以轉換任意具備「字母」屬性的字符，例如元音變音A（Ä）
For more information about the Unicode properties, please see » http://www.unicode.org/unicod...
更多關於 Unicode 屬性的信息，請查看 » http://www.unicode.org/unicod...。

UTF-8 編碼規則

對於單字節的符號，字節的第一位設爲0，後面7位爲這個符號的 Unicode 碼。

所以對於英語字母，UTF-8 編碼和 ASCII 碼是相同的。php

對於n字節的符號（n > 1），第一個字節的前n位都設爲1，第n + 1位設爲0，

後面字節的前兩位一概設爲10。剩下的沒有說起的二進制位，所有爲這個符號的 Unicode 碼。函數

Unicode符號範圍     |        UTF-8編碼方式
(十六進制)          |              （二進制）
----------------------+---------------------------------------------
0000 0000-0000 007F | 0xxxxxxx
0000 0080-0000 07FF | 110xxxxx 10xxxxxx
0000 0800-0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx
0001 0000-0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

跟據上表，解讀 UTF-8 編碼很是簡單。若是一個字節的第一位是0，則這個字節
單獨就是一個字符；若是第一位是1，則連續有多少個1，就表示當前字符佔用多少個字節。
由於多字節的utf-8編碼值的前一位都是以1開頭。this