SQL經典實例(六)字符串處理

遍歷字符串

例如,把emp表的ename等於KING的字符串拆開來顯示爲4行,每行一個字符。
藉助表t10:curl

clipboard.png

select substr(e.ename, iter.pos, 1) as C
    from (select ename from emp where ename = 'KING') e,
         (select id as pos from t10) iter
where iter.pos <= length(e.ename);

clipboard.png

其實看一下from語句的笛卡爾積就能理解:函數

select ename, iter.pos
    from (select ename from emp where ename = 'KING') e,
         (select id as pos from t10) iter;

clipboard.png

嵌入引號

select 'g''day mate' qmarks from t1 union all
select 'beavers''teeth' from t1 union all
select '''' from t1;

clipboard.png

統計字符出現的次數

select (length('HELLO HELLO') - 
        length(replace('HELLO HELLO', 'LL', '')))/length('LL')
    as cnt
from t1;

clipboard.png

刪除不想要的字符

例如,從emp表中的ename中刪除元音字母,從sal中刪除全部的0:
Oracleurl

select ename, 
       replace(translate(ename, 'AEIOU', 'aaaaa'), 'a', '') as stripped1,
       sal,
       replace(sal, 0, '') as stripped2
    from emp;

clipboard.png

MySQL沒有提供translate函數,須要屢次調用replacespa

select ename,
       replace(
       replace(
       replace(
       replace(
       replace(ename, 'A', ''), 'E', ''), 'I', ''), 'O', ''), 'U', '') as stripped1,
       sal,
       replace(sal, 0, '') as stripped2
    from emp;

分離數字和字符數據

考慮以下結果集:code

select ename || cast(sal as char(4)) as data from emp;

clipboard.png
想把數字和字母分開成兩列。
Oracleregexp

select replace(
        translate(data, '0123456789', '0000000000'), '0', '') ename,
       to_number(
        replace(
          translate(lower(data),
            'abcdefghijklmnopqrstuvwxyz',
            rpad('z', 26, 'z')), 'z')) sal
    from (
    select ename || cast(sal as char(4)) data 
        from emp ) x;

clipboard.png
rpad函數從右邊對字符串使用指定的字符進行填充   
rpad(string,padded_length,[pad_string])
string 表示:被填充的字符串   
padded_length 表示:字符的長度,是返回的字符串的數量,若是這個數量比原字符串的長度要短,rpad函數將會把字符串截取成從左到右的n個字符;   
pad_string 是個可選參數,這個字符串是要粘貼到string的右邊,若是這個參數未寫,lpad函數將會在string的右邊粘貼空格。
例如:
rpad('tech', 7); 將返回'tech ' (後面有三個空格)   
rpad('tech', 2); 將返回'te'   
rpad('tech', 8, '0'); 將返回'tech0000'   
rpad('tech on the net', 15, 'z'); 將返回 'tech on the net' (原字符串一共15個,因此無需增長或者截取)  
rpad('tech on the net', 16, 'z'); 將返回 'tech on the netz'排序

判斷含有字母和數字的字符串

考慮以下視圖:
Oracleip

create view V as 
select ename as data
    from emp
  where deptno = 10
  union all 
select ename || ', $' || cast(sal as char(4)) || '.00' as data
    from emp
  where deptno = 20
  union all
select ename || cast(deptno as char(4)) as data
    from emp
  where deptno = 30;

MySQL字符串

create view V as 
select ename as data
    from emp
  where deptno = 10
  union all 
select concat(ename, ', $', sal, '.00') as data
    from emp
  where deptno = 20
  union all
select concat(ename, deptno) as data
    from emp
  where deptno = 30;

clipboard.png
想要過濾那些包含了除字母和數字以外的結果:string

Oracle

select data from V
    where translate(lower(data),
        '0123456789abcdefghijklmnopqrstuvwxyz ',
        rpad('a', 37, 'a')) = rpad('a', length(data), 'a');

MySQL

select data from V
    where data regexp '[^0-9a-zA-Z]' = 0;

注意,因爲建立視圖V時使用的是char(4),長度是固定的,因此當cast(deptno as char(4))長度不足4時會用空格補齊。所以須要對空格進行處理。原書p96語句有誤

建立分隔列表

考慮轉換以下結果集:

select deptno, ename as emps from emp order by deptno;

clipboard.png

變換成:

clipboard.png
Oracle

select deptno,
       ltrim(sys_connect_by_path(ename, ','), ',') emps
    from (
select deptno, ename,
       row_number() over (partition by deptno order by empno) rn,
       count(*) over (partition by deptno) cnt
    from emp
    )
where level = cnt
    start with rn = 1
    connect by prior deptno = deptno and prior rn = rn-1;

MySQL

select deptno,
       group_concat(ename order by empno separator ',') as emps
    from emp
group by deptno;

GROUP_CONCAT函數能夠完成全部的工做。它負責把傳遞給它的ENAME列拼接起來。它是一個聚合函數,因此查詢語句須要用到GROUP_BY.

Oracle 語句分析:
考慮from子句的子查詢:

select deptno, ename,
       row_number() over (partition by deptno order by empno) rn,
       count(*) over (partition by deptno) cnt
    from emp;

clipboard.png

按字母表順序排列字符

考慮以下將ename按照字母順序排列以下:

clipboard.png

MySQL

select ename old_ename, group_concat(c order by c separator '') new_ename
    from (
        select ename, substr(a.ename, iter.pos, 1) c
            from emp a,
                (select id pos from t10) iter
            where iter.pos <= length(a.ename)
        ) x
    group by ename;

考慮from子查詢:

select ename, substr(a.ename, iter.pos, 1) c
            from emp a,
                (select id pos from t10) iter
            where iter.pos <= length(a.ename)

clipboard.png
...
GROUP_CONCAT函數不只能串接每一個字母,還能按照字母表順序對它們進行排序。

Oracle 使用SYS_CONNECT_BY_PATH函數迭代建立一個列表:

select old_name, new_name
    from (
        select old_name, replace(sys_connect_by_path(c, ' '), ' ') new_name
            from (
                select e.ename old_name, 
                       row_number() over (partition by e.ename 
                                            order by substr(e.ename, iter.pos, 1)) rn,
                       substr(e.ename, iter.pos, 1) c
                from emp e,
                    (select rownum pos from emp) iter
            where iter.pos <= length(e.ename)
            order by 1
        ) x
    start with rn=1
connect by prior rn = rn-1 and prior old_name = old_name
) 
where length(old_name) = length(new_name);

分析:看裏層的子查詢:

select e.ename old_name, 
                       row_number() over (partition by e.ename 
                                            order by substr(e.ename, iter.pos, 1)) rn,
                       substr(e.ename, iter.pos, 1) c
                from emp e,
                    (select rownum pos from emp) iter
            where iter.pos <= length(e.ename)
            order by 1

clipboard.png
...
而後,提取出排好序的字符並重建每一個名字。使用SYS_CONNECT_BY_PATH函數來將全部的字符按順序串接起來

select old_name, replace(sys_connect_by_path(c, ' '), ' ') new_name
            from (
                select e.ename old_name, 
                       row_number() over (partition by e.ename 
                                            order by substr(e.ename, iter.pos, 1)) rn,
                       substr(e.ename, iter.pos, 1) c
                from emp e,
                    (select rownum pos from emp) iter
            where iter.pos <= length(e.ename)
            order by 1
            ) x
        start with rn=1
    connect by prior rn = rn-1 and prior old_name = old_name

clipboard.png
...
最後只保留那些和原名字具備相同長度的字符串。

提取第n個分隔子字符串

考慮以下視圖:

create view V as
select 'mo,larry,curly' as name
    from t1
  union all
select 'tina,gina,jaunita,regina,leena' as name
    from t1;

clipboard.png
但願提取每一行的第二個名字,解決問題的關鍵是把每個名字轉換爲單獨的一行,並保持每個名字在列表裏的順序不變。

MySQL

select name
  from (
select iter.pos,
       substring_index(
       substring_index(src.name, ',', iter.pos), ',', -1) name
  from V src,
       (select id pos from t10) iter
 where iter.pos <= 
            length(src.name) - length(replace(src.name, ',', ''))
    ) x
where pos = 2;

Oracle

select sub
  from (
select iter.pos,
       src.name,
       substr(src.name,
        instr(src.name, ',', 1, iter.pos) + 1,
        instr(src.name, ',', 1, iter.pos+1) - 
        instr(src.name, ',', 1, iter.pos)-1) sub
   from (select ','||name||',' as name from V) src,
        (select rownum pos from emp) iter
 where iter.pos < length(src.name) - length(replace(src.name, ','))
    )
where pos = 2;
《SQL經典實例》第六章
相關文章
相關標籤/搜索