優化一個奇葩表設計上的全表掃描SQL

時間 2019-11-10

標籤優化一個奇葩設計掃描 sql 欄目 SQL 简体版

原文原文鏈接

以前在一個比較繁忙的系統抓到的耗時長、消耗CPU多的一條SQL，以下：
SELECT * FROM Z_VISU_DATA_ALARM_LOG T
WHERE TO_DATE(T.T_TIMESTR, 'MM/DD/YY HH24:MI:SS'))<=(TO_DATE(TO_CHAR(SYSDATE, 'yyyy-mm-dd HH24:mi:ss'),'yyyy-mm-dd HH24:mi:ss') - 1800 * 1000 / 1440/60/1000sql

1.先看看奇葩的表設計：設計表的同窗看來很喜歡varchar2這種數據類型，以及128這個數字。
SQL> desc Z_VISU_DATA_ALARM_LOG
Name Type Nullable Default Comments
-------------- ------------- -------- ------- --------
T_DESC VARCHAR2(128) Y
T_ERRORSTRING VARCHAR2(128) Y
T_KEY VARCHAR2(128) Y
T_POINTNAME VARCHAR2(128) Y
T_PTNAMEEXT VARCHAR2(128) Y
T_PTNAMELONG VARCHAR2(128) Y
T_PTTIME VARCHAR2(128) Y
T_PTTIMEMS VARCHAR2(128) Y
T_RAWSTATUS VARCHAR2(128) Y
T_RETURNSTATUS VARCHAR2(128) Y
T_STATUS VARCHAR2(128) Y
T_TIMEMSSTR VARCHAR2(128) Y
T_TIMESTR VARCHAR2(128) Y
T_UNITS VARCHAR2(128) Y
T_VALSTR VARCHAR2(128) Y
T_VALUE VARCHAR2(128) Y express

2.再看看記錄數:看到這麼多數據再加上表名，猜想這個是一個記錄alarm log的大表，真想問一下歷史數據歸檔是怎麼作的，雖然明知道得不到答案。
SQL> select count(*) from Z_VISU_DATA_ALARM_LOG;
COUNT(*)
----------
7971800session

3.最後看下這個SQL的執行計劃：其實不用看執行計劃也能猜到是全表掃描。由於SQL寫的太隨意了！寫的時候只是爲了完成功能，而不去考慮性能。奇葩的表設計+800W記錄+SQL作全表掃描能不慢?能不佔用cpu高？less

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 3652682256
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)
--------------------------------------------------------------------------------
| 0 | DELETE STATEMENT | | 701K| 1683K| 42632 (22)
| 1 | DELETE | Z_VISU_DATA_ALARM_LOG | | |
|* 2 | TABLE ACCESS FULL| Z_VISU_DATA_ALARM_LOG | 701K| 1683K| 42632 (22)
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter((TO_DATE(TO_CHAR(SYSDATE@!,'yyyy-mm-dd HH24:mi:ss'),'yyyy-mm-dd
HH24:mi:ss')-TO_DATE("T"."T_TIMESTR",'MM/DD/YY HH24:MI:SS'))*24*60
15 rows selectedide

SQL> 函數

那要怎麼優化這個SQL呢？
1.表設計的時候，時間字段仍是用date或者timestamp吧。BTW，t_desc只有128個字符夠嗎？
2.根據查詢的時間間隔來作分區表，這樣對錶只須要作ALTER TABLE xxx EXCHANGE就能夠完成歷史數據歸檔，也能夠下降沒必要要的IO開銷
3.直接優化這個SQL：
(1)首先改寫，將=號左右兩邊作下數學變換+移動一下位置
SELECT * FROM Z_VISU_DATA_ALARM_LOG T
WHERE TO_DATE(T.T_TIMESTR, 'MM/DD/YY HH24:MI:SS'))<=(TO_DATE(TO_CHAR(SYSDATE, 'yyyy-mm-dd HH24:mi:ss'), 'yyyy-mm-dd HH24:mi:ss') - 1800 * 1000 / 1440/60/1000
(2)對 TO_DATE(T.T_TIMESTR, 'MM/DD/YY HH24:MI:SS'))創建函數索引。性能

這個有個梗，有可能會遇到ORA-01743：only pure functions can be indexed 。
對於這個錯誤是由於建立TO_DATE(T.T_TIMESTR, 'MM/DD/YY HH24:MI:SS'))的時候由於最後年份YY只取了後兩位，這是一個不肯定的值，故而會報錯。故須要改爲對TO_DATE(T.T_TIMESTR, 'yyyy-mm-dd HH24:MI:SS'))建立函數索引，固然sql也須要改。優化

PS：如下是tom大師對ORA-01743錯誤的一個說明
One quirk I have noticed with function-based indexes is that if you create one on the built-in
function TO_DATE, it will not succeed in some cases, for example:
ops$tkyte@ORA10GR1> create table t ( year varchar2(4) );
Table created.
ops$tkyte@ORA10GR1> create index t_idx on t( to_date(year,'YYYY') );
create index t_idx on t( to_date(year,'YYYY') )
*
ERROR at line 1:
ORA-01743: only pure functions can be indexed
This seems strange, since we can sometimes create a function using TO_DATE, for example:
ops$tkyte@ORA10GR1> create index t_idx on t( to_date('01'||year,'MMYYYY') );
Index created.
The error message that accompanies this isn’t too illuminating either:
ops$tkyte@ORA10GR1> !oerr ora 1743
01743, 00000, "only pure functions can be indexed"
// *Cause: The indexed function uses SYSDATE or the user environment.
// *Action: PL/SQL functions must be pure (RNDS, RNPS, WNDS, WNPS). SQL
// expressions must not use SYSDATE, USER, USERENV(), or anything
// else dependent on the session state. NLS-dependent functions
// are OK.
We are not using SYSDATE. We are not using the 「user environment」 (or are we?). No
PL/SQL functions are used, and nothing about the session state is involved. The trick lies in
the format we used: YYYY. That format, given the same exact inputs, will return different
, anytime in the month of May
ops$tkyte@ORA10GR1> select to_char( to_date('2005','YYYY'),
2 'DD-Mon-YYYY HH24:MI:SS' )
3 from dual;
TO_CHAR(TO_DATE('200
--------------------
01-May-2005 00:00:00
the YYYY format will return May 1, in June it will return June 1, and so on. It turns out that
TO_DATE, when used with YYYY, is not deterministic! That is why the index cannot be created: it
would only work correctly in the month you created it in (or insert/updated a row in). So, it is
due to the user environment, which includes the current date itself.
To use TO_DATE in a function-based index, you must use a date format that is unambigu-
ous and deterministic—regardless of what day it is currently.ui

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。