1.grouping setssql
記得前幾天第一次接觸grouping sets時,筆者的感受是一臉懵逼。數據庫
後來一不當心看到msdn上對grouping sets的說明,頓時豁然開朗,其實grouping sets就是由多個group by聯合起來,關係以下。函數
select A , B from table group by grouping sets(A, B) 等價於 性能
select A , null as B from table group by A spa
union all 3d
select null as A , B from table group by B code
爲了更好的理解我建立了teacher表,表數據以下,查詢結果集中左邊的爲使用union all的group by字句,右邊的爲使用grouping sets的結果集。blog
select null as teacherAddress,MAX(teacherSalary),ascriptionInstitute from teacher group by ascriptionInstitute union all select teacherAddress,MAX(teacherSalary),NULL as ascriptionInstitute from teacher group by teacherAddress select teacherAddress,MAX(teacherSalary),ascriptionInstitute from teacher group by GROUPING SETS (ascriptionInstitute,teacherAddress)
上面提到grouping sets是等價於帶union all的group by子句,之因此是等價而不是等於,從二者結果集中的對比就能夠一目了之,那就是它們的順序不同。這說明grouping sets並不僅是group by的語法糖,這二者內部的執行過程應該是全然不一樣的,在百度過程當中發現大多數答案都是這句話:「聚合是一次性從數據庫中取出全部須要操做的數據,在內存中對數據庫進行聚合操做並生成結果。而UNION ALL是屢次掃描表,將返回的結果進行UNION操做。性能方面grouping sets能減小IO操做但會增長CPU佔用時間」。我不理解的地方是一次性取出數據後,是如何在內存中進行聚合操做的?結果集雖然順序不同但數據是相同的,這說明依舊進行了聯合操做而這個聯合操做並非屢次掃描表,關鍵內部屢次是如何掃描的我很好奇?對於性能我想知道爲何會這樣子而不是看到現象。另外在grouping sets中若是將括號中的參數換個位置那麼結果也將改變,這說明結果集中的順序與參數的位置也有關,這讓我更加好奇grouping sets的內部執行過程了。ip
select MAX(teacherSalary),ascriptionInstitute ,teacherAddress from teacher group by GROUPING SETS (ascriptionInstitute,teacherAddress) select MAX(teacherSalary),ascriptionInstitute ,teacherAddress from teacher group by GROUPING SETS (teacherAddress,ascriptionInstitute)
2.grouping( )內存
grouping函數用來區分NULL值,這裏NULL值有2種狀況,一是本來表中的數據就爲NULL,二是由rollup、cube、grouping sets生成的NULL值。
當爲第一種狀況中的空值時,grouping(NULL)返回0;當爲第二種狀況中的空值時,grouping(NULL)返回1。實例以下,從結果中能夠看到第二個結果集中本來爲null的數據因爲grouping函數爲1,故顯示ROLLUP-NULL字符串。
select teacherAddress,ascriptionInstitute,COUNT(teacherId ) from teacher group by teacherAddress,ascriptionInstitute select teacherAddress,ascriptionInstitute,COUNT(teacherId ) from teacher group by rollup(teacherAddress,ascriptionInstitute) select ISNULL(teacherAddress,case when GROUPING(teacherAddress)=1 then 'ROLLUP-NULL' end) as teacherAddress, ISNULL(ascriptionInstitute,case when GROUPING(ascriptionInstitute)=1 then 'ROLLUP-NULL' end) as ascriptionInstitute, COUNT(teacherId ) from teacher group by rollup(teacherAddress,ascriptionInstitute)
3.grouping_id( )
grouping_id函數也是計算分組級別的函數,注意若是要使用grouping_id函數那必須得有group by字句,並且group by字句的中的列與grouping_id函數的參數必須相等。好比group by A,B,那麼必須使用grouping_id(A,B)。下面用一個等效關係來講明grouping_id()與grouping()的聯繫,grouping_id(A, B)等效於grouping(A) + grouping(B),但要注意這裏的+號不是算術相加,它表示的是二進制數據組合在一塊兒,好比grouping(A)=1,grouping(B)=1,那麼grouping_id(A, B)=11B,也就是十進制數3。原來的表數據執行下面的sql語句結果太多效果不明顯,因此我改了下表數據,不過對比兩個結果集效果很明顯。
select ISNULL(teacherAddress,case when GROUPING(teacherAddress)=1 then 'ROLLUP-NULL' end) as teacherAddress, ISNULL(ascriptionInstitute,case when GROUPING(ascriptionInstitute)=1 then 'ROLLUP-NULL' end) as ascriptionInstitute, ISNULL(teacherSex,case when GROUPING(teacherSex)=1 then 'ROLLUP-NULL' end) as teacherSex, COUNT(teacherId ) from teacher group by rollup(teacherAddress,ascriptionInstitute,teacherSex) select ISNULL(teacherAddress,case when GROUPING(teacherAddress)=1 then 'ROLLUP-NULL' end) as teacherAddress, ISNULL(ascriptionInstitute,case when GROUPING(ascriptionInstitute)=1 then 'ROLLUP-NULL' end) as ascriptionInstitute, ISNULL(teacherSex,case when GROUPING(teacherSex)=1 then 'ROLLUP-NULL' end) as teacherSex, COUNT(teacherId ) as '數量' , GROUPING_ID(teacherAddress,ascriptionInstitute,teacherSex) from teacher group by rollup(teacherAddress,ascriptionInstitute,teacherSex)