分析函數基於分組,計算分組內數據的聚合值,常常會和窗口函數OVER()一塊兒使用,使用分析函數能夠很方便地計算同比和環比,得到中位數,得到分組的最大值和最小值。分析函數和聚合函數不一樣,不須要GROUP BY子句,對SELECT子句的結果集,經過OVER()子句分組。算法
使用如下腳本插入示例數據:sql
;with cte_data as ( select 'Document Control' as Department,'Arifin' as LastName,17.78 as Rate union all select 'Document Control','Norred',16.82 union all select 'Document Control','Kharatishvili',16.82 union all select 'Document Control','Chai',10.25 union all select 'Document Control','Berge',10.25 union all select 'Information Services','Trenary',50.48 union all select 'Information Services','Conroy',39.66 union all select 'Information Services','Ajenstat',38.46 union all select 'Information Services','Wilson',38.46 union all select 'Information Services','Sharma',32.45 union all select 'Information Services','Connelly',32.45 union all select 'Information Services','Berg',27.40 union all select 'Information Services','Meyyappan',27.40 union all select 'Information Services','Bacon',27.40 union all select 'Information Services','Bueno ',27.40 ) select Department ,LastName ,Rate into #data from cte_data go
分析函數一般和OVER()函數搭配使用,SQL Server中共有4類分析函數。express
注意:distinct子句的執行順序是在分析函數以後。 app
CUME_DIST 計算的邏輯是:小於等於當前值的行數/分組內總行數ide
PERCENT_RANK 計算的邏輯是:(分組內當前行的RANK值-1)/ (分組內總行數-1),排名值是RANK()函數排序的結果值。函數
如下代碼,用於計算累積分佈和排名百分比:spa
select Department ,LastName ,Rate ,cume_dist() over(partition by Department order by Rate) as CumeDist ,percent_rank() over(partition by Department order by Rate) as PtcRank ,rank() over(partition by Department order by Rate asc) as rank_number ,count(0) over(partition by Department) as count_in_group from #data order by DepartMent ,Rate desc
PERCENTILE_CONT和PERCENTILE_DISC都是爲了計算百分位的數值,好比計算在某個百分位時某個欄位的數值是多少。scala
PERCENTILE_CONT ( numeric_literal ) WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] ) OVER ( [ <partition_by_clause> ] ) PERCENTILE_DISC ( numeric_literal ) WITHIN GROUP ( ORDER BY order_by_expression [ ASC | DESC ] ) OVER ( [ <partition_by_clause> ] )
這兩個函數的區別是前者是連續型,後者是離散型。CONT表明continuous,連續值,DISC表明discrete,離散值。PERCENTILE_CONT是連續型,意味它考慮的是區間,因此值是絕對的中間值;而PERCENTILE_DISC是離散型,因此它更多考慮向上或者向下取捨,而不會考慮區間。3d
如下腳本用於得到分位數:code
select Department ,LastName ,Rate ,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Rate) OVER (PARTITION BY Department) AS MedianCont ,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Rate) OVER (PARTITION BY Department) AS MedianDisc ,row_number() over(partition by Department order by Rate) as rn from #data order by DepartMent ,Rate asc
在一次查詢中,對於同一個字段進行排序,Lag 函數用於獲取同一分組內的前N行,Lead函數用於獲取同一分組內的後N行,
LAG (scalar_expression [,offset] [,default]) OVER ( [ partition_by_clause ] order_by_clause ) LEAD ( scalar_expression [ ,offset ] , [ default ] ) OVER ( [ partition_by_clause ] order_by_clause )
結果日期,這兩個函數特別適合用於計算同比和環比。
select DepartMent ,LastName ,Rate ,lag(Rate,1,0) over(partition by Department order by LastName) as LastRate ,lead(Rate,1,0) over(partition by Department order by LastName) as NextRate from #data order by Department ,LastName
獲取分組內的最大值和最小值,分組內的最大值和最小值是惟一的。
LAST_VALUE ( [scalar_expression ) OVER ( [ partition_by_clause ] order_by_clause rows_range_clause )
FIRST_VALUE ( [scalar_expression ] ) OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )
SQL Server的排名函數是對查詢的結果進行排名和分組,TSQL共有4個排名函數,分別是:RANK、NTILE、DENSE_RANK和ROW_NUMBER,和OVER()函數搭配使用,按照特定的順序排名。
1,ROW_NUMBER函數
ROW_NUMBER函數其實是一個序列,每一個分組內都會建立一個序列,序列從1開始,按照順序依次 +1 遞增。
ROW_NUMBER ( )
OVER ( [ PARTITION_BY_clause ] order_by_clause )
分組內序列的最大值就是該分組內的行的數目。
2,RANK函數
RANK函數用於排名時,不會返回連續的整數。RANK函數的語法是:在分組內,按照特定的順序排名,序號從1依次遞增,排名函數以tie爲單位,每一個tie中的全部行的排名是相同的,排名多是不連續的。
RANK ( ) OVER ( [ partition_by_clause ] order_by_clause )
排名的算法是:
3,DENSE_RANK
DENSE_RANK函數用於排名時,會返回連續的整數。每一個tie佔用一個排名,每一個tie中的全部行的排名是相同的。排名值是連續的
DENSE_RANK ( ) OVER ( [ <partition_by_clause> ] < order_by_clause > )
排名的算法是:
4,NTILE
在每一個分組中,NTILE按照指定的順序,把數據行分爲N個小組(tile),NTILE返回小組編號。在每一個分組內,具備相同的小組編號的數據行,位於同一個小組。注意:小組的編號是按照行數,而不是按照列值。在同一分組內,存在兩行的列值相同,而小組編號不一樣。
NTILE (integer_expression) OVER ( [ <partition_by_clause> ] < order_by_clause > )
若是分區中的行數不能被integer_expression整除,那麼會致使小組相差一個成員:較大的小組按OVER子句指定的順序位於較小的小組以前。 例如,若是把8行分爲3個小組,前2個小組有3行,後一個小組有2行。
若是分區中的中行數能被integer_expression整除,那麼每一個小組具備相同的行數。
特別地,NTILE(4) 把一個分組分紅4份,叫作Quartile。例如,如下腳本顯示各個排名函數的執行結果:
select Department ,LastName ,Rate ,row_number() over(order by Rate) as [row number] ,rank() over(order by rate) as rate_rank ,dense_rank() over(order by rate) as rate_dense_rank ,ntile(4) over(order by rate) as quartile_by_rate from #data
參考文檔: