Statistical Concepts and Market Returns

Statistical Concepts and Market Returns

Categories of statistics

  • Descriptive statistics: used to summarize the important characteristics of large data sets.
  • Inferential statistics: pertain to the procedures used to make forecasts, estimates, or judgments about a large set of data on the basis of the statistical characteristics of a sample.

Measures of Central Tendency

When describing investments, measures of central tendency provide an indication of an investment's expected return.app

  • Arithmetic mean (算術平均)
  • Geometric mean (幾何平均): often used when calculating investment returns over multiple periods or when measuring compound growth rates.
  • Weighted mean (加權平均)
  • Median (中位數): the midpoint of a data set when the data is arranged in ascending or decending order.
  • Mode (衆數): the value that occurs most frequently in a data set. A data set may have more than one mode or even no mode.
  • Harmonic mean(調和平均數/倒數平均數): used for certain computations, wuch as the average cost of shares purchased over time. 是整體各統計變量倒數的算術平均數的倒數

Note: The geometric mean is always less than or equal to the arithmetic mean, and the difference increases as the dispersion of the observations increases. The only time the arithmetic and geometric means are equal is when there is no variability in the observations (i.e. all observations are equal)less

Note: For values that are not all equal: harmonic mean < geometric mean < arithmetic mean. This mathematical fact is the basis for the claimed benefit of purchasing the same dollar amount of mutual fund shares each month or each week. Some refer to this practice as "dollar cost averaging"ide

Note: modal interval: for any frequency distribution, the interval with the greatest frequency is referred to as the modal interval. 模式區間:發生頻率最高的區間。this

均值(mean)和平均值(average)在不少狀況下能夠不加區分地使用,可是二者仍是有所區別:1)樣本的「均值」是根據上面的算術平均公式計算得出2)"平均值"是若干種能夠描述樣本的典型值或集中趨勢(central tendency)的彙總統計量之一。rest

Measures of Dispersion

When describing investment, measures of dispersion indicate the riskiness of an investment.
Dispersion is defined as the variability around the central tendency. The common theme in finance and investmentss is the tradeoff between reward and variability, where the central tendency is the measure of the reward and dispersion is a measure of risk.code

  • Range (範圍): range = maximum value - minimum value
  • Mean absolute deviation (MAD/平均絕對誤差): the average of the absolute values of the deviations of individual observations from the arithmetic mean.
    orm

  • Variance (方差):
  • Standard Deviation(標準差):blog

Note: The most noteworthy difference from the formula for population variance is that the denominator for s^2 is n-1, one less than the sampe size n, where σ^2 uses the entire population size N. Based on the mathematical theory behind statistical procedures, the use of the entire number of sample observations, n instead of n-1 as the divisor in the commputation of s^2, will systematically underestimate the population parameter σ^2, particular for small sample sizes. This sysmatic underestimation causes the same variance to be what is referrerd to as biased estimator of the population variance. Using n-1 instead of n iin the denominator, however, improves the statistical properties of s^2 as an estimator of σ^2. Thus, s^2 is considdered to be an unbiased estimator of σ^2.ip

Chebyshev's Inequality

Chebyshev's inequality(切比雪夫不等式) states that for any set of observations, whether sample or population data and regardless of the shape of the distribution, the percentage of the observations that lie within k standard deviations of the mean is at least 1-1/k^2 for k > 1.ci

The importance of Chebyshev's inequality is that is applies to any distribution.

Coefficient of Variation (變異係數/離散係數)

Relative disperation is the amount of variability in a distribution relative to a reference point or benchmark. Relative disperation is commonly measured with the coefficient of vairation(CV).
離散係數,離散係數又稱變異係數,是統計學當中的經常使用統計指標,主要用於比較不一樣水平的變量數列的離散程度及平均數的表明性。

CV = (standard devition of x)/(average value of x)

CV measures the amount of dispersion in a distribution relative to the distribution's mean. In an investments setting, the CV is used to measure the risk(variability) per unit of expected return(mean).

Sharpe Ratio

The Sharpe measure(a.k.a., the Sharpe ratio or reward-to-variability ratio) is widely used for investment performance measurement and measures excess return per unit of risk.
夏普比率: 反應風險及回報的比率。測量組合回報的風險,將高於無風險回報的部分除以某一時段內的標準差,得出的結果就是每一單位風險產生的超額回報。比率越高,調整風險後的回報越高。

Skewness(偏度)

Skewness, or skew, refers to the extent to which a distribution is not sysmmetrical. Nonsysmmetrical distributions may be either positively or negatively skewe and result from the occurrence of outliers in the data set. Outliers are observations with extraordinarily large values, either positve or negative.

  • A positively skewed distribution is characterized by many outliers in the upper region or right tail. A positively skewed distribution is said to be skewed right because of its relatively long upper(right) tail.

  • A negatively skewed distribution has a disproportionately large amount of outliers that fall within its lower(left) tail. A negatively skewed distribution is said to be skewed left because of its lower tail.

Values of Sk in excess of 0.5 in absolute value indicate significant levels of skewness.

Kurtosis(峯度)

Kurtosis is a measure of the degree to which a distribution is more or less "peaked" than a normal distribution. Leptokurtic(頻率分配曲線的尖頂峯度) describes a distribution that is more peaked tha a normal distribution, whereas platykurtic (低峯態分佈) refers to a distribution that is less peeked, or flatter than a normal distribution. A distribution is mesokurtic(常態峯) if it has the same kurtosis as a normal distribution.

A distribution is said to exhibit excess kurtosis if it has either more or less kurtosis than the normal distribution. The computed kurtosis for all normal distribution is 3. A normal distribution has excess kurtosis equal to 0, a leptokurtic distribution has excess kurtosis greater than 0, and platykurtic distributions will have excess kurtosis less than 0.

In general, greater positive kurtosis and more negative skew in returns distributions indicates increased risk.

Excess kurtosis values that exceed 1.0 in absolute value are considered large.

excess kurtosis=sample kurtosis-3
相關文章
相關標籤/搜索