|
Preparations
To run this procedure, select a range, and then run the Statistics→Basic
Statistics and Tables→Descriptive Statistics
command or press Control+D. Results
Count - sample size.
Mean - The
mean is a particularly informative
measure of the "central tendency" of the variable if it is reported along with
its confidence intervals. Usually we are interested in
statistics (such as the mean) from our
sample only to the extent to which they are informative about the population.
The larger the sample size, the more reliable its
mean. The larger the variation of data
values, the less reliable the mean.
Mean = (Sxi)/n
where
n
is the sample size.
Standard
deviation (this term was first used by Pearson, 1894) is a commonly-used
measure of variation. The standard deviation
of a population of values is computed as:
= [ (xi-µ)2/N]1/2
where
µ - is the population mean
N - is the population size.
Standard Error of the Mean. The standard error of the mean (first used by Yule, 1897) is the theoretical
standard deviation of all sample means of size n drawn from a population and
depends on both the population variance (sigma) and the sample size (n) as
indicated below:
= ( 2/n)1/2
where
2
- is the population variance
n - is the sample size. Since the population
variance is typically unknown, the best estimate for the standard error of the
mean is then calculated as:
= (s2/n)1/2
where s2 - is the sample
variance (our best estimate of the population variance) n -
is the sample size.
Minimum - smallest number in sample.
Maximum -
biggest number in sample.
Range -
difference between maximum and minimum.
Sum -
series elements sum.
Sum Standard
Error - standard deviation of sums distribution.
Total Sum
Squares- This is the sum of the squared values of the variable. It is
sometimes referred to as the unadjusted sum of squares. It is reported for its
usefulness in calculating other statistics and is not interpreted directly. xi2.
Adjusted Sum
Squares - This is the sum of the squared
differences from the mean.
(xi-µ)2
where
µ - sample mean.
Variance (this term was first used by Fisher,
1918a) is computed as:
2= (xi-µ)2/(N-1)
where
µ - sample mean
N - sample size
Geometric Mean - is a "summary"
statistic useful when the measurement scale is not linear. The geometric mean
(GM) is an alternative type of mean that is used for business, economic, and
biological applications. Only nonnegative values are used in the computation. If
one of the values is zero, the geometric mean is defined to be zero. It is
computed as:
G = (Π xi
)1/N
where
Π -
product of all sample elements
Harmonic Mean -
is a "summary" statistic used in analyses of frequency data; it is computed as:
H = n * 1/S(1/xi
)
where
n is the
sample size.
Mode -
A measure of central tendency, the mode
(the term first used by Pearson, 1895) of a sample is the value which occurs
most frequently in the sample.
Lower value
of a reliable interval (LCL), Upper value of a reliable interval (UCL) - This is the upper and lower values of a 100(1-a) interval
estimate for the mean based on a t distribution with n-1 degrees of freedom.
This interval estimate assumes that the population standard deviation is not
known and that the data for this variable are normally distributed.
Skewness -
This statistic measures the direction and degree of asymmetry. A value of zero
indicates a symmetrical distribution. A positive value indicates skewness (longtailedness)
to the right while a negative value indicates skewness to the left. Values
between -3 and +3 indicate are typical values of samples from a normal
distribution.
Kurtosis - This statistic measures the heaviness of the
tails of a distribution. The usual reference point in kurtosis is the normal
distribution. If this kurtosis statistic equals three and the skewness is zero,
the distribution is normal. Unimodal distributions that have kurtosis greater
than three have heavier or thicker tails than the normal. These same
distributions also tend to have higher peaks in the center of the distribution
(leptokurtic). Unimodal distributions whose tails are lighter than the normal
distribution tend to have a kurtosis that is less than three. In this case, the
peak of the distribution tends to be broader than the normal (platykurtic). Be
forewarned that this statistic is an unreliable estimator of kurtosis for small
sample sizes.
Fishers's Alternative Skewness
and Kurtosis - Fisher's measure is an alternative measure of
skewness. Calculated by Microsoft* Excel*
as Skewness and Kurtosis.
Stadard Deviation - is a
relative measure of dispersion. It is most often used to compare the amount of
variation in two samples. It can be used for the same data over two time periods
or for the same time period but two different places. It is the standard
deviation divided by the mean.
Second Moment (about Mean) (m2),
Trird Moment (about Mean) (m3), Fourth Moment (about Mean) (m4) -
moments about mean.
Median - is the number in
the middle of a set of numbers; that is, half the numbers have values that are
greater than the median, and half have values that are less.
Median Error -
is computed as me=
(π/ 2N)1/2
p-level - see Elementary Concepts
|