1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|
Statistics
====================
Basic Measures
-------------------------
The sample distribution has finite size and is what has been measured; the
parent distribution is infinite and smooth and is the limit case of the
sample distribution.
The mean, or average, is (of course):
$$\langle x \rangle = \frac{1}{N} \sum_{i=1}^{N}x_i$$
The variance is;
$$s^{2}_x = \frac{1}{N-1}\sum^{N}_{i=1}\left(x-\langle x \rangle\right)^2$$
The standard deviation is the square root of the variance; the standard
deviation of the parent distribution is represented by $\sigma_x$ instead of
$s_x$. The mean of the parent distribution is $\mu$ instead of $\bar{x}$.
Binomial Distribution
-------------------------
If we are playing a yes/no game (eg flipping a coin), the binomial distribution
represents the probability of getting 'yes' $x$ times out of $n$ if $p$ is the
probability of getting 'yes' for a single attempt.
$$P(x;n,p) = \frac{n!}{x! (n-x)!} p^x (1-p)^{n-x}$$
The mean of this distribution is $\mu = np$, and $\sigma = \sqrt{np (1-p)}$.
Poisson Distribution
------------------------
$$P(x,\mu) = \frac{\mu^x}{x!} e^{-\mu}$$
The mean is $\mu$, and $\sigma=\sqrt{\mu}$.
Gaussian Distribution
--------------------------
The classic! Also called a normal distribution.
$$P(x;\mu,\sigma) = \frac{1}{2\pi \sigma} e^{-\left(\frac{(x-\mu)^2}{2\sigma^2}\right)}$$
The mean is $\mu$ and the deviation is $\sigma=\sqrt{\mu}$.
Lorentzian Distribution
---------------------------
This distribution represents damped resonance; it is also the Fourier
transform of an exponentially decaying sinusoid.
$$P(x;\mu,\Gamma) = \frac{1}{\pi} \frac{\Gamma/2}{(x-\mu)^2 + (\Gamma/2)^2}$$
where the mean is $\mu$ and the linewidth (the width of the peak) is $\Gamma$.
Error Analysis
-------------------
For a given measurement, the error on the mean is not the standard deviation
(which is a measure of the statistics), it is $\frac{s_x}{\sqrt{N}}$: the
standard deviation should stay roughly constant as $N$ gets very large, but
the error on the mean should get smaller. More elaborately, if the errors are
different for each individual measurement, the mean will be:
$$\bar{x}=
\frac{ \sum_{i=1}^{N} x_i / \sigma_{i}^2}{\sum_{i=1}^{N} 1/\sigma_{i}^2}
\pm \sqrt{ \frac{1}{\sum_{i=1}^{N} 1/\sigma_{i}^2}}$$
$\chi^2$ Distribution
------------------------
$\chi^2$ is often written "chi-squared" and is a metric for how well a fit
curve matches uncertain data.
$$\chi^2 = \sum_{i=1}^{N}\left(\frac{x_i-\mu_i}{\sigma{i}}\right)^2$$
The number of degrees of freedom of the system is the number of measurements
$N$ minus the number of variable parameters in a curve fit $N_c$: $\nu = N-N_c$.
The reduced $\chi^2$ value is $\chi^{2}_r = \chi^2 /\nu$. You want $\chi^{2}_r$
to be around (but not exactly!) 1; if it is significantly larger there are
probably too many degrees of freedom, while if significantly smaller the fit is
bad.
|