I was curious as to practically how well standard deviation can be estimated at different numbers of observed samples. I thought the plot was interesting, so figured I’d share.
Here is the Julia code I used:
using Distributions using Gadfly function std_of_sample(n) std(rand(Normal(2, 5), n)) end; nmax = 200 n_observations = collect(0:nmax * 100) % nmax + 2; p = plot( x=n, y=map(std_of_sample, n_observations), Guide.XLabel("Number of observations"), Guide.YLabel("Estimated σ"), Theme(default_color=color("black")), Coord.Cartesian(ymin=-3, ymax=13, xmin=-20, xmax=220), Guide.xticks(ticks=[0, 10, 25, 50, 100, 200]) ); draw(PNG(640px, 480px), p)
We just sample a different number of samples from a normal distribution with mean 2 and known standard deviation 5.