Standard deviation is a number that tells you how far the values in a dataset typically stray from the average.
That one sentence covers most of what you need to know day to day. But the mechanics behind it are worth understanding, because standard deviation shows up everywhere from school report cards to financial risk disclosures to scientific papers. Once you see how it works, you will know exactly what it is telling you each time you encounter it.
Start with a simple idea: a group of numbers has an average, and each number in the group sits some distance away from that average. Standard deviation is a way of summarizing those distances into a single figure. A small standard deviation means the numbers cluster tightly around the average. A large one means they scatter widely.
Consider two sets of exam scores. Class A: 78, 80, 79, 81, 80. Class B: 50, 65, 80, 95, 100. Both classes average 79.6, yet the experiences in those two rooms were completely different. Standard deviation captures that difference. Class A has a standard deviation near 1; Class B's is around 20.
That gap tells a story no average can.
For a population (every member of the group, not just a sample), the formula is:
Where sigma is the population standard deviation, N is the count of values, xi is each individual value, and mean is the arithmetic average of all values. The sum inside the square root adds up the squared distance from the mean for every data point. Dividing by N turns that total into an average squared distance. Taking the square root converts it back to the original units of measurement.
Squaring the deviations before averaging them serves two purposes. It makes all the distances positive (so negative and positive distances do not cancel each other out), and it gives extra weight to values far from the mean. Those outliers pull the standard deviation upward.
This eight-value dataset appears in many statistics textbooks as a clean illustration of the method. Work through it step by step.
So for this dataset, the typical value sits about 2 units away from the mean of 5. Check that against the raw numbers: the values 4 and 6 are both exactly 1 away from 5, while 2 is 3 away and 9 is 4 away. An "average distance" of 2 feels about right.
You can skip the arithmetic on your own datasets using the standard deviation calculator, which shows the full step-by-step working just as above.
There are two versions of the formula, and mixing them up is the most common mistake people make when first learning this topic.
Population standard deviation divides by N. Use it when your dataset is the entire group you care about: the test scores of every student in one specific class, the weights of every package produced in a single factory run, the daily temperatures recorded at one weather station over a complete year.
Sample standard deviation divides by N minus 1 instead of N. Use it when your data is a subset drawn from a larger population and you want to estimate the spread of the full group. Dividing by N minus 1 makes the result slightly larger, which corrects for the fact that a sample tends to underestimate true population spread. This correction is known as Bessel's correction, and it gives an unbiased estimate of the population variance.
In practice: if someone hands you a spreadsheet of survey responses from 200 people out of a possible 10,000, use the sample formula. If the spreadsheet contains data from all 200 employees at a company and those 200 people are the entire population you are studying, use the population formula. Most statistics software (and most spreadsheet functions) defaults to the sample version, so it is worth knowing which one you are looking at.
For a related look at how different values factor into summary statistics, the GCF calculator shows a different kind of factoring that comes up frequently in number theory and simplification problems.
On its own, the number has no universal meaning. A standard deviation of 15 is enormous for a dataset of classroom test scores (max 100) and negligible for a dataset of stellar distances measured in light-years. Context always matters.
That said, a few practical rules of thumb hold across many applications. In a normal distribution (the familiar bell curve), about 68 percent of values fall within one standard deviation of the mean, about 95 percent fall within two, and about 99.7 percent fall within three. This is sometimes called the empirical rule or the 68-95-99.7 rule. When a value falls more than two standard deviations from the mean, it draws attention. Three or more standard deviations away is genuinely unusual in most naturally occurring distributions.
Financial analysts use standard deviation to measure volatility. A stock with a high standard deviation in daily returns is unpredictable. Quality control engineers track it to catch manufacturing drift before products go out of spec. Psychologists report it alongside mean scores to show how much individuals varied from the group average.
If you have a list of numbers ready, the standard deviation calculator handles the whole process in one step. Paste in your values, choose population or sample mode, and it returns the result with all intermediate steps shown so you can verify the work or learn from it.
A high standard deviation means the values in a dataset are spread far from the mean. Test scores with a standard deviation of 20 points vary much more than scores with a standard deviation of 3 points. High spread can signal inconsistency, wide natural variation, or a dataset that contains outliers worth investigating.
Variance is the average of the squared deviations from the mean. Standard deviation is the square root of variance. Because variance is in squared units (square inches, square dollars), it is harder to interpret directly. Taking the square root brings the measure back to the same units as the original data, which is why standard deviation gets used in most reports and textbooks.
Use population standard deviation (divide by N) when your dataset contains every member of the group you care about. Use sample standard deviation (divide by N minus 1) when your data is a subset drawn from a larger population and you want to estimate the true spread of the full group. Most statistics software defaults to the sample version.
No. Standard deviation describes the spread of individual values within a dataset. Standard error describes how much a sample mean is likely to differ from the true population mean. Standard error equals the standard deviation divided by the square root of the sample size, so it shrinks as your sample grows larger.

Editor at Encore Editorial, Chris Terry sets the editorial standards here and turns dense topics into plain English. He has written widely on education, finance, and consumer markets.