Statistics & Probabilities
un+certainty
"So, You're Saying There's a Chance?"
Classic… and a great way to think about statistics and probabilities because that's what they are at heart. The chance itself. Typically, we tend to think of these things as risk assessment, and that's not wrong. It's just incomplete. It is not simply a risk assessment; it is a motion assessment. If I do motion x, motion y or motion z will happen as a result. In reality, it is a choice of path. Some of these paths are not truly linear but have linearity incorporated. You can walk into a grocery store and win the lottery, but only if you followed the linear procedure of purchasing the ticket.
Let that sink in. Every "random" outcome has a chain of required motions behind it. The lottery looks random because we only see the final step—the drawing. But the path to getting there is completely deterministic. You decided to go to the store. You walked in. You picked a ticket. You paid for it. Every step was a choice, a motion, a cause. The randomness isn't in the process. It's in the outcome of the last step, and only because we don't have enough information to predict it.
That's statistics and probability in a nutshell. It's the math of what happens when you can't see the whole path.
What Even Is Probability?
Here's the simplest definition: probability is a number between 0 and 1 that measures how available a particular outcome is.
0 means impossible. That path is blocked. You cannot take it.
1 means certain. That path is the only one. You must take it.
Everything in between is a measure of how open that path is relative to all the other paths available.
Flip a coin. Two paths: heads or tails. If the coin is fair, both paths are equally available. So each one gets probability 1/2 = 0.5. Not because God decrees it. Not because of some mystical property of coins. Because there are two outcomes and the coin doesn't favor either one. The availability is split equally.
Roll a die. Six paths, equally available. Each gets 1/6. Simple division of availability across options.
Now here's where it gets interesting. Roll two dice and ask: what's the probability the sum is 7? There are 36 total combinations (6 × 6). Six of them sum to 7: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1). So the probability is 6/36 = 1/6.
But the probability of rolling a sum of 2? Only one way: (1,1). So it's 1/36. The paths aren't equally available anymore. Some sums have more routes leading to them than others. Probability is measuring how many paths converge on an outcome.
The Sample Space: Your Map of All Possible Paths
Mathematicians call the set of all possible outcomes the sample space. This is your map. Every path you could possibly take is somewhere on this map.
For a coin flip: {Heads, Tails}. That's the whole map. Two locations.
For a die: {1, 2, 3, 4, 5, 6}. Six locations.
For two dice: {(1,1), (1,2), ... (6,6)}. Thirty-six locations.
An event is a subset of the sample space—a collection of paths you care about. "Rolling a 7" is the event {(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}. Six paths out of thirty-six.
Probability of an event = (paths in the event) / (total paths in the sample space).
That's it. That's the fundamental formula. Everything else in probability is a variation on this theme: counting the paths you want, counting the total paths, dividing.
Conditional Probability: Narrowing the Map
This is where probability becomes motion assessment for real.
Conditional probability asks: given that you've already taken some steps, what paths are still available?
You're in that grocery store. You've already walked in. You've already gone to the counter. You've already bought the lottery ticket. Given all those motions have occurred, what's the probability you win?
Before you left your house, the probability of winning was basically zero—because the path required steps you hadn't taken yet. After buying the ticket, the probability jumps. Not because the lottery changed. Because YOUR position on the map changed. You've moved closer to the outcome.
The notation is P(A|B), read "probability of A given B." The vertical bar means "given that B has already happened." It's not just a filter. It's telling you where you are on the map. B is your current position. A is where you're trying to go. P(A|B) measures how available A is from where you're standing now.
The formula:
P(A|B) = P(A and B) / P(B)
Translation: the probability of reaching A from position B equals the probability of the path that goes through both A and B, divided by the probability of being at B in the first place.
Think of it like this. You're in a big building with 100 rooms. 30 rooms have windows. 10 rooms are on the top floor. 5 rooms are on the top floor AND have windows. If you know you're on the top floor—that's your B—then the probability of having a window is 5/10 = 0.5. Not 30/100. Because knowing your position eliminates rooms you can't be in. The map got smaller.
Bayes' Theorem: Updating Your Beliefs
Thomas Bayes. 18th century Presbyterian minister. Figured out how to reverse conditional probability, and it changed everything.
Here's the problem: you know P(B|A)—the probability of seeing evidence B if hypothesis A is true. But what you want is P(A|B)—the probability that hypothesis A is true given that you observed evidence B. Those are not the same thing.
Example. A medical test is 99% accurate. You test positive. What's the probability you're actually sick?
Your gut says 99%. Your gut is wrong.
If the disease affects 1 in 1000 people, then out of 1000 people tested: 1 sick person tests positive (true positive). About 10 healthy people also test positive (false positives, because the test is 99% accurate on 999 people, meaning about 1% error = ~10 false positives).
So out of 11 positive tests, only 1 actually has the disease. The real probability is about 1/11 ≈ 9%. Not 99%.
What happened? The base rate matters. How common the disease is changes the meaning of the test result. Bayes' theorem keeps track of this:
P(A|B) = P(B|A) × P(A) / P(B)
P(A) is the prior—what you believed before seeing evidence. P(B|A) is the likelihood—how expected the evidence is if your hypothesis is true. P(B) is the total probability of the evidence. P(A|B) is the posterior—your updated belief after seeing evidence.
This is learning. Mathematically. You start with a belief, you observe the world, you update. Every time you update, you're moving on the map. Your position changes. The available paths reconfigure.
Expected Value: The Average Path
If you could take every possible path at once and average where you end up, that's the expected value.
Flip a coin. Heads you win $10, tails you lose $5. Expected value: (0.5 × $10) + (0.5 × -$5) = $5 - $2.50 = $2.50.
You won't win $2.50 on any single flip. It's not even a possible outcome. But if you flip 1000 times, your average gain per flip will be damn close to $2.50. The expected value is where the path leads over time.
This is why casinos make money. Every game has a negative expected value for the player. Each individual play is uncertain—you might win big. But the path over thousands of plays leads exactly where the math says. The casino isn't gambling. The casino is doing math. You're the one gambling.
Expected value is also why the lottery is a terrible investment and an okay entertainment purchase. The expected value of a $2 lottery ticket is about $0.80. You lose $1.20 on average. But you're not buying the expected value. You're buying the path—the possibility, the dream, the linear procedure of purchasing hope. As long as you know the math, that's your choice to make.
Distributions: The Shape of Uncertainty
When you have a bunch of possible outcomes with different probabilities, the pattern they form is called a distribution. This is uncertainty given a shape.
The most famous is the normal distribution—the bell curve. Heights, test scores, measurement errors, blood pressure—an insane number of things in nature follow this shape. Most values cluster near the middle (the mean), and fewer values appear as you move further out.
Why does it show up everywhere? Because of the Central Limit Theorem—one of the most powerful results in all of statistics. It says: if you take any random process and average a bunch of samples from it, those averages will form a bell curve. Doesn't matter what the original process looks like. Could be coin flips, dice rolls, heights of trees, anything. Average enough of them and you get a bell curve.
That's not a coincidence. It's structure. When many small independent factors add together, they converge to the same shape. The universe has a default shape for "lots of small things combining," and it's the bell curve.
The bell curve has two numbers that describe it completely: the mean (where the center is) and the standard deviation (how spread out it is). A tight bell curve means most values are close to the mean—high certainty. A wide bell curve means values are spread out—high uncertainty.
See that? Certainty and uncertainty aren't opposites. They're the same measurement. The standard deviation. Small spread = certain. Large spread = uncertain. The bell curve doesn't judge. It just shows you how wide the path is.
Statistics: Looking Backward
Probability looks forward—what might happen. Statistics looks backward—what already happened, and what can we learn from it.
You collected data. A bunch of numbers. Measurements. Observations. Statistics is the toolkit for extracting meaning from those numbers.
The mean (average) tells you the center of gravity. Add everything up, divide by how many. It's where the data balances.
The median tells you the middle value. Line everything up in order, pick the one in the center. This is more robust than the mean because one billionaire in a room of broke people doesn't distort the median like it distorts the mean.
The standard deviation tells you how spread out the data is. Are all the values clustered tight, or scattered all over? This is the uncertainty measurement from before, but applied to real observed data instead of theoretical outcomes.
Here's the deep insight: statistics is using the paths already taken to estimate the paths not yet taken. You measure 1000 people's heights to predict the next person's height. You track 10,000 medical outcomes to predict the next treatment's effectiveness. The past is a map of the future—not a perfect map, but the best one you've got.
Correlation Is Not Causation (but It's Not Nothing)
You've heard this a million times. "Correlation is not causation." And it's true. Ice cream sales and drowning deaths both increase in summer. Ice cream doesn't cause drowning. Summer causes both.
But here's what people miss: correlation is still information. It's telling you that two paths tend to move together. Maybe one causes the other. Maybe something else causes both. Maybe it's pure coincidence. Correlation doesn't tell you which—but it points you toward something worth investigating.
Correlation is a number between -1 and 1. Positive correlation means they move together (one goes up, the other goes up). Negative correlation means they move opposite (one goes up, the other goes down). Zero means no relationship.
The mistake isn't finding correlations. The mistake is assuming the path you see is the only path that could explain it. That's lazy navigation. Good statistics means finding the correlation, then doing the hard work of figuring out the actual structure underneath.
The Motion Assessment
Here's what I want you to take away. Statistics and probability aren't about predicting the future or analyzing the past. They're about understanding the path structure of reality.
Every choice you make is a motion through a space of possibilities. Some paths are wide open—high probability. Some are narrow—low probability. Some paths require other paths first—conditional probability. Some paths look available but lead nowhere—misleading correlations.
The math doesn't make your choices for you. The math shows you the map. What paths exist, how wide they are, where they've led before. What you do with that map is your decision.
But here's the thing Lloyd Christmas understood that a lot of smart people don't: knowing the chance is small doesn't mean the path doesn't exist. It means the path is narrow. And sometimes narrow paths lead to the most interesting places.
You just have to know you're choosing to walk a narrow path. That's the difference between a gambler and a fool. The gambler knows the odds. The fool doesn't know there are odds.
Un+certainty. They come together. You can't have one without the other. And the math that handles both is the math of motion through possibility.