Knowledge & Intellect

Develop cognitive abilities through mathematics, sciences, critical thinking, and research skills.

Popular Topics

Category Stats

Back to Knowledge & Intellect

Reference

Quick Reference

Knowledge & Intellect

Statistics & Probability Reference

Comprehensive reference for statistical concepts, probability distributions, data analysis methods, and common formulas used in statistics and probability.

Statistics & Probability Reference Sheet

1. Descriptive Statistics

Mean (Arithmetic Average)

Formula: $\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}$
Variables:
- $x_i$ = each value in the dataset
- $n$ = number of values
Explanation: The mean provides the central value of a dataset. Use it for symmetric distributions without outliers.

Median

Explanation: The middle value of a dataset when ordered. Use it for skewed distributions or when outliers are present.

Mode

Explanation: The most frequently occurring value in a dataset. Useful for categorical data.

Variance

Formula: $s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}$
Variables:
- $s^2$ = variance
- $x_i$ = each value in the dataset
- $\bar{x}$ = mean of the dataset
- $n$ = number of values
Explanation: Measures the dispersion of the dataset. Larger values indicate more spread.

Standard Deviation

Formula: $s = \sqrt{s^2}$
Explanation: The square root of variance, representing data spread in the same units as the data.

2. Probability Basics

Probability Rules

Addition Rule: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
Multiplication Rule: $P(A \cap B) = P(A) \times P(B|A)$

Conditional Probability

Formula: $P(A|B) = \frac{P(A \cap B)}{P(B)}$
Explanation: Probability of $A$ occurring given that $B$ has occurred.

Bayes' Theorem

Formula: $P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$
Explanation: Used to update the probability of $A$ given new evidence $B$ .

3. Common Distributions

Normal Distribution

Formula: $f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
Variables:
- $\mu$ = mean
- $\sigma$ = standard deviation
Explanation: Bell-shaped curve; applicable when data is symmetrically distributed.

Binomial Distribution

Formula: $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$
Variables:
- $n$ = number of trials
- $k$ = number of successes
- $p$ = probability of success
Explanation: Used for discrete data with two possible outcomes (success/failure).

Poisson Distribution

Formula: $P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$
Variables:
- $\lambda$ = average number of occurrences
Explanation: Models the number of events in a fixed interval of time or space.

Exponential Distribution

Formula: $f(x|\lambda) = \lambda e^{-\lambda x}$
Variables:
- $\lambda$ = rate parameter
Explanation: Time between events in a Poisson process.

4. Hypothesis Testing

Z-test

Formula: $Z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}$
Explanation: Used when the population variance is known and the sample size is large ( $n > 30$ ).

t-test

Formula: $t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}$
Explanation: Used when the population variance is unknown and the sample size is small ( $n \leq 30$ ).

p-value

Explanation: Probability of observing test results as extreme as the observed results, under the null hypothesis.

Confidence Intervals

Formula: $\bar{x} \pm Z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}$
Explanation: Range of values within which the population parameter is expected to lie with a certain level of confidence.

5. Regression & Correlation

Linear Regression

Formula: $y = \beta_0 + \beta_1 x$
Variables:
- $\beta_0$ = y-intercept
- $\beta_1$ = slope
Explanation: Models the relationship between two variables by fitting a linear equation.

Correlation Coefficient (Pearson's r)

Formula: $r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2 \sum (y_i - \bar{y})^2}}$
Explanation: Measures the strength and direction of a linear relationship between two variables.

6. Sampling

Types of Sampling

Simple Random Sampling: Every member has an equal chance of being selected.
Stratified Sampling: Population divided into strata, and random samples taken from each stratum.
Cluster Sampling: Population divided into clusters, and entire clusters are randomly selected.

Central Limit Theorem

Explanation: With a sufficiently large sample size, the sampling distribution of the mean will be normally distributed, regardless of the shape of the population distribution.

This reference sheet provides a concise overview of essential statistics and probability concepts, allowing quick access to formulas, definitions, and explanations for practical use in both academic and professional settings.

More Brain Sheets

Explore more quick reference toolkits and guides

Explore All Brain Sheets

Quick Learning

Try 2-3 minute Brain Flashes for quick learning bursts

Try Brain Flashes