Knowledge & Intellect

Back to Knowledge & Intellect
Reference
Quick Reference
Knowledge & Intellect

Statistics & Probability Reference

Comprehensive reference for statistical concepts, probability distributions, data analysis methods, and common formulas used in statistics and probability.

Statistics & Probability Reference Sheet

1. Descriptive Statistics

Mean (Arithmetic Average)

  • Formula: xˉ=i=1nxin\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}
  • Variables:
    • xix_i = each value in the dataset
    • nn = number of values
  • Explanation: The mean provides the central value of a dataset. Use it for symmetric distributions without outliers.

Median

  • Explanation: The middle value of a dataset when ordered. Use it for skewed distributions or when outliers are present.

Mode

  • Explanation: The most frequently occurring value in a dataset. Useful for categorical data.

Variance

  • Formula: s2=i=1n(xixˉ)2n1s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n-1}
  • Variables:
    • s2s^2 = variance
    • xix_i = each value in the dataset
    • xˉ\bar{x} = mean of the dataset
    • nn = number of values
  • Explanation: Measures the dispersion of the dataset. Larger values indicate more spread.

Standard Deviation

  • Formula: s=s2s = \sqrt{s^2}
  • Explanation: The square root of variance, representing data spread in the same units as the data.

2. Probability Basics

Probability Rules

  • Addition Rule: P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B)
  • Multiplication Rule: P(AB)=P(A)×P(BA)P(A \cap B) = P(A) \times P(B|A)

Conditional Probability

  • Formula: P(AB)=P(AB)P(B)P(A|B) = \frac{P(A \cap B)}{P(B)}
  • Explanation: Probability of AA occurring given that BB has occurred.

Bayes' Theorem

  • Formula: P(AB)=P(BA)×P(A)P(B)P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}
  • Explanation: Used to update the probability of AA given new evidence BB.

3. Common Distributions

Normal Distribution

  • Formula: f(x)=1σ2πe(xμ)22σ2f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}
  • Variables:
    • μ\mu = mean
    • σ\sigma = standard deviation
  • Explanation: Bell-shaped curve; applicable when data is symmetrically distributed.

Binomial Distribution

  • Formula: P(X=k)=(nk)pk(1p)nkP(X = k) = \binom{n}{k} p^k (1-p)^{n-k}
  • Variables:
    • nn = number of trials
    • kk = number of successes
    • pp = probability of success
  • Explanation: Used for discrete data with two possible outcomes (success/failure).

Poisson Distribution

  • Formula: P(X=k)=λkeλk!P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}
  • Variables:
    • λ\lambda = average number of occurrences
  • Explanation: Models the number of events in a fixed interval of time or space.

Exponential Distribution

  • Formula: f(xλ)=λeλxf(x|\lambda) = \lambda e^{-\lambda x}
  • Variables:
    • λ\lambda = rate parameter
  • Explanation: Time between events in a Poisson process.

4. Hypothesis Testing

Z-test

  • Formula: Z=xˉμσnZ = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}
  • Explanation: Used when the population variance is known and the sample size is large (n>30n > 30).

t-test

  • Formula: t=xˉμsnt = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}
  • Explanation: Used when the population variance is unknown and the sample size is small (n30n \leq 30).

p-value

  • Explanation: Probability of observing test results as extreme as the observed results, under the null hypothesis.

Confidence Intervals

  • Formula: xˉ±Zα/2×σn\bar{x} \pm Z_{\alpha/2} \times \frac{\sigma}{\sqrt{n}}
  • Explanation: Range of values within which the population parameter is expected to lie with a certain level of confidence.

5. Regression & Correlation

Linear Regression

  • Formula: y=β0+β1xy = \beta_0 + \beta_1 x
  • Variables:
    • β0\beta_0 = y-intercept
    • β1\beta_1 = slope
  • Explanation: Models the relationship between two variables by fitting a linear equation.

Correlation Coefficient (Pearson's r)

  • Formula: r=(xixˉ)(yiyˉ)(xixˉ)2(yiyˉ)2r = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum (x_i - \bar{x})^2 \sum (y_i - \bar{y})^2}}
  • Explanation: Measures the strength and direction of a linear relationship between two variables.

6. Sampling

Types of Sampling

  • Simple Random Sampling: Every member has an equal chance of being selected.
  • Stratified Sampling: Population divided into strata, and random samples taken from each stratum.
  • Cluster Sampling: Population divided into clusters, and entire clusters are randomly selected.

Central Limit Theorem

  • Explanation: With a sufficiently large sample size, the sampling distribution of the mean will be normally distributed, regardless of the shape of the population distribution.

This reference sheet provides a concise overview of essential statistics and probability concepts, allowing quick access to formulas, definitions, and explanations for practical use in both academic and professional settings.

More Brain Sheets

Explore more quick reference toolkits and guides

Explore All Brain Sheets

Quick Learning

Try 2-3 minute Brain Flashes for quick learning bursts

Try Brain Flashes