Joint Distribution: A Complete and Detailed Guide

Introduction

A joint distribution describes the probability behavior of two or more random variables simultaneously. It’s foundational in multivariate probability, helping us understand how variables interact and depend on one another.

Definition

For two random variables X and Y, their joint probability distribution gives the probability that X = x and Y = y simultaneously.

Discrete Case

P(X = x, Y = y) = p(x, y)

The joint probability mass function (pmf) must satisfy:

p(x, y) ≥ 0 for all x, y
∑∑ p(x, y) = 1

Continuous Case

f(x, y) = joint probability density function

It must satisfy:

f(x, y) ≥ 0
∬ f(x, y) dx dy = 1

Marginal Distributions

The marginal distributions give the individual probabilities for X or Y by summing or integrating over the other variable.

Discrete:

P(X = x) = ∑ p(x, y) P(Y = y) = ∑ p(x, y)

Continuous: f_X(x) = ∫ f(x, y) dy f_Y(y) = ∫ f(x, y) dx

Conditional Distributions

Conditional distributions tell us the probability of one variable given that the other has occurred.

Discrete:

P(X = x | Y = y) = P(X = x, Y = y) / P(Y = y)

Continuous: f(x | y) = f(x, y) / f_Y(y)

Independence

Two variables X and Y are independent if:

Discrete:  P(X = x, Y = y) = P(X = x) * P(Y = y)

Continuous: f(x, y) = f_X(x) * f_Y(y)

Practical Example: Discrete Joint Distribution Table

Suppose a factory produces two types of items, A and B, and records whether each item passes (1) or fails (0) quality inspection. Let:

X: type of item (A=0, B=1)
Y: result of inspection (Pass=1, Fail=0)

The joint distribution is given by the table below:

X\Y	Y = 0 (Fail)	Y = 1 (Pass)	Marginal P(X)
X = 0 (A)	0.10	0.30	0.40
X = 1 (B)	0.20	0.40	0.60
Marginal P(Y)	0.30	0.70	1.00

– P(X = 0, Y = 1) = 0.30
– P(X = 1 | Y = 1) = 0.40 / 0.70 ≈ 0.571
– Check for independence: P(X=0) * P(Y=1) = 0.40 × 0.70 = 0.28 ≠ 0.30 → Not independent

Example: Continuous Joint Distribution

Let f(x, y) = 2 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. Then:

Total probability = ∬ f(x, y) dx dy = 2 × 1 × 1 = 2 → Not valid!
Must normalize: use f(x, y) = 1 instead so total probability = 1

Python Example (Continuous)

import numpy as np

from scipy.stats import multivariate_normal

Define mean vector and covariance matrix

mu = [0, 0] cov = [[1, 0.5], [0.5, 1]]

Generate joint PDF values on grid

x, y = np.mgrid[-3:3:.1, -3:3:.1] pos = np.dstack((x, y)) z = multivariate_normal(mu, cov).pdf(pos)

Plot with matplotlib

import matplotlib.pyplot as plt plt.contourf(x, y, z) plt.title('Joint Normal Distribution') plt.colorbar() plt.show()

Applications

Modeling correlations between variables
Bayesian statistics and joint likelihoods
Multivariate regression and classification
Econometrics, finance, and machine learning

Conclusion

A joint distribution is crucial for analyzing relationships between multiple random variables. By understanding joint, marginal, and conditional distributions, we can uncover dependencies and structure in data, forming the backbone of multivariate statistics and data science.

oriolyt

oriolyt

Joint Distribution: A Complete and Detailed Guide

Introduction

Definition

Discrete Case

Continuous Case

Marginal Distributions

Conditional Distributions

Independence

Practical Example: Discrete Joint Distribution Table

Example: Continuous Joint Distribution

Python Example (Continuous)

Applications

Conclusion

Add a Comment Cancel reply

Uniform Random Variable: A Complete and ...

Exponential Random Variable: A Complete ...

Normal Distribution: A Complete and Deta...

Inference in Probability:A detailed guid...

Useful Links

Contact

info@oriolyt.com

+260 975835018

Sign Up