We will cover by now:
Tests whether the median of a population equals a given value, or whether the median of paired differences equals zero.
Let
Zero differences are discarded.
Under H₀: $$ S \sim \text{Binomial}(n, 1/2) $$
Exact p-values are computed from the binomial distribution.
If m₀ is the true median, then $$ \mathbb{P}(X_i > m₀) = \mathbb{P}(X_i < m₀) = \tfrac{1}{2}. $$
By independence, the indicator variables are i.i.d. Bernoulli(1/2), yielding an exact distribution-free test.
import numpy as np
from math import comb
def sign_test(x, y=None, median=0, alternative="two-sided"):
"""
Sign test.
Parameters
----------
x : array-like
Sample data (or first sample if paired test).
y : array-like or None
Second sample for paired sign test.
median : float
Hypothesized median (used only if y is None).
alternative : {"two-sided", "greater", "less"}
Returns
-------
dict with test statistic and p-value
"""
x = np.asarray(x)
if y is not None:
y = np.asarray(y)
d = x - y
else:
d = x - median
# Remove zeros (ties)
d = d[d != 0]
n = len(d)
if n == 0:
raise ValueError("All differences are zero.")
S = np.sum(d > 0) # number of positive signs
# Binomial probabilities
if alternative == "two-sided":
k = min(S, n - S)
p_value = 2 * sum(comb(n, i) * 0.5**n for i in range(k + 1))
p_value = min(p_value, 1.0)
elif alternative == "greater":
p_value = sum(comb(n, i) * 0.5**n for i in range(S, n + 1))
elif alternative == "less":
p_value = sum(comb(n, i) * 0.5**n for i in range(0, S + 1))
else:
raise ValueError("alternative must be 'two-sided', 'greater', or 'less'")
return {
"n": n,
"S": S,
"p_value": p_value
}
# Example data
x = [2.1, -0.3, 1.4, 0.7, -1.2, 0.5, 0.9, -0.4]
result = sign_test(x, median=0, alternative="two-sided")
result
{'n': 8, 'S': 5, 'p_value': 0.7265625}
A study is done to determine the effects of removing a renal blockage in patients whose renal function is impaired because of advanced metatstatic malignancy of nonurologic cause. The arterial blood pressure of a random sample of 10 patients is measured before and after surgery for treatment of the blockage yielded the following data:
before = [150, 132, 130, 116, 107, 100, 101, 96, 90, 78]
after = [90, 102, 80, 82, 90, 94, 84, 93, 90, 80]
sign_test(before, after, alternative="greater")
{'n': 9, 'S': 8, 'p_value': 0.01953125}
The Wilcoxon signed-rank test is a non-parametric test for detecting a location shift in one-sample or paired-sample settings.
It is a robust alternative to the one-sample or paired t-test under non-normality.
Let $$ X_1, \dots, X_n \quad \text{i.i.d.} $$ and let $m_0$ be a hypothesized median.
Define differences: $$ d_i = X_i - m_0. $$
Given paired observations $(X_i, Y_i)$, define $$ d_i = X_i - Y_i. $$
In both cases, inference is performed on the distribution of the differences $d_i$.
Null hypothesis $$ H_0: \text{the distribution of } d_i \text{ is symmetric about } 0 $$
Alternative hypothesis $$ H_1: \text{the distribution of } d_i \text{ is not symmetric about } 0 $$ (or one-sided variants)
⚠️ Note: this is not merely a median test; symmetry is essential.
The two statistics are affinely equivalent: $$ W^+ = \frac{n(n+1)}{4} + \frac{1}{2} T_n, \quad T_n = 2W^+ - \frac{n(n+1)}{2}. $$
They induce identical tests, p-values, and decisions.
Under $H_0$:
Thus, $$ T_n = \sum_{i=1}^n R_i S_i $$ has an exact permutation distribution.
Equivalently, $$ W^+ = \sum_{i=1}^n R_i B_i, \quad B_i \sim \text{Bernoulli}(1/2). $$
This distribution:
Let $$ S = \sum_{k=1}^n k = \frac{n(n+1)}{2}. $$
Then $$ W^+ \in \{0,1,\dots,S\}. $$
Each value corresponds to the sum of a subset of $\{1,\dots,n\}$.
Formally: $$
\frac{#{\text{subsets of } {1,\dots,n} \text{ with sum } w}}{2^n}. $$
For every subset $A \subseteq \{1,\dots,n\}$, its complement $A^c$ satisfies: $$ \sum_{k \in A^c} k = S - \sum_{k \in A} k. $$
Hence: $$ \mathbb{P}(W^+ = w) = \mathbb{P}(W^+ = S - w), $$ and $$ \mathbb{E}[W^+] = \frac{S}{2} = \frac{n(n+1)}{4}. $$
For $W^+$: $$ \mathbb{E}[W^+] = \frac{n(n+1)}{4}, \quad \mathrm{Var}(W^+) = \frac{n(n+1)(2n+1)}{24}. $$
For $T_n$: $$ \mathbb{E}[T_n] = 0, \quad \mathrm{Var}(T_n) = \frac{n(n+1)(2n+1)}{6}. $$
Under symmetry: $$ d_i \stackrel{d}{=} -d_i. $$
Therefore:
Thus the statistic reduces to a randomly signed sum of fixed ranks, yielding:
Conditionally on the ranks: $$ T_n = \sum_{i=1}^n R_i S_i $$ is a sum of independent, mean-zero random variables.
Let $$ \sigma_n^2 = \sum_{i=1}^n R_i^2 \sim \frac{n^3}{3}. $$
Since $$ \max_i \frac{R_i^2}{\sigma_n^2} \to 0, $$ the Lindeberg condition holds.
Hence: $$ \frac{T_n}{\sqrt{\sigma_n^2}} \xrightarrow{d} N(0,1). $$
Equivalently: $$ \frac{W^+ - \mathbb{E}[W^+]}{\sqrt{\mathrm{Var}(W^+)}} \xrightarrow{d} N(0,1). $$
In these cases, the null distribution is distorted.
The Wilcoxon signed-rank test is an exact, distribution-free test for symmetry-based location shifts, whose null distribution arises from random sign permutations of fixed ranks and converges asymptotically to a normal distribution.
Let $X_i$ denote the length (in centimeters) of a randomly selected pygmy sunfish,
for $i = 1, 2, \dots, 10$.
Suppose we obtain the following sample:
$$ 5.0,\; 3.9,\; 5.2,\; 5.5,\; 2.8,\; 6.1,\; 6.4,\; 2.6,\; 1.7,\; 4.3 $$Can we conclude that the median length of pygmy sunfish differs significantly from
$3.7$ centimeters?
Formulate and perform an appropriate statistical test at a suitable significance level.
import numpy as np
from math import sqrt
from scipy.stats import wilcoxon, norm
def wilcoxon_signed_rank_one_sample(
x,
median_0=0,
alpha=0.05,
alternative="two-sided"
):
"""
One-sample Wilcoxon signed-rank test.
Reports:
- W+ , W-
- W_exact = min(W+, W-) (classical table statistic)
- exact p-value from scipy.stats.wilcoxon
- asymptotic z and p-value (normal approximation)
"""
x = np.asarray(x)
d = x - median_0
d = d[d != 0]
n = len(d)
if n == 0:
raise ValueError("All observations equal the hypothesized median.")
# ----- Signed ranks -----
abs_d = np.abs(d)
ranks = abs_d.argsort().argsort() + 1
W_plus = np.sum(ranks[d > 0])
W_minus = np.sum(ranks[d < 0])
W_exact = min(W_plus, W_minus)
# ----- Exact p-value (SciPy) -----
stat_scipy, p_exact = wilcoxon(
d,
alternative=alternative,
mode="exact" if n <= 25 else "approx"
)
# ----- Asymptotic approximation -----
mu = n * (n + 1) / 4
sigma = sqrt(n * (n + 1) * (2 * n + 1) / 24)
z = (W_plus - mu - 0.5 * np.sign(W_plus - mu)) / sigma
if alternative == "two-sided":
p_asym = 2 * (1 - norm.cdf(abs(z)))
elif alternative == "greater":
p_asym = 1 - norm.cdf(z)
else:
p_asym = norm.cdf(z)
return {
"n": n,
"W_plus": W_plus,
"W_minus": W_minus,
"W_exact": W_exact,
"scipy_stat": stat_scipy,
"p_value_exact": p_exact,
"z": z,
"p_value_asymptotic": p_asym
}
# Example — One-sample Wilcoxon signed-rank test (table statistic)
# Input data
x = [5.0, 3.9, 5.2, 5.5, 2.8, 6.1, 6.4, 2.6, 1.7, 4.3]
median_0 = 3.7
alpha = 0.05
# Run the test
result = wilcoxon_signed_rank_one_sample(
x,
median_0=median_0,
alpha=alpha,
alternative="two-sided"
)
# Output
print("One-sample Wilcoxon signed-rank test")
print("-----------------------------------")
print(f"n = {result['n']}")
print(f"W+ = {result['W_plus']}")
print(f"W- = {result['W_minus']}")
print(f"W (min(W+, W-)) = {result['W_exact']}")
print("\nExact test (SciPy)")
print(f"Exact p-value = {result['p_value_exact']:.4f}")
if result["p_value_exact"] < alpha:
print("Exact decision: REJECT H0")
else:
print("Exact decision: DO NOT reject H0")
print("\nAsymptotic normal approximation")
print(f"z = {result['z']:.3f}")
print(f"p-value ≈ {result['p_value_asymptotic']:.4f}")
if result["p_value_asymptotic"] < alpha:
print("Asymptotic decision: REJECT H0")
else:
print("Asymptotic decision: DO NOT reject H0")
One-sample Wilcoxon signed-rank test ----------------------------------- n = 10 W+ = 40 W- = 15 W (min(W+, W-)) = 15 Exact test (SciPy) Exact p-value = 0.2324 Exact decision: DO NOT reject H0 Asymptotic normal approximation z = 1.223 p-value ≈ 0.2213 Asymptotic decision: DO NOT reject H0
The median age of the onset of diabetes is thought to be $45$ years.
The ages at onset for a random sample of $30$ people with diabetes are:
Assuming that the distribution of the age at the onset of diabetes is symmetric,
is there evidence to conclude that the median age of the onset of diabetes
differs significantly from $45$ years?
Formulate and perform an appropriate statistical test at a suitable significance level.
# Input data
x = [35.5, 44.5, 39.8, 33.3, 51.4, 51.3, 30.5, 48.9, 42.1, 40.3,
46.8, 38.0, 40.1, 36.8, 39.3, 65.4, 42.6, 42.8, 59.8, 52.4,
26.2, 60.9, 45.6, 27.1, 47.3, 36.6, 55.6, 45.1, 52.2, 43.5]
median_0 = 45
alpha = 0.05
# Run the test
result = wilcoxon_signed_rank_one_sample(
x,
median_0=median_0,
alpha=alpha,
alternative="two-sided"
)
# Output
print("One-sample Wilcoxon signed-rank test")
print("-----------------------------------")
print(f"n = {result['n']}")
print(f"W+ = {result['W_plus']}")
print(f"W- = {result['W_minus']}")
print(f"W (min(W+, W-)) = {result['W_exact']}")
print("\nExact test (SciPy)")
print(f"Exact p-value = {result['p_value_exact']:.4f}")
if result["p_value_exact"] < alpha:
print("Exact decision: REJECT H0")
else:
print("Exact decision: DO NOT reject H0")
print("\nAsymptotic normal approximation")
print(f"z = {result['z']:.3f}")
print(f"p-value ≈ {result['p_value_asymptotic']:.4f}")
if result["p_value_asymptotic"] < alpha:
print("Asymptotic decision: REJECT H0")
else:
print("Asymptotic decision: DO NOT reject H0")
One-sample Wilcoxon signed-rank test ----------------------------------- n = 30 W+ = 200 W- = 265 W (min(W+, W-)) = 200 Exact test (SciPy) Exact p-value = 0.5038 Exact decision: DO NOT reject H0 Asymptotic normal approximation z = -0.658 p-value ≈ 0.5104 Asymptotic decision: DO NOT reject H0
Dental researchers have developed a new material for preventing cavities:
a plastic sealant that is applied to the chewing surfaces of teeth.
To determine whether the sealant is effective, it was applied to half of the teeth of each of 12 school-aged children. The remaining teeth for each child were left untreated. After two years, the number of cavities in the sealant-coated teeth and the uncoated teeth was recorded, resulting in the following data:
| Child | Coated | Uncoated | Diff |
|---|---|---|---|
| 1 | 3 | 3 | 0 |
| 2 | 1 | 3 | 2 |
| 3 | 0 | 2 | 2 |
| 4 | 4 | 5 | 1 |
| 5 | 1 | 0 | -1 |
| 6 | 0 | 1 | 1 |
| 7 | 1 | 5 | 4 |
| 8 | 2 | 0 | -2 |
| 9 | 1 | 6 | 5 |
| 10 | 0 | 0 | 0 |
| 11 | 0 | 3 | 3 |
| 12 | 4 | 3 | -1 |
Here, the difference is defined as
$$ \text{Diff} = (\text{Uncoated}) - (\text{Coated}). $$Is there sufficient evidence to conclude that sealant-coated teeth are less prone to cavities than untreated teeth?
Formulate and perform an appropriate statistical test at a suitable significance level.
import numpy as np
from math import sqrt
from scipy.stats import wilcoxon, norm
from scipy.stats import rankdata
def wilcoxon_signed_rank_paired(
x,
y,
alpha=0.05,
alternative="two-sided"
):
"""
Paired Wilcoxon signed-rank test.
Reports:
- W+ , W-
- W_exact = min(W+, W-) (classical table statistic)
- exact p-value from scipy.stats.wilcoxon
- asymptotic z and p-value (normal approximation)
"""
x = np.asarray(x)
y = np.asarray(y)
if len(x) != len(y):
raise ValueError("x and y must have the same length.")
# Paired differences
d = x - y
d = d[d != 0] # remove zero differences
n = len(d)
if n == 0:
raise ValueError("All paired differences are zero.")
# ----- Signed ranks -----
abs_d = np.abs(d)
ranks = rankdata(abs_d, method="average") # correct for ties
W_plus = np.sum(ranks[d > 0])
W_minus = np.sum(ranks[d < 0])
W_exact = min(W_plus, W_minus)
# ----- Exact p-value (SciPy) -----
stat_scipy, p_exact = wilcoxon(
d,
alternative=alternative,
mode="exact" if n <= 25 else "approx"
)
# ----- Asymptotic normal approximation -----
mu = n * (n + 1) / 4
sigma = sqrt(n * (n + 1) * (2 * n + 1) / 24)
z = (W_plus - mu - 0.5 * np.sign(W_plus - mu)) / sigma
if alternative == "two-sided":
p_asym = 2 * (1 - norm.cdf(abs(z)))
elif alternative == "greater":
p_asym = 1 - norm.cdf(z)
elif alternative == "less":
p_asym = norm.cdf(z)
else:
raise ValueError("Invalid alternative.")
return {
"n": n,
"W_plus": W_plus,
"W_minus": W_minus,
"W_exact": W_exact,
"scipy_stat": stat_scipy, # this is W+
"p_value_exact": p_exact,
"z": z,
"p_value_asymptotic": p_asym
}
# Paired example data (e.g., before vs after)
# Data: number of cavities
coated = [3, 1, 0, 4, 1, 0, 1, 2, 1, 0, 0, 4]
uncoated = [3, 3, 2, 5, 0, 1, 5, 0, 6, 0, 3, 3]
alpha = 0.05
# Paired Wilcoxon signed-rank test
result = wilcoxon_signed_rank_paired(
uncoated,
coated,
alpha=alpha,
alternative="greater" # H1: median(uncoated − coated) > 0
)
print("Paired Wilcoxon signed-rank test (Dental sealant study)")
print("------------------------------------------------------")
print(f"n = {result['n']}")
print(f"W+ = {result['W_plus']}")
print(f"W- = {result['W_minus']}")
print(f"W = min(W+, W-) = {result['W_exact']}")
print("\nExact test (SciPy)")
print(f"Test statistic (W+) = {result['scipy_stat']}")
print(f"Exact p-value = {result['p_value_exact']:.4f}")
print("Decision:", "REJECT H0" if result["p_value_exact"] < alpha else "DO NOT reject H0")
print("\nAsymptotic normal approximation")
print(f"z = {result['z']:.3f}")
print(f"p-value ≈ {result['p_value_asymptotic']:.4f}")
print("Decision:", "REJECT H0" if result["p_value_asymptotic"] < alpha else "DO NOT reject H0")
Paired Wilcoxon signed-rank test (Dental sealant study) ------------------------------------------------------ n = 10 W+ = 44.0 W- = 11.0 W = min(W+, W-) = 11.0 Exact test (SciPy) Test statistic (W+) = 44.0 Exact p-value = 0.0527 Decision: DO NOT reject H0 Asymptotic normal approximation z = 1.631 p-value ≈ 0.0515 Decision: DO NOT reject H0