DAgostino Pearson Test: Formula, K2 Statistic, Interpretation, R, Python, SPSS and Excel Guide

Q: What is the DAgostino Pearson Test?

It is an omnibus normality test that combines skewness and kurtosis into one K2 statistic.

Q: What does K2 mean?

K2 is the sum of the squared skewness z-score and squared kurtosis z-score. A large K2 value indicates stronger departure from normality.

Q: What was the DAgostino Pearson Test result in this example?

The result for G3 was K2 = 114.2048 with p = 1.587642e-25, so normality was rejected.

Q: Can the DAgostino Pearson Test be run in SPSS?

Yes. SPSS can manually calculate the test using central moments, skewness, kurtosis, component transformations and K2.

Normality Testing and Distribution Diagnostics

DAgostino Pearson Test is an omnibus normality test that combines skewness and kurtosis into one K2 statistic. In this guide, the test is used to check whether G3 final grades from student-por.csv follow a normal distribution. The article explains the formula, hypotheses, K2 statistic, chi-square p-value, R workflow, Python workflow, SPSS verification, Excel method, chart interpretation and verified student performance results.

Advertisement
Google AdSense top placement reserved here

Quick Answer: DAgostino Pearson Test Result

A DAgostino Pearson Test was conducted to evaluate whether G3 final grades follow a normal distribution. The verified result was K2 = 114.2048, with p = 1.587642e-25. The skewness was -0.9108 and the excess kurtosis was 2.6821. Since p < 0.05, the analysis rejects normality for G3 final grades.

DAgostino Pearson Test Overview

The test checks normality by looking at two important features of a distribution: skewness and kurtosis. Skewness tells whether a distribution leans left or right. Kurtosis tells whether the distribution has unusually heavy tails or a sharper/flatter shape compared with a normal distribution.

The method transforms skewness and kurtosis into two approximately standard normal z-scores. These two z-scores are then squared and added to create the K2 statistic. Under the null hypothesis of normality, K2 is compared with a chi-square distribution with 2 degrees of freedom.

OutcomeG3

Sample size649

K2 statistic114.2048

Verified softwareR, Python, SPSS

In this example, the very small p-value means the G3 distribution is not normally shaped. The negative skewness and positive excess kurtosis show that the departure is caused by both asymmetry and heavy-tail/peakedness behavior.

What Is the DAgostino Pearson Test?

The D’Agostino-Pearson omnibus normality test is a formal statistical test for checking whether a sample is consistent with a normal distribution. It is called “omnibus” because it does not check only one feature. It combines skewness and kurtosis into a single overall test statistic.

In simple language: the method asks whether the distribution is too asymmetric, too heavy-tailed, too sharply peaked, or too flat to be treated as normal.

This is useful before applying statistical methods that assume normality. A histogram or Q-Q plot can show the pattern visually, but the K2 statistic gives a numerical test result.

DAgostino Pearson Test Formula

The test combines the transformed skewness and kurtosis components:

K² = Z²(skewness) + Z²(kurtosis)

The p-value is calculated using a chi-square distribution with 2 degrees of freedom:

p-value = P(χ²₂ ≥ K²)

For this dataset, the verified G3 result is:

K² = (-8.2817)² + (6.7542)²
K² = 114.2048
p = 1.587642e-25

The skewness z-score is strongly negative, and the kurtosis z-score is strongly positive. When squared and added, they produce a very large K2 statistic, far beyond the usual 5% chi-square critical value.

DAgostino Pearson Test Null Hypothesis and Alternative Hypothesis

Hypothesis	Meaning	Decision rule
H0	The sample comes from a normally distributed population.	If p-value is 0.05 or greater, do not reject normality.
H1	The sample does not come from a normally distributed population.	If p-value is less than 0.05, reject normality.

For G3 final grades, p is far below 0.05. Therefore, the analysis rejects the null hypothesis and concludes that the G3 distribution is not normal.

Advertisement
Google AdSense middle placement reserved here

Dataset and Variables Used

This example uses the student-por.csv dataset. The verified workflow uses 649 rows, 34 columns and no missing cells in the selected analysis variables. The main outcome is G3, which represents final grade. The supporting variables are G1, G2, absences and studytime.

Variable	Role	Meaning
G3	Main outcome	Final grade from 0 to 20.
G1	Comparison variable	First-period grade.
G2	Comparison variable	Second-period grade.
absences	Comparison variable	Number of school absences.
studytime	Group variable	Weekly study time category.

External data source: UCI Machine Learning Repository: Student Performance dataset.

Verified DAgostino Pearson Test Results

The analysis was verified in R, Python and SPSS. R and Python produced the same K2 statistic and p-value. SPSS manually reproduced the same calculation using a clean CSV file, skewness, kurtosis, transformed z-scores, K2 statistic and chi-square p-value.

Final report sentence: A DAgostino Pearson Test was conducted to evaluate whether G3 final grades follow a normal distribution. The result was K2 = 114.2048, p = 1.587642e-25. The skewness was -0.9108 and the excess kurtosis was 2.6821. Because p < 0.05, the analysis rejected normality for G3.

Main G3 Result

Variable	N	Mean	SD	Skewness	Excess kurtosis	K2	p-value	Decision
G3	649	11.9060	3.2307	-0.9108	2.6821	114.2048	1.587642e-25	Reject normality

Component Z-Scores

Component	Z-score	Meaning
Skewness z	-8.2817	The negative sign shows a strong left-skewed pattern in G3.
Kurtosis z	6.7542	The positive value shows strong kurtosis departure from normality.

Variable Comparison Results

Variable	N	Mean	SD	Skewness	Excess kurtosis	K2	p-value	Decision
G1	649	11.3991	2.7453	-0.0028	0.0271	0.0802	0.9607	Do not reject normality
G2	649	11.5701	2.9136	-0.3594	1.6405	40.1870	1.877145e-09	Reject normality
G3	649	11.9060	3.2307	-0.9108	2.6821	114.2048	1.587642e-25	Reject normality
absences	649	3.6595	4.6408	2.0160	5.7274	287.7460	3.286672e-63	Reject normality

G3 by Studytime Group

Studytime group	N	Mean G3	SD	Skewness	Excess kurtosis	K2	p-value	Decision
1: <2 hours	212	10.8443	3.2186	-1.0705	3.0162	50.5457	1.057158e-11	Reject normality
2: 2 to 5 hours	305	12.0918	3.2431	-1.0230	2.9751	66.2103	4.193909e-15	Reject normality
3: 5 to 10 hours	97	13.2268	2.5021	-0.1872	-0.5374	2.0646	0.3562	Do not reject normality
4: >10 hours	35	13.0571	3.0384	0.2002	-0.4590	0.3974	0.8198	Do not reject normality

DAgostino Pearson Test Result Images and Chart Interpretation

1. Histogram with Fitted Normal Curve

DAgostino Pearson Test histogram of G3 final grades with fitted normal curve — G3 distribution with fitted normal curve.

This chart shows why the normality result is not surprising. The distribution is not a smooth bell shape. G3 grades are bounded between 0 and 20, measured as integer scores, and contain a visible group of low scores near zero. The fitted normal curve cannot fully match this shape.

2. Normal Q-Q Plot

Normal Q-Q plot for G3 final grades used with DAgostino Pearson Test — Normal Q-Q plot for G3 final grade.

The Q-Q plot compares observed grade quantiles with theoretical normal quantiles. If G3 were normally distributed, the points would follow the reference line more closely. Instead, the plot shows stair-step behavior from repeated integer grades, a strong lower-tail departure and a high-end flattening. This visual pattern supports the formal rejection of normality.

3. Skewness and Excess Kurtosis

This chart shows the two shape features used by the test. G3 has negative skewness of about -0.9108, meaning the distribution has a longer or stronger lower-side pull. It also has excess kurtosis of about 2.6821, meaning the distribution departs strongly from the kurtosis expected under normality.

4. Component Z-Scores

DAgostino Pearson Test component z-scores for skewness and kurtosis in G3 — Skewness and kurtosis component z-scores for G3.

This chart explains the K2 statistic. The skewness component is strongly negative, with z ≈ -8.2817. The kurtosis component is strongly positive, with z ≈ 6.7542. The test squares both values, so both components strongly increase K2. This is why the final statistic becomes very large.

5. Chi-Square Null Distribution for K2

DAgostino Pearson Test chi-square null distribution with observed K2 and critical value — Chi-square null distribution for K2 with observed statistic and 95% critical value.

This chart shows the decision visually. The dashed line marks the 95% chi-square critical value, while the observed K2 statistic is far to the right. Since K2 = 114.2048 is much larger than the critical value, the p-value becomes extremely small and the normality assumption is rejected.

6. K2 Comparison Across Variables

DAgostino Pearson Test K2 comparison across G1 G2 G3 and absences — K2 comparison across G1, G2, G3 and absences.

This chart compares normality departure across four variables. G1 has a very small K2 value and does not reject normality. G2 and G3 reject normality, while absences has the largest K2 value because absences are count data and are strongly non-normal. This chart helps readers understand that normality can differ across variables inside the same dataset.

7. G3 K2 by Studytime Group

DAgostino Pearson Test K2 comparison for G3 by studytime group — G3 K2 statistic by studytime group.

This chart shows that G3 normality differs across studytime categories. The first two groups, <2 hours and 2 to 5 hours, have large K2 values and reject normality. The 5 to 10 hours and >10 hours groups have smaller K2 values and do not reject normality at the 0.05 level. This means the overall G3 normality rejection is mainly driven by the larger lower-studytime groups.

Additional Verification Images

The “-1” image files below are duplicate verification charts from the repeated workflow. They are included for completeness. For better page speed, the seven main charts above are usually enough for the published article, while the repeated charts can remain in the media library as backup evidence.

Additional DAgostino Pearson Test component z-score chart for G3 — Additional component z-score chart from repeated verification.

Additional G3 histogram with fitted normal curve for DAgostino Pearson Test — Additional G3 histogram with fitted normal curve.

Additional normal Q-Q plot for G3 final grade.

Additional skewness and excess kurtosis chart for G3 — Additional skewness and excess kurtosis chart.

Additional chi-square null distribution chart for DAgostino Pearson K2 — Additional chi-square null distribution chart for K2.

Additional K2 comparison chart for G3 by studytime group — Additional studytime-group K2 comparison chart.

Additional K2 comparison chart across G1 G2 G3 and absences — Additional variable-level K2 comparison chart.

DAgostino Pearson Test in R

In R, the test can be calculated by computing skewness, kurtosis, component transformations, K2 and the chi-square p-value.

student <- read.csv("student-por.csv", sep = ";", stringsAsFactors = FALSE)

g3 <- as.numeric(student$G3)
g3 <- g3[!is.na(g3)]

n <- length(g3)
mu <- mean(g3)
s <- sd(g3)

m2 <- mean((g3 - mu)^2)
m3 <- mean((g3 - mu)^3)
m4 <- mean((g3 - mu)^4)

skewness <- m3 / (m2^(3/2))
pearson_kurtosis <- m4 / (m2^2)
excess_kurtosis <- pearson_kurtosis - 3

# In the full workflow, skewness and kurtosis are transformed
# into z-scores, then combined:
K2 <- z_skewness^2 + z_kurtosis^2
p_value <- pchisq(K2, df = 2, lower.tail = FALSE)

The verified R output gives K2 = 114.2048 and p = 1.587642e-25.

DAgostino Pearson Test in Python

Python can reproduce the same result by calculating skewness, kurtosis, component z-scores and K2.

import pandas as pd
import numpy as np
import math

student = pd.read_csv("student-por.csv", sep=";")

g3 = pd.to_numeric(student["G3"], errors="coerce").dropna().to_numpy()

n = len(g3)
mean = np.mean(g3)
sd = np.std(g3, ddof=1)

m2 = np.mean((g3 - mean) ** 2)
m3 = np.mean((g3 - mean) ** 3)
m4 = np.mean((g3 - mean) ** 4)

skewness = m3 / (m2 ** 1.5)
pearson_kurtosis = m4 / (m2 ** 2)
excess_kurtosis = pearson_kurtosis - 3

# After D'Agostino-Pearson transformations:
K2 = z_skewness ** 2 + z_kurtosis ** 2
p_value = math.exp(-K2 / 2)  # chi-square df = 2 survival function

print(K2, p_value)

The verified Python result matches R and SPSS: K2 = 114.204755, p = 1.587642e-25.

DAgostino Pearson Test in SPSS

SPSS can manually verify the calculation by importing a clean CSV file, computing central moments, transforming skewness and kurtosis into z-scores, and calculating the K2 statistic.

SPSS Manual G3 Result

N	Mean G3	SD G3	Skewness	Excess kurtosis	Skewness z	Kurtosis z	K2	p-value	Decision
649	11.906009	3.230656	-0.910798	2.682123	-8.281651	6.754184	114.204755	1.58764160E-25	Reject normality

SPSS Syntax Used

GET DATA
  /TYPE=TXT
  /FILE='D:\dagostino_pearson_test\student_dap_spss_clean.csv'
  /ENCODING='UTF8'
  /DELIMITERS=","
  /QUALIFIER='"'
  /FIRSTCASE=2
  /VARIABLES=
  studytime F1.0
  G1 F2.0
  G2 F2.0
  G3 F2.0
  absences F3.0.
CACHE.
EXECUTE.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /n_total=N(G3)
  /mean_G3=MEAN(G3)
  /sd_G3=SD(G3).

COMPUTE dev_G3 = G3 - mean_G3.
COMPUTE dev2_G3 = dev_G3 ** 2.
COMPUTE dev3_G3 = dev_G3 ** 3.
COMPUTE dev4_G3 = dev_G3 ** 4.
EXECUTE.

AGGREGATE
  /OUTFILE=* MODE=ADDVARIABLES
  /BREAK=
  /m2_G3=MEAN(dev2_G3)
  /m3_G3=MEAN(dev3_G3)
  /m4_G3=MEAN(dev4_G3).

COMPUTE skewness_G3 = m3_G3 / (m2_G3 ** 1.5).
COMPUTE pearson_kurtosis_G3 = m4_G3 / (m2_G3 ** 2).
COMPUTE excess_kurtosis_G3 = pearson_kurtosis_G3 - 3.

COMPUTE K2_G3 = (z_skewness_G3 ** 2) + (z_kurtosis_G3 ** 2).
COMPUTE p_value_G3 = EXP(-K2_G3 / 2).
EXECUTE.

Download SPSS verification PDF: DAgostino Pearson Test SPSS Output PDF.

DAgostino Pearson Test in Excel

Excel can calculate skewness and kurtosis, but the full transformed K2 calculation is easier in R or Python. Still, Excel is useful for understanding the logic.

Place G3 values in one column.
Calculate sample size, mean and standard deviation.
Calculate skewness.
Calculate kurtosis or excess kurtosis.
Transform skewness and kurtosis into z-scores using the formula.
Square both z-scores and add them to get K2.
Use the chi-square distribution with 2 degrees of freedom to calculate the p-value.

=SKEW(G3_range)
=KURT(G3_range)
=Z_skewness^2 + Z_kurtosis^2
=CHISQ.DIST.RT(K2, 2)

How to Report the DAgostino Pearson Test

A strong report should include the variable, sample size, skewness, excess kurtosis, K2 statistic, p-value and final decision.

APA-style report: A DAgostino Pearson Test was conducted to evaluate whether G3 final grades followed a normal distribution. The result was K2 = 114.2048, p = 1.587642e-25. The distribution showed negative skewness of -0.9108 and excess kurtosis of 2.6821. Since p < 0.05, the normality assumption was rejected.

Plain-language report: G3 final grades are not normally distributed. The distribution is left-skewed and has strong kurtosis departure, so a normal curve does not describe the grade pattern well.

Common Mistakes in DAgostino Pearson Test

1. Reporting only the p-value

The p-value is important, but a complete interpretation should also mention skewness, kurtosis and K2.

2. Ignoring the direction of skewness

A significant result tells you normality is rejected, but skewness tells whether the departure is left-sided or right-sided.

3. Treating K2 as variance

K2 is not variance. It is the sum of squared skewness and kurtosis z-scores.

4. Using the test without visual charts

A histogram and Q-Q plot help explain why normality is rejected.

5. Using it for very small samples

The method is generally more useful with moderate or large samples. For very small samples, normality tests can be unstable.

Download DAgostino Pearson Test Files

The SPSS PDF contains the verified manual output, including import checks, G3 result, variable comparison and studytime-group comparison.

Download SPSS Output PDF
Includes G3 K2 calculation, variable comparison and studytime-group output.

View R and Python Code
Use the code sections above to reproduce the K2 statistic.

Sources and Method Notes

This guide uses verified R, Python and SPSS outputs from the student performance dataset. The following sources support the dataset and software environment.

FAQs About DAgostino Pearson Test

What is the DAgostino Pearson Test?

It is an omnibus normality test that combines skewness and kurtosis into one K2 statistic.

What does K2 mean?

K2 is the sum of the squared skewness z-score and squared kurtosis z-score. A large K2 value indicates stronger departure from normality.

What was the result in this example?

The result for G3 was K2 = 114.2048 with p = 1.587642e-25, so normality was rejected.

What do skewness and kurtosis show here?

G3 has negative skewness of about -0.9108 and excess kurtosis of about 2.6821, indicating strong departure from a normal shape.

Can this test be run in R?

Yes. R can calculate skewness, kurtosis, transformed z-scores, K2 and the chi-square p-value.

Can this test be run in Python?

Yes. Python can reproduce the same calculation using pandas, numpy and the chi-square p-value formula.

Can this test be run in SPSS?