Distribution Testing, Normality and ECDF Analysis
Kolmogorov Smirnov Test is a distribution-comparison test that measures the largest distance between cumulative distribution functions. This guide explains the one-sample KS normality test, the two-sample KS test, the KS D statistic, ECDF charts, SPSS output, Python validation graphs and practical interpretation using a real student performance dataset.
Google AdSense top placement reserved here
Quick Answer: Kolmogorov Smirnov Test Result
The Kolmogorov Smirnov Test was applied to the G3 final grade variable from the student performance dataset. The one-sample KS test compared the empirical cumulative distribution of G3 with a fitted normal cumulative distribution. The result showed that G3 does not follow a smooth normal distribution.
Main result: The Kolmogorov Smirnov Test for G3 final grade produced D ā 0.124. The maximum ECDF-normal CDF gap occurred around G3 = 10, where the observed empirical cumulative probability was about 0.154 and the fitted normal CDF was about 0.278. The practical conclusion is that G3 final grades are not normally distributed.
Important reporting note: G3 is an integer grade variable with many tied values. The normal mean and standard deviation are estimated from the same sample. Therefore, the one-sample KS p-value should be explained carefully as a practical distribution-comparison result, not as a perfect continuous-distribution normality test.
Show Table of Contents
- What is the Kolmogorov Smirnov Test?
- Kolmogorov Smirnov Test formula and KS D statistic
- Null hypothesis and interpretation
- Dataset and variables used
- Verified SPSS, R and Python results
- Chart-by-chart explanation
- Two-sample Kolmogorov Smirnov Test
- How to run the test in SPSS, Python, R and Excel
- How to report the Kolmogorov Smirnov Test
- Common mistakes
- Download output and chart files
- Sources and method notes
- FAQs
What Is the Kolmogorov Smirnov Test?
The Kolmogorov Smirnov Test, often called the KS test, is a nonparametric test that compares cumulative distribution functions. Unlike a mean test, it does not ask only whether one group has a higher average. It asks whether the whole distribution differs.
There are two common forms of the test. The one-sample Kolmogorov Smirnov Test compares an observed empirical cumulative distribution function with a theoretical cumulative distribution, such as a normal distribution. The two-sample Kolmogorov Smirnov Test compares the empirical cumulative distribution of one group with the empirical cumulative distribution of another group.
| KS test type | What it compares | Example in this article | Question answered |
|---|---|---|---|
| One-sample KS test | Observed ECDF vs theoretical CDF | G3 final grade vs fitted normal distribution | Does G3 look normal? |
| Two-sample KS test | Group 1 ECDF vs group 2 ECDF | G3 in school GP vs G3 in school MS | Do two groups have different grade distributions? |
| Variable screening | Several variables vs fitted normal CDFs | G1, G2, G3, age, absences, studytime, failures | Which variables depart most strongly from normality? |
In this student performance example, the Kolmogorov Smirnov Test is useful because final grades are not smooth continuous measurements. G3 is an integer score from 0 to 20. Many students cluster around middle grades, and a small group has very low scores. A normal curve cannot fully reproduce that step-shaped grade distribution.
Practical note: The Kolmogorov Smirnov Test is excellent for explaining ECDF distance, but for normality testing it should be read with caution when the data are discrete, tied, rounded or when normal parameters are estimated from the same sample. For related normality checks, also see the DAgostino Pearson Test and the Cramer von Mises Test.
Kolmogorov Smirnov Test Formula and KS D Statistic
The most important value in the Kolmogorov Smirnov Test is the KS D statistic. It is the largest vertical distance between two cumulative distribution curves.
One-Sample KS D Statistic
For a one-sample KS test, the D statistic is:
D = max |F_n(x) - F_0(x)|
Where:
- Fn(x) is the empirical cumulative distribution function from the observed sample.
- F0(x) is the theoretical cumulative distribution function, such as the normal CDF.
- D is the largest vertical distance between the two curves.
Two-Sample KS D Statistic
For a two-sample KS test, the D statistic is:
D = max |F_1(x) - F_2(x)|
Here, F1(x) and F2(x) are the empirical CDFs of two groups. In this post, this logic is used to compare G3 distributions for school groups, studytime groups and sex groups.
Plain-language formula meaning
If the ECDF and the reference CDF stay close together, the KS D statistic is small. If they separate strongly at any point, D becomes larger. The test therefore focuses on the maximum distribution gap, not on the average gap.
Kolmogorov Smirnov Test Null Hypothesis and Interpretation
| Test | Null hypothesis | Alternative hypothesis | Decision rule |
|---|---|---|---|
| One-sample KS test | The observed data follow the reference distribution. | The observed distribution differs from the reference distribution. | If p < .05, reject the reference distribution fit. |
| Two-sample KS test | The two groups come from the same distribution. | The two groups have different distributions. | If p < .05, conclude that the group distributions differ. |
For this article, the one-sample result for G3 rejects the fitted normal reference. The two-sample results also show distribution differences for school, studytime group and sex, with school showing the largest distribution gap.
Google AdSense middle placement reserved here
Dataset and Variables Used
The analysis uses the Portuguese student performance dataset. The main outcome variable is G3, the final grade. The cleaned dataset contains 649 valid student records. The workflow was run in R, Python and SPSS, with Python used for chart generation and SPSS used for formal output verification.
| Variable | Meaning | Role in this Kolmogorov Smirnov Test guide |
|---|---|---|
| G3 | Final grade | Main one-sample KS normality variable. |
| G1 | First period grade | Comparison variable in KS D statistic chart. |
| G2 | Second period grade | Comparison variable in KS D statistic chart. |
| failures | Past class failures | Strongest non-normal variable because most students have zero failures. |
| studytime | Studytime category | Ordinal predictor and two-sample grouping variable. |
| absences | Number of absences | Count variable with strong non-normality. |
| school | School GP or MS | Two-sample KS group comparison. |
| sex | Female or male group | Additional two-sample KS group comparison. |
External dataset source: UCI Machine Learning Repository: Student Performance dataset.
Verified SPSS, R and Python Results
G3 Descriptive Statistics
| Statistic | G3 value | Meaning |
|---|---|---|
| N | 649 | No G3 cases were lost in the cleaned file. |
| Mean | 11.91 | The average final grade is slightly below 12. |
| Median | 12.00 | The central grade is exactly 12. |
| Standard deviation | 3.23 | Grades vary by about 3.23 grade points around the mean. |
| Minimum | 0 | Some students have extremely low final grades. |
| Maximum | 19 | The highest observed final grade is 19. |
One-Sample Kolmogorov Smirnov Test Results
The one-sample KS test compared each numeric variable with a fitted normal CDF. Because the normal mean and standard deviation were estimated from the same data, the p-values should be treated as approximate practical distribution comparisons. The D statistics are still useful because they show which variables depart most strongly from a fitted normal shape.
| Variable | N | Mean | SD | KS D | Maximum gap location | Interpretation |
|---|---|---|---|---|---|---|
| failures | 649 | 0.222 | 0.593 | 0.492 | 0 | Strongest non-normal variable; most students have zero failures. |
| studytime | 649 | 1.93 | 0.830 | 0.263 | 2 | Ordinal categories create strong departure from smooth normality. |
| absences | 649 | 3.66 | 4.64 | 0.215 | 0 | Count variable with many low values and skewness. |
| Fedu | 649 | 2.31 | 1.10 | 0.211 | 2 | Ordinal education categories are not normally distributed. |
| Medu | 649 | 2.51 | 1.13 | 0.191 | 2 | Mother education is categorical/ordinal, so normality is not expected. |
| age | 649 | 16.7 | 1.22 | 0.175 | 16 | Age is concentrated around teenage years, not normal. |
| G3 | 649 | 11.9 | 3.23 | 0.124 | 10 | Final grades depart from fitted normality because of integer steps and low-score tail. |
| G2 | 649 | 11.6 | 2.91 | 0.088 | 11 | Second period grades depart from normality less than G3. |
| G1 | 649 | 11.4 | 2.75 | 0.086 | 11 | First period grades have the smallest D among the listed variables. |
G3-Focused KS Result
G3 final grade result: D = 0.124, D+ = 0.0742, Dā = 0.124, maximum gap location = G3 = 10. At that location, the empirical cumulative probability is about 0.154, while the fitted normal CDF is about 0.278. This means the fitted normal curve accumulates probability faster than the observed grade distribution around that point.
Two-Sample KS Results
The two-sample Kolmogorov Smirnov Test compared G3 distributions across important groups. Unlike a t-test, this comparison does not only test whether averages differ. It checks whether the cumulative grade distributions differ.
| Comparison | Group sizes | Means | Medians | KS D | p-value | Interpretation |
|---|---|---|---|---|---|---|
| G3 by school: GP vs MS | 423 vs 226 | 12.6 vs 10.7 | 13 vs 11 | 0.295 | 1.57e-11 | Strong distribution difference between schools. |
| G3 by studytime: low vs higher | 517 vs 132 | 11.6 vs 13.2 | 11 vs 13 | 0.240 | 1.11e-5 | Higher studytime group has a visibly different grade distribution. |
| G3 by sex: F vs M | 383 vs 266 | 12.3 vs 11.4 | 12 vs 11 | 0.138 | 0.00515 | Statistically significant but smaller distribution difference. |
Kolmogorov Smirnov Test Charts and Graph-by-Graph Explanation
The charts below explain the result in a way that a table alone cannot. The KS D statistic is a maximum ECDF distance, so the best visual explanation is to show the ECDF, fitted normal CDF, maximum gap, histogram and group ECDF comparisons.
1. G3 Histogram with Fitted Normal Curve

The histogram shows why the Kolmogorov Smirnov Test rejects normality for G3. The grade distribution is not a smooth bell curve. It has clear integer steps, a concentration around middle grades, and a visible low-grade tail. The fitted normal curve is useful as a reference, but the real data do not follow it closely enough.
This chart also explains why the result should not be overreported as a pure continuous-data normality test. G3 is a bounded grade score from 0 to 20, and the observed maximum is 19. Because grades are repeated integer values, ties are expected. The KS test is detecting both shape departure and the discreteness of the measurement scale.
2. G3 ECDF Versus Fitted Normal CDF

This is the most important conceptual chart in the post. The step-shaped line is the empirical cumulative distribution of actual G3 scores. The smooth curve is the fitted normal cumulative distribution. Where the two lines separate most, the KS D statistic is largest.
For G3, the maximum gap appears around grade 10. The empirical cumulative probability around that point is about 0.154, while the fitted normal CDF is about 0.278. In plain language, the normal model expects more cumulative probability below that score than the observed grade distribution actually has. That vertical distance creates the KS D value.
3. G3 Maximum Distance Chart

The maximum-distance chart marks the exact point where the KS statistic is created. This is helpful because many readers misunderstand the KS test as an average distance test. It is not. It uses the single largest vertical gap between the empirical distribution and the reference distribution.
In this example, the D statistic is about 0.124. That means the biggest gap between the observed G3 ECDF and fitted normal CDF is about 12.4 percentage points. This is large enough, with 649 cases, to reject the fitted normal reference.
4. G3 ECDF-Normal Gap Contributions by Grade Value

This chart shows how much each sorted grade value contributes to the difference between the observed ECDF and the fitted normal CDF. Instead of looking only at the final D statistic, the chart shows where the distribution mismatch builds up.
The peak around grade 10 confirms that the largest mismatch is not at the extreme maximum or minimum grade. It is in the central-lower part of the distribution, where the observed cumulative distribution and fitted normal distribution disagree most strongly.
5. G3 Q-Q Plot

The Q-Q plot supports the KS result. If G3 were normally distributed, the points would follow the diagonal reference line more closely. Instead, the bounded integer grade scale and low-grade tail create visible departures, especially at the lower end.
This chart is important because KS testing and Q-Q plotting answer the same broad question from different angles. The KS test gives a formal D statistic. The Q-Q plot shows whether the observed quantiles align with normal quantiles. Together they make the normality conclusion stronger and easier to explain.
6. G3 Grade Band Counts

The grade-band chart is useful for non-technical readers. It shows that most students are concentrated in middle passing grade bands, while a smaller group appears in very low and high bands. The distribution is educationally meaningful, but it is not a smooth continuous normal distribution.
This chart also explains why the normal curve is imperfect. Grade data are bounded, rounded and often clustered around school grading thresholds. In such cases, normality tests often reject. The more important question is whether the planned statistical method is robust to that distribution shape.
7. Variable KS D Statistic Comparison

The variable comparison chart shows that G3 is non-normal, but it is not the most extreme variable. The strongest departure is failures, with D = 0.492. This happens because failures is heavily concentrated at zero. Studytime, absences, parent education and age also show strong non-normality because they are ordinal, count or concentrated variables.
G1, G2 and G3 have smaller D statistics than failures or studytime because grades are more spread out than those categorical variables. Still, G3 has D = 0.124, which is strong enough to reject the fitted normal reference with 649 observations.
8. Variable Approximate P-Value Strength

The p-value chart should be explained carefully. Because the variables are discrete and because the normal parameters are estimated from the same sample, the one-sample p-values are approximate. Still, the chart is useful because it shows that the same variables with large D values also produce strong evidence against the fitted normal reference.
This chart should not be used to claim that every variable needs a nonparametric method. Instead, it should be used to identify which variables are farthest from smooth normality and therefore need more careful visual inspection.
9. G3 Simulation Null Distribution

The simulation chart gives a practical understanding of the D statistic. In 2,000 simulated normal samples, the average simulated D was about 0.025, the 95th percentile was about 0.0363, and the observed G3 D was 0.124. The simulation p-value was approximately 0 because none of the simulated normal samples produced a D as large as the observed G3 D.
This chart is very useful for readers because it turns an abstract p-value into a visual comparison. The observed G3 distribution is not just slightly different from simulated normal samples. Its ECDF gap is much larger than what the simulation would usually produce under a normal pattern.
10. School Group ECDF Comparison

The school-group ECDF chart compares the cumulative distribution of G3 for school GP and school MS. The two-sample KS result is D = 0.295 with p = 1.57e-11. This is the largest two-sample distribution difference in the article.
The result means that the two schools differ not only in average final grade, but in the full grade distribution. School GP has a higher mean and median, while school MS accumulates lower grades faster. The ECDF chart shows this distributional separation more clearly than a mean table alone.
11. Studytime Group ECDF Comparison

The studytime-group comparison gives D = 0.240 with p = 1.11e-5. The lower studytime group has n = 517 and mean G3 about 11.6. The higher studytime group has n = 132 and mean G3 about 13.2. The ECDF chart shows that the higher studytime group tends to accumulate probability at higher grade values.
This is a good example of why the two-sample KS test is more informative than a simple mean comparison. It shows that the distribution shape shifts, not just the average. The difference between medians, 11 versus 13, also supports the visual interpretation.
12. Two-Sample KS D Statistic Comparison

This chart summarizes the two-sample KS comparisons. School GP vs MS has the largest D statistic at 0.295. Studytime group has D = 0.240. Sex has D = 0.138. All three are statistically significant, but the practical size of the distribution gap is clearly strongest for school and studytime.
The correct interpretation is not simply that all three p-values are significant. The D statistic shows the size of the maximum distribution gap. Therefore, the school distribution difference is more important than the sex distribution difference in this dataset.
Two-Sample Kolmogorov Smirnov Test Interpretation
The two-sample KS test asks whether two groups have the same distribution. It is useful when the researcher wants to compare more than the mean. In this article, the two-sample test was used for three practical comparisons: school, studytime group and sex.
School GP vs School MS
The strongest group difference is the school comparison. GP has n = 423, mean G3 = 12.6 and median G3 = 13. MS has n = 226, mean G3 = 10.7 and median G3 = 11. The KS D statistic is 0.295, with p = 1.57e-11. This means the whole distribution of final grades differs strongly between the schools.
Low vs Higher Studytime
The studytime comparison also shows a strong distribution difference. The low studytime group has n = 517, mean G3 = 11.6 and median G3 = 11. The higher studytime group has n = 132, mean G3 = 13.2 and median G3 = 13. The KS D statistic is 0.240, with p = 1.11e-5.
Female vs Male Students
The sex comparison is statistically significant but weaker. Female students have n = 383, mean G3 = 12.3 and median G3 = 12. Male students have n = 266, mean G3 = 11.4 and median G3 = 11. The KS D statistic is 0.138, with p = 0.00515. This suggests a distribution difference, but the maximum gap is smaller than school or studytime.
How to Run the Kolmogorov Smirnov Test in SPSS, Python, R and Excel
Kolmogorov Smirnov Test in SPSS
SPSS can run the one-sample Kolmogorov Smirnov Test through the nonparametric tests or Explore/normality procedures, depending on the version and menu workflow. For grouped comparisons, SPSS can also prepare group summaries and ECDF-style outputs.
* Example SPSS one-sample Kolmogorov Smirnov normality test.
NPAR TESTS
/K-S(NORMAL)=G3
/MISSING ANALYSIS.
* Example descriptive check before interpreting the KS result.
EXAMINE VARIABLES=G3
/PLOT=HISTOGRAM NPPLOT
/STATISTICS=DESCRIPTIVES
/MISSING=LISTWISE.
In the corrected SPSS workflow for this article, the cleaned CSV was imported, grouping variables were created, one-sample normality checks were run and output was exported automatically as a PDF.
Kolmogorov Smirnov Test in Python
Python can run the one-sample KS test using scipy.stats.kstest and the two-sample KS test using scipy.stats.ks_2samp. The example below compares G3 with a fitted normal CDF.
import pandas as pd
from scipy import stats
g3 = df["G3"].dropna()
mu = g3.mean()
sigma = g3.std(ddof=1)
ks_result = stats.kstest(g3, "norm", args=(mu, sigma))
print("KS D statistic:", ks_result.statistic)
print("p-value:", ks_result.pvalue)
For the two-sample school comparison:
gp = df.loc[df["school"] == "GP", "G3"].dropna()
ms = df.loc[df["school"] == "MS", "G3"].dropna()
two_sample_result = stats.ks_2samp(gp, ms)
print("Two-sample KS D:", two_sample_result.statistic)
print("p-value:", two_sample_result.pvalue)
Kolmogorov Smirnov Test in R
In R, the one-sample KS test can be run with ks.test(). When parameters are estimated from the same sample, the result should be described carefully.
g3 <- na.omit(student_data$G3)
ks.test(
g3,
"pnorm",
mean(g3),
sd(g3)
)
For the two-sample school comparison:
gp <- student_data$G3[student_data$school == "GP"]
ms <- student_data$G3[student_data$school == "MS"]
ks.test(gp, ms)
Kolmogorov Smirnov Test in Excel
Excel does not provide a simple built-in KS test button, but the logic can be created manually:
- Sort the observed variable from smallest to largest.
- Calculate the empirical cumulative probability for each row.
- Calculate the fitted theoretical CDF using
NORM.DIST. - Calculate the absolute difference between ECDF and theoretical CDF.
- The largest absolute difference is the KS D statistic.
=NORM.DIST(A2, mean_value, sd_value, TRUE)
=ABS(empirical_cdf - normal_cdf)
=MAX(abs_difference_column)
Excel is useful for learning the ECDF logic, but SPSS, R and Python are better for reproducible analysis, group comparisons and chart generation.
How to Report the Kolmogorov Smirnov Test
APA-Style One-Sample Report for G3
Report: A one-sample Kolmogorov Smirnov Test was conducted to compare G3 final grades with a fitted normal distribution. The result indicated a significant departure from the normal reference distribution, D = 0.124, n = 649, p < .001. The largest ECDF-normal CDF gap occurred around G3 = 10, where the empirical cumulative probability was about 0.154 and the fitted normal CDF was about 0.278.
Plain-Language Report for G3
Plain-language interpretation: G3 final grades do not follow a smooth normal distribution. The scores are discrete integer grades, many students cluster around the middle grade range, and a small group has very low scores. The histogram, ECDF chart and Q-Q plot support the Kolmogorov Smirnov Test result.
Two-Sample Report for School Groups
Report: A two-sample Kolmogorov Smirnov Test showed that the G3 distribution differed between school GP and school MS, D = 0.295, p = 1.57e-11. School GP had a higher mean and median final grade, and the ECDF comparison showed a clear distributional separation between the two schools.
Common Mistakes in Kolmogorov Smirnov Test Interpretation
1. Treating the KS Test as Only a Normality Test
The Kolmogorov Smirnov Test can be used for normality checking, but it is broader than that. It compares cumulative distributions. The two-sample KS test compares two empirical distributions without requiring a normal reference.
2. Ignoring Ties and Discrete Values
G3 is an integer grade variable. Failures, studytime and parent education are also discrete or ordinal variables. These ties matter. A strict continuous KS assumption is not perfectly matched to these variables, so results should be interpreted as practical distribution comparisons.
3. Reporting Only the P-Value
The p-value tells whether the difference is statistically significant. It does not tell where the difference occurs. The maximum-distance chart and ECDF gap chart explain the location and shape of the difference.
4. Forgetting That Parameters Were Estimated
When the normal mean and standard deviation are estimated from the same sample, the one-sample KS test p-value is not the same as a test against a fully pre-specified normal distribution. This is why Lilliefors-style correction or cautious interpretation is important.
5. Automatically Rejecting Parametric Methods
A significant KS test does not automatically mean every parametric method is forbidden. With n = 649, small or moderate deviations can become significant. The correct response is to inspect plots, consider robustness, check residuals and choose the method that fits the research question.
Download SPSS Output and Chart Files
The SPSS PDF and chart images used in this article support the complete Kolmogorov Smirnov Test interpretation.
Sources and Method Notes
This guide uses verified SPSS, R and Python outputs from the student performance dataset. The one-sample KS tests compare variables with fitted normal CDFs. Since the normal parameters are estimated from the same data and the variables contain ties, the p-values are explained as practical distribution-comparison evidence rather than perfect continuous-distribution tests.
FAQs About the Kolmogorov Smirnov Test
What is the Kolmogorov Smirnov Test?
The Kolmogorov Smirnov Test is a distribution-comparison test. It measures the largest distance between cumulative distribution functions.
What is the KS D statistic?
The KS D statistic is the maximum vertical distance between the observed empirical cumulative distribution and a theoretical or second empirical cumulative distribution.
What was the G3 Kolmogorov Smirnov Test result in this example?
The G3 final grade result was D ā 0.124 with 649 cases. The maximum ECDF-normal CDF gap occurred around G3 = 10, so G3 did not follow the fitted normal reference distribution.
Is the Kolmogorov Smirnov Test a normality test?
It can be used as a normality test when the observed distribution is compared with a normal CDF, but it is more generally a cumulative distribution comparison test.
What is the difference between one-sample and two-sample KS tests?
The one-sample KS test compares observed data with a theoretical distribution. The two-sample KS test compares the empirical distributions of two groups.
Why should KS results be interpreted carefully for grades?
Grades are discrete, rounded and bounded. They contain repeated values and ties, while the standard KS normality test assumes a continuous reference distribution.
Which variable had the largest KS D statistic in this analysis?
Failures had the largest one-sample KS D statistic, D = 0.492, because most students had zero failures and the variable was highly non-normal.
Which two-sample comparison was strongest?
The school comparison was strongest. G3 by school GP versus MS had D = 0.295 and p = 1.57e-11.
Can I run the Kolmogorov Smirnov Test in Python?
Yes. Use scipy.stats.kstest for one-sample tests and scipy.stats.ks_2samp for two-sample tests.
Can I run the Kolmogorov Smirnov Test in SPSS?
Yes. SPSS supports one-sample Kolmogorov Smirnov testing through its nonparametric and normality-testing procedures.
Google AdSense bottom placement reserved here


