Regression Diagnostics and Heteroscedasticity Testing
Goldfeld Quandt Test is a regression diagnostic used to check whether residual variance changes across an ordered variable such as fitted values, income, time, study time, absences or another suspected source of heteroscedasticity. This complete guide explains the Goldfeld Quandt Test formula, null hypothesis, F statistic, p-value interpretation, R workflow, Python workflow, SPSS output, Excel calculation, charts and verified results from the student-por.csv dataset.
Google AdSense top placement reserved here
Quick Answer: Goldfeld Quandt Test Result
A Goldfeld Quandt Test was conducted on residuals from a regression model predicting G3 final grade. The observations were ordered by fitted G3 values, the middle section was dropped, and residual variances were compared between the low-ordered and high-ordered groups. The low fitted-value group had a residual variance of about 2.86, while the high fitted-value group had a residual variance of about 0.59. The directional statistic for high ordered variance divided by low ordered variance was about F = 0.205.
Final report sentence: A Goldfeld Quandt Test was used to examine heteroscedasticity in a regression model predicting G3 final grade. After ordering cases by fitted G3 and comparing the low and high ordered groups, the high ordered group showed lower residual variance than the low ordered group. The directional increasing-variance test was not supported (F ≈ 0.205, p ≈ 1.000), but the reverse direction indicated that residual spread was larger at the low fitted-grade end. Therefore, the model does not show increasing heteroscedasticity as fitted G3 rises, but the residual spread is not perfectly uniform across the fitted-value range.
Important interpretation: a small F ratio does not mean “no variance difference.” It means the variance in the high ordered group is smaller than the variance in the low ordered group when the ratio is calculated as high variance divided by low variance. Direction matters in the Goldfeld Quandt Test.
What Is the Goldfeld Quandt Test?
The Goldfeld Quandt Test is a test for heteroscedasticity in regression. Heteroscedasticity means that the error variance is not constant across the range of the data. In a good ordinary least squares regression model, residuals should ideally have roughly constant spread. If the residuals become wider or narrower as fitted values increase, the homoscedasticity assumption may be questionable.
The Goldfeld Quandt Test works by ordering observations according to a variable suspected of causing unequal variance. The data are then split into lower and upper groups, often with the middle portion omitted. A regression is effectively assessed across these ordered groups, and the residual variance from one group is compared with the residual variance from the other group using an F ratio.
Most short explanations simply say that the Goldfeld Quandt Test checks heteroscedasticity. That is incomplete. A proper guide should explain the ordering variable, the dropped middle section, the residual variance ratio, the direction of the alternative hypothesis, and how the result changes when the ordering variable changes. This post does that with R, Python, SPSS and Excel workflows.
Practical note: this test is most useful when you have a specific theory about where variance should change. For example, residual variance may increase with fitted income, predicted sales, time, production volume or an academic performance measure. In this worked example, fitted G3 is used as the main ordering variable because it directly shows whether prediction errors become larger or smaller across the predicted final-grade range.
If you are building a complete regression diagnostics library, you may also need Durbin Watson Test for residual autocorrelation, Brown Forsythe Test for robust variance comparison, Cochran C Test for largest-variance detection, Cramer von Mises Test for distributional fit, and DAgostino Pearson Test for skewness-kurtosis normality checking.
Goldfeld Quandt Test Formula
The Goldfeld Quandt Test compares the residual variance of two ordered groups. If the data are ordered from low to high according to a suspected heteroscedasticity variable, the test statistic can be written as:
F = s²_high / s²_low
Here, s²_high is the residual variance in the high ordered group and s²_low is the residual variance in the low ordered group. If the alternative hypothesis is that variance increases as the ordering variable increases, then a large F value supports heteroscedasticity. If F is much smaller than 1, the result points in the opposite direction: the higher ordered group has lower residual variance than the lower ordered group.
| Statistic pattern | General interpretation | Practical meaning |
|---|---|---|
| F near 1 | Similar residual variances | The lower and upper ordered groups have similar error spread. |
| F much greater than 1 | Possible increasing heteroscedasticity | The high ordered group has larger residual variance. |
| F much less than 1 | Possible decreasing heteroscedasticity | The high ordered group has smaller residual variance. |
| Small p-value | Evidence against equal residual variance | The residual spread differs across the ordered groups. |
Why the Middle Data Are Dropped
The middle part of the ordered data is often dropped to make the lower and upper groups more distinct. If the purpose is to compare low fitted values with high fitted values, the middle observations may blur the contrast. Dropping the middle section creates a cleaner comparison between the two ends of the ordered range.
Directional and Two-Sided Interpretation
The Goldfeld Quandt Test can be interpreted directionally or two-sided. A directional increasing test asks whether the high ordered group has greater variance than the low ordered group. A directional decreasing test asks whether the high ordered group has lower variance. A two-sided version asks whether the two variances differ in either direction.
Goldfeld Quandt Test Null Hypothesis and Alternative Hypothesis
| Hypothesis | Meaning | Applied to this example |
|---|---|---|
| H0 | The residual variance is equal across the ordered groups. | The low fitted G3 group and high fitted G3 group have equal residual variance. |
| H1 increasing | The high ordered group has greater residual variance. | Residual variance increases as fitted G3 increases. |
| H1 decreasing | The high ordered group has lower residual variance. | Residual variance decreases as fitted G3 increases. |
| H1 two-sided | The ordered groups have different residual variances. | Residual variance is not equal across low and high fitted G3 groups. |
Google AdSense middle placement reserved here
Dataset and Regression Model Used
This worked example uses the student-por.csv student performance dataset. The outcome variable is G3, the final grade. The regression model predicts G3 using earlier grades, study behavior and selected background variables. The Goldfeld Quandt Test is then applied to the regression residuals.
| Item | Verified value | Explanation |
|---|---|---|
| Dataset | student-por.csv | Student performance dataset used for regression diagnostics. |
| Rows | 649 | Total student observations used in the analysis. |
| Main outcome | G3 | Final grade, measured from 0 to 20. |
| Main model predictors | G1, G2, studytime, failures, absences, age, Medu, Fedu | Academic, study and background predictors. |
| Main ordering variable | Fitted G3 | Used to test whether residual variance changes across predicted final-grade level. |
| Middle section | 20% dropped | Used to separate the lower and upper ordered residual groups more clearly. |
External dataset source: UCI Machine Learning Repository: Student Performance dataset.
Verified Goldfeld Quandt Test Results in R, Python and SPSS
The analysis was reproduced in R, Python and SPSS. R used the regression model and Goldfeld Quandt workflow. Python used statsmodels and a manual split-verification method. SPSS saved predicted values and residuals, ordered the cases, created the low and high ordered groups, and manually compared residual variances.
Main Fitted-G3 Ordering Result
| Software | Ordering variable | Low group residual variance | High group residual variance | F statistic | Interpretation |
|---|---|---|---|---|---|
| R | Fitted G3 | ≈ 2.86 | ≈ 0.59 | ≈ 0.205 | High fitted G3 group has lower residual variance. |
| Python | Fitted G3 | ≈ 2.86 | ≈ 0.59 | ≈ 0.205 | No evidence of increasing variance; evidence points toward decreasing variance. |
| SPSS | Fitted G3 | ≈ 2.86 | ≈ 0.59 | ≈ 0.205 | Manual grouped residual calculation supports the R and Python result. |
Directional P-Value Interpretation
| Test direction | Approximate result | Meaning | Conclusion |
|---|---|---|---|
| Increasing variance with fitted G3 | p ≈ 1.000 | The high fitted group is not more variable. | Not supported. |
| Decreasing variance with fitted G3 | p < 0.001 | The high fitted group is less variable than the low fitted group. | Supported. |
| Two-sided variance difference | p < 0.001 | The two ordered groups have different residual variances. | Supported. |
SPSS Regression Model Summary
The SPSS output confirms that the regression model strongly predicts G3 before residual variance is checked. The Goldfeld Quandt Test is not a model-fit test by itself; it is a diagnostic test on residual variance after fitting the model.
| Model statistic | Value | Meaning |
|---|---|---|
| R | 0.922 | Strong association between fitted and observed G3. |
| R Square | 0.851 | The model explains about 85.1% of G3 variation. |
| Adjusted R Square | 0.849 | Adjusted explanatory power after accounting for predictors. |
| Std. Error of Estimate | 1.256 | Typical prediction error size in grade units. |
| F statistic | 456.111 | Overall model significance test. |
| Model Sig. | .000 | The model is statistically significant overall. |
Main Regression Coefficients from SPSS
| Predictor | B | Std. Error | Beta | t | Sig. | Short interpretation |
|---|---|---|---|---|---|---|
| Constant | -0.501 | 0.774 | — | -0.648 | 0.518 | Not significant. |
| G1 | 0.143 | 0.037 | 0.122 | 3.910 | 0.000 | Earlier grade G1 positively predicts G3. |
| G2 | 0.885 | 0.034 | 0.798 | 25.744 | 0.000 | G2 is the strongest predictor of G3. |
| studytime | 0.097 | 0.062 | 0.025 | 1.556 | 0.120 | Not significant after controlling for other variables. |
| failures | -0.235 | 0.095 | -0.043 | -2.471 | 0.014 | Previous failures negatively predict G3. |
| absences | 0.023 | 0.011 | 0.033 | 2.085 | 0.038 | Small positive coefficient in this controlled model. |
| age | 0.023 | 0.044 | 0.009 | 0.520 | 0.604 | Not significant. |
| Medu | -0.045 | 0.058 | -0.016 | -0.776 | 0.438 | Not significant. |
| Fedu | 0.022 | 0.059 | 0.007 | 0.371 | 0.711 | Not significant. |
Goldfeld Quandt Model Comparison
| Model | Predictors | Approximate F ratio | Interpretation |
|---|---|---|---|
| Main model | G1, G2, studytime, failures, absences, age, Medu, Fedu | ≈ 0.205 | High fitted group has lower residual variance than low fitted group. |
| Simple G1+G2 model | G1, G2 | ≈ 0.198 | Similar decreasing variance pattern. |
| Background model | studytime, failures, absences, age, Medu, Fedu, traveltime, health | ≈ 0.945 | Closer to equal variance than the grade-based models. |
Goldfeld Quandt Test Charts and Interpretation
1. Actual vs Fitted G3

This chart shows the relationship between actual G3 final grades and the model’s fitted values. The points follow a strong upward diagonal pattern, which means the regression model predicts G3 reasonably well. However, prediction strength does not automatically prove equal residual variance. That is why the Goldfeld Quandt Test is applied after the regression model is fitted.
2. F Null Distribution for Goldfeld Quandt Test

The observed F statistic lies far to the left of the upper-tail critical value. This means the data do not support the increasing-variance alternative when the ratio is high ordered variance divided by low ordered variance. In simple words, the high fitted G3 group does not have larger residual spread. The result points in the opposite direction because the low fitted G3 group has more residual variance.
3. Group Residual Variances

This is the most direct chart for understanding the result. The low ordered group has much higher residual variance than the high ordered group. That is why the F ratio is much smaller than 1 when calculated as high group variance divided by low group variance. The chart clearly shows that the main issue is not increasing spread at high fitted grades; the larger spread is concentrated at the lower fitted-grade end.
4. Goldfeld Quandt F Statistic Across Regression Models

The main model and the G1+G2 model both show small F ratios, meaning residual variance is lower in the high ordered fitted group. The background model has a value closer to 1, suggesting less group variance difference under that specification. This chart is important because heteroscedasticity diagnostics can change when the regression model changes.
5. Goldfeld Quandt Test P-Value Comparison

This chart shows why the choice of ordering variable matters. When ordered by fitted G3, the increasing-variance alternative has a p-value near 1 because the high fitted group is not more variable. The reverse and two-sided comparisons are much smaller. Other ordering variables, such as absences, studytime and failures, can also produce different diagnostic signals. A serious report should therefore explain the ordering variable instead of reporting a single unexplained p-value.
6. Regression Residuals in Goldfeld Quandt Subgroups

The boxplot shows that the low ordered group has more extreme residual values and a wider spread. The high ordered group is more compact. This visual result supports the numerical variance comparison. It also shows that some low fitted-grade cases have large negative residuals, which increases the residual variance in the lower ordered group.
7. Residuals vs Fitted Values

This chart is a standard heteroscedasticity diagnostic plot. The residuals are not evenly spread across all fitted values. The lower fitted region contains several larger negative residuals, while the higher fitted region looks tighter. This explains why the Goldfeld Quandt Test detects lower residual variance in the high fitted group.
8. Squared Residuals Ordered by Fitted G3

This plot opens the test mechanism visually. The observations are sorted by fitted G3, the middle section is dropped, and the two outer sections are compared. Large squared residuals are concentrated more strongly in the low fitted section. That is why the low ordered group has greater residual variance and why the high/low F ratio is small.
How to Run the Goldfeld Quandt Test in R, Python, SPSS and Excel
Goldfeld Quandt Test in R
In R, the Goldfeld Quandt Test can be run with the lmtest package. The safest workflow is to fit the regression model, calculate fitted values, order residuals by fitted values, and then compare the lower and upper residual variance groups manually as a verification step.
install.packages("lmtest")
library(lmtest)
student <- read.csv("student-por.csv", sep = ";", stringsAsFactors = FALSE)
num_vars <- c("G1", "G2", "G3", "studytime", "failures", "absences", "age", "Medu", "Fedu")
student[num_vars] <- lapply(student[num_vars], as.numeric)
model <- lm(G3 ~ G1 + G2 + studytime + failures + absences + age + Medu + Fedu,
data = student)
# Goldfeld Quandt Test ordered by fitted G3
fit_values <- fitted(model)
gq_greater <- gqtest(model, order.by = fit_values, fraction = 0.20, alternative = "greater")
gq_less <- gqtest(model, order.by = fit_values, fraction = 0.20, alternative = "less")
print(gq_greater)
print(gq_less)
# Manual verification
res <- resid(model)
ord <- order(fit_values)
res_ord <- res[ord]
n <- length(res_ord)
group_n <- floor(n * 0.40)
low_res <- res_ord[1:group_n]
high_res <- res_ord[(n - group_n + 1):n]
var_low <- var(low_res)
var_high <- var(high_res)
f_high_low <- var_high / var_low
var_low
var_high
f_high_low
Goldfeld Quandt Test in Python
In Python, use statsmodels for the regression model and then manually verify the Goldfeld Quandt logic by sorting residuals according to fitted values. This method is transparent and makes the split, dropped section and F ratio easy to understand.
import numpy as np
import pandas as pd
import statsmodels.api as sm
from scipy import stats
student = pd.read_csv("student-por.csv", sep=";")
cols = ["G1", "G2", "G3", "studytime", "failures", "absences", "age", "Medu", "Fedu"]
for col in cols:
student[col] = pd.to_numeric(student[col], errors="coerce")
data = student[cols].dropna().copy()
y = data["G3"]
X = data[["G1", "G2", "studytime", "failures", "absences", "age", "Medu", "Fedu"]]
X = sm.add_constant(X)
model = sm.OLS(y, X).fit()
data["fitted_g3"] = model.fittedvalues
data["residual"] = model.resid
data["residual_sq"] = data["residual"] ** 2
# Order by fitted G3
ordered = data.sort_values("fitted_g3").reset_index(drop=True)
n = len(ordered)
group_n = int(np.floor(n * 0.40))
low_group = ordered.iloc[:group_n].copy()
high_group = ordered.iloc[-group_n:].copy()
var_low = low_group["residual"].var(ddof=1)
var_high = high_group["residual"].var(ddof=1)
f_high_low = var_high / var_low
df_high = len(high_group) - 1
df_low = len(low_group) - 1
p_greater = stats.f.sf(f_high_low, df_high, df_low)
p_less = stats.f.cdf(f_high_low, df_high, df_low)
p_two_sided = min(1, 2 * min(p_greater, p_less))
print(model.summary())
print("Low ordered residual variance:", var_low)
print("High ordered residual variance:", var_high)
print("Goldfeld Quandt F statistic high/low:", f_high_low)
print("p-value greater:", p_greater)
print("p-value less:", p_less)
print("p-value two-sided:", p_two_sided)
Goldfeld Quandt Test in SPSS
SPSS does not need a special built-in button for this workflow. You can save predicted values and residuals from the regression model, sort by the ordering variable, define low and high ordered groups, and then compare residual variances manually.
* Step 1: Save predicted values and residuals from the regression model.
REGRESSION
/MISSING LISTWISE
/STATISTICS COEFF R ANOVA COLLIN
/DEPENDENT G3
/METHOD=ENTER G1 G2 studytime failures absences age Medu Fedu
/SAVE PRED(pred_g3) RESID(res_g3).
* Step 2: Keep a case id and sort by fitted G3.
COMPUTE caseid = $CASENUM.
EXECUTE.
SORT CASES BY pred_g3(A).
EXECUTE.
* Step 3: Create ordered position and total n.
COMPUTE ordered_position = $CASENUM.
EXECUTE.
AGGREGATE
/OUTFILE=* MODE=ADDVARIABLES
/BREAK=
/total_n = N(res_g3).
* Step 4: Use lower 40% and upper 40%, dropping the middle 20%.
COMPUTE group_n = TRUNC(total_n * .40).
COMPUTE gq_group = $SYSMIS.
IF (ordered_position LE group_n) gq_group = 1.
IF (ordered_position GT total_n - group_n) gq_group = 2.
VALUE LABELS gq_group
1 "Low ordered group"
2 "High ordered group".
EXECUTE.
* Step 5: Compare residual variance by group.
TEMPORARY.
SELECT IF NOT MISSING(gq_group).
EXAMINE VARIABLES=res_g3 BY gq_group
/PLOT BOXPLOT
/STATISTICS DESCRIPTIVES
/MISSING LISTWISE.
* Step 6: Save squared residuals for manual checking.
COMPUTE res_g3_sq = res_g3 ** 2.
EXECUTE.
TEMPORARY.
SELECT IF NOT MISSING(gq_group).
MEANS TABLES=res_g3 res_g3_sq BY gq_group
/CELLS MEAN COUNT STDDEV VARIANCE.
Goldfeld Quandt Test in Excel
Excel can reproduce the Goldfeld Quandt Test if you already have actual values, fitted values and residuals. The key is to sort by the selected ordering variable and compare residual variance in the lower and upper groups.
| Excel column | Content | Example formula or action |
|---|---|---|
| A | Observation ID | 1, 2, 3, … |
| B | Actual G3 | Observed final grade |
| C | Fitted G3 | Predicted final grade from regression |
| D | Residual | =B2-C2 |
| E | Squared residual | =D2^2 |
| Step | Sort | Sort all rows by fitted G3 from smallest to largest. |
| Step | Split | Use lower 40% and upper 40%; drop the middle 20%. |
| Low variance | Variance of lower group residuals | =VAR.S(D2:D260) |
| High variance | Variance of upper group residuals | =VAR.S(D391:D650) |
F statistic = High ordered residual variance / Low ordered residual variance
Right-tail p-value for increasing variance:
=F.DIST.RT(F_statistic, df_high, df_low)
Left-tail p-value for decreasing variance:
=F.DIST(F_statistic, df_high, df_low, TRUE)
Two-sided p-value approximation:
=MIN(1, 2*MIN(left_tail_p, right_tail_p))
For this dataset and fitted-G3 ordering, Excel should return an F ratio close to 0.205 if the same residuals, same ordering and same split rule are used.
How to Report the Goldfeld Quandt Test Result
A good report should mention the regression model, the ordering variable, the dropped middle section, the residual variance comparison, the F statistic and the direction of interpretation. Do not report only a p-value because the Goldfeld Quandt Test is directional and depends strongly on the chosen ordering variable.
APA-style report: A Goldfeld Quandt Test was conducted to evaluate heteroscedasticity in a regression model predicting G3 final grade from G1, G2, studytime, failures, absences, age, mother’s education and father’s education. Cases were ordered by fitted G3 values, and the middle 20% was omitted before comparing residual variances in the lower and upper ordered groups. The high ordered group showed lower residual variance than the low ordered group (F ≈ 0.205). The increasing-variance alternative was not supported, but the result indicated that residual spread was greater in the low fitted-value range.
Plain-language report: The model does not show that prediction errors become larger as fitted G3 increases. Instead, the errors are more spread out among lower fitted grades and more compact among higher fitted grades. This means the model has a variance-pattern issue, but the direction is decreasing rather than increasing across fitted G3.
When Should You Use the Goldfeld Quandt Test?
Use this test when you suspect that regression residual variance changes systematically across an ordered variable. It is especially useful when the expected variance pattern is connected to a specific scale, such as income, fitted values, production volume, age, study time, absences or another meaningful ordering variable.
| Situation | Use it? | Reason |
|---|---|---|
| Regression errors may increase with fitted values | Yes | Ordering by fitted values can reveal changing residual spread. |
| Residual variance may rise with income or sales | Yes | Economic variables often produce scale-related heteroscedasticity. |
| No clear ordering variable exists | Use caution | The test depends on a meaningful order. |
| You want a general heteroscedasticity test | Use with other tests | Breusch-Pagan or White tests may be more general. |
| Data contain strong outliers | Use visual checks too | Outliers can strongly affect residual variance comparisons. |
Common Mistakes in the Goldfeld Quandt Test
1. Not explaining the ordering variable
The Goldfeld Quandt Test depends on how the data are ordered. A result ordered by fitted values is not the same as a result ordered by absences, study time, failures or age. Always report the ordering variable.
2. Treating F less than 1 as automatically non-significant
An F ratio below 1 can still be meaningful. It may show that variance decreases instead of increases. Directional interpretation matters.
3. Ignoring the dropped middle section
If the middle section is dropped, report how much was dropped. In this post, the middle 20% was omitted to compare the lower 40% and upper 40% ordered groups.
4. Confusing heteroscedasticity with poor model fit
A model can predict well and still have unequal residual variance. The SPSS model has strong R Square, but the residual spread still changes across fitted G3.
5. Using only one test
The Goldfeld Quandt Test is useful, but it should be considered alongside residual plots and other heteroscedasticity tests. A complete diagnostic workflow often includes residuals vs fitted plots, Breusch-Pagan, White test and robust standard errors.
6. Forgetting outlier influence
Large residual outliers can strongly affect group variance. In this example, low fitted-grade cases include several larger negative residuals, which increases the variance of the low ordered group.
Download SPSS Output and Verification Files
The SPSS PDF verifies the regression output, saved predicted values, saved residuals, subgroup construction and manual Goldfeld Quandt residual variance comparison.
Sources and Method Notes
This post uses verified R, Python and SPSS outputs together with official software documentation for the test implementation and dataset source.
FAQs About the Goldfeld Quandt Test
What does the Goldfeld Quandt Test check?
It checks whether regression residual variance changes across an ordered variable. It is commonly used as a heteroscedasticity diagnostic.
What is the null hypothesis of the Goldfeld Quandt Test?
The null hypothesis is that residual variance is equal across the ordered groups. In simple terms, the model errors have constant variance.
What does a small p-value mean?
A small p-value means there is evidence that residual variance differs across the ordered groups. The direction depends on whether the alternative is increasing, decreasing or two-sided.
What was the result in this example?
The high fitted G3 group had lower residual variance than the low fitted G3 group. The increasing-variance alternative was not supported, but the result showed decreasing residual spread across fitted G3.
Why is fitted G3 used as the ordering variable?
Fitted G3 is used because it tests whether residual variance changes across the predicted grade range. It is a common diagnostic choice when checking whether model errors become wider or narrower as fitted values increase.
Can the Goldfeld Quandt Test be run in R?
Yes. In R, the lmtest package provides the gqtest() function, and the statistic can also be calculated manually from ordered residual variances.
Can the Goldfeld Quandt Test be run in Python?
Yes. Python users can apply statsmodels and manually verify the test by sorting residuals, dropping the middle section and comparing residual variance between low and high ordered groups.
Can the Goldfeld Quandt Test be run in SPSS?
Yes. SPSS can save predicted values and residuals. After sorting by the ordering variable, you can create low and high ordered groups and compare residual variance manually.
Can the Goldfeld Quandt Test be done in Excel?
Yes. If you have actual values, fitted values and residuals, Excel can sort by the ordering variable, calculate group variances and compute the F statistic and p-values.
Is the Goldfeld Quandt Test the same as the Breusch-Pagan test?
No. Both are heteroscedasticity tests, but the Goldfeld Quandt Test compares residual variances across ordered groups, while the Breusch-Pagan test models squared residuals against explanatory variables.
Google AdSense bottom placement reserved here
“`


