Open Conference Systems, ITC 2016 Conference

Font Size: 
POSTER: The Impact of Group Imbalance on Logistic Regression Analyses with Assessment Data
Arwa Alkhalaf, Bruno Zumbo

Building: Pinnacle
Room: 2F-Harbourside Ballroom
Date: 2016-07-02 11:00 AM – 12:30 PM
Last modified: 2016-05-22

Abstract


Introduction

Logistic regression (LogReg) is widely used in analyzing educational assessment data, a special case of which is LogReg for differential item functioning (DIF). The model of interest in this paper is a LogReg akin to an analysis of covariance analysis with a dichotomous outcome variable and three predictors: a continuous covariate, a (skewed) dichotomous grouping variable and the interaction. The skewed grouping variable reflects an imbalance in sample sizes of the two groups. Little to no research has been done to examine the effects of skewed predictors on parameter estimates in logistic regression.

Objectives

The present simulation study investigates the impact of unbalanced group membership on the Type I error rate and statistical power of the Wald tests in this model.

Methods

To examine Type I error of the Wald tests: a 4×4×10 completely crossed factorial design, varying three factors: sample size (from 200 to 5000), skewness of the dichotomous predictor (from 50:50 to 1:99), skewness of the dependent variable (from 50:50 to 1:99). To examine power, a 4×4×10×2 completely crossed factorial design, varying four factors: the same three as studied for Type I error, and effect size of dichotomous predictor and interaction (odds ratios of 2 and 4).

Results and Conclusions

The Type I error and power findings are a complicated interaction of the skewness of the dependent variable, the imbalance of the group sizes (i.e., skewness of the grouping predictor variable), and sample size. As a general statement, the Type I error rate and power are negatively affected by severe imbalance in group sizes. In cases wherein the Type I error rate of the Wald test of the grouping variable is effected, it is consistently deflated and close to zero. The complicated findings will be interpreted focusing on providing advice for data analysts and practitioners.


An account with this site is required in order to view papers. Click here to create an account.