Penalized Logistic Regression in Case-Control Studies
Likelihood-based inference of odds ratios in logistic regression models is problematic for small samples. For example, maximum-likelihood estimators maybe seriously biased or even non-existent due to separation. Firth (1993) proposed a penalized likelihood approach which avoids these problems. However, his approach is based on a prospective sampling design and its application to case-control data has not yet been fully justified.
To address the shortcomings of standard likelihood-based inference, we describe: i) naive application of Firth logistic regression, which ignores the case-control sampling design, and ii) an extension of Firth's method to case-control data (Zhang, 2006). We present a simulation study evaluating the empirical performance of the two approaches in small to moderate case-control samples. Our simulation results suggest that even though there is no formal justification for applying Firth logistic regression to case-control data, it performs as well as Zhang logistic regression which is justified for case-control data.