统计与数据科学系列学术讲座
Moving beyond Population Variable Importance: Concept, Theory and Applications of Individual Variable Importance

Abstract: In a nonparametric regression setting, we propose a novel concept of ''individual variable importance'', which refers to the relevance of some covariates with respect to an outcome variable among individuals with certain features. This concept holds practical importance for both risk assessment and association identification. For example, it can represent (i) the usefulness of expensive biomarkers in disease prediction for individuals at certain baseline risk, or (ii) age-specific associations between physiological indicators. We quantify the individual variable importance by a ratio parameter between two conditional mean squared errors. To infer this parameter we develop fully nonparametric estimators and establish their asymptotic properties. Our inferential approaches perform well in simulation studies. We further demonstrate our approaches through application to a real data set, showing a scientifically interesting result: the association between body shape and systolic blood pressure decays with increasing age. While our finding aligns with the existing medical literature based on standard parametric regression techniques, our analysis methods are more reliable because their validity is not affected by model misspecification. More importantly, the fully nonparametric nature equips the individual variable importance framework with broader applicability in contexts that go beyond traditional parametric modeling. It can capture complex relationships between variables that are likely to be missed by standard parametric interaction analyses.


About the Speaker:

戴国榕,复旦大学管理学院统计与数据科学系讲师。他于2019年获Texas A&M 统计学博士学位,随后留校从事博士后研究工作,直至2021年加入复旦大学。戴国榕博士的研究兴趣包括高维统计、缺失数据、半参数理论、半监督推断,以及统计方法在生物医学中的应用。