讲座题目:多源数据融合统计建模方法及应用
主 讲 人:厦门大学方匡南教授
讲座时间:2023年6月15日(周四)14:30-16:00
讲座地点:6号学院楼402会议室
主办单位:002cc白菜资讯 浙江省2011“数据科学与大数据分析协同创新中心”
摘 要:
In diverse fields ranging from finance to omics, it is increasingly common that data is distributed and with multiple individual sources (referred to as \clients" in some studies).Integrating raw data, although powerful, is often not feasible, for example, when there areconsiderations on privacy protection. Distributed learning techniques have been developedto integrate summary statistics as opposed to raw data. In many of the existing distributedlearning studies, it is stringently assumed that all the clients have the same model. To accommodate data heterogeneity, some federated learning methods allow for client-specificmodels. In this article, we consider the scenario that clients form clusters, those in the samecluster have the same model, and different clusters have different models. Further considering the clustering structure can lead to a better understanding of the \interconnections"among clients and reduce the number of parameters. To this end, we develop a novel penalization approach. Specifically, group penalization is imposed for regularized estimationand selection of important variables, and fusion penalization is imposed to automaticallycluster clients. An effective ADMM algorithm is developed, and the estimation, selection,and clustering consistency properties are established under mild conditions. Simulationand data analysis further demonstrate the practical utility and superiority of the proposedapproach.
主讲人简介:
方匡南,厦门大学经济学院统计学与数据科学系教授、博士生导师、耶鲁大学博士后,厦门大学经济学院统计学与数据科学系副主任,厦门大学信用大数据与智能风控研究中心主任,国际统计学会elected member,国家社科基金重大项目首席专家。主要从事统计机器学习、经济管理统计、金融科技等。入选国家级高层次青年拔尖人才、福建省高层次人才A类、福建省“特支双百计划”青年拔尖人才等。兼全国工业统计教学研究会副会长、中国商业统计学会常务理事、《统计研究》、《数理统计与管理》编委等。在国内外权威期刊共发表学术论文100余篇论文,著有学术专著和教材等6部。获省部级以上科研成果奖项10多项,多项科研成果被省部级以上领导批示。主持国家社科基金重大项目1项,国家自然科学基金4项,以及教育部人文社科、国家统计局重大项目等10多项纵向项目以及承担了华为、华星光电等30多项企事业横向项目。
欢迎感兴趣的师生积极参加!