讲座题目:高维海量数据双分割模型整合
主讲人:武汉大学刘妍岩教授
讲座时间:2022年12月8日(周四)14:30-15:30
讲座地点:腾讯会议(469-612-022)
主办单位:002cc白菜资讯浙江省2011“数据科学与大数据分析协同创新中心”
摘 要:
Massive data are often featured with high dimensionality as well as large sample size, which typically cannot be stored in a single machine and thus make both analysis and prediction challenging. We propose a distributed gridding model aggregation (DGMA) approach to predicting the conditional mean of a response variable, which overcomes the storage limitation of a single machine and the curse of high dimensionality. Specifically, on each local machine that stores partial data of relatively moderate sample size, we develop the model aggregation approach by splitting predictors wherein a greedy algorithm is developed. To obtain the optimal weights across all local machines, we further design a distributed and communication-efficient algorithm. Our procedure effectively distributes the workload and dramatically reduces the communication cost. Extensive numerical experiments are carried out on both simulated and real datasets to demonstrate the feasibility of the DGMA method.
主讲人简介:
刘妍岩,武汉大学数学与统计学院教授,博士生导师。2001年获武汉大学理学博士学位。主要研究方向为生存分析、半参数统计推断、复杂高维数据模型结构选择以及大数据统计分析技术等。曾到美国北卡来罗纳大学教堂山分校、加拿大Simon-Fraser大学、香港理工大学、香港中文大学、德国Greifswald大学等学校短期访问和工作。主持完成国家自然科学基金以及教育部基金项目6项,在统计学期刊JournalofMachine Learning Research, Biometrics, Biostatistics,Genetics,LifetimeDataAnalysis等期刊发表SCI研究论文六十余篇。目前担任国际统计学期刊statisticalpapers副主编,数理统计与管理副主编(2022.01-2025.12),中国现场统计学会第十一届理事会常务理事、中国数学会女专家工作委员会委员。
欢迎感兴趣的师生积极参加!