BSAN 730: Large Scale Data Analysis (graduate) - Spring 2021 - 2025
In this course I focus on the statistical analysis of large-scale data. Students learn how some well-known statistical tools can be adapted for the analysis of Big Data and how the limitations of classical tools have engineered the development of modern techniques for data analysis. I cover topics such as split and conquer techniques for variable selection, scalable Bootstrap, Conformal Inference and a gentle introduction to large-scale Multiple Testing. This course relies heavily on computer programming using R and the emphasis is primarily on business applications. Special thanks to the guest speakers (Weinan Wang, Aniruddha Neogi, Joshua Derenski, Bradley Rava, Jacob Dice, Sara Almohtasib) who have given a lecture in this class and have shared their unique experiences in managing and analyzing large data.