The Center for Big Data Analytics (CBDA) is an interdisciplinary research center focusing on large-scale data analysis.

Over the past decade, faced with modern data settings, off-the-shelf statistical machine learning methods are frequently proving insufficient. These modern settings pose three key challenges, which largely come under the rubric of “big data”: (a) the data might have a large number of features, in what we call “big-p” data, to denote the fact that the dimension p of the data is large, or (b) the data might have a large number of data instances, in what we call “big-n” data, to denote the fact that the number of samples n is large, or (c) the data-types could be complex: such as permutations, or strings, or graphs, which typically lie in some large and complex discrete space.

The center is engaged in fundamental research on all of the above aspects of big data, developing novel computational and statistical analyses of massive and complex data sets that arise in varied scientific and industrial applications. Its interdisciplinary research brings together computer scientists, applied mathematicians, and statisticians, working on applications from diverse areas, such as network analysis, predictive modeling, cancer genomics and bioinformatics.

The research themes of the center thus revolve around mathematical modeling of big data, new statistical methods for the analysis of high-dimensional data, fast and memory-efficient algorithms that leverage new parallel computer architectures, and broadly, data analytic solutions for challenging applications.