Recovery Guarantees for Kernel-based Clustering under Non-parametric Mixture Models

Despite the ubiquity of kernel-based clustering, surprisingly few statistical guarantees exist beyond settings that consider strong structural assumptions on the data generation process. In this work, we take a step towards bridging this gap by …

On the optimality of kernels for high-dimensional clustering

This paper studies the optimality of kernel methods in high-dimensional data clustering. Recent works have studied the large sample performance of kernel clustering in the high-dimensional regime, where Euclidean distance becomes less informative. …

Statistical vs computational limits of kernels.

Surprisingly (or rather unsurprisingly), studying the interplay between statistical and computational properties of algorithms is not a common place in Machine learning although it is glaringly obvious that 1) without computational constraints, exhaustive search or Maximum likelihood based approaches can always find the right solution, if it exists - (statistical optimality). 2) Without statistical guarantees, computationally efficient algorithms are mere heuristics and cannot be deployed in high-stake real-world applications. This focus of this project is to contribute to bridging this gap.

Neural networks as optimization toolboxes.

The idea of using Neural networks for optimisation is not a particularly novel one. Already, in the 80's and 90's, there were a few papers (1, 2) that touched upon this idea. Having said that, this idea has not received the attention that it deserves from the ML community. For instance, many traditional unsupervised learning methods constitute minimising an objective function with or without constraints (for instance, clustering, dimensionality reduction, ordinal embedding, etc).