Modal-set Estimation using kNN graphs, and Applications to Clustering

Yes

Modal-set Estimation using kNN graphs, and Applications to Clustering

Date Posted: October 12, 2018

Date Recorded: October 12, 2018

Speaker(s): Samory Kpotufe

All Captioned Videos
Brains, Minds and Machines Seminar Series

Associated CBMM Pages:

Brains, Minds + Machines Seminar Series: Modal-Set Estimation using kNN graphs, and Applications to Clustering

Description:

Samory Kpotufe, Princeton University

Estimating the mode or modal-sets (i.e. extrema points or surfaces) of an unknown density from sample is a basic problem in data analysis. Such estimation is relevant to other problems such as clustering, outlier detection, or can simply serve to identify low-dimensional structures in high dimensional-data (e.g. point-cloud data from medical-imaging, astronomy, etc). Theoretical work on mode-estimation has largely concentrated on understanding its statistical difficulty, while less attention has been given to implementable procedures. Thus, theoretical estimators, which are often statistically optimal, are for the most part hard to implement. Furthermore for more general modal-sets (general extrema of any dimension and shape) much less is known, although various existing procedures (e.g. for manifold-denoising or density-ridge estimation) have similar practical aim. I’ll present two related contributions of independent interest: (1) practical estimators of modal-sets – based on particular subgraphs of a k-NN graph – which attain minimax-optimal rates under surprisingly general distributional conditions; (2) high-probability finite sample rates for k-NN density estimation which is at the heart of our analysis. Finally, I’ll discuss recent successful work towards the deployment of these modal-sets estimators for various clustering applications.

Much of the talk is based on a series of work with collaborators S. Dasgupta, K. Chaudhuri, U. von Luxburg, and Heinrich Jiang.

Search form

You are here

Video

Yes

Modal-set Estimation using kNN graphs, and Applications to Clustering