Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp010z708z75q
Full metadata record
DC FieldValueLanguage
dc.contributorSinger, Amit-
dc.contributor.advisorKpotufe, Samory-
dc.contributor.authorJiang, Hanxi Heinrich-
dc.date.accessioned2015-06-15T15:53:53Z-
dc.date.available2015-06-15T15:53:53Z-
dc.date.created2015-05-04-
dc.date.issued2015-06-15-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp010z708z75q-
dc.description.abstractWe present a new clustering algorithm based on first estimating the modes of the unknown density f from sample points generated by f. Our notion of mode is generalized to include points with densities similar to that of a local mode of f. Thus, we first identify sample points which can be confidently clustered together. We call these the cluster cores. The next step then consists of assigning the remaining points to cluster cores. We make mild assumptions on the underlying distribution and the shapes of the clusters. We develop mode estimation procedures based on the k-NN density estimate and use high probability finite-sample uniform bounds on the k-NN density estimator to obtain theoretically strong guarantees. The procedure can be implemented in O(nk log n) time. We show that the clustering algorithm is competitive against the state-of-art (including DBSCAN, mean-shift, and spectral clustering) based on simulations and real data experiments.en_US
dc.format.extent69 pagesen_US
dc.language.isoen_USen_US
dc.titleA practical clustering algorithm based on k-NN density mode estimation with provable learning boundsen_US
dc.typePrinceton University Senior Theses-
pu.date.classyear2015en_US
pu.departmentMathematicsen_US
pu.pdf.coverpageSeniorThesisCoverPage-
Appears in Collections:Mathematics, 1934-2020

Files in This Item:
File SizeFormat 
PUTheses2015-Jiang_Hanxi_Heinrich.pdf12.11 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.