Scaling-up model-based clustering algorithm by working on clustering features

Document Type

Conference paper

Source Publication

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Publication Date

1-1-2002

Volume

2412

First Page

569

Last Page

575

Publisher

Springer Verlag

Abstract

In this paper, we propose EMACF (Expectation- Maximization Algorithm for Clustering Features) to generate clusters from data summaries rather than data items directly. Incorporating with an adaptive grid-based data summarization procedure, we establish a scalable clustering algorithm: gEMACF. The experimental results show that gEMACF can generate more accurate results than other scalable clustering algorithms. The experimental results also indicate that gEMACF can run two order of magnitude faster than the traditional expectation-maximization algorithm with little loss of accuracy.

DOI

10.1007/3-540-45675-9_86

Print ISSN

03029743

Publisher Statement

Copyright © Springer-Verlag Berlin Heidelberg 2002. Access to external full text or publisher's version may require subscription.

Additional Information

Paper presented at the 3rd International Conference on Intelligent Data Engineering and Automated Learning, Aug 12-14, 2002, Manchester, England.

ISBN of the source publication: 9783540440253

Full-text Version

Publisher’s Version

Language

English

Share

COinS