"Matryoshka: A HMM Based Temporal Data Clustering Methodology for Modeling System Dynamics"
Author(s)
Li, C
Biswas, G
Dale, M
Dale, P
Griffith University Author(s)
Year published
2002
Metadata
Show full item recordAbstract
This paper discusses a temporal data clustering system that is based on the Hidden Markov Model(HMM) methodology. The proposed methodology improves upon existing HMM clustering methods in two ways. First, an explicit HMM model size selection procedure is incorporated into the clustering process, i.e., the sizes of the individual HMMs are dynamically determined for each cluster. This improves the interpretability of cluster models, and the quality of the final clustering partition results. Second, a partition selection method is developed to ensure an objective, data-driven selection of the number of clusters in the partition. ...
View more >This paper discusses a temporal data clustering system that is based on the Hidden Markov Model(HMM) methodology. The proposed methodology improves upon existing HMM clustering methods in two ways. First, an explicit HMM model size selection procedure is incorporated into the clustering process, i.e., the sizes of the individual HMMs are dynamically determined for each cluster. This improves the interpretability of cluster models, and the quality of the final clustering partition results. Second, a partition selection method is developed to ensure an objective, data-driven selection of the number of clusters in the partition. The result is a heuristic sequential search control algorithm that is computationally feasible. Experiments with artificially generated data and real world ecology data show that: (i) the HMM model size selection algorithm is effective in re-discovering the structure of the generating HMMs, (ii) the HMM clustering with model size selection significantly outperforms HMM clustering using uniform HMM model sizes for re-discovering clustering partition structures, (iii) it is able to produce interpretable and "interesting" models for real world data.
View less >
View more >This paper discusses a temporal data clustering system that is based on the Hidden Markov Model(HMM) methodology. The proposed methodology improves upon existing HMM clustering methods in two ways. First, an explicit HMM model size selection procedure is incorporated into the clustering process, i.e., the sizes of the individual HMMs are dynamically determined for each cluster. This improves the interpretability of cluster models, and the quality of the final clustering partition results. Second, a partition selection method is developed to ensure an objective, data-driven selection of the number of clusters in the partition. The result is a heuristic sequential search control algorithm that is computationally feasible. Experiments with artificially generated data and real world ecology data show that: (i) the HMM model size selection algorithm is effective in re-discovering the structure of the generating HMMs, (ii) the HMM clustering with model size selection significantly outperforms HMM clustering using uniform HMM model sizes for re-discovering clustering partition structures, (iii) it is able to produce interpretable and "interesting" models for real world data.
View less >
Journal Title
Intelligent Data Analysis
Volume
6
Issue
3
Subject
Data management and data science
Cognitive and computational psychology