Clustering of unevenly sampled gene expression time-series data

C. S. Möller-Levet, F. Klawonn, K. H. Cho*, H. Yin, O. Wolkenhauer

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

56 Scopus citations

Abstract

Time course measurements are becoming a common type of experiment in the use of microarrays. The temporal order of the data and the varying length of sampling intervals are important and should be considered in clustering time-series. However, the shortness of gene expression time-series data limits the use of conventional statistical models and techniques for time-series analysis. To address this problem, this paper proposes the fuzzy short time-series (FSTS) clustering algorithm, which clusters profiles based on the similarity of their relative change of expression level and the corresponding temporal information. One of the major advantages of fuzzy clustering is that genes can belong to more than one group, revealing distinctive features of each gene's function and regulation. Several examples are provided to illustrate the performance of the proposed algorithm. In addition, we present the validation of the algorithm by clustering the genes which define the model profiles in Chu et al. (Science, 282 (1998) 699). The fuzzy c-means, k-means, average linkage hierarchical algorithm and random clustering are compared to the proposed FSTS algorithm. The performance is evaluated with a well-established cluster validity measure proving that the FSTS algorithm has a better performance than the compared algorithms in clustering similar rates of change of expression in successive unevenly distributed time points. Moreover, the FSTS algorithm was able to cluster in a biologically meaningful way the genes defining the model profiles.

Original languageEnglish
Pages (from-to)49-66
Number of pages18
JournalFuzzy Sets and Systems
Volume152
Issue number1
DOIs
StatePublished - 16 May 2005
Externally publishedYes

Keywords

  • Fuzzy clustering
  • Gene expression
  • Short time series
  • Unevenly sampled

Fingerprint

Dive into the research topics of 'Clustering of unevenly sampled gene expression time-series data'. Together they form a unique fingerprint.

Cite this