Open Access Research Article

Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives

S Déjean1*, PGP Martin2, A Baccini1 and P Besse1

Author Affiliations

1 Laboratoire de Statistique et Probabilités, UMR 5583, Université Paul Sabatier, Toulouse Cedex 9 31062, France

2 Laboratoire de Pharmacologie et Toxicologie, UR 66, Institut National de la Recherche Agronomique (INRA), 180 Chemin de Tournefeuille, BP 3, Toulouse Cedex 9 31931, France

For all author emails, please log on.

EURASIP Journal on Bioinformatics and Systems Biology 2007, 2007:70561 doi:10.1155/2007/70561


The electronic version of this article is the complete one and can be found online at: http://bsb.eurasipjournals.com/content/2007/1/70561


Received:14 December 2006
Revisions received:6 March 2007
Accepted:16 May 2007
Published:18 June 2007

© 2007 S. Déjean et al.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.

Research Article

References

  1. T Park, S-G Yi, S Lee, et al. Statistical tests for identifying differentially expressed genes in time-course microarray experiments. Bioinformatics 19(6), 694–703 (2003). PubMed Abstract | Publisher Full Text OpenURL

  2. SD Peddada, EK Lobenhofer, L Li, CA Afshari, CR Weinberg, DM Umbach, Gene selection and clustering for time-course and dose-response microarray experiments using order-restricted inference. Bioinformatics 19(7), 834–841 (2003). PubMed Abstract | Publisher Full Text OpenURL

  3. JD Storey, W Xiao, JT Leek, RG Tompkins, RW Davis, Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 102(36), 12837–12842 (2005). PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  4. YC Tai, TP Speed, A multivariate empirical Bayes statistic for replicated microarray time course data. The Annals of Statistics 34(5), 2387–2412 (2006). Publisher Full Text OpenURL

  5. MF Ramoni, P Sebastiani, IS Kohane, Cluster analysis of gene expression dynamics. Proceedings of the National Academy of Sciences of the United States of America 99(14), 9121–9126 (2002). PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  6. J Ernst, GJ Nau, Z Bar-Joseph, Clustering short time series gene expression data. Bioinformatics 21(1), i159–i168 (2005). PubMed Abstract | Publisher Full Text OpenURL

  7. CD Giurcǎneanu, I Tǎbuş, J Astola, Clustering time series gene expression data based on sum-of-exponentials fitting. EURASIP Journal on Applied Signal Processing 2005(8), 1159–1173 (2005). Publisher Full Text OpenURL

  8. NA Heard, CC Holmes, DA Stephens, DJ Hand, G Dimopoulos, Bayesian coclustering of Anopheles gene expression time series: study of immune defense response to multiple experimental challenges. Proceedings of the National Academy of Sciences of the United States of America 102(47), 16939–16944 (2005). PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  9. A Conesa, MJ Nueda, A Ferrer, M Talón, maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments. Bioinformatics 22(9), 1096–1102 (2006). PubMed Abstract | Publisher Full Text OpenURL

  10. J Letowski, R Brousseau, L Masson, Designing better probes: effect of probe size, mismatch position and number on hybridization in DNA oligonucleotide microarrays. Journal of Microbiological Methods 57(2), 269–278 (2004). PubMed Abstract | Publisher Full Text OpenURL

  11. J Ramsay, B Silverman, Functional Data Analysis, 2nd edn. (Springer, New York, NY, USA, 2005)

  12. Z Bar-Joseph, GK Gerber, DK Gifford, TS Jaakkola, I Simon, Continuous representations of time-series gene expression data. Journal of Computational Biology 10(3-4), 341–356 (2003). PubMed Abstract | Publisher Full Text OpenURL

  13. Z Bar-Joseph, Analyzing time series gene expression data. Bioinformatics 20(16), 2493–2503 (2004). PubMed Abstract | Publisher Full Text OpenURL

  14. PGP Martin, F Lasserre, C Calleja, et al. Transcriptional modulations by RXR agonists are only partially subordinated to PPAR signaling and attest additional, organ-specific, molecular cross-talks. Gene Expression 12(3), 177–192 (2005). PubMed Abstract | Publisher Full Text OpenURL

  15. PGP Martin, H Guillou, F Lasserre, et al. Novel aspects of PPAR-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study. Hepatology 45(3), 767–777 (2007). PubMed Abstract | Publisher Full Text OpenURL

  16. INRArray:, Laboratoire de Pharmacologie et Toxicologie, INRA. [http://www.inra.fr/internet/Centres/toulouse/pharmacologie/lpt.htm] webcite

  17. B Silverman, Some aspects of the spline smoothing approach to non-parametric regression curve fitting. Journal of the Royal Statistical Society: Series B 47(1), 1–52 (1985)

  18. P Besse, H Cardot, F Ferraty, Simultaneous non-parametric regressions of unbalanced longitudinal data. Computational Statistics & Data Analysis 24(3), 255–270 (1997). PubMed Abstract | Publisher Full Text OpenURL

  19. GAF Seber, Multivariate Observations (John Wiley & Sons, New York, NY, USA, 1984)

  20. KY Yeung, WL Ruzzo, Principal component analysis for clustering gene expression data. Bioinformatics 17(9), 763–774 (2001). PubMed Abstract | Publisher Full Text OpenURL

  21. H Chipman, TJ Hastie, T Tibshirani, Clustering microarray data. in Statistical Analysis of Gene Expression Microarray Data, ed. by Speed T (Chapmann & Hall/CRC Press, Boca Raton, Fla, USA, 2003), pp. 159–200

  22. S Kersten, J Seydoux, JM Peters, FJ Gonzalez, B Desvergne, W Wahli, Peroxisome proliferator-activated receptor mediates the adaptive response to fasting. Journal of Clinical Investigation 103(11), 1489–1498 (1999). PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  23. S Mandard, M Müller, S Kersten, Peroxisome proliferator-activated receptor target genes. Cellular and Molecular Life Sciences 61(4), 393–416 (2004). PubMed Abstract | Publisher Full Text OpenURL

  24. M Bauer, AC Hamm, M Bonaus, et al. Starvation response in mouse liver shows strong correlation with life-span-prolonging processes. Physiological Genomics 17(2), 230–244 (2004). PubMed Abstract | Publisher Full Text OpenURL