Research Article
Assessing the Exceptionality of Coloured Motifs in Networks
1 Institut National de la Recherche Agronomique (INRA), UR1077, Unité Mathématique, Informatique et Génome, 78352 Jouy-en-Josas, France
2 Centre for Genomic Regulation (CRG), Genome Bioinformatics Group, Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain
3 Université de Lyon, 69000 Lyon, France
4 Laboratoire de Biométrie et Biologie Évolutive, Université Claude Bernard Lyon 1, CNRS/UMR 5558, 69622 Villeurbanne, France
5 Projet BAMBOO, Institut National de Recherche Informatique et en Automatique (INRIA) Rhône-Alpes, 655 avenue de l'Europe, 38330 Montbonnot Saint-Martin, France
EURASIP Journal on Bioinformatics and Systems Biology 2009, 2009:616234 doi:10.1155/2009/616234
Published: 26 October 2008Abstract
Various methods have been recently employed to characterise the structure of biological
networks. In particular, the concept of network motif and the related one of coloured
motif have proven useful to model the notion of a functional/evolutionary building
block. However, algorithms that enumerate all the motifs of a network may produce
a very large output, and methods to decide which motifs should be selected for downstream
analysis are needed. A widely used method is to assess if the motif is exceptional,
that is, over- or under-represented with respect to a null hypothesis. Much effort
has been put in the last thirty years to derive
-values for the frequencies of topological motifs, that is, fixed subgraphs. They
rely either on (compound) Poisson and Gaussian approximations for the motif count
distribution in Erdös-Rényi random graphs or on simulations in other models. We focus
on a different definition of graph motifs that corresponds to coloured motifs. A coloured
motif is a connected subgraph with fixed vertex colours but unspecified topology.
Our work is the first analytical attempt to assess the exceptionality of coloured
motifs in networks without any simulation. We first establish analytical formulae
for the mean and the variance of the count of a coloured motif in an Erdös-Rényi random
graph model. Using simulations under this model, we further show that a Pólya-Aeppli
distribution better approximates the distribution of the motif count compared to Gaussian
or Poisson distributions. The Pólya-Aeppli distribution, and more generally the compound
Poisson distributions, are indeed well designed to model counts of clumping events.
Altogether, these results enable to derive a
-value for a coloured motif, without spending time on simulations.



