PLoS ONE (Jan 2012)

Random phenotypic variation of yeast (Saccharomyces cerevisiae) single-gene knockouts fits a double pareto-lognormal distribution.

  • John H Graham,
  • Daniel T Robb,
  • Amy R Poe

DOI
https://doi.org/10.1371/journal.pone.0048964
Journal volume & issue
Vol. 7, no. 11
p. e48964

Abstract

Read online

Distributed robustness is thought to influence the buffering of random phenotypic variation through the scale-free topology of gene regulatory, metabolic, and protein-protein interaction networks. If this hypothesis is true, then the phenotypic response to the perturbation of particular nodes in such a network should be proportional to the number of links those nodes make with neighboring nodes. This suggests a probability distribution approximating an inverse power-law of random phenotypic variation. Zero phenotypic variation, however, is impossible, because random molecular and cellular processes are essential to normal development. Consequently, a more realistic distribution should have a y-intercept close to zero in the lower tail, a mode greater than zero, and a long (fat) upper tail. The double Pareto-lognormal (DPLN) distribution is an ideal candidate distribution. It consists of a mixture of a lognormal body and upper and lower power-law tails.If our assumptions are true, the DPLN distribution should provide a better fit to random phenotypic variation in a large series of single-gene knockout lines than other skewed or symmetrical distributions. We fit a large published data set of single-gene knockout lines in Saccharomyces cerevisiae to seven different probability distributions: DPLN, right Pareto-lognormal (RPLN), left Pareto-lognormal (LPLN), normal, lognormal, exponential, and Pareto. The best model was judged by the Akaike Information Criterion (AIC).Phenotypic variation among gene knockouts in S. cerevisiae fits a double Pareto-lognormal (DPLN) distribution better than any of the alternative distributions, including the right Pareto-lognormal and lognormal distributions.A DPLN distribution is consistent with the hypothesis that developmental stability is mediated, in part, by distributed robustness, the resilience of gene regulatory, metabolic, and protein-protein interaction networks. Alternatively, multiplicative cell growth, and the mixing of lognormal distributions having different variances, may generate a DPLN distribution.