Mani et al. 10.1073/pnas.0712255105.
Fig. 3. Coefficients of variation for Study J data. Plot a shows a plot of coefficient of variation versus the mean for each single and double mutant that was studied in Study J. Plot b shows a region of plot a with a high density of points superimposed with a running median calculated with window size of 30.
Fig. 4. Comparing distributions of |eMin|, |eProduct|, |eLog|, and |eAdditive| using data from Study J. (a) Pairs from Study J for which the single and double mutants exhibited slower than wild-type growth. (b) The subset of pairs from a for which the phenotypes of both single mutants was also >0.9. (c) The subset of pairs from a for which the phenotypes of both single mutants was <0.9 but >0.75. (d) The subset of pairs for which phenotype of at least one of the single mutants was >0.75, and the phenotype of the other was >0.9.
Fig. 5. Comparing distributions of eMin, eProduct, eLog, and eAdditive, using data from Study S in the absence of MMS. (a) All pairs from Study S (MMS-). (b) The subset of pairs from a for which the phenotypes of both single mutants was also >0.9. (c) The subset of pairs from a for which the phenotypes of both single mutants was <0.9.
Fig. 6. Comparing distributions of eMin, eProduct, eLog, and eAdditive using data from Study S in the presence of MMS. (a) All pairs from Study S. (b) The subset of pairs from a for which the phenotypes of both single mutants was <0.9 but >0.75. (c) The subset of pairs from a for which phenotype of at least one of the single mutants was <0.75, and the phenotype of the other was <0.9.
Fig. 7. Comparing distributions of |eMin|, |eProduct|, |eLog|, and |eAdditive| using data from Study S in the absence of MMS. (a) All pairs from Study S (MMS-). (b) The subset of pairs from a for which the phenotypes of both single mutants was also >0.9. (c) The subset of pairs from a for which the phenotypes of both single mutants was <0.9.
Fig. 8. Comparing distributions of |eMin|, |eProduct|, |eLog|, and |eAdditive| using data from Study S in the presence of MMS. (a) All pairs from Study S. (b) The subset of pairs from a for which the phenotypes of both single mutants was <0.9 but >0.75. (c) The subset of pairs from a for which phenotype of at least one of the single mutants was <0.75, and the phenotype of the other was <9.
Fig. 9. Differences in e between Product, Min, and Additive definitions. The six plots show differences in e between each pair of definitions for various single-mutant fitness values Wx and Wy: (a) Min and Product definitions, (b) Min and Log definitions, (c) Min and Additive definitions, (d) Product and Log definitions, (e) Product and Additive definitions, and (f) Log and Additive definitions. Dark plot regions indicate that the pair of definitions agree closely on the double-mutant fitness predicted for a non-interacting gene pair. These plots show that all four definitions produce identical results when at least one of the single mutants has wild-type fitness. They also show that Product and Log definitions yield practically equivalent results under all circumstances.
Fig. 10. Comparison of Product and Log definitions. This plot shows Fig. 7d with more resolution on the contours.
Fig. 11. Distributions of e for interactions reported by Study T and Study P. Plots show how interactions identified in Study T (plot a) and Study P (plots b and c) map on to the corresponding e calculated for the Product definition in the Study S. Plot b shows only Study Psevere interactions. With this restriction, both Study T and Study Psevere (which use the Min definition) generally find more severe synthetic interactions that are also identified by the Product definition. Plot c shows only Study Pslight interactions. Those interactions reported by Study Pslight (which used the Min definition) and not by Study SProduct are near e = 0, suggesting that the differences between these studies arise from the definition of genetic interaction.
Fig. 12. Comparison of number of synthetic interactions reported in Study P and Study T. Results either include (Upper) or exclude (Lower) the SF and SF-slight interaction classes in Study P that were most affected by interaction definition choice, and consider only gene pairs tested in both studies (see Table 1).
SI Text
Computing distributions of e
For subsets of gene pairs, medians and median absolute deviations (defined as the median of the absolute deviation of the samples from sample median) were computed using the "median" and "mad" programs in MATLAB (MathWorks). For subsets of pairs, the standard error on the median was computed using 10,000 bootstrapped samples obtained using the "bootstrp" program in MATLAB. Distributions of e were plotted by computing histograms with a bin width of 0.04.
Measuring confidence intervals
Confidence intervals reported for confirmation and rejection rates were computed using the Clopper and Pearson exact method as implemented by the "binofit" program in MATLAB.
Defining shared function or functional links
A set of additional studies used the methodology described in the main manuscript for defining functional links between genes (1-3).
Deriving the Product definition
The Product definition is obtained by applying a multiplicative neutrality function to the fitness measure W defined for a strain as
Deriving the Additive definition
The Additive definition is obtained by applying a multiplicative neutrality function to the fitness measure Fexp defined for a strain as
Now, applying the relative-growth-rate fitness definition, W, fitness values for strains x, y, and xy are
Deriving the Log definition
The Log definition is obtained by applying a multiplicative neutrality function to the fitness measure
Genetic interaction definitions used previously but not considered here
Some other fitness measures used in conjunction with the multiplicative neutrality function (8-11) have not been considered here, either due to our inability to replicate the fitness scoring steps carried out in the corresponding studies across the quantitative datasets we wanted to compare in our work).
Bias shift for pairs involving extreme single-mutant fitness defects.
As described in the main text, we observed a positive shift in the distribution of e under each definition for gene pairs involving an extremely deleterious mutation, relative to the distribution (for the same definition) for pairs involving less severely deleterious mutations. We cannot rule out the possibility that definitions of genetic interaction truly have a shifted bias in the context of more deleterious mutations, as has been recently suggested (12). Another explanation is the preferential existence of compensatory mutations in extremely slow-growing mutant strains. This is supported by the fact that the positive shift in e bias we observed in Study J was more severe than that observed in Study S. Specifically, the more extreme positive shift in e for Study J (relative to Study S) under every genetic interaction definition may be explained by compensatory mutations that have arisen more frequently in Study J strains. Strains carrying compensatory mutations (e.g., aneuploidy) may overtake a population over time and affect measured growth rates, and this effect will be more pronounced for populations of doubly mutant strains with more severe fitness defects. Study S was less prone to compensatory mutations occurring either before, during, or after the meiosis combining the deletions. All of the parental strains with the nourseothricin resistance marker gene (Natr) were freshly generated from wild type and would have had less opportunity for aneuploidy or other compensatory mutation than strains from the preexisting library containing strains with the kanamycin resistance marker gene (Kanr). Furthermore, growth rates for Kanr and Natr strains corresponding to the same deleted gene were compared and where differences were observed, the Kanr and/or Natr strain was rederived from wild type. If this did not correct the difference in growth rates, the corresponding gene was eliminated from further analyses. Thus, Study S was less prone to compensatory mutations occurring before the meiotic combination of two deletion alleles. Furthermore, Study S strains were generated in the absence of MMS. Because no strains in this Study S had severe growth defects in the absence of MMS, there was less selection for compensatory mutations both before and after the meiosis combining the two deletion alleles. In Study J, it was argued that compensatory mutations present before the meiosis would not lead to any bias since the compensatory mutations would randomly segregate to wild-type, single- and double-mutant strains. Accepting this argument, the positive shift in e observed for extreme relative to moderate mutations could still be due to compensatory mutations occurring after the meiosis. Alternatively, mutations that suppress the growth defect of a mutation with which it was coselected may not impact growth similarly in the absence of that mutation. Thus, the fact that the study with the greater positive bias in e was also the study more prone to compensatory mutation is consistent with (but by no means proves) the idea that compensatory mutations are causing the bias.
Data
Difference between Product and Log definitions
For a given pair of deleterious single mutations, the predictions for double mutant phenotype from the Product and Log definitions presented herein are practically indistinguishable. In SI Fig. 3, we demonstrate this fact using a heatmap of the absolute difference between the double mutant fitness predictions for the two definitions over all deleterious fitness values of Wx and Wy. In SI Fig. 10, we show the same plot but with the scale modified. It is clear from this plot the numerical difference between the two definitions has a peak at Wx = Wy = 0.5. The maximum difference in e between Product and Log definitions is 0.02155. Note that the difference between Log and Product definitions becomes significantly larger for advantageous mutations, so that Product and Log models are only practically equivalent for deleterious mutations.
Overlap of Study J with Study T and Study P
Among the 45 pairs tested by both Study T and Study J is the pair NUP60-CTF18 that is labeled as synthetic by Study T. This pair has negative e for all four definitions. Among the 56 pairs tested by both Study J and Study P are two interactions labeled as synthetic by Study P: VAC14-CCR4 (interaction type SL/SF) and RAD52-RML2 (interaction type SF). The value of e in Study J is negative for all four definitions for VAC14-CCR4, but for RAD52-RML2 only the Min definition has a negative e.
1. Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. Science 306:1555-1558.
2. Wong SL, Zhang LV, Roth FP (2005) Discovering functional relationships: biochemistry versus genetics. Trends Genet 21:424-427.
3. Rual JF, et al. (2005) Towards a proteome-scale map of the human protein-protein interaction network. Nature 437:1173-1178.
4. St. Onge RP, et al. (2007) Systematic pathway analysis using high-resolution fitness profiling of combinatorial gene deletions. Nat Genet 39:199-206.
5. Szafraniec K, Wloch DM, Sliwa P, Borts RH, Korona R (2003) Small fitness effects and weak genetic interactions between deleterious mutations in heterozygous loci of the yeast Saccharomyces cerevisiae. Genet Res 82:19-31.
6. Jasnos L, Korona R (2007) Epistatic buffering of fitness loss in yeast double deletion strains. Nat Genet 39:550-554.
7. Sanjuan R, Elena SF (2006) Epistasis correlates to genomic complexity. Proc Natl Acad Sci USA 103:14402-14405.
8. Hartman JL, Tippery NP (2004) Systematic quantification of gene interactions by phenotypic array analysis. Genome Biol 5:R49.
9. Schuldiner M, et al. (2005) Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell 123:507-519.
10. Collins SR, Schuldiner M, Krogan NJ, Weissman JS (2006) A strategy for extracting and analyzing large-scale quantitative epistatic interaction data. Genome Biol 7:R63.
11. Collins SR, et al. (2007) Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature 446:806-810.
12. Beerenwinkel N, et al. (2007) Analysis of epistatic interactions and fitness landscapes using a new geometric approach. BMC Evol Biol 7:60.