Schmid, Karl
- Department of Plant Biology, Swedish University of Agricultural Sciences
Sliding-window analysis has widely been used to uncover synonymous (silent, d(S)) and nonsynonymous (replacement, d(N)) rate variation along the protein sequence and to detect regions of a protein under selective constraint (indicated by d(N)d(S)). The approach compares two or more protein-coding genes and plots estimates <(d)over cap>(S) and (d) over cap (N) from each sliding window along the sequence. Here we demonstrate that the approach produces artifactual trends of synonymous and nonsynonymous rate variation, with greater variation in (d) over cap (S) than in (d) over cap (N). Such trends are generated even if the true dS and dN are constant along the whole protein and different codons are evolving independently. Many published tests of negative and positive selection using sliding windows that we have examined appear to be invalid because they fail to correct for multiple testing. Instead, likelihood ratio tests provide a more rigorous framework for detecting signals of natural selection affecting protein evolution. We demonstrate that a previous finding that a particular region of the BRCA1 gene experienced a synonymous rate reduction driven by purifying selection is likely an artifact of the sliding window analysis. We evaluate various sliding-window analyses in molecular evolution, population genetics, and comparative genomics, and argue that the approach is not generally valid if it is not known a priori that a trend exists and if no correction for multiple testing is applied.
PLoS ONE
2008, volume: 3, number: 11, pages: e3746
Publisher: PUBLIC LIBRARY SCIENCE
Food Science
Forest Science
https://res.slu.se/id/publ/60364