1/30/2024 0 Comments Gtr model evolutionWhile the choice for an independent model of evolution, such as the general time-reversible model (GTR ) in combination with varying rates across sites (RAS e.g. ) or on their own using independent nucleotide models (see e.g. For example, in the case of analyzing protein-coding sequences, one needs to make sure to incorporate codon models as well as codon partition models in the model selection procedure.Ĭhloroplast genes, such as atpB and rbcL (the subjects of the analyses in this paper), are protein-coding genes that are often analyzed in concatenated alignments with non-coding sequences using independent nucleotide models (see e.g. Indeed, while researchers are widely adopting model selection techniques in phylogenetics in order to select the model that best fits their dataset, this may be problematic when complex evolutionary models are not included in popular model selection tools, such as Modeltest. Probabilistic modelling of sequence evolution has become the norm in phylogenetic inference, but complex evolutionary models are often not used in studies on molecular evolution, in part due to their increased computational burden but mainly due to the absence of such models in popular model testing tools. With the advent of new evolutionary models and drastic increases in computation power during the past decades, with desktop machines becoming more powerful and the advent of computer clusters with large amounts of processors (and processor cores) and a vast amount of memory, Maximum likelihood and Bayesian MCMC approaches now allow for very complex evolutionary models to be used in the analysis of large alignments. While the modelling of evolutionary processes in non-coding sequences has received much attention from a context-dependence point of view in the last two decades, the same cannot be said for modelling approaches for coding sequences, at least not in terms of developed model-based approaches. Finally, we observe that the substitution patterns in both datasets are drastically different, leading to the conclusion that combined analysis of these two genes using a single model may not be advisable from a context-dependent point of view. Context-dependent codon partition models hence perform closer to codon models, which remain the best performing models at a drastically increased computational cost, compared to codon partition models, but remain computationally interesting alternatives to codon models. Using Bayes factors based on thermodynamic integration, we show that in both datasets the same context-dependent codon partition model yields the largest increase in model fit compared to an independent evolutionary model. We show that, both in the atpB and rbcL alignments of a collection of land plants, these context-dependent codon partition models significantly improve model fit over existing codon partition models. Such context-dependent codon partition models employ a full dependency scheme for four-fold degenerate sites, whilst maintaining the independence assumption for the first and second codon positions. To this end, we have estimated and compared various existing independent models, codon models, codon partition models and context-dependent codon partition models for the atpB and rbcL genes of the chloroplast genome, which are frequently used in plant systematics. We present so-called context-dependent codon partition models to assess previous empirical claims that the evolution of four-fold degenerate sites is strongly dependent on the composition of its two flanking bases. Given that empirical research has provided indications of context-dependent substitution patterns at four-fold degenerate sites, we take those indications into account in this paper. Such codon partition models however impose independent evolution of the different codon positions, which is overly restrictive from a biological point of view. Lately, codon partition models have been proposed as a viable alternative, mimicking the substitution behaviour of codon models at a low computational cost. Accurate modelling of substitution processes in protein-coding sequences is often hampered by the computational burdens associated with full codon models.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |