Selectoscope: A Modern Web-App for Positive Selection Analysis of Genomic Data

  • Andrey V. ZaikaEmail author
  • Iakov I. Davydov
  • Mikhail S. Gelfand
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9683)


Selectoscope is a web application which combines a number of popular tools used to infer positive selection in an easy to use pipeline. A set of homologous DNA sequences to be analyzed and evaluated are submitted to the server by uploading protein-coding gene sequences in the FASTA format. The sequences are aligned and a phylogenetic tree is constructed. The codeml procedure from the PAML package is used first to adjust branch lengths and to find a starting point for the likelihood maximization, then FastCodeML is executed. Upon completion, branches and positions under positive selection are visualized simultaneously on the tree and alignment viewers. Run logs are accessible through the web interface. Selectoscope is based on the Docker virtualization technology. This makes the application easy to install with a negligible performance overhead. The application is highly scalable and can be used on a single PC or on a large high performance clusters. The source code is freely available at


Positive selection Codeml Fastcodeml 



This study was supported by the Scientific & Technological Cooperation Program Switzerland-Russia (RFBR grant 16-54-21004 and Swiss National Science Foundation project ZLRZ3_163872).


  1. 1.
    Valle, M., Schabauer, H., Pacher, C., Stockinger, H., Stamatakis, A., Robinson-Rechavi, M., Salamin, N.: Optimisation strategies for fast detection of positive selection on phylogenetic trees. Bioinformatics 30(8), 1129–1137 (2014)CrossRefGoogle Scholar
  2. 2.
    Yang, Z.: PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24(8), 1586–1591 (2007)CrossRefGoogle Scholar
  3. 3.
    Yang, Z., Nielsen, R., Goldman, N., Pedersen, A.M.: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155(1), 431–449 (2000)Google Scholar
  4. 4.
    Zhang, J., Nielsen, R., Yang, Z.: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol. Biol. Evol. 22(12), 2472–2479 (2005). Epub 2005 Aug 17CrossRefGoogle Scholar
  5. 5.
    Kosakovsky Pond, S.L., Murrell, B., Fourment, M., Frost, S.D., Delport, W., Scheffler, K.A.: A random effects branch-site model for detecting episodic diversifying selection. Mol. Biol. Evol. 28(11), 3033–3043 (2011). doi: 10.1093/molbev/msr125. Epub 2011 Jun 13CrossRefGoogle Scholar
  6. 6.
    Murrell, B., Wertheim, J.O., Moola, S., Weighill, T., Scheffler, K., Kosakovsky Pond, S.L.: Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 8(7), e1002764 (2012). doi: 10.1371/journal.pgen.1002764. Epub 2012 Jul 12CrossRefGoogle Scholar
  7. 7.
    Redelings, B.: Erasing errors due to alignment ambiguity when estimating positive selection. Mol. Biol. Evol. 31(8), 1979–1993 (2014). doi: 10.1093/molbev/msu174. Epub 2014 May 27CrossRefGoogle Scholar
  8. 8.
    Fletcher, W., Yang, Z.: The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol. Biol. Evol. 27(10), 2257–2267 (2010). doi: 10.1093/molbev/msq115. Epub 2010 May 5CrossRefGoogle Scholar
  9. 9.
    Diekmann, Y., Pereira-Leal, J.B.: Gene tree affects inference of sites under selection by the branch-site test of positive selection. Evol. Bioinform. 11(Suppl. 2), 11–17 (2016). doi: 10.4137/EBO.S30902. eCollection 2015CrossRefGoogle Scholar
  10. 10.
    Sela, I., Ashkenazy, H., Katoh, K., Pupko, T.: GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters. Nucleic Acids Res. 43(W1), W7–14 (2015). doi: 10.1093/nar/gkv318. Epub 2015 Apr 16CrossRefGoogle Scholar
  11. 11.
    Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59(3), 307–321 (2010)CrossRefGoogle Scholar
  12. 12.
    Katoh, K., Standley, D.M.: MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30(4), 772–780 (2013). doi: 10.1093/molbev/mst010. Epub 2013 Jan 16CrossRefGoogle Scholar
  13. 13.
    Yang, Z., Nielsen, R., Goldman, N., Pedersen, A.M.: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155(1), 431–449 (2000)Google Scholar
  14. 14.
    BioJS, the leading, open-source JavaScript visualization library for life sciences.
  15. 15.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Andrey V. Zaika
    • 1
    Email author
  • Iakov I. Davydov
    • 3
    • 4
  • Mikhail S. Gelfand
    • 1
    • 2
  1. 1.A.A. Kharkevich Institute for Information Transmission ProblemsRussian Academy of SciencesMoscowRussia
  2. 2.Faculty of Bioengineering and BioinformaticsLomonosov Moscow State UniversityMoscowRussia
  3. 3.Department of Ecology and EvolutionUniversity of LausanneLausanneSwitzerland
  4. 4.Swiss Institute of BioinformaticsLausanneSwitzerland

Personalised recommendations