Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013;14(10):R116.
doi: 10.1186/gb-2013-14-10-r116.

Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

Open-Phylo: a customizable crowd-computing platform for multiple sequence alignment

Daniel Kwak et al. Genome Biol. 2013.

Abstract

Citizen science games such as Galaxy Zoo, Foldit, and Phylo aim to harness the intelligence and processing power generated by crowds of online gamers to solve scientific problems. However, the selection of the data to be analyzed through these games is under the exclusive control of the game designers, and so are the results produced by gamers. Here, we introduce Open-Phylo, a freely accessible crowd-computing platform that enables any scientist to enter our system and use crowds of gamers to assist computer programs in solving one of the most fundamental problems in genomics: the multiple sequence alignment problem.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Open-Phylo crowd-computing system. (1) Scientists upload their sequences to the database, validate the alignment puzzles built by the system (See green box in the data administration interface), or select new ones. (2) The same users monitor the progress of the crowd in improving their alignments, close puzzles, open new puzzles and finally (3) download the best solutions. The crowd-computing engine is powered by (a) many casual gamers playing classic puzzles and (b) a smaller number of experienced players, who have access to larger and more difficult puzzles.
Figure 2
Figure 2
Performance of Open-Phylo using the casual or expert version of the Phylo video game. Ratio of puzzles improved by Open-Phylo for the scoring functions Ancestor (top left), MUSCLE (top right), GUIDANCE (bottom right) and T-Coffee (bottom left). The alignment program used to calculate the initial MSAs is indicated on the axis of the radar charts: Multiz (north), MUSCLE (west), PRANK (south) and T-Coffee (east). The area surrounded by a blue line corresponds to the performance achieved with the casual puzzles only, while the area surrounded by a red line indicates the performance of the expert version only. The area surrounded by a dashed green line shows the ratio of alignments improved by either the classic or expert version.
Figure 3
Figure 3
A multiple sequence alignment improved with the expert version of Phylo. (a) A section of the input alignment of the P53 gene calculated with MUSCLE. (b) The improved alignment obtained with the expert version of Phylo. Three nucleotides from the elephant sequence (loxAfr3) have been moved to increase the conservation of alignment columns 6, 32 and 33. The player also improved the alignment of columns 48 and 49 and revealed similarities not found in the original alignment. Image produced with Jalview [21].
Figure 4
Figure 4
Comparison of the improvements provided by the casual and expert versions. The ratio of optimal solutions obtained with the casual version is shown in the area surrounded by a blue line, and the ratio obtained with the expert version in red. Each radar chart corresponds to one of the objective functions: Ancestor (top left), MUSCLE (top right), GUIDANCE (bottom right) and T-Coffee (bottom left). The alignment program used to calculate the initial MSAs is indicated on the axis of the radar charts: Multiz (north), MUSCLE (west), PRANK (south) and T-Coffee (east).
Figure 5
Figure 5
Performance of the game scoring function in identifying the best alignments. The graphs show the distributions of the rank (calculated using the scoring function used in the game) of the best solutions found in the casual or classic game (that is where casual submissions were inserted into the initial MSA and found to have the best score). Each histogram corresponds to a different objective function: Ancestor (top left), GUIDANCE (top right), MUSCLE (bottom left) and T-Coffee (bottom right).
Figure 6
Figure 6
Usage statistics for Phylo from 3 December 2012 until 3 April 2013. (a) Number of classic (in green) and expert puzzles (in red) completed over time. The number of classic puzzles submitted by registered players is shown in blue. (b) Number of visits to the Phylo website per country. (c) Number of registered players vs minimum number of puzzles completed. Statistics for the classic version are shown in green for the classic version and in red for the expert version. (d) Social posts that led to a visit to Phylo. Social networks are shown in blue, social news services in red and blogs in green.

Similar articles

Cited by

References

    1. Blanchette M. Computation and analysis of genomic multi-sequence alignments. Annu Rev Genomics Hum Genet. 2007;14:193–213. doi: 10.1146/annurev.genom.8.080706.092300. - DOI - PubMed
    1. Wang L, Jiang T. On the complexity of multiple sequence alignment. J Comput Biol. 1994;14:337–348. doi: 10.1089/cmb.1994.1.337. - DOI - PubMed
    1. Notredame C. Recent evolutions of multiple sequence alignment algorithms. PLoS Comput Biol. 2007;14:e123. doi: 10.1371/journal.pcbi.0030123. - DOI - PMC - PubMed
    1. Burge SW, Daub J, Eberhardt R, Tate J, Barquist L, Nawrocki E, Eddy SR, Gardner PP, Bateman A. Rfam 11.0: 10 years of RNA families. Nucleic Acids Res. 2013;14:D226–D232. doi: 10.1093/nar/gks1005. - DOI - PMC - PubMed
    1. Kawrykow A, Roumanis G, Kam A, Kwak D, Leung C, Wu C, Zarour E, Sarmenta L, Blanchette M, Waldispühl J. Players phylo. Phylo: a citizen science approach for improving multiple sequence alignment. PLoS ONE. 2012;14:e31362. doi: 10.1371/journal.pone.0031362. - DOI - PMC - PubMed

Publication types

LinkOut - more resources