Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Nov 2:11:542.
doi: 10.1186/1471-2105-11-542.

CaGrid Workflow Toolkit: a Taverna based workflow tool for cancer grid

Affiliations

CaGrid Workflow Toolkit: a Taverna based workflow tool for cancer grid

Wei Tan et al. BMC Bioinformatics. .

Abstract

Background: In biological and medical domain, the use of web services made the data and computation functionality accessible in a unified manner, which helped automate the data pipeline that was previously performed manually. Workflow technology is widely used in the orchestration of multiple services to facilitate in-silico research. Cancer Biomedical Informatics Grid (caBIG) is an information network enabling the sharing of cancer research related resources and caGrid is its underlying service-based computation infrastructure. CaBIG requires that services are composed and orchestrated in a given sequence to realize data pipelines, which are often called scientific workflows.

Results: CaGrid selected Taverna as its workflow execution system of choice due to its integration with web service technology and support for a wide range of web services, plug-in architecture to cater for easy integration of third party extensions, etc. The caGrid Workflow Toolkit (or the toolkit for short), an extension to the Taverna workflow system, is designed and implemented to ease building and running caGrid workflows. It provides users with support for various phases in using workflows: service discovery, composition and orchestration, data access, and secure service invocation, which have been identified by the caGrid community as challenging in a multi-institutional and cross-discipline domain.

Conclusions: By extending the Taverna Workbench, caGrid Workflow Toolkit provided a comprehensive solution to compose and coordinate services in caGrid, which would otherwise remain isolated and disconnected from each other. Using it users can access more than 140 services and are offered with a rich set of features including discovery of data and analytical services, query and transfer of data, security protections for service invocations, state management in service interactions, and sharing of workflows, experiences and best practices. The proposed solution is general enough to be applicable and reusable within other service-computing infrastructures that leverage similar technology stack.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Architecture of caGrid workflow toolkit. The solid rectangle comprises the components of the toolkit, and each component is numbered to illustrate their interactions with the environment.
Figure 2
Figure 2
All four plug-ins in Taverna service panel. The caGrid workflow toolkit currently contains four plug-ins, i.e., cagrid-activity (caGrid service... and caGrid service from WSDL...) for service discovery, invocation and security enforcement; cql-builder (CQL Builder) for visualized construction of CQL clause to query data services; cagrid-transfer-activity (CaGrid Transfer Activity) for file transfers between clients and services; cds-activity (CDS Activity) for credential delegation.
Figure 3
Figure 3
Service discovery GUI and the result. Search services whose description contains array, hosted by NCICB, and annotated with concept code C44282.
Figure 4
Figure 4
Query caArray data service and retrieve files [33]. CQL_Builder provides a GUI to build a complex CQL clause querying caArray files. CaGrid_Transfer_Activity downloads files to a local directory using caGrid transfer utility.
Figure 5
Figure 5
CQL builder to construct CQL querying clause to caArray service. CQL Builder dialog provides a GUI to build a complex CQL clause querying data services. The Edit criterion dialog is used to build querying criteria in CQL.
Figure 6
Figure 6
Invoke caGrid FQP securely and use credential delegation [34]. CDS_Activity issues an EPR of the delegated credential. FQP uses this EPR to fetch the actual delegated credential also from CDS and uses it to invoke multiple data services (the query activity) on behalf of the invoker.
Figure 7
Figure 7
lymphoma type prediction workflow and the result. Microarray data is extracted from caArray, preprocessed and used to learn a model for lymphoma type prediction. Result is a csv file describing the actual lymphoma type of each tumor sample and the prediction results using SVM and KNN algorithms, respectively.
Figure 8
Figure 8
The complete lymphoma type prediction workflow [35].

Similar articles

Cited by

References

    1. Bhagat J, Tanoh F, Nzuobontane E, Laurent T, Orlowski J, Roos M, Wolstencroft K, Aleksejevs S, Stevens R, Pettifer S, BioCatalogue: a universal catalogue of web services for the life sciences. Nucl Acids Res. 2010. p. gkq394. - PMC - PubMed
    1. Hull D, Wolstencroft K, Stevens R, Goble C, Pocock M, Li P, Oinn T. Taverna: a tool for building and running workflows of services. Nucleic acids research. 2006;34:W729–W732. doi: 10.1093/nar/gkl320. - DOI - PMC - PubMed
    1. Von Eschenbach A, Buetow K. Cancer Informatics Vision: caBIG. Cancer Informatics. 2006;2:22. - PMC - PubMed
    1. Saltz J, Kurc T, Hastings S, Langella S, Oster S, Ervin D, Sharma A, Pan T, Gurcan M, Permar J. et al.e-Science, caGrid, and Translational Biomedical Research. Computer. 2008;41:58–66. doi: 10.1109/MC.2008.459. - DOI - PMC - PubMed
    1. Foster I. Globus Toolkit Version 4: Software for Service-Oriented Systems. Journal of Computer Science and Technology. 2006;21:513–520. doi: 10.1007/s11390-006-0513-y. - DOI

Publication types

LinkOut - more resources