Experiences with workflows for automating data-intensive bioinformatics

dc.contributor.authorSpjuth, Ola
dc.contributor.authorBongcam-Rudlof, Erik
dc.contributor.authorCarrasco Hernández, Guillermo
dc.contributor.authorForer, Lucas
dc.contributor.authorGiovacchini, Mario
dc.contributor.authorValls Guimera, Roman
dc.contributor.authorKallio, Aleksi
dc.contributor.authorKorpelainen, Eija
dc.contributor.authorKanduła, Maciej M
dc.contributor.authorKrachunov, Milko
dc.contributor.authorKreil, David P.
dc.contributor.authorKulev, Ognyan
dc.contributor.authorŁabaj, Pavel P.
dc.contributor.authorLampa, Samuel
dc.contributor.authorPireddu, Luca
dc.contributor.authorSchönherr, Sebastian
dc.contributor.authorSiretskiy, Alexey
dc.contributor.authorVassilev, Dimitar
dc.date.accessioned2015-09-17T10:26:44Z
dc.date.available2015-09-17T10:26:44Z
dc.date.issued2015-08-19
dc.description.abstractHigh-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution.IT
dc.description.statusPubblicatoIT
dc.identifier.doi10.1186/s13062-015-0071-8IT
dc.identifier.issn1745-6150
dc.identifier.urihttp://hdl.handle.net/11050/1148
dc.language.isoenIT
dc.publisherBioMed CentralIT
dc.relation.ispartofBiology DirectIT
dc.relation.ispartofseries10;34
dc.rightsAttribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/it/*
dc.subjectworkflowIT
dc.subjectautomationIT
dc.subjectbig dataIT
dc.subjectreproducibilityIT
dc.subjecthigh-performance computingIT
dc.subjectdata-intensiveIT
dc.subject.een-cordisEEN CORDIS::SCIENZE BIOLOGICHE ::Biologia / biotecnologia ::Progettazione MolecolareIT
dc.subject.een-cordisEEN CORDIS::SCIENZE BIOLOGICHE ::Ricerca sul genoma ::BioinformaticaIT
dc.subject.programProgram::Data Fusion::Visual Computing (VIC)IT
dc.titleExperiences with workflows for automating data-intensive bioinformaticsIT
dc.typeArticoloIT
File
Original bundle
Ora in mostra 1 - 1 di 1
Caricamento...
Immagine di anteprima
Nome:
s13062-015-0071-8.pdf
Dimensione:
1.35 MB
Formato:
Adobe Portable Document Format
Descrizione:
License bundle
Ora in mostra 1 - 1 di 1
Caricamento...
Immagine di anteprima
Nome:
license.txt
Dimensione:
2.06 KB
Formato:
Item-specific license agreed upon to submission
Descrizione:
collections