Scripting for large-scale sequencing based on Hadoop

dc.contributor.authorSchumacher, André
dc.contributor.authorPireddu, Luca
dc.contributor.authorKallio, Aleksi
dc.contributor.authorNiemenmaa, Matti
dc.contributor.authorKorpelainen, Eija
dc.contributor.authorZanetti, Gianluigi
dc.contributor.authorHeljanko, Keijo
dc.date.accessioned2014-05-16T08:03:04Z
dc.date.available2014-05-16T08:03:04Z
dc.date.issued2013
dc.description.abstractThe large volumes of data generated by modern sequencing experiments present significant challenges in their manipulation and analysis. Traditional approaches are often found to be complicated to scale. We describe our ongoing work on SeqPig, a tool that facilitates the use of the Pig Latin distributed scripting language to manipulate, analyze and query sequencing data applying the advances motivated by the “big data revolution” in data-intensive activities. SeqPig provides access to popular data formats and implements a number of custom sequencing-specific functions. Most importantly, it grants users access to the scalable Hadoop platform from a high level scripting languageIT
dc.description.pagenumber84-85IT
dc.description.statusPubblicatoIT
dc.identifier.doi10.14806/ej.19.A.628IT
dc.identifier.issn2226-6089
dc.identifier.urihttp://hdl.handle.net/11050/909
dc.language.isoenIT
dc.relation.ispartofEMBnet.journal. The Next NGS Challenge Conference: Data Processing and Integration 14-16 May 2013, Valencia, SpainIT
dc.relation.ispartofseries19;Suppl. A
dc.rightsAttribuzione - Non commerciale - Condividi allo stesso modo 3.0 Italia*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/it/*
dc.subjectbioinformaticsIT
dc.subjectngsIT
dc.subjectdata analysisIT
dc.subjectcloud computingIT
dc.subjecthigh-performance computingIT
dc.subject.een-cordisEEN CORDIS::SCIENZE BIOLOGICHE ::Ricerca sul genoma ::BioinformaticaIT
dc.subject.programProgram::Biomedicine::Bioinformatics (BI)IT
dc.titleScripting for large-scale sequencing based on HadoopIT
dc.typeArticoloIT
File
Original bundle
Ora in mostra 1 - 1 di 1
Caricamento...
Immagine di anteprima
Nome:
628-3761-2-PB.pdf
Dimensione:
327.01 KB
Formato:
Adobe Portable Document Format
Descrizione:
Articolo in Open Access
License bundle
Ora in mostra 1 - 1 di 1
Caricamento...
Immagine di anteprima
Nome:
license.txt
Dimensione:
2.06 KB
Formato:
Item-specific license agreed upon to submission
Descrizione:
collections