A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms.

abstract

BACKGROUND: With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries. RESULTS: We develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory. CONCLUSIONS: Our strategy minimizes memory consumption while simultaneously obtaining comparable or improved accuracy over existing algorithms. It provides support for incremental updates of assemblies when new libraries become available.

authors

published proceedings

BMC Genomics

altmetric score

17.446

author list (cited authors)

Sze, S., Pimsler, M. L., Tomberlin, J. K., Jones, C. D., & Tarone, A. M.

citation count

7

complete list of authors

Sze, Sing-Hoi||Pimsler, Meaghan L||Tomberlin, Jeffery K||Jones, Corbin D||Tarone, Aaron M

publication date

January 2017

publisher

Springer Nature Publisher

published in

BMC GENOMICS Journal

keywords

Algorithms
Alternative Splicing
Animals
Diptera
Drosophila Melanogaster
Gene Expression
Gene Expression Profiling
Mole Rats
RNA Splicing
RNA-seq
Sequence Analysis, RNA
Transcriptome Assembly

PubMed Central ID

28589866

Digital Object Identifier (DOI)

10.1186/s12864-017-3735-1

start page

387

volume

18

issue

Suppl 4

URL

http://dx.doi.org/10.1186/s12864-017-3735-1

A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms. Academic Article

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

volume

issue

Other

URL