NextDenovo is a string graph-based de novo assembler for long reads (CLR, HiFi and ONT). It uses a “correct-then-assemble” strategy similar to canu (no correction step for PacBio HiFi reads), but requires significantly less computing resources and storages. After assembly, the per-base accuracy is about 98-99.8%, to further improve single base accuracy, try NextPolish.
NextDenovo contains two core modules: NextCorrect and NextGraph. NextCorrect can be used to correct long noisy reads with approximately 15% sequencing errors, and NextGraph can be used to construct a string graph with corrected reads. It also contains a modified version of minimap2 and some useful utilities (see utilities for more details).
We benchmarked NextDenovo against other assemblers using Oxford Nanopore long reads from human and Drosophila melanogaster, and PacBio continuous long reads (CLR) from Arabidopsis thaliana. NextDenovo produces more contiguous assemblies with fewer contigs compared to the other tools. NextDenovo also shows a high assembly accurate level in terms of assembly consistency and single-base accuracy.
click here or use the following command:
wget https://github.com/Nextomics/NextDenovo/releases/latest/download/NextDenovo.tgz tar -vxzf NextDenovo.tgz && cd NextDenovo
If you want to compile from the source, run:
git clone firstname.lastname@example.org:Nextomics/NextDenovo.git cd NextDenovo && make
ls reads1.fasta reads2.fastq reads3.fasta.gz reads4.fastq.gz ... > input.fofn
cp doc/run.cfg ./
Feel free to raise an issue at the issue page.
Please ask questions on the issue page first. They are also helpful to other users.
For additional help, please send an email to huj_at_grandomics_dot_com.
Hu, J. et al. An efficient error correction and accurate assembly tool for noisy long reads. bioRxiv 2023.03.09.531669 (2023) doi:10.1101/2023.03.09.531669.
NextDenovo is optimized for assembly with seed_cutoff >= 10kb. This should not be a big problem because it only requires the longest 30x-45x seeds length >= 10kb. For shorter seeds, it may produce unexpected results for some complex genomes and need be careful to check the quality.
You can track updates by tab the
Star button on the upper-right corner at the github page.