@sjackmanhttp://sjackman.ca

Abstract

Methods

Conclusion

Spaced Seed de Bruijn Graph

Read
  AGATGTGCTGCCGCCTTGGACAGCGTTACTCTAAT
Spaced seed k-mers
  AGATGTGC----------GACAGCGT
   GATGTGCT----------ACAGCGTT
    ATGTGCTG----------CAGCGTTA
     TGTGCTGC----------AGCGTTAC
      GTGCTGCC----------GCGTTACT
       TGCTGCCG----------CGTTACTC
        GCTGCCGC----------GTTACTCT
         CTGCCGCC----------TTACTCTA
          TGCCGCCT----------TACTCTAA
           GCCGCCTT----------ACTCTAAT
            CCGCCTTG----------CTCTAATT

Sealer: Bloom Filter de Bruijn Graph

Sealer

C. elegans

C. elegans

E. coli MiSeq

E. coli

S. cerevisiae

S. cerevisiae

Data

Species Accession Genome size Read length Reads Library Fold cov.
E. coli K-12 BaseSpace 3756762 4.6 Mbp 301 bp 4.5 M MiSeq 600 bp 290x
C. elegans BaseSpace 13037213 100 Mbp 9125 N50 ≥1500 bp 190 k Moleculo 12x
C. elegans SRA DRR008445 100 bp 139 M Mate-pair 4550 bp 139x
S. cerevisiae doi:10.1101/013490 12.5 Mbp 8497 bp ≥1000 bp 75 k Oxford Nanopore 38x

Results

Species Assembler Scaffolds Scaftigs Memory (GB)
E. coli ABySS 1.5.2 k=64 133 107 6.3
E. coli ABySS 1.5.2 k=364 157 134 5.6
E. coli ABySS 1.6.0 k=364 K=32 176 176 3.2
E. coli SPAdes 3.1.1 204 204 NA
C. elegans ABySS 1.5.2 k=192 408 77 7.7
C. elegans ABySS 1.5.2 k=512 475 83 21.7
C. elegans ABySS 1.6.0 k=512 K=96 347 73 8.4
C. elegans Celera Assembler 8.3rc1 71 71 NA
S. cerevisiae ABySS 1.5.2 k=192 358 72 8.9
S. cerevisiae ABySS 1.5.2 k=384 185 65 29.3
S. cerevisiae ABySS 1.6.0 k=384 K=96 221 75 11.7
S. cerevisiae Nanocorr + Celera 586 586 NA

References

SPAdes, Bankevich et al. (2012) doi:10.1089/cmb.2012.0021
Nanocorr, Goodwin et al. (2015) doi:10.1101/013490
Celera Assembler, Myers et al. (2000) doi:10.1126/science.287.5461.2196