I live in Vancouver, Canada, where I'm a PhD candidate studying bioinformatics at the University of British Columbia with my supervisor, Inanc Birol. I work with the genome sequencing data of a variety of species, including human and the white spruce tree. I develop software that takes fragmented genome sequencing data and attempts to reassemble the original genome from which the short fragments were derived. My undergraduate degree is in computer engineering. I'm a programmer, an avid traveller, a singer, and an experimental amateur chef.


Correct Misassemblies Using Linked Reads From Large Molecules

Tigmint identifies and corrects misassemblies using linked reads from 10x Genomics Chromium. The reads are first aligned to the assembly, and the extents of the large DNA molecules are inferred from the alignments of the reads. The physical coverage of the large molecules is more consistent and less prone to... [Read More]

ABySS 2.0

Resource-Efficient Assembly of Large Genomes using a Bloom Filter

ABySS 1.0 was the first genome sequence assembly software capable of assembling a human genome using short read sequencing data. To aggregate enough memory to make that possible, multiple machines worked together in parallel and communicated using the message passing interface (MPI). ABySS 2.0, on the other hand, reduces the... [Read More]

Automating data-analysis pipelines using R and Make

Slides and a hands-on activity

Slides and a hands-on activity ‘Automating’ comes from the roots ‘auto-‘ meaning ‘self-‘, and ‘mating’, meaning ‘screwing’. Bioinformatics analysis often involves designing a pipeline of commands and running that pipeline on many data sets. There are many ways to tackle this common task. Running commands interactively at the command line... [Read More]


Abreviate gene sequences to unique and stable identifiers

UniqTag is used to abbreviate gene sequences to unique and stable identifiers. It selects a representative k-mer from the sequence of each gene to be used as a systematic identifier for that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without... [Read More]


Genome sequence assembler for large genomes

ABySS is a genome sequence assembler that distributes the computation of large genome sequence assembly over a cluster of computers using MPI. ABySS was used assemble the twenty gigabase white spruce (Picea glauca) genome, seven times the size of the human genome. [Read More]