I live in Vancouver, Canada, where I'm a PhD candidate studying bioinformatics at the University of British Columbia with my supervisor, Inanc Birol. I work with the genome sequencing data of a variety of species, including human and the white spruce tree. I develop software that takes fragmented genome sequencing data and attempts to reassemble the original genome from which the short fragments were derived. My undergraduate degree is in computer engineering. I'm a programmer, an avid traveller, a singer, and an experimental amateur chef.

Automating data-analysis pipelines using R and Make

Slides and a hands-on activity

'Automating' comes from the roots 'auto-' meaning 'self-', and 'mating', meaning 'screwing'. Bioinformatics analysis often involves designing a pipeline of commands and running that pipeline on many data sets. There are many ways to tackle this common task. Running commands interactively at the command line...


Abreviate gene sequences to unique and stable identifiers

UniqTag is used to abbreviate gene sequences to unique and stable identifiers. It selects a representative k-mer from the sequence of each gene to be used as a systematic identifier for that gene. Unlike serial numbers, these identifiers are stable between different assemblies and annotations of the same data without...


Genome sequence assembler for large genomes

ABySS is a genome sequence assembler that distributes the computation of large genome sequence assembly over a cluster of computers using MPI. ABySS was used assemble the twenty gigabase white spruce (Picea glauca) genome, seven times the size of the human genome.