When working with sequencing datasets of ecological interest, an interesting problem is how to tease out the genetic diversity present in the population being sequenced. Usually, assembly software simply aligns the short read sequences, and determines the consensus sequence based on the majority vote of each position. However, we may wish to seperately assemble each haplotype (version) of each gene. We formulate this as a graph problem, where short reads that overlap are considered nodes in a graph that share an edge if they should go in different haplotypes.
Home | Bio/About | Tags |