← All projects
shipped 2026

Genome Decoder

Raw DNA file in, drug-response calls out, sequenced entirely in the browser.

Genome Decoder hero
The console mid-scan: a DNA helix sweeps under a scan beam while the analysis stream resolves CYP2C19 to a *1/*2 diplotype and an intermediate-metabolizer call.

Why this exists

I read that the cost of sequencing a genome has fallen from over a decade and billions of dollars to a couple hundred dollars, and still dropping. That made me want to understand what a sequenced genome is actually good for, now that it’s an accessible pathway for anyone curious about the nuances of their own biology and health. It turns out it is, in fact, helpful, and that a layperson can analyze the raw data themselves. That’s what led me to this project.

What it does

Drop in a DNA file and it reports how your genes shape your response to three common drugs:

How it works

  1. Read: It detects whether the file is a genotyping array or a VCF and streams it line by line, so a whole genome never has to fit in memory
  2. Decompress: Gzipped genomes are unpacked on the fly. Real ones are multi-member BGZF, which the browser’s built-in decompressor chokes on, so it falls back to a streaming library that handles them
  3. Match: It keeps only the handful of variants the drug modules need, matching each first by rsID, then by chromosome position plus reference allele if the file isn’t annotated
  4. Call: Each gene module turns the genotypes into a diplotype, a phenotype, and the matching CPIC guidance
  5. Report: The result renders to the neon console and to a printable summary

One detail that mattered: the same variant can be reported on either DNA strand, so a 23andMe file and a reference genome can write it with opposite letters. Counting a base or its complement keeps the call correct either way.

What’s next

What I learned

This was a good crash course in what matters when you’re reading the presence or absence of a variant. I’m still a beginner in this domain, and a lot of the value came from asking naive questions and watching the answer explain why a particular detail mattered. The other surprise was how far a vibecoded Python script gets you. It’s adequately powerful to analyze a genome as a layperson, which would have been unthinkable two years ago.

Status

Shipped. Live and running in the browser at the link below. The next pass is about breadth and clearer language, not a rebuild.

← All projects