Sunday, October 23, 2016

SeqLL: Helicos van Winkle

Helicos was the first company to launch a single molecule DNA sequencing system. They never sold many systems (my estimate is fewer than 20), but some of those sites really loved their machines. One beauty of the system was a very simple sample prep: fragment your DNA, add terminal transferase and dATP and the sample was ready-to-hybridize.  Helicos demonstrated a number of interesting applications, such as direct loading of RNA onto the system and performing capture on the flowcell.  But with anemic demand and mounting losses, the company faded, finally filing for bankruptcy in November 2012.  One true believer bought up key IP and hardware and kept the torch burning as SeqLL, headquartered just down the road from me in Woburn, Massachusetts (best known for the book and movie A Civil Action, but also the birthplace of thermodynamicist Count Rumford ).  Now SeqLL is re-launching the technology in a new box, or is it similar technology? That's the big question, poorly addressed by their beta test announcement or their website, and that website seems to be positioning the new SeqLL box against the state of competing sequencing technologies of back when Helicos folded.

SeqLL's solicitation for beta testers was accompanied by a very plain vanilla press release.  The only additional press is a piece in the premium In Sequence portion of GenomeWeb, which I don't have a subscription to. Neither I nor any other bloggers were apparently solicited to do pieces.  Obviously I have a vested interest in that channel of communication, but I do honestly feel that I and others (such as James Hadfield, Lex Nederbragt, Dale Yuzuki, Shawn BakerNick Loman and Mick Watson) often give insightful and honest takes on new technologies - even if some of us are becoming very identified with specific technologies (by us, I mean Nick).  Given that essentially nothing technical or novel was in the press release, beyond promising that the new instrument is benchtop in size, I went to SeqLL's website's technology page.  When it finally loaded (I was on a mobile device, and 4G is iffy in Boston), I found something designed by marketing types with little interest in technical details.  That's not at all the way to get experienced technologists interested in your reboot.

To review, Helicos uses a single molecule reversible terminator sequencing-by-synthesis approach.  In each cycle, a single nucleotide type is presented to the individual DNAs on the flowcell. Molecules which have successfully extended are imaged via a fluorescent tag, then the tag and terminator are removed and the cycle repeated for another molecule.  So it is very similar to Illumina in the sequencing cycle, other than these are single molecules (vs polonies on Illumina) and only a single nucleotide is in the pool (vs. all four for Illumina).

Helicos in the past had two serious issues.  First, the cycle times were very slow, as each cycle of sequencing required four times as many images as Illumina.  Second, sometimes correct bases would not be seen, the problem of "dark bases". So Helicos had a high single nucleotide deletion rate.  Helicos also had very short read lengths, even for that time -- if I remember correctly most reads were in the 35-45 range, though like 454 and Ion Torrent not all reads would be expected to be the same length due to the inherent lack of synchrony from providing 1 nucleotide in each cycle (i.e., if nucleotides are cycled A,T,C,G, then the sequence ATCG will advance sooner than GCTA).  Many of the advantages touted for Helicos stemmed from the lack of PCR anywhere in the sample or template prep; this enabled the direct loading of RNA on the flowcell and was also emphasized for the avoidance of amplification biases in counting RNA or DNA tags.  Plus that simple library prep.  The old Helicos machine was actually a ton in weight (for vibration isolation) and cost around $750K.

But look at an apropos sampling of the sequencing landscape today.  

Illumina has a range of instruments, with read length options often longer than 100 bases, plus the possibility of paired ends.  Illumina's least expensive instrument is only $50K, with other benchtop instruments in the $100K and $250K ranges,   PCR-free and low-PCR library preps are available to reduce amplification biases,. A number of simpler genomic preps are now available, ranging from Nextera to iGenomX new linked read technology.  Plus there are a gazillion Illumina instruments around the world, with growing evidence of a capacity glut.

If you're truly nervous about amplification bias, then there's also competing single molecule sequencers from Pacific Biosciences and Oxford Nanopore.  PacBio's Sequel isn't quite a benchtop machine, but does have a price tag around $250K and with long reads can give far more information on transcripts than short reads.  MinION is delivering ever increasing throughput, can be run as either a short read or long read system, and also will soon be launching a direct RNA sequencing method.  Indeed, the Oxford method is truly direct, analyzing the input RNA with a capability of detecting base modifications. And, of course, MinION has no initial capital cost.  Oh, and there's the 5 minute genomic prep that is expected to be launched any day now (even if it never launches, one can make do with the current 10 minute prep).

So where does SeqLL fit in this landscape?  Great question, and you won't find much help with any of the provided literature.  Few hints as to the time from sample-to-result.  Nothing on the per base accuracy or read length.  No suggestion of what the cost of the instrument might be, or the cost of a sequencing run.  And certainly no datasets to look at.  The website cites studies by the old Helicos gang -- some of which have been employed by other companies since 2010, which underscores how dated these studies are.

Of course, you can try to contact the company and get this information.  But let's face it, this isn't how things work much in the Internet era.  If you want to build excitement around a system, then you need to provide some stats.  This is an attempt to solicit beta testers, and anyone who has done this will tell you that it is a lot of fun and a lot of pain.  After all, the point of a beta test is to shake the problems out of a system as it transitions from development to commercialization, and that can't ever be expected to go smoothly.  A company running a beta test will frequently try the patience of the testers, so that company needs to build a lot of trust with their testing community.  The best way to build trust is to be as open as possible, and hiding all the key statistics at the outset is no way to be open or build trust. 

SeqLL's strange re-launch PR doesn't mean the technology isn't interesting or doesn't have a place. However, the company must shake off the cobwebs quickly and demonstrate this.  My unsolicited advice, in addition to releasing the key performance details, would be to focus in on a small number of applications.  The list on the website reads like a complete laundry list of modern sequencing applications, which is far too little focus for a new (or rebooting) platform.  Pick three to six applications in which the SeqLL/Helicos technology potentially has a strong edge, given the competition of today and the near future. SeqLL needs to differentiate themselves in a crowded sequencing technology marketplace, where former advantages have potentially been shrunk or even eclipsed.  Properly focused execution - and a more open information strategy -- are the best shots at success in such an environment.


Kyle Serikawa said...

Very much agree with advice to target that specific application that SeqLL might be best suited for and that is not currently dominated by other technologies or library prep methods. That's hard to do (nothing pops into my head at the moment), but to get the users jazzed up--who you really need in a case like this--there's gotta be a hook. PacBio is long, long single molecule reads, so phasing and translocations. Oxford is portability, $, and fast pathogen ID. Seems like they may have launched prematurely.

I'm all for the concept of letting the users define and develop the use cases and techniques; to me that was Illumina's genius, and something Oxford is doing well too. But there has to be some reason to even get the user to try.

Shawn Baker said...

Unless they've reduced the price, PacBio's Sequel (I wonder how much that name annoys SeqLL) lists for $350k. As for SeqLL, they should focus on what Helicos should have focused on all those years ago - building a short read counting machine for RNA-Seq and ChIP-Seq. I think they're going to struggle with anything else.

Georgi Marinov said...


Was truly awesome and only possible with their machine.

Also, chromatin assays in organisms with very high-AT genomes would be a great fit for the technology.

But those are niche markets indeed.

The direct RNA-seq stuff is also great in principle, but it is best suited to 3' end profiling. Which is also a niche market.

It doesn't give you full-length RNAs the way the nanopores are promising to.