Omics! Omics!: May 2007

Thursday, May 31, 2007

Myeloma @ Ground Zero

News wires are carrying the story that a doctor has observed a number of cases of multiple myeloma in workers from the World Trade Center site.

First, it is important to note that this is an early report, and such epidemiological clusters could be the result of many things other than a causal link. It could be observational bias: those exposed at the WTC are watched much more closely than the general public. The report apparently doesn't say how many cases were seen, only that the cases occurred in younger individuals.

If this unfortunate observation turns out to represent a new cluster, it will be interesting to find out to what degree the patients have genetically similar tumors. Multiple myeloma often results from translocations which generate 'always on' growth signals in terminal B-cells, the cells which normally produce antibodies in response to challenge by a foreign substance. I don't believe there is much evidence linking specific translocations or other genetic abnormalities with particular sources of multiple myeloma -- particularly since little is known about the causes of myeloma other than it is most frequent in the elderly. If particular translocations are prevalent in these cases, then that may be helpful in treatment -- myeloma is probably on the cusp of having treatments focused on genetic subtypes, due to many personalized medicine studies (including the one I had peripheral involvement in). There is also a lot of pharmaceutical company interest in myeloma, a situation which wasn't very true a decade ago.

The witches brew of dust, asbestos, smoke and other chemicals will surely take a toll on those who served their country in that time of need. The best we can hope for is to try to get a little ahead of the curve.

Wednesday, May 30, 2007

Origins of infectious disease

You'll need a subscription (or visit to the local library), but there is a fascinating review from Jared Diamond and colleagues in a recent Nature: "Origins of major human infectious diseases".

One of the ideas espoused here is that there is a typical progression which human pathogens follow over long time periods. For example, Stage 1 means the pathogen is never naturally found in humans (i.e. this excludes laboratory exposures), whereas Stage 2 pathogens are found in humans but do not transmit from human-to-human. Stage 3 pathogens sometimes transmit human-human, but only for a few cycles, Stage 4 pathogens routinely transmit between humans but retain animal-human transmission routes and still are animal pathogens, and finally Stage 5 pathogens which are pathogenic only in humans.

There are lots of interesting evolutionary stories wrapped up in this model. For example, there is a group of viruses called simian foamy viruses which can be inferred to have speciated via the speciation of their host -- each virus species is specific to a single primate (and none infects humans).

One very interesting hypothesis in the review attempts to explain why during the European exploration & conquest of the Americas most diseases traveled from Old World to New World and few in the opposite direction. Of the 25 diseases explored, only 1 (Chagas) is clearly of New World origin with two others (TB & syphilis) still controversial and four others apparently untraceable to date (rotavirus, rubella, tetanus & typhus); the remaining 18 are all of Old World origin. The authors suggest two explanations. First, many of the diseases originated in domestic livestock; the Old World domesticated more species and tended to live in much closer proximity to their livestock. Second, many more tropical diseases originated in the Old World because Old World primates are genetically more similar to humans than New World primates.

The review also discusses some interesting open questions about pathogen emergence and evolution. Rubella virus has no known animal relative, but is thought to have emerged in humans as little as 11K years ago. Humans and chimps have distinctive Plasmodium species, but it is unknown whether these arose because of or after the human-chimp split. Whether TB and mumps have gone animal->human or human->animal remain open questions. To answer these questions, a lot of good old-fashioned virology needs to be done -- but I think there is also a huge opportunity to use next generation sequencing. By sequencing many Plasmodium species, a very detailed tree of their relationships might emerge and the genetic differences between them identified, ultimately leading to a mechanistic understanding of how each species has adapted to its host -- information that might be used to fight these nasty creatures. Metagenomic searches for viruses, perhaps using enrichment schemes or simply treating the host genome as a byproduct, might uncover new viruses.

A popular shibboleth of the anti-evolution crowd is that evolution is one of many (at best) equally valid theories for explaining the biological world. This review underlines the fallacy of this argument: hypotheses about evolutionary relationships and sequences are used as a framework to organize a lot of information, but also to generate new hypotheses for further exploration. It is this latter topic that is virtually never addressed by anti-evolution researchers; it is a conception of science which seems to perpetually evade their grasp.

Tuesday, May 29, 2007

Taking out the garbage

Tonight is garbage & recycling night at the home office, which is a good way to free up some space. Lately, I've been thinking about the same subject in my work.

My first two serious programming languages, Basic & APL, both had garbage collection, but only for resizable arrays. The challenge in garbage collection is figuring out what isn't needed, and arrays make that simple.

Later when I dealt with Pascal & C and C++, all languages lacking garbage collection, I cheated and just allocated gross amounts of memory every time. The problems were simple enough & the memory large enough I could get away with not managing memory.

Then I moved to two garbage collecting languages, Perl and Java (plus a proprietary language called BTL). Since I was now writing object-oriented programs that threw off lots of big objects (to hold big genomic sequences and their features), I needed to start thinking about garbage collection.

Computer programs need memory and can either slurp it in two general fashions: static and dynamic. Static allocation is my old trick: ask for more than you think you will ever need, and hope that (a) you never need more and (b) that it doesn't cause other problems by being such a hog. Dynamic memory allocation requests memory as needed, but you need to free that memory again or it ends up just being indistinguishable from progressive static allocation. Some dynamic memory allocation is quite painless, such as the allocation of local variables in a recursive routine (such as a recursive implementation of Smith-Waterman). Variables are created as you descend the recursion, but then freed again as your unroll it on your way back up.

Far more complex problems arise if your dynamic process throws off things which persist. For example, suppose each step of the recursive descent throws out a variable which may or may not be captured by another object. Now we have a problem: the recursion may unwind, but the memory for those objects can't be reclaimed unless they are truly going unused.

A very common form of garbage collection is reference counting, in which the language implementation keeps track of the number of times each object is being pointed to by something else. If the reference count drops to zero, then the object can be reclaimed. There are two catches. First, such systems are more inefficient than might be obvious, because you do a lot of incrementing and decrementing counters. My object might have one billion other objects pointing to it, but I still must keep track of any changes who is referencing it. The bigger problem is cycles: if a set of objects form a circular chain of references to each other, then most reference counting garbage collectors will never collect them. The simplest case of this is an object which points to itself. Now, my favorite computer scientist (in part because he shares both my mitochondrial and Y-chromosome sequences, as well as quite a few of my SNPs and other genetic variants) points out that there are reference counting garbage collectors which can collect cycles, but these seem to not have entered common usage.

Perl uses a simple reference counting garbage collector, and so one must be careful to either avoid forming cycles or explicitly break them. You can avoid cycles, but it tends to be a pain -- lots of storing ids and running lookups on them instead of just having objects refer to one another directly. It's the equivalent of trying to converse with someone by saying "could you tell me your spouse's SSN so I can look her up?", instead of the person just pointing your directly at the spouse.

Many languages sport better garbage collectors. Python seems to have reference counting as default but a more sophisticated scheme as an option (perhaps the default now?). When I wrote Java it was reference counted, but apparently moved out of that world long ago. C# & Ruby use advanced approaches.

Of course, having garbage collection isn't a panacea. I tend to think in terms of rich networks of objects, and if you build such a rich network no garbage collector can collect it. So some manual management is inevitable: just as an overloaded ship might toss away empty barrels, I end up having to sever parts of the network when I am done with them. Some of my current code is particularly in need of these techniques because it is depth-first searching a rich object hierarchy to get to very large amounts of data at the leaves. Once a result is generated for a leaf, the pile of data for the leaf is superfluous and can be discarded. But since I'm in Perl, I need to be careful about the discards or all will be for naught: leave a circular reference in, and all that debris clings to the mother ship. So the flush() methods I write often hack mercilessly -- and I hope don't do any collateral damage.

Garbage collection is just one of the seismic forces that seems to be compressing my programming style in Perl. Will the rock fold or fracture? It still isn't clear, but it's clearly a subject for another post.

Friday, May 25, 2007

A little color in our lives

For the general public, all too many reports on genetic phenomena are either very abstract or extremely scary: either it is some phenomenon never before discussed in the popular press or some sort of new risk factor for a dreaded disease. It is nice to see progress on traits which are non-threatening and which can be scored by most individuals. Such visible traits are often used in introductory instruction. While the rare and exotic may be striking, it is far more exciting to discover that science is tackling a trait you see in someone near-and-dear, a trait you see sported on a daily basis. Thanks to the glory of random Medline search noise, I discovered just such a paper (open access!), greatly apropos my dear Amanda.

What the paper found is that explaining the modulation of hair color by two known genes of relevance, Mc1r and agouti, is insufficient. The color pattern known as brindle doesn't fit these results, and they identify a new genetic region (K) responsible for brindling.

Color phenotypes are fun to look at because they are so easy, and because they can often simply illustrate some of the more complex genetic phenomena. A classic example are tortoiseshell and calico cats, which are nearly always female and are due to X-inactivation (I think XXY male cats are viable, and therefore could exhibit these colors). In this case, the various color loci and alleles exhibit epistasis, the hiding (Gr: "stand upon") of the phenotype from one locus due to the genotype at another locus. For example, an Mc1r allele known to encode a truncated (and therefore non-functional) protein prevents any combination of K and agouti alleles from producing a black coat.

Now this is just a mapping paper, and so the gene in question has not been pinpointed. The paper points to there being ~250 genes in the interval in question, with nothing rolling over on first glance. However, they suggest that looking at non-SNP markers, such as simple nucleotide repeats, and additional pedigrees should enable a candidate to be identified.

What will also be interesting, and perhaps not make this discovery quite so benign, is whether variants at this locus affect energy metabolism. Due to some deep evolutionary links (that, alas, I've forgotten the details), coat color mutations often result in weight regulation disorders or severe developmental defects. Both of the other genes mentioned in the paper, Mc1r and Agouti, are important areas for obesity and diabetes research, particularly since Mc1r is in a class of proteins (GPCRs) which has historically been targetable by oral medications. Beige mutant shows both weight and developmental issues, and steel (a grey phenotype) has a host of developmental problems. The brindle phenotype is not known in humans, so it will be interesting to see if (a) the gene is functional in humans (b) has variation across human populations and (c) whether that variation is linked to subtle variations in phenotype

Thursday, May 24, 2007

Two more farewells

Today's paper's obituaries brought the news of Stanley Miller's passing. Miller's experiment with Harold Urey is notable for many reasons. First, it sparked the whole field of abiogenesis, and second it is recognizable to many persons outside of biology or chemistry. Indeed, there used to be a video at the National Air & Space Museum of Julia Child running the experiment, cooking primordial soup (I think the video can be found in some libraries). Miller's experiment did not prove abiogenesis, nor did it prove a particular model, but it did demonstrate that interestingly complex organic compounds could be generated from simple processes that might have existed on a pre-life Earth. Indeed, it is the fact that Miller's work stimultated debate & testing about what the prebiotic Earth was or was not like, the hallmark of good science on the outer fringe.

The obituary also noted that Miller's thesis advisor, Nobel Laureate Harold Urey, insisted that he be sole author on the paper. I've always known this as the Miller-Urey experiment, but that was awfully gracious of a senior scientist, and a model not always followed. At my department at Harvard there was a story of a graduate student whose defense was snubbed by his advisor due to a dispute about failing to include the advisor on a submitted paper.

I've been meaning to note one other passing of a great pioneer. In The Right Stuff, there is a scene of the potential Mercury astronauts enduring an exhalation test, and at the end only Scott Carpenter & John Glenn are still blowing bubbles. That is now the case in real life with the passing of Wally Schirra. Schirra was notorious as a jokester, but it is also notable that when it came time to name his capsule, he picked Sigma 7, for the letter's relevance to math, science & engineering. His sigma was indeed spectacular, splashing down within sight of his recovery craft. As a kid I never understood why Schirra retired just before he would have had a lock on a slot to go to the moon. As an adult, I can begin to fathom how exhausting all the training was.

Wednesday, May 23, 2007

Death of a Laptop Bag

I had a rude surprise this morning on the walk to the train -- my laptop bag sliding off my shoulder. Miraculously (especially for first thing in the morning) I was quick enough to snag it before it dumped my machine on the pavement. One of the metal rings connecting the shoulder strap to the bag had sheared off.

I will miss that bag. It was a freebie -- a token from my visit last June to ASCO, the truly enormous oncology meeting held that time in Atlanta. A seam for a fabric divider had blown a month or so ago, but otherwise the leatherette messenger bag was in good shape. It had just enough pockets to be useful, not so many as to invite me to lose things. It also became my interview bag & my networking bag while I was making the transition away from the old shop, so there are multiple sentimental attachments to it.

My previous bag was also a freebie, that time from the ASBMB meeting (the really huge biochemistry meeting, held in Boston that time) a few years back. It had sat in a closet for a while, extending its lifetime, but it eventually was too attractive. It was a great backpack style bag, with a water bottle pouch on the side (very useful!) and two main compartments. When I could start seeing the contents through the bottom, I realized the gig was up.

The bag before that actually survived its tour of duty, but only because it was suited for the task -- it was a nice L.L. Bean bag (!) my wife had given me, but the zipper went too far around -- fail to zip it completely and the zipper could work itself open, dumping contents all over. A few too many near-escapes of my laptop convinced me to find another bag.

The bag before that also was flawed, initially. It may have also been a freebie, but I think it was MLNM-provided. It's problem was again a weak snap clip that tried to loose the bag from my shoulder strap.

The bags before that have been lost to memory, but I think it ran about 1 per year, most worn to shreds.

I am hard on laptop bags. First, there is the laptop. Then, too many journals to be read on the train -- plus the periodic library book or paperback. Plus an assortment of bills & correspondence to deal with, some random printouts, etc. Often the cell phone and Palm Pilot are riding along, as well as the beloved Nano. A few key pharmaceuticals don't add much weight, but are critical -- there are those days you just need to stifle your histamine H2 receptors and COXs.

Somehow, I've been pretty lucky with the laptops themselves. There was one unfortunate case of the treacherous shoulder strap clip dumping the machine onto the sidewalk, triggering a crash of the hard drive -- with an unsaved PowerPoint, and a big chip in a corner of my last Millennium machine. The personal machine is missing the bezel from the DVD drive -- but it appears to have relied on Van Der Waals forces to begin with. I think I did once take out another hard drive, but in 9 years of laptop commuting it really wasn't a bad record.

So, for a few days I must carry my bag by the handles -- until they break. Maybe I'll actually get to a store. Given my track record, maybe I should skip Staples and go straight to the Army-Navy store.

Tuesday, May 22, 2007

The other question

I introduced my post on the appropriate math background for biology with the comment that the high school student at my banquet table had asked two questions, and that was one of them. I've been tardy about following up with the other question, and have even gotten a prodding in the comments.

The other question, which I hadn't really thought about until he asked it, and have spent a lot of time contemplating its surprisingly many facets, was 'how important are grades'? Now that's a doozy.

My first thought is that they are very important. I never took a college course pass-fail unless it was only offered that way. I once enrolled in a class pass-fail, but changed my mind later -- which was most foolhearty, because it was clear that I wouldn't be acing the course -- it wasn't just a book course, and Martha Graham I ain't.

On the other hand, how powerful a motivator were grades in the end? They clearly were ineffective in high school, where a powerful case of senioritis set in early in my junior year. Nor were they much of a motivator in my biology coursework, where I generally did more than necessary (making up for the lack of effort in some other areas!). They were a mixed motivator in the various courses I blew off until things looked grim, and somehow always pulled my nose up before hitting ground (though on at least one occasion, the instructor was kind enough to move the ground a bit lower).

Grades were obviously not a factor in the courses I did take pass-fail, yet I still knocked myself out for the ones that were interesting and did enough for those that weren't. And they certainly weren't a driver of one of my graduate courses that I took unofficially for no credit, simply because I thought it would be a great course -- a notion amply proven.

There was also the horrible spectacle of grade-grubbing, the line of students before each exam trying to extract the precise exam questions from the instructor, and afterwards trying to wheedle every last point out. This group was highly enriched for pre-meds, though many pre-meds avoided this graceless pursuit. I must confess indulging in some schadenfreude as the worst grade grubbers got their due, and their ambitions moved from pre-med to pre-dental and finally to pre-chiropractic (on the other hand, Miss Amanda is treated by the best students: Vet schools are rarer than Med schools, and can therefore be much pickier).

But, on the other hand, there is the external issue. One of our talented staff at the new shop is being lured away by the siren of law school and has chosen Northeastern. A strength of NU is that they have a strong internship component to all their degree programs, which gives both practical experience and monetary remuneration. But, what I didn't know is the law school doesn't give grades. The catch is that when a graduate goes job hunting, the firms considering them don't have any grades to use as a guide -- just descriptions of the courses and text descriptions of the student's work. In effect, each potential employer must grade the student based on a digest of the student's work -- not altogether an objectionable concept, but certainly a bit more complicated than sorting on GPA.

I got a B in chemistry in my first college semester, and at some social event I was mildly moaning about it to the dean I knew. Instead of reprimanding me, he congratulated me: no longer would I be worried about besmirching my perfect 4.0. He felt that many good students had been unraveled by an unhealthy obsession with perfection, so a little tarnish early was a good thing.

That first B set up a pattern, one that somehow I developed an unhealthy superstition around. Every semester I would get one B, and it was always outside of Biology. The superstition came to a horrific climax when I got all the cold hard estimates of grades early -- all except one Biology course I had found challenging -- and they were all A. So some streak had to give, and for me the terror was that the one I cared about would be vanquished. Luckily, that was not the streak that ended.

A far worse see-saw with grades was in graduate school. I had one deficiency to remedy: no coursework in Physical Chemistry. Obviously this was going to be an easy class, as at least 3 undergraduate chemistry courses had covered it. Neatly neglected was the fact that the PChem material had always been the greatest struggle for the lowest grades, and even more neglected was my PChem coursework. Before I knew it, I was in a deep hole, grade-weise. I went to the instructor to try to get some help, but he mostly saw a very worried student -- and that was the time the ground was lowered. He kept saying "you'll get a B" & I kept despairing that it was mathematically impossible and anything else would scotch my funding. Somehow it finally sunk in: get your act together, and you'll get the B.

How important are grades? They're useful, but imperfect. In the end, I probably would have put about the same effort into each class, grades or no grades. The grades probably 'kicked it up a notch', but just a notch. External viewers, such as potential employers or grad schools, might find GPAs convenient, but more likely you will be hired / accepted based on your connections & recommendations. If I had to do college again, I'd probably accept a few more B's and take a few classes pass-fail -- and try to play frisbee on Delaware's magnificent Mall a bit more. Coursework is important, but it isn't everything.

One last question, with regards to this blog entry -- did I pass?

Saturday, May 19, 2007

Circling back to sequence

One of the open-access papers in this week's PNAS Early Edition describes a new approach for large scale targeted resequencing. Their 'selector' technology, described previously in an open access Nucleic Acids paper, enables the amplification of specific restriction fragments from a very large population. The target DNA is digested with the restriction enzyme of interest, denatured, and then hybridized to the set of selector probes. Successful binding to a selector probe is then followed by ligation to form a circular DNA structure lacking free ends. Exonuclease treatment destroys all the non-circular DNA, and universal priming sites built into the selector probe allow amplification of all the selectors with a single pair of PCR primers.

In the new paper, the selectors were built to target 10 genes known to be mutated in cancer and the PCR products were explored using 454 sequencing. Six tumor samples were used. In any given sample about 74% of the selectors yielded at least one sequencing read. Mutations detected for p53 (HUGO:TP53) were confirmed by conventional Sanger sequencing.

This is a first report, but it does point to some challenges -- and the promise of this sort of approach. The challenge for resequencing, whether for cancer work or for genotyping important disease loci in a patient (or newborn), is to make sure you find everything you want. 74% sequence coverage is clearly not enough -- what if the allele you are interested would be in that other 26%. Furthermore, particularly for cancer resequencing, the standard was rather generous -- a single read from the amplicon. Even if you fully trusted the sequence data on a single read (yeah, right!), you obviously need at least two to properly assess a germline genotype -- and of course this is really a sampling exercise so you need a lot more to be confident you searched both chromosomes.

For cancer genomics a question of particular interest. A huge effort is rolling forward, with many critics, to sequence cancer-related genes in a large number of tumor samples. The definition of cancer-related is still murky and would depend on the technology used -- it might end up being every gene, but more likely will be a very large set of genes with possible relevance to cancer. For example, one published pilot project (in Science, not yet free) looked at 13K genes; as covered here previously other studies have focused on pharmacologically tractable gene families such as kinases. Yet another post on the subject, with links in need of repair. A key challenge is that every tumor is really a complex community of mutants, and the mutations of interest may be at a very low frequency -- the mutations which are in cancer stem cells or perhaps a rare mutation which later therapy will select for. Treating this as a population genetics problem, your sensitivity to detect a mutation will largely be a function of the number of independent reads which can be generated for a single amplicon from each sample. One interesting, but probably difficult, area of exploration would be to NOT PCR-amplify the circles prior to capturing them on the 454 (or similar next-gen sequencing technology) so that each bead represents a single captured molecule.

Wednesday, May 16, 2007

What Math for Biology?

At the banquet I attended recently, the high school senior I sat next to asked me two very interesting questions about his future education, one I had thought a lot about and one that I hadn't.

I'll tackle the first one now: what math should he take in college?

My own B.S. program required only Calculus. To be honest, I've never actually had to use calculus in my studies; the one time I attempted to integrate a function it turned out to be one without a computable derivative! But I would hesitate to ditch Calculus altogether, because it is a powerful concept and the concept of integrating shows up so often.

The easy recommendation was statistics. Of course, one needs a good statistics course -- the one course I took was awful. The main problem it had been dumbed down to meet the perceived weaknesses of the business majors for which it was a requirement, spreading over three semesters what should have been no more than 3/4 of a semester. It was only later I realized that I should have taken the Psych department's stats class, as it emphasized experimental design. Ideally, a modern stats course for biology would give a heavy exposure to Bayesian statistics and touch on topics that may see limited relevance in other areas (such as the Extreme Value Distribution -- important to search programs).

IMHO biologists would also be served well by a survey of various mathematical disciplines, a survey that would emphasize understanding of the key concepts over being able to execute all the calculations. I realize that is probably heresy in many math circles, but it would be very useful. For example, basic topology and graph theory -- everyone should understand the difference between a Directed Acyclic Graph and an Undirected Cyclic Graph without being able to prove theorems about them. You're going to run into eigenvalues constantly in the bioinformatics literature; demystifying them early is important. At home we have a great textbook (my mother collected them for tutoring) called Mathematics, A Human Endeavour, which has just the right flavor.

It should come as no surprise that I believe every science student should have an exposure to computer programming. Even if you don't ever write code for fun or profit, the underlying thinking is very useful. Understanding the fundamental data structures (tree, linked list, etc) and some common algorithms should simply be part of everyones intellectual foundation.

Given the opportunity (make me education czar!), I would, of course, like to see these things pushed far lower. My own math education, which I think is representative of typical in the U.S., was far too unaggressive and bored the heck out of too many people too quickly. Subjects like plane geometry need to be distilled down to their essence, with the remaining time filled with a taste of programming, which involves similar thinking to mathematical proof execution. It appalls me now that at an early age I learned how to compute mean and median, but nowhere in grade school was it pointed out when one might be superior to the other (especially since this distinction is so routinely ignored by the media).

One last comment: if my math education was representative, then too many math programs fail to stimulate real excitement in math. I do feel it is important to learn to compute certain things, and you really need to have the basic multiplication tables memorized to have any luck at algebra, but too much drilling flattens the excitement. I was disappointed, but not surprised, that my tablemate had never heard of the Seven Bridges of Koenigsburg. I doubt he had been exposed to snowflake curves (a simple to sketch fractal) either, a concept that blew my mind back in 3rd grade or so -- but it wasn't in the curriculum. A little less drilling, a little more mind expansion & math education would be on the right track.

Tuesday, May 15, 2007

Here a pedia, there a pedia

There's a lot of science blog activity debating the utility of Wikipedia (e.g. Eye on DNA, Wired Science, Science Roll, Epidemix), and I've never been one to not jump on a good bandwagon. In general, I think the various viewpoints agree more than they disagree, but there is a significant divergence of opinion on whether Wikipedia can be saved from itself. I will boldly and decisively plant my flag in the mushy middle.

I find Wikipedia very useful as a general reference (and often link to it from here), but it is clearly flawed. The anonymous nature has always given me the willies. In general, I have found the information pretty good, but often maddening. For example, I was looking up a particular branch of Protestantism today (in the interest of general knowledge), and unfortunately the entry was written by a true believer (and not flagged) -- surprising, since usually WP bends over backwards to flag everything not written in a neutral style. A friend once asked me for some help for her daughter's high school project on serotonin, and I was surprised to find the Wikipedia entry lacked any history of its discovery (but did I edit it then -- no! -- however, it does look like its gained a little bit on the history). Some Wikipedia entries have absurd levels of depth, whereas others are too shallow.

Back in grade school the library always had multiple encyclopedias, which often had different strengths and levels of detail. At home we had both Junior Brittanica and Encyclopedia Brittanica, though both purchased before my entry into this world. I quickly learned that my then favorite topic was badly truncated, with the big one ending during the early Mercury shots and the Junior set going into Gemini (I think). Wikipedia is amazingly up-to-date, with events showing up there before the newspaper with the same info can hit the street. But in any case, the world is probably best served by competing encyclopedias.

An alternative to Wikipedia has been launched called Citizendium, and the model has some interesting differences from Wikipedia. Contributors will be non-anonymous and a much more limited in number, and generally chosen for recognized authority. Alas, Citzendium is pretty limited in coverage. Looking at my last post, things are pretty bad. 'Ailuropoda' brings up nothing, whereas the first hit for 'panda' is to an article on creationism (those poor hypercute ursids, commandeered for pseudoscience!). 'Dodder' brings up articles on horizontal gene transfer. Wikipedia's article on serotonin is flawed, but Citizendium's is non-existent, though there are articles on neurotransmitters and Julius Axelrod. Even more striking, 'Watson-Crick' pulls up nada. Yikes!

What is potentially interesting is that Wikipedia is on a 'copyleft' model, which theoretically means Citizendium could (and plans to) use Wikipedia as a major building block for their effort. So, it is possible that Citizendium will evolve to largely be a buffered version of Wikipedia, slower to be updated and smaller in scope, but with the worst excesses filtered out. Of course, the copyleft also means Wikipedia can raid Citizendium for the improvements. Let's hope this works out to a virtuous cycle.

For the sake of completeness I feel I must include the other heavily publicized encyclopedia effort, but I can't recommend it. Conservapedia is an explicitly ideological view of the world. In particular, it attempts to be even-handed towards various branches of creationism. So the article on the speed of light (at this time of writing) touches on the 'minor' problems it creates for young earth creationism and cites only a creationist tract, but doesn't discuss at all Michaelson-Morley. We learn that kangaroos originated in the Middle East. Now, if you dig you can find that some folks who like the general concept but not the anti-science are trying to contribute

Wikipedia is flawed, but what to do? The eager can try to reform Wikipedia or lead the expansion of Citizendium. But if Don Quixote is your hero, then perhaps you'll try to keep Conservapedia on the straight-and-narrow.

Monday, May 14, 2007

What's left to sequence?

The era of complete organismal genome sequences is just over a decade old. Somewhere on another blog (alas, I cannot find the page) there was a rumor that the National Center for Human Genome Research is no longer soliciting white papers for genome sequencing, but rather the funding focus will shift to standard funding mechanisms. If true, this may mark the end of the era of genome sequencing for the sake of understanding genome sequences.

The initial sequences had quite a rush of excitement to them, because they were blazing new territory. Even before the E.coli genome was finished, the early rushes on it revealed exciting details of genome organization. Haemophilus gave the first picture of a complete genome and Mycoplasma genitalium a hint at a minimalist genome. Saccharomyces, Caenorhabditis and Homo each represented a huge new milestone in genome complexity. As genomes came rushing out it became harder and harder to track what had and had not been completed, with the tracking harder if you expand the list to what was well underway. Since it wasn't in my specific professional needs, my personal tracking grew very sloppy.

It's gotten so that now I'm surprised when something is sequenced more by the claims of how the genome is novel than the actual content. What, we hadn't sequenced one of those yet? If the genome sequencing era is entering a plateau, what has been overlooked?

Of course, a key question is what makes a genome interesting to sequence. Obviously, if the organism is important to you, then seeing its genome sequenced is important. For example, our household's next generation of omicist will be quite disappointed to learn that his beloved Ailuropoda melanoleuca hasn't made the cut. Similarly, had the new Massachusetts biotech proposal come in the era of genome sequencing excitement, perhaps we would see its centerpiece the sequencing of local favorites Homarus americanus and Gaddus morhua (favorites that is, with drawn butter and beer batter, respectively).

So one criterion for genome sequencing has obviously been those species which are very important to a lot of people, or at least a lot of biologists. Another has been genome novelty, or the probability that the genome will on its own tell a very interesting genomic story, in terms of organization, content or evolutionary history. And another has been the use of genomes for cross-comparison. In this last category, it could be argued that the mammalian space is pretty well covered now.

But was anything overlooked? To ask the question with the criteria above is to, unfortunately, severely probe my ignorance in a number of fields. For example, what is the most serious (perhaps measured by worldwide deaths) bacterial pathogen yet unsequenced? How many of genera of bacteria or archea have a currently culturable member but no member sequenced? What is the most important (economic value as a proxy) industrial microorganism to not yet reveal its genomic secrets?

Even if I stick to eukaryotes, I find my understanding lacking. Despite a lifetime of gardening & outdoors exploration, I'll confess mostly ignorance in the plant arena. Of course, a lot of economically relevant plants have been tough going due to high repeat content (with maize as the poster child for this issue) or have large genomes with high ploidy (many important wheat varieties are hexaploid or octaploid). Despite this, I will propose below one plant that seems to have been missed. Fungi are another area where I am more ignorant than knowledgable (luckily, fungal genomes have their own blog!.

Furthermore, as one might hope, the people who think about this stuff for a living have covered most of the obvious stuff. Clearly it would be useful to round out our sampling of chordates with a jawless fish and a cartilaginous fish, and indeed a lamprey and a shark are in progress. The remaining orders of mammals, particularly the monotremes (egg-layers), are getting their due. Early animal evolution, with all sorts of interesting questions of the appearance of genes relevant to complex developmental processes, are being covered with a sponge and hydra. Lichen fungus? Covered! Flatworm -- yes! Earthworm -- no, but a leech covers the segmented worms.

I fully expect that someone will point out half or more of these as actively being sequenced, but here is my list of possible oversights. I tried to hit Google with a reasonable number of searches, but there's always one more to do. Furthermore, I hope -- no I challenge -- readers of this will make cases for other species.

Euglena is the original taxonomic paradigm-busting organism, neither clearly plant or animal. Now believed (last I heard) to be an ancient symbiotic fusion of a trypanosome-like organism and a photosynthetic species. There has been an EST project, but apparently no full genome sequence

Genome sequencing has been applied to a number of insects and at least one tick, but this hardly covers the amazing variety of arthropods. Certainly it would be interesting from an evo-devo angle to have some more. Millipede would be an obvious one, and of course (as noted above) many crusteaceans are economically important.

Many plants rely on symbiotic fungi on their roots to extract various nutrients from the soil; one (or a few) of these would give a window into what is entailed in this particular symbiosis. Another fungus that might be interesting would be the one cultured by leaf-cutter ants.

Dodder is one weird looking plant, due to it lacking photosynthesis. I've periodically stumbled across patches of the stuff, and it looks more like a human creation or the spinnings of a deranged extraterrestrial spider than a plant -- tangles of orange thread-like tendrils. Dodder's chloroplast genome was sequenced quite a while back, and I'm surprised the nuclear genome hasn't followed -- parasites seem to always have quirky genomes and a lot to say (by comparison) about their non-parasitic relatives.

Finally, there is what I wish I had thought to propose for the 454 sequencing contest: a rotifer genome project. Rotifers are tiny invertebrates which are found ubiquitiously in standing water. What is striking is that one of the two large subdivisions of rotifers, the Bdelloids, show every evidence of avoiding any sexual reproduction or recombination over vast evolutionary time. This long-term clonality has fascinating effects on the genome -- each copy of each gene in this diploid genome is evolving on its own trajectory. For comparison, at least one sexual rotifer species should be sequenced

Thursday, May 10, 2007

Rewarding Failure

Yesterday I blogged on Governor Deval Patrick's $1B biotech proposal and suggested that perhaps the money should be spent on something other than grants. Today I'll offer a modest suggestion on how to run the grant competitions.

Give it to the losers.

Which losers? Failed grant applications for NIH grants, that's who.

Of course, not just any failed grant should be submitted -- only the best will make it. This isn't an attempt to simply have a consolation prize, but rather to address the oft lamented failing of grant competitions, the fact that truly novel & innovative research is often rejected.

My process, if implemented, would attempt to be very efficient with researchers' time. The grant proposal would be resubmitted, with the prior reviewers comments, with a short description of what parts of the original proposal would be attempted under the new grant. No rewriting allowed.

Of course, not just any failed grant would be eligible. Obviously since this would be a state program the researcher must be Massachusetts based; such parochialism is foolish but politically necessary. New England could probably compete better with California if the various states worked together, but instead you see the mayor of Boston threatening to sue a New Hampshire airport for putting Boston in its name. Rhode Island has an active effort to poach biotech companies from Massachusetts, but their most heralded catch (Alpha-Beta) had the bad manners to go bust after moving.

Furthermore, not all failures need apply. The goal is to find the interesting rejections, the ones bounced because the grant reviewers lacked sufficient imagination. Not the grants that are 90% likely to succeed, but the proposals with a 10% chance of success -- but that potential success would be spectacular. Or, proposals where a scientist successful in one sub-specialty is attempting something in a very different one -- precisely the type of cross-pollination which leads to radically new thinking. Of course, the proposals would need to address important topics -- no stamp collecting expeditions, no routine cloning studies or genome sequencing.

High risk, high reward -- that's the requirement. Most of these won't produce direct results, though they will help train new scientists. But if a few go big, then everyone benefits -- and perhaps some of the next Amgen or Google will be based in Massachusetts.

Wednesday, May 09, 2007

Deval's Big Biotech Play

Massachusetts governor Deval Patrick announced today a 10 year, $1B plan to encourage further growth of the biotech sector. The money will be used on a variety of programs, including grants and several showpiece centers, particularly a stem cell bank to be housed at U. Mass Worcester Medical Center. Much of the funding would target stem cell research and RNAi (Craig Mello, one of the Nobel laureates for RNAi, is also at the medical center).

A big driver is the fear that Massachusetts will lose future biotech activity to other states, particularly California with its huge stem cell initiative. As mentioned previously, all sorts of states and countries are here at BIO trying to poach companies.

I wonder if this approach will really address the most important needs for Massachusetts biotech. More research funding is nice, but is that really the key issue?

For example, one could imagine funding programs to educate high school students in biotech, particulary laboratory technique. Making biotech retraining available to adults through community colleges and adult education programs would be another worthwhile activity. Both of these would create a large reservoir of potential technicians, and many of these skills are transferable elsewhere. In a similar manner, one could imagine programs to retrain health care workers, such as nurses and med techs, in relevant areas. This isn't an attempt to rob the healthcare Peter to pay the biotech Paula, but rather would reflect the reality that many such professionals burn out or want career changes for other reasons.

The governor might also target the permitting process, which is routinely bemoaned by the construction industry. Builders shouldn't have carte blanche, but reducing the complexity of rules and processes (especially across jurisdictions) would enable new research (and other) buildings to go up faster. Taking a look at local laws would be helpful; Cambridge's neighbor Somerville missed the biotech boom because into the 1990s (and perhaps later; I never did hear of a repeal) it was illegal to recombine DNA in Somerville.

It would also be worth looking at the growth of future biotech in Cambridge. Most research activities in Cambridge are in an irregular 2-mile or so long swath bounded by residential neighborhoods, MIT and the Charles River, with a large railroad & road corridor forming the extreme end. It is easy to scan that region with a builder's eye and see that there aren't many more parking lots or dilapidated buildings that will be easy to do away with (other than in the I-93/MBTA corridor, which is just starting to be built out). Throw in some reasonable urban mix of housing and retail, and there's only so much growth potential left -- particularly given Cambridge's aversion to very tall buildings & the very appropriate desire to have low-rise buffer zones on the edges of older residential neighborhoods. When Cambridge is truly full, where will the activity go? That might be a good question to start asking -- there is plenty of other biotech scattered in the state, but the concentration in Cambridge would argue there is advantage to density.

Transportation would be another important issue to tackle is transportation -- the Boston area already has a strained transportation system. The Cambridge biotech district is crossed by a number of bus and one subway line, but you can be somewhere in the Boston area with good transit service and not have good connections to Cambridge. One proposed transit line, the Urban Ring, would run smack through the middle of the district and connect it to the airport and the Longwood Medical Area.

In other words, perhaps the best way to support biotech in the state is to stick to some more traditional domains of state government. It might not be as sexy, but it might have a bigger impact and more beneficial side effects.

Tuesday, May 08, 2007

A Tale of Two Drugs

The BIO convention has unleashed a flurry of opinion items on drug pricing bemoaning the high prices of drugs coming out of biotech. While few are as absurd as the one Derek Lowe skewered today, they are coming from voices taken very seriously. Marcia Angell, former New England Journal of Medicine editor-in-chief, was quoted that biotech companies can charge whatever they want due to holding monopolies on their treatments, whereas today's Globe had an op-ed from Harvard prof Jerry Avorn all but proposing a tax on biotech drugs to be earmarked for NIH funding.

In the end the claim is that drug prices could be much, much lower but real pharmaceutical innovation would be preserved or even enhanced. Angell in particular seems to be fond of the White Queen's habit of believing impossible things; a run of Op-Eds from her this year in the Globe alternately excoriated the pharmaceutical industry for spending R&D dollars 'me-too' drug development followed by celebrating that the availability of multiple closely related compounds enabled large payers to bargain with pharma companies over dispensary prices. Angell also fails to explain why companies which can charge 'whatever they want to' don't charge more -- are they idiots? Or is the world really not so neat and tidy.

I'd like to use the stories of two drugs from my former shop to illustrate how complex reality is. The two drugs are Velcade and MLN-02, both entering Millennium's portfolio through the acquisition of Leukosite. Two or so other drugs from other companies will also make appearances, though that's getting ahead of ourselves.

Velcade is Millennium's biggest drug. Nobody can claim it is 'me too': it is the first proteasome inhibitor ever to enter the clinic, and is still the only approved one. Velcade has been approved to treat two cancers of B-cells, multiple myleoma and mantle cell lymphoma. That short list is not for lack of trying: between Millennium, the NCI and individual investigators it has probably been thrown at virtually every known cancer, alone or in combination with standard chemotherapy agents. Positive signals are few and far between and have the nasty habit of disappearing once the trials get large. In many cases, the right dosing or combination may not have been found yet, so oncologists continue to explore Velcade even in indications where it has not succeeded previously.

Now Millennium and its pharma partner J&J do market Velcade, but the effort is quite modest (I don't have numbers, but the U.S. sales force I think is a few hundred). Millennium continues to plow cash into Velcade trials in the hopes of hitting a significant jackpot; MM & MCL are important but won't drive sales to the stratosphere. Velcade was the first agent in over forty years to demonstrate a survival advantage in second line myeloma. Yet Millennium's stock price is stagnant and the company is trimming expenses annually. So why isn't Millennium in clover?

The answer quite simply is competition. Celgene had thalidomide, with its dark history, and thal is quite useful in myleoma. Thal is oral, whereas Velcade is injectable, and convenience wins all other things being equal -- and at the moment there is not hard evidence to say the two drugs aren't comparably effective. But what's really knocking Millennium around is the thalidomide follow-on Revlimid, which is claimed to be significantly less teratogenic. Now Rev is a follow-on and chemically related to thalidomide -- is this a me-too? Revlimid is also oral and is already looking good in front line trials of myeloma, where Millennium hoped to expand Velcade into -- that will probably be successful, but it will be another the same dogfight all over again.

Note that this competition has some very real effects. It is chic in some circles to sneer at worrying about stock prices, but that stock is a very real mechanism for raising money to plow into further R&D. Millennium is still an independent company because it was lucky enough to issue stock near the peak of the biotech bubble; money is still marching out the door faster than it marches in.

Now let's look at MLN02, another drug with an interesting story. MLN02 is an antibody which targets certain integrins, heterodimeric extracellular protein molecules important for the recruitment of immune cells to sites of inflammation. MLN02 targets an integrin believed to be specific to the gut, and so might offer a very specific approach to downregulating excessive immune activity in ulcerative colitis and Crohn's disease.

MLN02 has had a rocky history at Milllennium. Leukosite had partnered with Genentech on the drug, but Genentech later bailed out. A paper was published in the New England Journal with the results of a large study, but even these results are not as clear as one might like. In any case, the development of MLN02 has at times been a top priority on Landsdowne Street, but at other times the drug was essentially tabled. Why?

A lot has to do with the competitive landscape. Crohn's and UC are not huge markets, so dividing the market up isn't very attractive -- especially if the competitor gets there first. The first entrant advantage is quite large in pharmaceuticals. So the tea leaves are read daily -- and the newswires scanned obsessively -- to see what the competition was up to.

One development which iced down MLN02 enthusiasm greatly was the accelerated development of another biotech company's integrin targeting drug -- and that company was large and successful. Their drug would go first for another indication, but might hit Crohn's and/or UC prior to MLN02 could be expected to get there. With lots of safety data from the other indication & some of the same docs prescribing in both areas, the deck would be stacked against a new entrant -- even if MLN02 had the theoretical advantage of being gut-specific. Launch of the potential competitor in the other indication ahead of schedule did not help matters any.

What heated up MLN02 interest again was what happened to that competitor, as it was Biogen's Avonex. Avonex works in MS, but in a very small number of patients a lethal viral infection was enabled by the drug. Suddenly, the competitive landscape was altered -- though with a new regulatory challenge of convincing the regulators that MLN02 really doesn't alter lymphocyte trafficking in the brain.

To some degree, the numbers folks were daily running estimates of what the expected gain from MLN02 would be, given the competitive landscape (I've left the other big player, Remicade, out of the story -- and it is probably going to waltz all over these markets). Even when Avonex was in trouble the models suggested that MLN02 might end up being a money pit after all -- depending on its efficacy and the price payers were willing to pay for it. Biotech has proven many times it is possible to fail by succeeding; your drug works, but not well enough to make it to market -- and there are no money-back guarantees on clinical trials.

Like it or not, money is the lifeblood of pharmaceutical development. Trials are expensive. No matter how much you hacked away at marketing or executive salaries at Millennium, the brutal reality of costly trials and ever changing competitive markets would prevail. We might want to pay less for new medications and perhaps through price caps or other government fiats society may accomplish this desire. But to claim that new drugs will continue to flow as before is to ignore the real world -- dlrugs go forward which are predicted to pay for their development costs and cover the money sunk into expensive failures. Cut the reimbursement rates and you inevitably negatively change the risk-reward perception for every project in development. Some will survive, but many, particularly the MLN02s of the world, will not.

Amazing Teachers

I had the distinct honor last weekend to serve on a panel of judges evaluating ten finalists for the Genzyme-Invitrogen Biotech Educator Award. Indeed, Genzyme's hospitality was the reason I got to marvel at the interior of their building.

WOW.

That's the best reaction to viewing all ten of the applications, which included a video of a lesson with annotation & reflection, a 10 minute presentation and 10 minutes of questioning by the panel. The submissions came from across the United States (though the contest is not restricted the U.S.; I believe one past winner was from Canada), from near biotech centers to rural communities far from any biotech company. Some schools were small, some were large. All had found creative ways to cope with the expense of biotech and the challenge of catching the attention of high school students.

The union of the the presentations contained an amazing array of ideas. Parent's nights where the kids become the mentor, dancing out the act of translation, demonstrating enzyme kinetics with nothing higher tech than paper, running experiments with enzymes purchased at the grocery, students creating instructional videos on laboratory technique -- even high schoolers running the show when third graders come for a day.

What was particularly impressive was how these programs were not limited to elite academic science geeks (I can say that; I am one). Many of the programs are in Vocational-Technical schools, and many were in schools with high populations economically disadvantaged children. Non-scientists were lured in by bioethics discussions and art contests.

Now I'm a few years post-secondary school, but I had nothing like this. A lot of the topics covered (and experiments run) I didn't hit until college -- of course, a few (e.g. PCR) weren't invented until then. I couldn't help a friendly chuckle at the exercise on Sanger sequencing; that's one lesson to be radically rewritten in the near future.

Does it make a difference? I believe so, and I have one data point to back it up. Later during the weekend I was describing this to a friend of ours, and her fourth-grader perked up. I asked if she knew about DNA, and she had heard of it. I asked how she liked science class (we live in a well-funded school district), and her smile dropped to a frown. "Mostly reading" (deeper frown) -- "No experiments!".

A bunch of themes reappeared in presentation after presentation. Money is a challenge, due to the expence of equipment and reagents (at least one teacher scans the paper for biotechs going through downsizing to find opportunities to gain gear!). Rigid state standards & testing requirements make some lessons difficult to fit in. Many teachers reported success enlisting teachers in other disciplines to cross-plan lessons, but it's not at all easy to accomplish. Surprisingly, none mentioned problems with controversial topics, and many of the lessons probably wouldn't have flown in my staid high school.

At the awards banquet I got a chance to see two other high school student engagement efforts, the Biodreaming poster competition (sponsored by Dow Agrosciences & Lilly) and at my dinner table were two young scientists participating in the BioGENEius research contest. I didn't have an opportunity to meet any of the minority scholars there, but a lot of young men and women stood when they were recognized.

Many of these teachers were trying to propagate (metastasize?) their programs to other districts or were participating in state or national standards setting efforts. Let us hope that they succeed magnificently so that all students have the opportunity to be excited by our science, and even if not excited by it they leave high school with a background which enables them to make reasoned choices as consumers and voters.

Friday, May 04, 2007

Biotech Buildings

I had the opportunity today to attend an event at the Genzyme Center and boy is that building a stunner. A soaring atrium contains mobiles which cast rainbows all over the space. There are watercourses and plantings at ground level, glass elevators -- more of a hotel lobby than an office building.

Novartis' Cambridge facility also has a nice atrium, though a bit more staid. Cell Signalling Technologies' lobby on the North Shore resembles a small jungle.

Biotech buildings in Cambridge are a mix. Some renovated older buildings are quite attractive, and some really are pretty plain. New buildings are a mix too. Space is precious, so those atria really shout 'we can afford it!'. Millennium's first custom building (75 Sidney) had a small atrium with a spiral stair (alas, not a double spiral!), but the later buildings used decorative ornamentation (granite in the bathrooms!) and non-rectangular walls in place of unusable air space.

Much as old banks built solid buildings with serious marble & columns to emphasize their solidity & seriousness, so too does a flashy building speak of a company's confidence in its future. Of course, such confidence is all too often misplaced. As a graduate student I watched Hybridon's headquarters emerge from a rehabbed tire warehouse, but then at Millennium I got to see the gorgeous inside -- because Hybridon was subletting the space to us. One company going up, another going down. Later, Millennium started shedding space and discovered that two story atria with a staircase looks nice, but doesn't make subletting the building a floor at a time practical without some changes.

It is also useful to be skeptical of some of the touted benefits of architecture. I am a fan of good architecture, but what looks good doesn't always work well. I love seeing Frank Lloyd Wright houses, but living in one is reputed to require some getting used to. All glass conference room walls may emphasize openess & light, but sometimes you don't really want to be a goldfish. MIT's Stata center is very funky, but just try finding an office in there (and worse, the interior is dead for at least one major cellphone carrier, meaning you can't be guided in).

In the end, some is just a matter of taste. I actually had the privilege of living in a famous bit of architecture for a semester, a Gropius-designed dorm at Harvard. I loved it; most students hated the small rooms. Plus, noise propagated dreadfully (our late night card games were often shut down) and it really didn't work well as a co-ed dorm - only one bathroom per floor, and those were definitely not suitable for unisex use. Worse, you had to go through the stairwell to get to another floor -- and the stairwell was keyed. Don't forget your key at night, or you get locked in a fishbowl in your PJs!

Do architectural gems translate to a happier, more productive workforce? Or are you stuck with a museum piece which resists change? I don't have a crystal ball -- though perhaps you can find a conference room which looks like one.

Where Biotech?

Biotech's big industry trade group, BIO, is convening in Boston this weekend & that means lots of dough to various media and advertising groups. The radio ads claim 20K biotech leaders will be here. One very visible consequence are billboards around town urging biotech companies to relocate to Las Vegas.

Even without the convention, there are regular TV and radio spots with Jeff Daniels urging life science companies to relocate to Michigan. Rhode Island has made specific attempts to schmooze companies to head south (alas for them, their biggest success, Alpha-Beta, moved just prior to clinical trial failure and company going bust).

The Nevada ads tout 'No Taxes', ignoring the fact that biotechs generally don't pay taxes -- because you have to make money to pay taxes, an extreme rarity in biotech. Actually, I'd bet the cost structure for Nevada is probably lower across the board -- cheaper electricity, lower heating costs (but higher cooling -- perhaps a wash?), certainly cheaper housing. Yet biotech is clearly strongly clustered. Are there any biotechs in Nevada?

I'm not trying to knock Nevada, or Michigan, or anywhere else. It's just the secret to getting biotechs to grow is elusive. A lot of the genomics companies started near big genome centers -- but why didn't Oklahoma reap companies from the center there? Big research universities are important -- but many big research universities do not have a garden of biotech in their neighborhood. Why is our local biotech largely in urban areas, whereas in Pennsylvania it seems to be entirely suburban -- even where I grew up (a region named for the 24th letter of the alphabet) has a cluster of biotechs. Why is non-Cambridge biotech around here largely west of town, whereas similar regions to the north and south have little to none: Worcester MA has a number of companies, but similarly distant Providence RI or southern NH very few. Why in the midwest is it easy to name companies headquartered near U Wisconsin, but not U Illinois?

I don't have the answer, and I suspect each region has a different answer. The U Mass Medical Center in Worcester may have pulled companies out that way, whereas Philadelphia area doesn't have a major academic research environment just outside the city.

One thought, and one which won't make happy the politicians trying to seed their own biotech clusters: what you need to get lots of new biotech is to have some old biotech. When companies grow and grow, they slowly shed talented people who often stay in the same region but start new ventures. And when companies crash-and-burn, a lot of people are looking for new opportunities. At the old shop we had large cohorts of persons previously at Biogen or Genetics Institute or Genome Therapeutics, and each time those companies went through convulsions a few more came on. Now, of course, every company in Cambridge is riddled with ex-Millennium hands, and many who learned the ropes of business there have gone on to start small companies. In addition to a bunch of folks I knew in my old life, my new shop has other clusters of former employers.

In the forest, when the elements topple an old tree the opportunity for new trees is created. The old roots may sprout new shoots, more sunlight comes in, and most of all the rotten log returns its resources to the surrounding soil. The analogy, like all analogies, is imperfect, but it's the same in biotech. The marketplace's creative destruction is a powerful force, but in order for it to create it needs something to destroy. States and localities wishing they had more biotech companies should continue their efforts, but temper their expectations, as those that gots gets and those that ain't gots gets slowly.

Thursday, May 03, 2007

Where can I find a good yurt?

Last Friday was move day -- my first at the new shop. If you are in computational biology in industry, expect to move.

Of course, a lot of folks move. Companies grow (yea!) and shrink (boo!) and relocated (sometimes yea!, sometimes boo!). I felt like I moved a lot at the old shop, but in reality many others moved more often.

However, it is easy to recognize that a computational position involves the least attachment to place. Find me WiFi and I can be pretty near top effectiveness. Today I discovered a machine screw in a tire of one of our cars -- so a trip to the repair shop. But they have WiFi! -- so waiting could be (and was) productive time. Need to deal with a child care gap -- WiFi! Doctor's appointments splitting up your day? -- find a cafe and connect!

But the downside is that your office is (rightly) seen as easily movable. The best place to park is near your laboratory collaborators, but when push comes to shove, the facilities folks will move you. Office space is cheap and more widely available; lab space is dear & very restricted (in Cambridge any chemistry labs can't be above a certain floor). Other office-only departments are similarly afflicted -- legal, finance, etc..

So one learns to expect to be a nomad. This move was luckily only a few floors, plus between short time & discipline I actually haven't accumulated much. It's also for a very short stay -- before long, more moves for bigger quarters and subsequent reshufflings of the stay-behinds. Who knows which category I'll be in. But you can't say it enough: moving because the company is growing far beats moving because the company is shrinking.

Omics! Omics!