09 June 2011

Explaining BlueSEQ

This is my (non-snarky) attempt to convince Anthony Fejes that BlueSEQ might be a good idea.

Anthony Fejes, in the process of doing a phenomenal job with realtime blogging about the Copenhagenomics talks, expressed his concerns and doubts about BlueSEQ's business plan. For those of you who don't know, BlueSEQ is a Danish startup which is attempting to create a neutral next generation sequencing exchange. Their goal is to match researchers who have a sequencing project need with service providers who have excess capacity on their sequencers.

Take a minute to read Anthony's comments (but please come back when you're done).

His main concerns seem to be:
1) Why would service providers want to participate?
2) How can BlueSEQ standardize NGS in a meaningful way?

Let's tackle the first point. Why would providers want to be a part of this? I'll give you my opinions, but the simple fact is they ARE interested. BlueSEQ has already signed up over 20 providers and they're getting more requests all the time. Providers really want to have access to these customers to help them drive and expand their business. It also helps them optimize their workflow. For example, if they're in the middle of a large human genome resequencing project and they've got an internal user who wants to run a 5 sample small RNA-Seq experiment, they could outsource the small project so as to not interrupt the large project (which would require a completely different setup). Alternatively, if they're running some ChIP-Seq projects, they may want to go onto the exchange to find as many ChIP-Seq samples to help fill up an entire 5500xl SOLiD or HiSeq 2000 run. Basically, it gives them more customers and more flexibility.

Now to the second point. Anthony correctly points out the folly of trying to standardize such things as the bioinformatics analysis process. Fortunately that's not what BlueSEQ is trying to do. In talking with dozens of service providers, what they've found is that a big issue that the providers face is educating their potential customers about what NGS can and can't do and translating the customer's general idea of a biological experiment into a particular NGS protocol. This is where BlueSEQ can step in. The goal is to help guide researchers in communicating their needs to the providers. For example, translating something like "I want to compare gene expression levels between these 10 samples" into something like "Starting from 100ng of total RNA, I need to run RNA-Seq (including library prep) on 10 mouse samples with 50M 1x100bp reads per sample, followed up with an analysis that looks for both expression level differences as well as SNP variances". The final details would likely be worked out between the researcher and the provider, but they would at least be on the same page.

If you're still with me, here are my quick responses to the rest of Anthony's specific questions:

[This is one stop shopping for next-gen sequencing providers? How do you make money doing this?]
BlueSEQ makes money by charging service provider small fee (generally 5-10%, depending on the project size) when, and only when, the provider wins a bid.

[How do you standardize the bioinformatics? Seems... naive.]
Yes, that would be naive, but they're not trying to do that.

[again, why would providers want to opt in to this?]
Because they want more business. Of course, if a service provider is running their machines 24x7, they wouldn't want to join. However, it appears that most providers don't find themselves in this situation.

[Why would any provider want customer reviews of NGS data... ]
It's possible they wouldn't, especially if they're not very good. But this is really more for the users as over time it will help them pick the best providers (and not, for example, choose simply based on price).

[the sample prep is a huge part of the quality, and if they don't control it, it's just going to be disaster.]
This would be entirely up to the researcher and provider. Some providers want that level of control (to ensure high quality) while others are happy to have their customers prepare the libraries.


  1. Where are these providers that don't have full queues? Pfizer tried to outsource some sequencing to a third party when our internal queues filled up: there were 4-6 week delays for sequencing at a third party company. The provider queues seem pretty full...

  2. Thanks for taking the time to reply. I've responded over at my blog, http://blog.fejes.ca/?p=770#comment-1543

    I still have a few other questions about the BlueSEQ model. Please let me know if you might have some time to talk about this - I'd be happy to write a follow-up post to correct any mistakes I may have made.

  3. Usually service providers 4-6 weeks queue just means:

    1- that they are waiting for other samples to fill up the plate or
    2- that they are waiting for samples that were supposed to be delivered weeks before but were not because the user is late or messed up the samples (which adds problems to 1-) or
    3- that the machine is broken (happens at least once - twice a year based on my experience w NGSequencers) or
    4- that a previous run went bad .... which usually means they got a bad reagents' lot (happens more than twice a year) or
    5- that they are re-sequencing / re preparing the libraries because 3 or 4 just happened.

    In my experience 30-40% of the delay is due to lack of samples to run (to fill up the plate), 30-40 % is due to unreliable reagents/machines (both on illumina, 454 and SOLiD) and the remaining 20% is due to unreliable users that are not able to provide the samples as requested (quantity, quality, timing etc..)