Sequencing microeukaryotes

November 15, 2011

Next-generation (and now third generation) sequencing technologies are enabling molecular biologists to obtain massive sequence libraries of DNA and RNA directly from the environment.  The ability to determine the taxonomic affiliation of a sequence read along with the gene it likely encodes necessitates an extensive reference sequence database of organisms.  From marine environments, there are upwards of hundreds of prokaryotic genomes sequenced providing a fairly extensive reference genome database (although I must acknowledge that marine bacteria and archaea are incredibly diverse).  For marine eukaryotes, including both autotrophic and heterotrophic members, our reference sequence databases are very sparse, for the most part, due to their larger genomes and difficulty to maintain in culture.  Just recently, genomes and extensive Expressed Sequence Tag (EST) libraries for members from each of the major marine eukaryotic phytoplankton function groups have been sequenced.  This includes representatives of diatoms, coccolithophores, dinoflagellates, green algae and cryptomonads.  However these currently available reference databases for marine eukaryotes represent only a very minute fraction of the total species that exist within marine environments.  This creates a road block for taxonomic identification and functional gene annotation of reads in metagenomic and metatranscriptomic projects from marine environments (see my research page for an example of such a study).  With the aims of providing an increased ability for the interpretation of marine sequencing projects  the Gordon and Betty Moore Foundation are funding sequencing of 750 EST libraries of marine microeukaryotes using the Illumina HTSeq 2000 Platform.  Sequencing is currently underway. I am pretty excited about this large sequencing effort, although I wish they would have initiated this sooner!

More details about this project can be found here.