Stanford Genome Technology Center|
Cryptococcus neoformans Genome Project
[ Image courtesy of Jennifer Lodge ]
The Cryptococcus neoformans Genome Project is a collaboration among scientists at two centers: the Stanford Genome Technology Center (SGTC) and the Institute for Genomic Research (TIGR). Starting in March 2000, the team of scientists (Richard Hyman, Eula Fung, Dan Bruno, Marilyn Fukushima, Molly Miranda, Don Rowley, and Ron Davis) at the SGTC has been funded by cooperative agreement AI47087 from the NIAID to undertake the C. neoformans Genome Project (as a whole genome shotgun). At the end of September 2001, we (SGTC) completed the shotgun sequencing phase of the C. neoformans Genome Project when we reached our shotgun sequence goal of seven-fold genome coverage. Starting in April 2001, a team of scientists at TIGR joined the C. neoformans Genome Project, with funding provided by cooperative agreement AI48594 from the NIAID. As of October, 2001, the shotgun sequencing phase of the C. neoformans Genome Project had reached our combined goal of 12-to-13-fold genome coverage. At that time, the reads were assembled, and the results of the assembly (contigs-in-progress; hereafter called contigs) were posted on this website and on TIGR's website.
UPDATE: March 2003.
Two problems in the shotgun sequence data became apparent. 1. The frequency of chimeric reads in the SGTC sequencing libraries was greater than usual. Using straightforward procedures, we have removed the lion's share of the chimeric reads from our data set. 2. The combined shotgun reads turned out to be from two closely related, but distinguishable, C. neoformans strains: B-3501A and JEC21. Rory Duncan, our Project Manager at the NIAID, asked us (SGTC) to concentrate on finishing the genome sequence of the B-3501A strain and asked TIGR to finish the JEC21 genome sequence.
As Richard Hyman wrote in Science (2001): "During finishing, physical gaps in
the sequence are closed, ambiguities in the sequence are resolved,
contaminating sequences are removed, and errors in the sequence are identified
and corrected. Finishing is a slow process, often taking 2 to 3 years for
large sequencing projects. Thus, the almost complete sequence will be
available for an extended length of time while the sequence is completed and
published." We (SGTC) are well underway with the finishing phase of the
C. neoformans B-3501A genome sequence, as demonstrated by these posted
assembled sequence data in this update. As finishing is inherently a slow (and
expensive) process, we request that users of our unpublished sequence data
have some patience. You can help us finish the C. neoformans genome
sequence. If you find an apparent problem in our current C. neoformans
B-3501A genome sequence, please communicate that information directly to
Richard Hyman (email@example.com). Thank you. Annotation will follow
Hyman, Richard W. Sequence data: posted vs. published. Science 2001: 291: 827.
Schein JE, Tangen KL, Chiu R, Shin H, Lengeler KB, MacDonald WK, Bosdet I,
Heitman J, Jones SJ, Marra MA, Kronstad JW.
Physical maps for genome analysis of serotype A and D
strains of the fungal pathogen
Cryptococcus neoformans. Genome Res. 2002;12: 1445-53.