| Genome Size | 751,723 base pairs |
| G+C% | 25.5% |
| Sequencing Reactions | ~13,000 |
| Average Redundancy | 7.8 |
| Man Months Labor | 29 months |
| Duration of Cloning & Sequencing Work | 24 months |
| Cost per Finished Base Pair | 43¢ |
| Project Costs | ~$320,000 |
We sequenced the genome of the bacterium Ureaplasma urealyticum serovar 3 (Uu). Uu is an opportunistic pathogen of the human urogenital tract that is a significant cause of adverse pregnancy outcome. Uu is the third Mycoplasma genome to be completely sequenced. The DNA sequencing was done using a combination of random shotgun sequencing and ordered shotgun sequencing. One of the aims of the project was to demonstrate that two scientists could rapidly sequence an entire microbial genome in a cost effective manner. The sequencing was completed at a cost of 43¢/finished base. Approximately 13,000 sequences were performed, and the average sequence redundancy was 7.8. At the completion of the sequencing phase of the project the final two gaps were closed by sequencing directly from genomic DNA. The sequences were assembled using the PE-Applied Biosystems AutoAssembler software.
Annotation of the Uu genome is proceeding utilizing a combination of sequence analysis on a UNIX workstation, along with a data management system developed in-house that is based upon a Microsoft SQL Server database with a Web-browser interface. A set of potential peptides coded for by Uu along with the complete nucleic acid sequence were subject to a variety of BLAST database searches along with a number of other analyses such as tRNAscan for identification of tRNAs. All of the data from these analysis tools were then parsed appropriately, and imported into tables within the SQL database. The web interface allows access to all of this data, and provides a framework for generating a growing map based upon identification of significant coding sequences. While this system is not automated to the extent provided by other available annotation systems, it provides us with an easily configurable system that can be personalized and maintained within a small laboratory environment.
The single circular chromosome of Uu contains 751,723 base pairs. Only 25.5% of the bases are G or C. Uu contains 3370 reading frames coding for 50 or more amino acids. Ureaplasmas use an unusual genetic code in which TGA codes for tryptophan instead of translation termination. Although all 62 codons for amino acids are represented in the Uu genome, the genome codes for potentially only 30 different tRNAs. There are two rRNA operons.