Francisella tularensis, the aetiological agent of tularemia, is an important pathogen throughout much of the Northern hemisphere. We have carried out sample sequencing of its genome in order to gain a greater insight into this organism about which very little is known, especially at the genetic level. Nucleotide sequence data from a genomic DNA shotgun library of the virulent F. tularensis strain Schu 4 has been partially assembled to provide 1·83 Mb of the genome sequence. A preliminary analysis of the F. tularensis genome sequence has been performed and the data compared with 20 fully sequenced and annotated bacterial genomes. Plasmid-encoded genes, previously isolated from low virulence strains of F. tularensis, were not identified. A total of 1289 potential coding ORFs were identified in the data set. An analysis of this data revealed 413 ORFs which would encode proteins with no homology to known proteins. ORFs which could encode proteins involved in amino acid and purine biosynthesis were also identified. These biosynthetic pathways provide targets for the construction of a defined attenuated mutant of F. tularensis for use as a vaccine against tularemia.