Codon usage and base composition in sequences from the A + T-rich genome of Rickettsia prowazekii, a member of the alpha Proteobacteria, have been investigated. Synonymous codon usage patterns are roughly similar among genes, even though the data set includes genes expected to be expressed at very different levels, indicating that translational selection has been ineffective in this species. However, multivariate statistical analysis differentiates genes according to their G + C contents at the first two codon positions. To study this variation, we have compared the amino acid composition patterns of 21 R. prowazekii proteins with that of a homologous set of proteins from Escherichia coil. The analysis shows that individual genes have been affected by biased mutation rates to very different extents: genes encoding proteins highly conserved among other species being the least affected. Overall, protein coding and intergenic spacer regions have G + C content values of 32.5% and 21.4%, respectively. Extrapolation from these values suggests that P. prowazekii has around 800 genes and that 60-70% of the genome may be coding.
|Number of pages||12|
|Journal||Journal of molecular evolution|
|Publication status||Published - May 1996|