Sequence Diversity

HIV mutates rapidly in host, due to:

Large number of replications (1010 virions per day) [Perelson et al, Science 271, 582 (1996)

Transcription errors (10-4 - 10-5 per base per cycle) [Perelson et al, Science 271, 582 (1996)]

Recombination (10-5 per site per generation) [R.A. Neher & T. Leitner, PLOS Comp. Bio. 6, e1000660 (2010)]

There is large sequence diversity: [B. Korber, IAVIReport (2010)]

Patient-derived sequences can be used to construct a prevalence landscape, using a

Maximum entropy constuction.

Patient-derived sequence frequencies are outcome of many factors, incluing

Empirical evidence suggests that prevalence is a reasonable proxy for replicative fitness

[A.L. Ferguson et al, Cell Immunity 38, 606 (2013)] (gag)

Potential uses of the prevalence landscape as a proxy for the fitness landscape:

Immunogenic versus non-immunogenic proteins

Design of vaccines

Retrodicting drug resistance sites