Abstract in Englisch:
Human Immunodeficiency Virus 1 uses for entry into host cells a receptor (CD4) and one of two co-receptors (CCR5 or CXCR4). Recently, a new class of antiretroviral drugs has entered clinical practice that specifically bind to the co-receptor CCR5, and thus inhibit virus entry. Accurate prediction of the co-receptor used by the virus in the patient is important as it allows for personalized selection of effective drugs and prognosis of disease progression. We have investigated whether it is possible to predict co-receptor usage accurately by analyzing the amino acid sequence of the main determinant of co-receptor usage, i.e., the third variable loop V3 of the gp120 protein. We developed a two-level machine learning approach that in the first level considers two different properties important for protein-protein binding derived from structural models of V3 and V3 sequences. The second level combines the two predictions of the first level. The two-level method predicts usage of CXCR4 co-receptor for new V3 sequences within seconds, with an area under the ROC curve of 0.937+/-0.004. Moreover, it is relatively robust against insertions and deletions, which frequently occur in V3. The approach could help clinicians to find optimal personalized treatments, and it offers new insights into the molecular basis of co-receptor usage. For instance, it quantifies the importance for co-receptor usage of a pocket that probably is responsible for binding sulfated tyrosine.