Wam logo
WAM canonical classes
H3 structure
WAM algorithm
Sidechain modelling
Screens and sample results
Sequence / structure rules in screening
CPAD home
WAM background

Modelling of antibody combining sites is a combination of protein homology modelling, in which a large number of antibody structures can serve as a knowledge database, and ab initio modelling that must be used for those parts of the antibody that are too variable to apply homology methods. The majority of the variable fragment (Fv) framework region (FW) is well conserved in structure between different antibodies, more so than members of many other protein families. After taking into account those variations that do occur in particular b-strands, the framework can be modelled by selection of a known structure that is closest in sequence to the Fv to be modelled.

Large sequence and structural variability in antibodies is restricted to the hypervariable regions, six loops of variable sequence and structure located on the antibody surface. Because these bind antigen, they are referred to as complementarity-determining regions (CDRs), 3 located in the light chain (CDR-L1, L2 and L3) and 3 in the heavy chain (CDR-H1, H2 and H3). It is the modelling of these CDRs which poses the challenge, as the following section shows.

Fv (1vfa)

The Fv of an antibody (1vfa). The light chain is in red; the heavy chain in magenta; the light chain CDRs in white; CDR-H1 and H2 in yellow and CDR-H3 in orange.

Canonical classes

5 of the 6 CDRs (all except H3) frequently fall into one of between 2 and 6 structural classes, referred to as canonical classes (Chothia and Lesk, 1987; Chothia et al, 1989). Members of a canonical class all have approximately the same backbone conformation. This is determined by the loop length and the presence of a number of key residues, both in the CDR and the framework, which hold the CDR in a given conformation by hydrogen bonding, electrostatic and hydrophobic interactions. So, to model an unknown CDR, the sequence is examined, the appropriate canonical class assigned, and the most sequence-homologous known CDR used. For each loop except L2, a few examples fall outside existing canonical classes, and, along with the H3 loop, must be modelled in other ways. However, it may be possible to determine further canonical classes as more crystal structures are solved.

Modelling the CDR-H3 loop

The H3 loop is more difficult to model as its conformation varies considerably between structures. There are essentially two approaches: knowledge-based methods such as database searching, where the closest matching database loop (either from antibodies, or from the entire Brookhaven Protein Data Bank (PDB; Berman et al, 2000) in sequence and length is used as the model, or ab initio methods, such as the CONGEN conformational search method (Bruccoleri and Karplus, 1987).

Current methods

Current methods of antibody modelling from other laboratories have generally taken the homology approach, exemplified by the methods of Pulito et al. (1996), Eigenbrot et al. (1993) and Barry et al. (1994). Pulito et al modelled non-humanised and humanised variants of an antibody to predict the structure (which was unknown), whereas Eigenbrot et al and Barry et al tested the modelling procedure by mod elling known structures. Eigenbrot modelled three variants of humanised anti-p18 5 antibody 4D5, and Barry modelled three anti-DNA antibodies. In these methods, the most homologous framework from the antibody database was selected, and canonical CDRs modelled on known CDRs of the same canonical class. (Note: Eigenbrot et al. use a slightly different approach for modelling the framework: a number of known structures are 'averaged', followed by energy minimisation to relieve the strain caused by 'average' bond lengths and angles). For the H3 loop, the H3 in a known structure most closely matching it in length and sequence is used. Deletions are handled by removing the residue and rotating the phi/psi angles of the two surrounding residues to enable a join whilst conserving geometry as far as possible. Insertion is essentially the reverse of this. Certain 'canonical-like' key residues can also be taken into account when modelling H3: for example, a salt-bridge forms between residues H3:N-1 and H3:C-1 (see here for numbering convention) if these are Arg and Asp, respectively (Rees et al, 1996). So if the unknown sequence has Arg and Asp in these positions, the known H3 chosen is one which also has Arg and Asp here (unless the length is very different). Finally, energy minimisation is performed on the structure. During the modelling process, the framework and CDRs typically come from different crystal structures. A grafting process therefore takes place: when the loops are extracted from the database, two or three framework residues on each side are also included, and these are fitted onto the corresponding residues on the template framework. The program ABGEN (Mandal et al., 1996) has automated the homology process described above, with the pre-minimisation stages completed in about 6 minutes. These methods (ABGEN and also the method of Eigenbrot et al and Barry et al) predict the framework and canonical loops accurately, with global backbone RMSD (see Appendix 1) for these sections less than 1.5, but the H3 loop is of more variable quality (from 1.0 to 4.0), due to its lack of a clear-cut classification system. The alternative approach to modelling H3 described by us involves conformational search (Martin, Cheetham and Rees 1989, 1991; Pedersen et al, 1992), and as shown here offers much improved results in H3 modelling.

Last updated 22/11/01