Protein side-chain placement: probabilistic inference and integer programming methods
Author(s)Hong, Eun-Jong; Lozano-Pérez, Tomás
The prediction of energetically favorable side-chain conformations is a fundamental element in homology modeling of proteins and the design of novel protein sequences. The space of side-chain conformations can be approximated by a discrete space of probabilistically representative side-chain conformations (called rotamers). The problem is, then, to find a rotamer selection for each amino acid that minimizes a potential energy function. This is called the Global Minimum Energy Conformation (GMEC) problem. This problem is an NP-hard optimization problem. The Dead-End Elimination theorem together with the A* algorithm (DEE/A*) has been successfully applied to this problem. However, DEE fails to converge for some complex instances. In this paper, we explore two alternatives to DEE/A* in solving the GMEC problem. We use a probabilistic inference method, the max-product (MP) belief-propagation algorithm, to estimate (often exactly) the GMEC. We also investigate integer programming formulations to obtain the exact solution. There are known ILP formulations that can be directly applied to the GMEC problem. We review these formulations and compare their effectiveness using CPLEX optimizers. We also present preliminary work towards applying the branch-and-price approach to the GMEC problem. The preliminary results suggest that the max-product algorithm is very effective for the GMEC problem. Though the max-product algorithm is an approximate method, its speed and accuracy are comparable to those of DEE/A* in large side-chain placement problems and may be superior in sequence design.
Computer Science (CS);
protein side-chain placement, protein design, belief propagation, integer programming, computational biology