next up previous contents
Next: The histogram technique Up: Model and simulation algorithm Previous: Model and simulation algorithm   Contents


Monte Carlo methods and the Metropolis algorithm

Monte Carlo (MC) methods refer, in a very general sense, to any simulation of an arbitrary system which uses a computer algorithm explicitly dependent on a series of (pseudo)random numbers (see, for example, [32]). The name, which derives from the famous Monaco casino, emphasizes the importance of randomness, or chance, in the method. MC is particularly important in statistical physics, where systems have a large number of degrees of freedom and quantities of interest, such as thermal averages, cannot be computed exactly. In a system with D degrees of freedom, for example, the thermal average of a quantity A associated with each microstate of the system in equilibrium at absolute temperature T is given by

\begin{displaymath}
\langle A \rangle =\frac{1}{Z} \int A({\bf x}) e^{-\frac{E({\bf x})}{T}}d{\bf x}
\end{displaymath} (2.1)

where ${\bf x}$ is a point in D-dimensional space representing the state of the system, $E({\bf x})$ is the energy of state ${\bf x}$, $Z=\int e^{-\frac{E({\bf x})}{T}}d{\bf x}$ is the partition function. Note that I am using units in which the Boltzmann constant is one (kB=1), so it does not appear in the equation. In the case of protein lattice models where conformational space is discrete, the integral in the above equation is replaced by a sum over all conformations:
\begin{displaymath}
\langle A \rangle =\frac{1}{Z}\sum_{{\bf x}}
A({\bf x}) e^{-\frac{E({\bf x})}{T}}
\end{displaymath} (2.2)

where different states x of the system correspond to different conformations and the partition function is $Z=\sum\limits_{\bf x} A({\bf x}) e^{-\frac{E({\bf x})}{T}}$.

In the case of very small chains all conformations can be enumerated and thermal averages (as well as extensive quantities such as entropy and free energy) can be computed exactly by Eq. 2.2 [13]. For longer chains, however, as the 36mer in the cubic lattice which is considered in this study, complete enumeration of conformational space is impossible with present day computers. In MC simulations this difficulty is solved by replacement of the set of all conformations in Eq. 2.2 by a representative tractable subset of M conformations, where M is much smaller than the total number of conformations, N. An estimate for the thermal average, $\langle A \rangle _{est}$ is then obtained:

\begin{displaymath}
\langle A \rangle _{est} = \frac{\sum_{l=1}^M A({\bf x}_l) e...
...E({\bf x}_l)}{T}}}
{\sum_{l=1}^M e^{-\frac{E({\bf x}_l)}{T}}}
\end{displaymath} (2.3)

It is clear that the accuracy of the estimate will depend directly on the quality of the representative subset of M conformations. In a simple sampling method, for example, where the M conformations are chosen randomly, the immense majority of the conformations will have energy very different from the average energy of the system at temperature T and their contribution to the estimate will be insignificant. The estimate obtained by simple sampling would be, therefore, very inaccurate, unless M becomes as big as N, or even larger.

The idea of importance sampling in MC simulations is to choose the representative set of conformations not completely at random, but in such a way that the selection is somehow biased towards conformations that are significantly populated at equilibrium. In general, if the probability that a given conformation ${\bf x_l}$ appears in the sample representative of conformations is $P_{samp}({\bf x_l})$, then Eq. 2.3 becomes:

\begin{displaymath}
\langle A \rangle _{est} = \frac{\sum_{l=1}^M
\frac{A({\bf...
...M
\frac{e^{-\frac{E({\bf x}_l)}{T}}}
{P_{samp}({\bf x_l})}}
\end{displaymath} (2.4)

In particular, if the conformations are chosen with probability $P_{samp}({\bf x_l}) \propto e^{-\frac{E({\bf x}_l)}{T}}$, then the Boltzmann factors cancel out and the estimate for the thermal average becomes:
\begin{displaymath}
\langle A \rangle _{est} = \frac{\sum_{l=1}^M A({\bf x}_l)}{M}
\end{displaymath} (2.5)

Samples of representative conformations with this particularly convenient property, where the probability of occurrence of a given conformation is proportional to its Boltzmann factor, are generated in the present study by the Metropolis algorithm [33]. The algorithm constructs a Markov chain of conformations, where the first conformation, ${\bf x_1}$, is arbitrarily chosen (e.g., randomly) and an appropriate probability function, $W({\bf x}_{i-1} \rightarrow
{\bf x}_i)$, is used to construct each conformation ${\bf x}_i$, from the previous conformation ${\bf x}_{i-1}$. $W({\bf x}_{i-1} \rightarrow
{\bf x}_i)$ is the probability of a ``move'' from conformation ${\bf x}_{i-1}$ to conformation ${\bf x}_i$. In general, to make such a chain of conformations converge to the desired canonical distribution it is sufficient (but not necessary) to impose the condition of detailed balance, according to which the following equality must hold for any arbitrary pair of conformations, ${\bf x_l}$ and ${\bf x}_m$,

\begin{displaymath}
P_{eq}({\bf x}_l) W({\bf x}_l \rightarrow{\bf x}_m) =
P_{eq}({\bf x}_m) W({\bf x}_m \rightarrow{\bf x}_l)
\end{displaymath} (2.6)

where $P_{eq}({\bf x})=\frac{e^{-\frac{E({\bf x}_l)}{T}}}{Z}$ is the equilibrium probability of conformation ${\bf x}$.

The condition of detailed balance given by Eq. 2.6 implies that, at equilibrium, the average number of moves ${\bf x}_l \rightarrow{\bf x}_m$ is the same as the average number of inverse moves ${\bf x}_m \rightarrow{\bf x}_l$. As this is true for any two arbitrary conformations it follows that if the system in equilibrium is submitted to moves that obey the detailed balance condition there will be no change in the probability of any conformation and the system will remain in equilibrium. It can also be shown (see, for example [32]) that if the system is not in equilibrium then the ratio between the probabilities of any two conformations tends to increase if it is initially below its equilibrium value and to decrease if it is initially above its equilibrium value. It follows that for sufficiently long simulations the system will reach thermodynamic equilibrium.

It is usually convenient to restrict the possible moves from a particular conformation only to a restricted number of ``adjacent'' conformations. The condition of detailed balance (Eq. 2.6) requires the following for any two conformations ${\bf x_l}$ and ${\bf x}_m$:

1.
If $W({\bf x}_l \rightarrow {\bf x}_m) = 0$ then $W({\bf x}_m \rightarrow {\bf x}_l) = 0$.
2.
If $W({\bf x}_l \rightarrow {\bf x}_m) \neq 0$ then
\begin{displaymath}
\frac{W({\bf x}_l \rightarrow {\bf x}_m)}{W({\bf x}_m \right...
...
{\rm exp}\left( - \frac{E({\bf x}_m)-E({\bf x}_l)}{T} \right)
\end{displaymath} (2.7)

So, a move from one conformation to another is possible if, and only if, the inverse move is also possible, or, in other words, two arbitrary conformations must be necessarily either mutually adjacent or not adjacent. Also if two conformations are mutually adjacent then the probability of a move between them is related to the probability of inverse move by a well defined relation dependent only on the difference in energy between the two conformations.

The rules that determine which conformations are adjacent to any arbitrary conformation are given by the ``move set'' used in the simulation. As long as detailed balance is respected, the particular move set should have no effect on the the equilibrium canonical distribution reached after sufficiently long time but it can have drastic effects on the rate at which this equilibrium distribution is reached. An appropriate move set, where adjacent conformations are not very different from each other, also permits a dynamic interpretation of the Markov chain of conformations generated during the simulation, according to which the number of conformations generated is considered to be proportional to time. According to these considerations, therefore, thermodynamic properties of the system are independent of the particular choice of move set but kinetic properties are not.

The Metropolis algorithm is as follows. The first conformation is randomly generated. At each point in the construction of the chain of conformations a move is attempted to the current conformation. The move is rejected immediately if the local chain conformation is not compatible with the attempted move or if it violates the excluded volume condition. If these two conditions are satisfied then the so called Metropolis criterion is applied. If the difference between the energy of the resulting conformation and the energy of the current conformation, $\Delta E$, is negative (i.e. the energy of the resulting conformation is smaller than the energy of the current conformation), then the resulting conformation is accepted and it becomes the new conformation in the chain. If $\Delta E$ is positive, however, a (pseudo)random number between 0 and 1, 0<R<1, is generated and the resulting conformation is only accepted if $e^{- \Delta E / T} > R$. If $e^{- \Delta E / T} < R$ then the resulting conformation is refused. Whenever the conformation resulting from the attempted move is refused for any of the three possible reasons, then the new conformation of the chain is the same current conformation. For sequence selection the same algorithm is used but the ``moves'' correspond to switching the position of two monomers while the conformation is kept fixed [23].

The Metropolis criterion can be summarized by the following expression for the probability of acceptance of an attempted conformation:

\begin{displaymath}
P_{{\rm accept}}={\rm min}(1, e^{-\Delta E / T})
\end{displaymath} (2.8)

Note that the ratio between the acceptance probabilities of a move is related to the acceptance probability for the inverse move by the same relation presented in Eq. 2.7. As the transition probability between two conformations is the product of the probability of attempting a given adjacent conformation, $P_{\rm attempt}$ and the probability of accepting the conformation $P_{{\rm accept}}$, then the detailed balance condition is only valid with the described algorithm because $P_{\rm attempt}$ is constant. This is the reason why local moves that are not compatible with the chain conformation or violate the excluded volume condition must be considered during the choice of conformation to be attempted in the simulation in conformational space, even if they will always be refused. If, for example, the choice of local moves were restricted a priori to moves compatible with the local conformation of the chain at the bead where the move is being attempted then $P_{\rm attempt}$ would depend on the number of such possible moves and would be different for different conformations.

In the present study averages estimated from long MC trajectories by Eq. 2.5 are considered to be good estimates for the true thermodynamic average, so that $\langle A \rangle _{est}\approx\langle A \rangle $ and the subscript is not used to distinguish estimates from real thermal averages.


next up previous contents
Next: The histogram technique Up: Model and simulation algorithm Previous: Model and simulation algorithm   Contents
Antonio Francisco Pereira de Araujo
1999-02-23