next up previous
Next: Minimization and Structural Study of a Model System Up: Introduction to QM simulations Previous: Introduction to QM simulations

Subsections

Introduction

This hands-on session will introduce you to quantum mechanical (QM) simulation methods and provide you with a few examples illustrating the powerful tools QM simulations offer for biomolecular modeling. The first part constitutes a basic introduction to QM simulations: How to use the QM software, set up runs etc.. All the files that you need to have will be provided for you. You will be able to calculate some useful properties and then put them to work. Since QM calculations are often quite time consuming we will provide you with intermediate results, so that you can focus on the essentials and continue on, when you are stuck.
In the second part of the tutorial you will then use QM simulations to calculate the proton affinity of water and methanol, which constitutes a very important quantity of biomolecular systems.

Before submitting your first simulation you need to make yourself familiar with the quantum chemistry package that you will use to perform your computations. This will be the topic of the following section, where you will learn how to set up a QM calculation. You also need to copy some files into you working directory before you can get started.

Computational Setup and Preparations

The quantum chemistry program that you will use to perform your calculations is called GAMESS and can be accessed via a shell-script called rungms. In order to make life easier for you we will provide a complete directory structure named QM_tutorial containing all the files you will need. First go into the working directory where you will perform all the exercises for this tutorial. Hence type

tbss> cd ~/tbss.work

Next, copy the directory which contains the QM tutorial files into your working directory by typing

tbss> cp -rp TOP_DIR/sumschool03/tutorials/05-qm-tutorial/QM_tutorial ./ 

Here you have to replace TOP_DIR by the directory where the main sumschool03 directory is located, e.g. ~/Desktop if it is located on your desktop or /mnt/cdrom/ if you copy it from CD.

Now you will have a directory QM_tutorial containing three subdirectories: one for part 1 and one for part 2 of the tutorial. The third one contains the GAMESS manual which will be helpful in case you would like to run your own simulations (you can view pdf files by using the command acroread filename.pdf).
In the directory part1 you will find sub-directories for each of the calculations you will perform in part 1. Please make sure to run each simulation in its appropriate directory in order to keep track of all the different results (Don’t worry, we will show you how!). Each directory also has a sub-directory called sample which contains sample input and pre-calculated output files. This will allow you to check your results and to make sure that you have set up your simulation properly.

Setting up runs with GAMESS

Just like NAMD2, the quantum chemistry package GAMESS is driven and controlled by an input file which tells the program about the system under investigation and what to do with it. This will be explained by means of the following example file, which, in fact, is the input file for your first real QM calculation using GAMESS. You don’t need to worry about where the file is located until part 1 of the tutorial.

! Minimization of retinal analogue (6-31G)
 $CONTRL SCFTYP=RHF RUNTYP=OPTIMIZE COORD=ZMT NZVAR=36
 ICHARG=1 $END
 $SYSTEM MEMORY=90000000 $END
 $BASIS NGAUSS=6 GBASIS=N31 $END
 $STATPT OPTTOL=1.0E-4 NSTEP=30 $END
 $SCF DIRSCF=.TRUE. $END
 $ELPOT IEPOT=1 WHERE=PDC $END
 $PDC PTSEL=CONNOLLY CONSTR=CHARGE $END
 $DATA
 Minimization of retinal analogue
C1
 H
 N      1     1.008000
 H      2     1.008000    1    120.000
 C      2     1.300000    3    120.000    1    180.000
 C      4     1.450000    3    109.471    2    180.000
 C      5     1.335000    4    120.000    3    180.000
 C      6     1.450000    5    109.471    4    180.000
 C      7     1.335000    6    120.000    5    180.000
 H      4     1.089000    3    109.471    2    300.000
 H      5     1.089000    4    120.000    9    180.000
 H      6     1.089000    5    109.471   10    120.000
 H      7     1.089000    6    120.000   11    180.000
 H      8     1.089000    7    109.471   12    120.000
 H      8     1.089000    7    109.471   12    240.000
 $END

We will now guide you through the file to help you understand what is being done. The GAMESS manual contains much more detail as well as additional options and you should have a look at it. The input file basically needs to specify at least three essential items: the molecular structure of the system under consideration, the quantum mechanical method ("level of theory") to solve the corresponding Schroedinger equation and the type of run, e.g. energy calculation, structure optimization etc..

The input file which controls the computation performed by GAMESS consists of a list of keyword groups. Each group starts with a $ sign followed by its name (note the required single space in front of each of them), e.g. $CONTRL, and ends with $END. The following will explain the keyword groups relevant to you. For more information please refer to the GAMESS manual (you can open it by via acroread manual/gamess.pdf&).

$CONTRL: As its name implies, this field controls the GAMESS run. The SCFTYP keyword tells GAMESS which wave function type to use and for you it will always be SCFTYP=RHF (RHF=Restricted Hartree-Fock). RUNTYP tells GAMESS what to do: single point energy calculation or, in your first example, optimization of the provided geometry via RUNTYP=OPTIMIZE. The next two keywords control how the coordinates of the molecule are input: COORD=ZMT tells GAMESS that you provide a so called Z-matrix (to be explained below) and NZVAR=33 lists the number of internal coordinates (there are 3N-6, where N is the number of atoms). NZVAR=33 also tells GAMESS that you request the optimization to be performed in the internal coordinate space (as opposed to simple cartesian coordinates which you can request via NZVAR=0). The final item in the main control group, ICHARG=1 provides GAMESS with the total charge of the system, 1 in this case.

$SYSTEM: This group tells GAMESS the amount of core memory available to it. You don’t need to worry about it!

$BASIS: In this section you tell GAMESS what particular basis to use for your simulation. Initially you will be using the 6-31G basis, which is specified by the keyword sequence NGAUSS=6 (i.e. 6-31G), GBASIS=N31 (i.e. 6-31G).

$STATPT: This group controls the properties of the geometry optimization run. With OPTTOL you specify the value of the gradient at which the minimization is considered to be converged. NSTEP is the maximum number of optimization steps allowed; it has to be large enough to accommodate the number of steps required for optimization of your system.

$SCF: This option controls the specific way GAMESS performs its calculations and you don’t need to worry about the details. Consult the documentation in case you’re interested.

$ELPOT: This keyword section requests the calculation of the electrostatic potential at points controlled by the input group $PDC.

$PDC: This group determines the points at which to calculate the electrostatic potential for the purpose of fitting atomic charges to this potential. These are the so called electrostatic potential ESP derived charges which you will need in the first part of the hands-on session.

$DATA: This section provides GAMESS with the conformation of the system under consideration. In principle, there are several ways to provide this information: one could simply supply the list of atoms and their respective x,y and z coordinates, similar to a pdb file. Sometimes, however, it is more convenient to specify the molecular arrangement in terms of internal coordinates via a Z-matrix. In this case, rather than giving the xyz-coordinates of all atoms, one specifies the distances, angles and dihedrals between the system’s atoms. The next section will provide you with a short tutorial on how to set up a Z-matrix.

A successful GAMESS run will create at least a *.dat file (certain RUNTYPs create more files) in addition to the standard output which is typically stored in the *.log file. The latter contains all the information that you need during this tutorial. The *.dat file stores additional information, e.g. the wavefunction and other restart values important for more advanced simulations.

Understanding Z-matrices

The purpose of this section is to help you understand how to find internal coordinates (i.e. distances, angles and dihedrals) that specify the conformation of the system under consideration and also to tell you how to use them to set up a Z-matrix. It is very important to keep in mind that in contrast to simple xyz-coordinates there is generally no unique set of internal coordinates for a system. This implies that you have the possibility of choosing a set of internal coordinates that is well suited for your system. Particularly for structure optimization and transition state searches a "good" choice of internal coordinates can be crucial for successful and efficient completion of your run. This is related to the fact that some of the internal coordinates may be linearly dependent, making it impossible to vary them independently, thereby prohibiting the optimization procedure to reach the equilibrium structure efficiently. In this sense the internal coordinates provided in the sample input file are "bad" since they constitute a rather linearly dependent set. They are nevertheless good enough and illustrate well the concept of a Z-matrix. (If you have time during one of you evening sessions we encourage you to develop a better one and repeat the minimization of part 1 using e.g. cartesian (NZVAR=0) and your new internal coordinates. You will observe that the convergence of the optimization runs will greatly depend on the choice of coordinates.) Let us now look more closely at the $DATA field of the input file:

 $DATA
 Minimization of retinal analogue
C1
 H
 N      1     1.008000
 H      2     1.008000    1    120.000
 C      2     1.300000    3    120.000    1    180.000
 C      4     1.450000    3    109.471    2    180.000
 C      5     1.335000    4    120.000    3    180.000
 C      6     1.450000    5    109.471    4    180.000
 C      7     1.335000    6    120.000    5    180.000
 H      4     1.089000    3    109.471    2    300.000
 H      5     1.089000    4    120.000    9    180.000
 H      6     1.089000    5    109.471   10    120.000
 H      7     1.089000    6    120.000   11    180.000
 H      8     1.089000    7    109.471   12    120.000
 H      8     1.089000    7    109.471   12    240.000
 $END

The geometry input starts in line 3 of the $DATA field (the first two lines, however, need to be present: the first one is a text comment, the second line provides the symmetry of the system). Each line specifies the location of an atom in terms of its position with respect to its molecular neighbors. The overall orientation of the molecule obviously does not matter. The first atom in the Z-matrix (a hydrogen atom) constitutes the origin of the internal coordinate system and hence no distances or angles to any other atoms need to be specified. The second item in the list is a nitrogen atom (N) whose position is completely determined by providing the distance from the chosen origin, the hydrogen of line 1. Hence

N      1     1.008000

tells GAMESS that N has a distance of 1.008 Å from atom number 1. Atom number 3 (H) can be uniquely defined by providing, e.g. the distance to atom 2 and the angle it forms with atoms 1 and 2. Therefore

 H      2     1.008000    1    120.000

means that atom 3 (H) has a distance of 1.008 Å from atom 2 (N) and the angle between atoms 1-2-3 (H-N-H) is 120 degrees. All remaining atoms can then be specified in terms of a distance, an angle and a dihedral as is the case in lines 4 to 13, e.g.

C      2     1.300000    3    120.000    1    180.000

implies that atom 4 (C) is located 1.30 Å from atom 2 (N) and that the angle and dihedral between atoms 3-2-4 and 1-3-2-4 is 120 and 180 degrees respectively. Again, note that this assignment is not unique and one should always try to chose the representation which is most convenient for the problem at hand.



next up previous
Next: Up: Previous:
markus@ks.uiuc.edu