From: Ivan Gregoretti (ivangreg_at_gmail.com)
Date: Wed Mar 19 2014 - 19:18:50 CDT

Hi Josh

Valid Question:
Why to avoid splitting the original pdb file into multiple, manually built
partial pdb files?

Answer:
Because I need to create tools (scripts) that will allow a group of
co-workers with less computational skills to start from non-ideal PDB
structures and move quickly to a molecular dynamic simulations. (Needless
to say, I need to create the scripts that will analyse the trajectories for
them too.)

Unfortunately, I must expect that if they do it manually, they will make
mistakes or overlook missing amino-acids and only realise it after a week
of MD simulation. The solution is automation. Programming.

Thanks for asking. I am pretty sure you are not the only one wondering the
same.

Ivan

Ivan Gregoretti, PhD
Bioinformatics

On Wed, Mar 19, 2014 at 7:39 PM, Josh Vermaas <vermaas2_at_illinois.edu> wrote:

> Hi Ivan,
>
> If you have access to CHARMM, I know that its structure building does
> allow the specification of ranges when dealing with these kinds of cases
> and it might make your life easier without exploding the number of files.
> Why is an explosion of pdb files bad though? You can just delete them once
> the full structure is built if they bug you, and since you know where the
> gaps are, its reasonably simple to script the generation of these temporary
> pdbs if you are so inclined.
>
> -Josh
>
>
> On 03/19/2014 05:58 PM, Ivan Gregoretti wrote:
>
> Hi Josh,
>
> Splitting the pdb into nterm.pdb and cterm.pdb does the trick. Thank you.
>
> Now I am trying to figure out a way to do that without splitting the
> original file into two. Why? Because in the structures I will be working on
> I expect multiple missing amino-acids per monomer. And I will be working
> with many structures. I better try not to provoke and explosion of partial
> pdb files.
>
> The solution would be (it may not exist) to call the pdb command with a
> range, something like
>
> segment H {
> pdb 1:134 4HBC_p.pdb
> residue 135 SER
> residue 136 SER
> pdb 137:end 4HBC_p.pdb
> }
>
> I do not think that a call to "pdb" offers that option. I will keep
> trying, I have no choice honestly.
>
> Thank you
>
> Ivan
>
>
> Ivan Gregoretti, PhD
> Bioinformatics
>
>
>
> On Wed, Mar 19, 2014 at 3:01 PM, Josh Vermaas <vermaas2_at_illinois.edu>wrote:
>
>> Hi Ivan,
>>
>> The residue command operates within the segment. For example, this is
>> what I did for a protein that was missing its N-terminal methionine:
>>
>> segment EMU {
>> residue 1 MET
>> pdb 3VVJ-chainA.pdb
>> }
>>
>> For your case, I think what would be cleanest would be to make two pdbs,
>> one containing residues 1 to 134 (call it nterm.pdb), one containing the
>> rest (call it cterm.pdb), and then do the following:
>>
>> segment H {
>> pdb nterm.pdb
>> residue 135 SER
>> residue 136 SER
>> pdb cterm.pdb
>> }
>>
>> The coordpdb command can use the original pdb, as I believe psfgen
>> doesn't count on the residues being contiguous when adding coordinates to
>> the topology (although I've been known to be wrong in the past).
>>
>> Good luck!
>> -Josh Vermaas
>>
>>
>>
>> On 03/19/2014 01:07 PM, Ivan Gregoretti wrote:
>>
>> Hi everybody,
>>
>> Can somebody share an example of how to use "residue" in psfgen to fill
>> in amino-acids that are missing in a PDB?
>>
>>
>> I am working with structure 4HBC. The monomer named H has two serines
>> in positions 135 and 136 that are missing in the positional information.
>>
>> This is an excerpt fro the 4HBC.pdb wher you can see that positional
>> information skips from resid 134 to 137.
>> ...
>> ATOM 1017 N THR H 133 -60.259 26.798 <60.259%C2%A0%2026.798>-21.481 1.00 44.57 N
>> ATOM 1018 CA THR H 133 -60.887 25.485 <60.887%C2%A0%2025.485>-21.267 1.00 51.05 C
>> ATOM 1019 C THR H 133 -59.916 24.273 -21.475 1.00
>> 54.98 C
>> ATOM 1020 O THR H 133 -59.389 24.074 -22.579 1.00
>> 57.77 O
>> ATOM 1021 CB THR H 133 -62.200 25.381 -22.089 1.00
>> 53.68 C
>> ATOM 1022 OG1 THR H 133 -62.808 24.096 -21.894 1.00
>> 60.25 O
>> ATOM 1023 CG2 THR H 133 -61.958 25.650 <61.958%C2%A0%2025.650>-23.594 1.00 56.91 C
>> ATOM 1024 N PRO H 134 -59.670 23.473 -20.405 1.00
>> 57.37 N
>> ATOM 1025 CA PRO H 134 -58.706 22.338 -20.445 1.00
>> 56.94 C
>> ATOM 1026 C PRO H 134 -59.074 21.172 -21.373 1.00
>> 62.62 C
>> ATOM 1027 O PRO H 134 -58.254 20.268 -21.584 1.00
>> 62.58 O
>> ATOM 1028 CB PRO H 134 -58.654 21.855 <58.654%C2%A0%2021.855>-18.984 1.00 58.07 C
>> ATOM 1029 CG PRO H 134 -59.897 22.397 -18.347 1.00
>> 58.81 C
>> ATOM 1030 CD PRO H 134 -60.205 23.690 -19.044 1.00
>> 50.39 C
>> ATOM 1031 N THR H 137 -56.064 18.107 -18.639 1.00
>> 50.63 N
>> ATOM 1032 CA THR H 137 -54.612 18.345 -18.474 1.00
>> 44.58 C
>> ATOM 1033 C THR H 137 -54.202 19.838 -18.557 1.00
>> 40.24 C
>> ATOM 1034 O THR H 137 -54.496 20.527 <54.496%C2%A0%2020.527>-19.551 1.00 40.54 O
>> ATOM 1035 CB THR H 137 -53.780 17.473 -19.454 1.00
>> 52.37 C
>> ATOM 1036 OG1 THR H 137 -52.386 17.754 -19.298 1.00
>> 53.79 O
>> ATOM 1037 CG2 THR H 137 -54.188 17.709 <54.188%C2%A0%2017.709>-20.924 1.00 53.97 C
>> ATOM 1038 N VAL H 138 -53.514 20.333 -17.524 1.00
>> 31.81 N
>> ATOM 1039 CA VAL H 138 -53.117 21.759 -17.499 1.00
>> 28.77 C
>> ATOM 1040 C VAL H 138 -51.619 21.892 -17.199 1.00
>> 26.21 C
>> ATOM 1041 O VAL H 138 -51.065 21.118 <51.065%C2%A0%2021.118>-16.437 1.00 27.01 O
>> ATOM 1042 CB VAL H 138 -54.025 22.634 <54.025%C2%A0%2022.634>-16.556 1.00 29.39 C
>> ATOM 1043 CG1 VAL H 138 -53.992 22.156 <53.992%C2%A0%2022.156>-15.122 1.00 31.74 C
>> ATOM 1044 CG2 VAL H 138 -53.651 24.114 -16.583 1.00
>> 27.45 C
>> ...
>>
>> I am trying to figure out how to modify my VMD script to fill in those
>> missing serines.
>>
>> At the moment I am only establishing the disulfide bonds and adding the
>> hydrogen:
>>
>> package require psfgen
>> topology top_all27_prot_lipid.inp
>> pdbalias residue HIS HSE
>> pdbalias atom ILE CD1 CD
>>
>> segment H {pdb 4HBC_p_H.pdb}
>> patch DISU H:21 H:92
>> patch DISU H:142 H:195
>> patch DISU H:130 H:215
>> coordpdb 4HBC_p_H.pdb H
>> guesscoord
>> writepdb 4HBC_p_H_H.pdb
>> writepsf 4HBC_p_H_H.psf
>>
>> quit
>>
>>
>> I tried inserting the statement
>>
>> residue 135 SER H
>>
>> into a number of positions in that script but I consistently get the
>> same error: "no segment in progress for residue".
>>
>> Could somebody enlighten me with a proper segment definition?
>>
>> Thank you,
>>
>> Ivan
>>
>>
>>
>>
>>
>>
>> I have a tcl script that takes a protein only pdb file and
>>
>>
>> Ivan Gregoretti, PhD
>> Bioinformatics
>>
>>
>>
>
>