From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Thu May 21 2015 - 08:09:00 CDT

On Thu, May 21, 2015 at 2:16 PM, Stefan Doerr <stefdoerr_at_gmail.com> wrote:
> Okay!
> So here I am back with the promised weird case of a "same residue as"
> selection.
> I attach the offending PDB file.
> If you load it up in VMD and you do the following selections you will see
> something weird:
>
> segid L23 and resid 5
> // shows the whole lipid nicely
> residue 1877
> // shows the lipid but notice it's missing the H17R atom!
> same residue as (segid L23 and resid 5 and name H17R) // shows only the
> H17R atom
> same residue as (segid L23 and resid 5 and name H17S) // shows every
> atom except H17R (like the residue 1877 selection)
>
> Also, cutting the lipid out from the PDB and putting it into a new PDB file
> by itself seems to solve the selection problem.
> So there seems to be something broken in the calculation of "residue".
> Is the PDB file corrupted somehow and it affects the calculation of
> "residue" or is it a bug after all?

after a little testing, looking at the output and also digging through
the source code, here is the evidence:

what i noticed is that for the specific residue a couple of bonds are
missing or only printed half. looking at the whole system, shows that
there are some overlapping molecules. this is confirmed by the
following error messages printed during loading the PDB file.

Info) Determining bond structure from distance search ...
ERROR) MolAtom 6190: Exceeded maximum number of bonds (12).
ERROR) MolAtom 21396: Exceeded maximum number of bonds (12).
ERROR) MolAtom 21399: Exceeded maximum number of bonds (12).

now looking through the sources leads us to a method called int
BaseMolecule::make_uniq_resids(int *flgs), which comes with the
following comments:

// assign a uniq resid (uniq_resid) to each set of connected atoms
// with the same residue id. There could be many residues with the
// same id, but not connected (the SSN problem - SSNs are not unique
// so don't use them as the primary key)

so this explains the algorithm behind computing unique residue ids.
the constraints are same residue id *and* being connected. thus when
the bond list overflows due to overlapping molecules or otherwise the
heuristic bond determination fails (same as when you load the pdb with
"autobonds off") then the residue keyword will lead to (somewhat)
unexpected numbering.

now, whether this is a bug or a feature is difficult to say. if you
feed VMD a proper bond topology, e.g. via a .psf, .js or similar file,
then the connectivity is read from that file and the (unique) residue
numbering should work as expected.

axel.
>
> Best greetings,
> Stefan
>
> ps. Is there some program which checks for correctness of PDB files?
>
> On Thu, Apr 2, 2015 at 4:00 PM, Stefan Doerr <stefdoerr_at_gmail.com> wrote:
>>
>> Haha ok, you have a point. I don't doubt your knowledge ;)
>> Good to know residue is based on resid and not on guessed bonds.
>>
>> Then I really need to find that case I had where residue did not
>> work for me and see what caused it.
>>
>> Thanks for the advice :)
>>
>>
>> On Thu, Apr 2, 2015 at 3:42 PM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:
>>>
>>> On Thu, Apr 2, 2015 at 9:32 AM, Stefan Doerr <stefdoerr_at_gmail.com> wrote:
>>> > Yes Josh thank you very much :) We reached the same conclusion it
>>> > seems.
>>> > Thank you both for clearing it up for me!
>>> >
>>> > Axel, I have the opposite preference actually. I slightly dislike stuff
>>> > like
>>> > residue/index etc.
>>> > Since it gets calculated internally by VMD and often by guessing bonds
>>> > I
>>> > cannot trust it 100% especially lacking the psf.
>>> > On the other hand if I know 100% sure that the PDB numbering is correct
>>> > (like in the case of the water box) I would prefer
>>> > to stick the PDB values (resid,serial etc.)
>>> > It all depends on how much you trust your PDB, hehehe.
>>>
>>> no, it depends on whether you have read the code. ;-)
>>>
>>> residue is based(!) on resid. so it is like resid but better as it is
>>> guaranteed to be unique and thus has no unwanted side effects of
>>> accidentally selecting atoms that you don't want.
>>>
>>> i am helping people on this mailing list for many years and have
>>> written quite a few plugins/scripts for VMD. quite a significant chunk
>>> of problems happened because people use resid where they should be
>>> using residue. writing scripts based on resid is an invitation for
>>> problems, because most of the time it is not exactly what you want.
>>>
>>> axel.
>>>
>>> >
>>> > I actually had a case recently where "same residue" did not correctly
>>> > select
>>> > all 3 atoms of a water Molecule.
>>> > If I find it again I might send it around.
>>> >
>>> > On Thu, Apr 2, 2015 at 3:20 PM, Axel Kohlmeyer <akohlmey_at_gmail.com>
>>> > wrote:
>>> >>
>>> >> On Thu, Apr 2, 2015 at 9:15 AM, Stefan Doerr <stefdoerr_at_gmail.com>
>>> >> wrote:
>>> >> > Thanks for the answer Axel :) I think I finally found the solution.
>>> >> >
>>> >> > You are right. The problem is that there might be a resid, lets say
>>> >> > number
>>> >> > 6, in the membrane and hence in the selection (within 2.4 of not
>>> >> > segid
>>> >> > WT5).
>>> >> > Then "same resid as (within 2.4 of not segid WT5)" would select
>>> >> > resid 6
>>> >> > in
>>> >> > WT5 which in reality is not close to the membrane.
>>> >> >
>>> >> > same resid as (segid WT5 and (within 2.4 of not segid WT5))
>>> >> > as you suggested does not work since it will select non-WT5 atoms
>>> >> > with
>>> >> > the
>>> >> > same resid as the ones in WT5 (due to moving it to the left).
>>> >> >
>>> >> > What actually works is this!
>>> >> > segid WT5 and (same resid as (segid WT5 and (within 2.4 of not segid
>>> >> > WT5)))
>>> >>
>>> >> as should:
>>> >>
>>> >> same residue as (segid WT5 and (within 2.4 of not segid WT5))
>>> >>
>>> >> and i think it is the more correct variant for what you want.
>>> >>
>>> >> > So once I have the atoms which are within 2.4 of not segid WT5 I
>>> >> > filter
>>> >> > to
>>> >> > only keep the ones in segid WT5. Then I do the same resid as to get
>>> >> > their
>>> >> > resids
>>> >> > and last I limit again only to WT5.
>>> >> > what a brainf**k...
>>> >> > Okay so we can safely say it's not a bug :) hehe
>>> >> >
>>> >> > ps.
>>> >> > Indeed same fragment works but I prefer not to depend on VMD
>>> >> > guessing
>>> >> > the
>>> >> > bonds correctly since I don't have a psf file and don't want to
>>> >> > generate
>>> >> > one.
>>> >>
>>> >> ...and i have a strong dislike for using resid, since it is almost
>>> >> never exactly what you think it is.
>>> >>
>>> >> axel.
>>> >>
>>> >>
>>> >> >
>>> >> >
>>> >> > On Thu, Apr 2, 2015 at 2:43 PM, Axel Kohlmeyer <akohlmey_at_gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Thu, Apr 2, 2015 at 8:08 AM, Stefan Doerr <stefdoerr_at_gmail.com>
>>> >> >> wrote:
>>> >> >> > Right, but I don't think this covers my problem. If you look at
>>> >> >> > the
>>> >> >> > water
>>> >> >> > box atoms in the PDB
>>> >> >> > (which is by the way the exact water box VMD uses for solvating)
>>> >> >> > a
>>> >> >> > unique
>>> >> >> > resid is correctly assigned to the 3 atoms of each
>>> >> >> > water molecule. The segid is also correctly assigned to only the
>>> >> >> > water
>>> >> >> > box.
>>> >> >> > Hence the segid and resid combination in the selection is unique.
>>> >> >> >
>>> >> >> > In that case using "segid WT5 and same residue" should be
>>> >> >> > equivalent
>>> >> >> > to
>>> >> >> > "segid WT5 and same resid".
>>> >> >> > We are not talking about the absolute values of resid or residue
>>> >> >> > but
>>> >> >> > about
>>> >> >> > the "same" resid or residue inside a segment.
>>> >> >>
>>> >> >> ok. my answer was not correct. but also your analysis is not
>>> >> >> correct.
>>> >> >> what you and i have overlooked is that the "same" qualifier applies
>>> >> >> to
>>> >> >> the
>>> >> >> (within 2.4 of not segid WT5) part. so you first select all
>>> >> >> particles
>>> >> >> that are within 2.4 angstrom of the atoms in all segments but WT5,
>>> >> >> then you expand this to all atoms in all segments that have the
>>> >> >> same
>>> >> >> residue/resid and *then* you reduce it to all of those in segment
>>> >> >> WT5.
>>> >> >> so using resid should pick more atoms.
>>> >> >> to make this work as you expect, you have to move the "same
>>> >> >> resid/residue as" to the very left of your selection.
>>> >> >>
>>> >> >> as always, computers stick to the rules and don't use common sense.
>>> >> >> ;-)
>>> >> >>
>>> >> >> >
>>> >> >> > On Thu, Apr 2, 2015 at 1:54 PM, Axel Kohlmeyer
>>> >> >> > <akohlmey_at_gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> On Thu, Apr 2, 2015 at 4:41 AM, Stefan Doerr
>>> >> >> >> <stefdoerr_at_gmail.com>
>>> >> >> >> wrote:
>>> >> >> >> > Hello,
>>> >> >> >> > I wanted to point out what looks like an atomselect bug to me.
>>> >> >> >> >
>>> >> >> >> > I have a lipid membrane system with a protein and I added a
>>> >> >> >> > water
>>> >> >> >> > box
>>> >> >> >> > on
>>> >> >> >> > it
>>> >> >> >> > which clashes with the membrane and protein.
>>> >> >> >> >
>>> >> >> >> > I am trying to select all water atoms that clash with
>>> >> >> >> > non-water
>>> >> >> >> > atoms
>>> >> >> >> > to
>>> >> >> >> > remove them.
>>> >> >> >> > Given the following two selections I would expect identical
>>> >> >> >> > results
>>> >> >> >> > (as
>>> >> >> >> > long
>>> >> >> >> > as the water molecules have correct resid's in the PDB which
>>> >> >> >> > they
>>> >> >> >> > indeed
>>> >> >> >> > have):
>>> >> >> >> >
>>> >> >> >> > segid WT5 and same residue as (within 2.4 of not segid WT5)
>>> >> >> >> > segid WT5 and same resid as (within 2.4 of not segid WT5)
>>> >> >> >> >
>>> >> >> >> > I made a imgur album with some explanatory pictures
>>> >> >> >> > http://imgur.com/a/RfPMq
>>> >> >> >> >
>>> >> >> >> > The first selection seems to works fine (see album: red
>>> >> >> >> > balls). If
>>> >> >> >> > you
>>> >> >> >> > change residue to resid it wrongly finds many more water
>>> >> >> >> > molecules
>>> >> >> >> > (yellow
>>> >> >> >> > balls). Although I actually have the impression that the first
>>> >> >> >> > selection
>>> >> >> >> > is
>>> >> >> >> > not totally correct either. I have seen it remove only pieces
>>> >> >> >> > of a
>>> >> >> >> > water
>>> >> >> >> > molecule and not all of it.
>>> >> >> >> >
>>> >> >> >> > I don't see a reason why resid should not work in this
>>> >> >> >> > selection
>>> >> >> >> > exactly
>>> >> >> >> > like residue.
>>> >> >> >>
>>> >> >> >> residue is a number designated by VMD and guaranteed to be
>>> >> >> >> unique,
>>> >> >> >> resid is just a label imported from the PDB file and can repeat.
>>> >> >> >> as
>>> >> >> >> far as i remember, the PDB standard only requires it to be
>>> >> >> >> unique
>>> >> >> >> within a segment.
>>> >> >> >>
>>> >> >> >> however, since residue is still dependent on a correct
>>> >> >> >> assignment of
>>> >> >> >> residue labels, i.e. resids to molecules, the safest way to
>>> >> >> >> select
>>> >> >> >> whole molecules is to use "same fragment as". fragment is also a
>>> >> >> >> property computed by VMD and assigns a unique id to each
>>> >> >> >> group/fragment of connected atoms, which coincides with
>>> >> >> >> molecules in
>>> >> >> >> classical force fields (that is assuming that you have correct
>>> >> >> >> bond
>>> >> >> >> information imported from a .psf file or similar).
>>> >> >> >>
>>> >> >> >> axel.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> >
>>> >> >> >> > Here is the PDB file if you want to test it. I tested it with
>>> >> >> >> > 1.9.1
>>> >> >> >> > and
>>> >> >> >> > 1.9.2 VMD:
>>> >> >> >> >
>>> >> >> >> > https://www.dropbox.com/s/o117eo5lpxmsp9e/onlyWT5andrest.pdb?dl=0
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> --
>>> >> >> >> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
>>> >> >> >> College of Science & Technology, Temple University, Philadelphia
>>> >> >> >> PA,
>>> >> >> >> USA
>>> >> >> >> International Centre for Theoretical Physics, Trieste. Italy.
>>> >> >> >
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
>>> >> >> College of Science & Technology, Temple University, Philadelphia
>>> >> >> PA,
>>> >> >> USA
>>> >> >> International Centre for Theoretical Physics, Trieste. Italy.
>>> >> >
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
>>> >> College of Science & Technology, Temple University, Philadelphia PA,
>>> >> USA
>>> >> International Centre for Theoretical Physics, Trieste. Italy.
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
>>> College of Science & Technology, Temple University, Philadelphia PA, USA
>>> International Centre for Theoretical Physics, Trieste. Italy.
>>
>>
>

-- 
Dr. Axel Kohlmeyer  akohlmey_at_gmail.com  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.