From: Tristan Croll (tristan.croll_at_qut.edu.au)
Date: Thu Nov 05 2015 - 22:37:24 CST

You forgot the bit from the previous email I sent you. You need to think through logically and work out what you want each step of the program to do. You’re getting that error because you haven’t defined $resnames – that has to be done, as I suggested, like this:

set sel [atomselect top protein]
set resnames [lsort –unique [$sel get resname]]
You’re also going to run into further problems, since you’ve just put inside the if {$usepdb} clause instructions to work on all PDB files, rather than the one you just checked. Everything I’ve sent you should be within the foreach file $filelist loop. In pseudocode:

For each PDB file {
  Check for nonstandard residues
  If no nonstandard residues {
    Make a PSF/PDB from this file
  }
}

Also, your use of psfgen is only going to give you the results you expect if each PDB file contains a single protein chain with no breaks. Anything else will fail badly – you’re better off using autopsf here.

Cheers,

Tristan

From: Akshay Bhatnagar [mailto:akshaybhatnagar2790_at_gmail.com]
Sent: Friday, 6 November 2015 2:24 PM
To: Tristan Croll; vmd-l_at_ks.uiuc.edu
Subject: Re: vmd-l: Query for eliminating proteins with unusual amino acids

Hello
I have modified the script you gave, as i wanted to create psf files for the dataset files. The script is used is:

set usepdb true
set standardnames {ALA LEU ARG ASN ASP ASX CYS GLU GLN GLX GLY HIS ILE LYS MET PHE PRO SER THR TRP TYR VAL}

foreach testname $resnames {
  if { [lsearch $standardnames $testname] == -1 } {
    set usepdb false
}

If { $usepdb} {
package require psfgen
psfcontext reset
topology top_all27_prot_lipid.inp
set filelist [glob *.PDB]
  foreach file $filelist {
    mol new $file
    set name [file rootname $file]
    set selp [atomselect top protein]
    echo $selp
    $selp writepdb $file-1.pdb
pdbalias residue HIS HSE
pdbalias atom ILE CD1 CD
pdbalias atom LYS 1HZ HZ1
pdbalias atom LYS 2HZ HZ2
pdbalias atom LYS 3HZ HZ3
pdbalias atom ARG 1HH1 HH11
pdbalias atom ARG 2HH1 HH12
pdbalias atom ARG 1HH2 HH21
pdbalias atom ARG 2HH2 HH22
pdbalias atom ASN 1HD2 HD21
pdbalias atom ASN 2HD2 HD22
pdbalias atom SER HG HG1
segment U {pdb $file-1.pdb}
coordpdb $file-1.pdb U
guesscoord
writepdb $name-psf.pdb
writepsf $name-psf.psf
resetpsf
}
}
}
but it gives an error : "can't read "resnames": no such variable"

Can you please help in this regard

With Regards
Akshay Bhatnagar
PhD Student
BITS Pilani Hyderabad Campus


On Fri, Nov 6, 2015 at 5:55 AM, Akshay Bhatnagar <akshaybhatnagar2790_at_gmail.com<mailto:akshaybhatnagar2790_at_gmail.com>> wrote:
Thank you very much Sir, I will try this immediately.

With Regards
Akshay Bhatnagar
PhD Student
BITS Pilani Hyderabad Campus


On Fri, Nov 6, 2015 at 5:54 AM, Tristan Croll <tristan.croll_at_qut.edu.au<mailto:tristan.croll_at_qut.edu.au>> wrote:
Yes, but you can easily do it in a script.

set usepdb true
set standardnames {ALA LEU … (put all the standard amino acid names here)}

foreach testname $resnames {
  if { [lsearch $standardnames $testname] == -1 } {
    set usepdb false
}

If { $usepdb} {
  Whatever you want to do with your pdb files
}


From: Akshay Bhatnagar [mailto:akshaybhatnagar2790_at_gmail.com<mailto:akshaybhatnagar2790_at_gmail.com>]
Sent: Friday, 6 November 2015 10:19 AM
To: Tristan Croll
Subject: Re: vmd-l: Query for eliminating proteins with unusual amino acids

Hello Sir

Thank you very much for the reply. But i have around 5600 pdb's i cannot manually do this for all these pdb files.

With Regards
Akshay Bhatnagar
PhD Student
BITS Pilani Hyderabad Campus


On Fri, Nov 6, 2015 at 3:58 AM, Tristan Croll <tristan.croll_at_qut.edu.au<mailto:tristan.croll_at_qut.edu.au>> wrote:
Hi Akshay,

Probably the easiest way to do this is to take advantage of the fact that, to VMD, “protein” is anything that contains atoms named C, N, CA and O – which should catch all non-standard amino acids as well. So if you do:

set sel [atomselect top protein]
set resnames [lsort –unique [$sel get resname]]

… then all you have to do from there is a simple check of $resnames against a list of standard amino acid resnames.

Cheers,

Tristan

From: owner-vmd-l_at_ks.uiuc.edu<mailto:owner-vmd-l_at_ks.uiuc.edu> [mailto:owner-vmd-l_at_ks.uiuc.edu<mailto:owner-vmd-l_at_ks.uiuc.edu>] On Behalf Of Akshay Bhatnagar
Sent: Thursday, 5 November 2015 3:46 PM
To: vmd-l_at_ks.uiuc.edu<mailto:vmd-l_at_ks.uiuc.edu>
Subject: vmd-l: Query for eliminating proteins with unusual amino acids

Hello everyone
I have around 5600 pdb files downloaded from rcsb.org<http://rcsb.org>. I want to generate psf for all these files. For this i have already made a tcl script. But the psf generation stops whenever it encounters a pdb files that contains an amino acid other than the 20 essential amino acids. These amino acids are generally the post translation modified amino acids. I tried removing them using a perl scipt where i searched for pdb files that contain only 20 essential amino acid and HOH, but this has resulted in only 2 pdb files (it is removing all the pdb files that contains ligands and drug molecules also). I want to know is there any identifier through which i can remove only the post translated modified amino acids.


With Regards
Akshay Bhatnagar
PhD Student
BITS Pilani Hyderabad Campus