next up previous contents index
Next: Comparison selections Up: Selection Methods Previous: Quoting with Single Quotes   Contents   Index

Double Quotes and Regular Expressions

Double quotes around a string are used to specify a regular expression search (compatible with Perl 5.005, using the Perl-compatible regular expressions library written by Philip Hazel). Regular expressions are a very powerful concept but rather hard to explain from scratch. If you don't know how to use them, you might have some luck with the unix man pages for ed, egrep, vi, or regex. If not, ask someone, or get any one of a number of books including the O'Reilly and Associates Sed and Awk book. The following should given an idea of how they work.

Regular expressions allows selection of all atoms with a name starting with C as:

        name "C.*"

or segment names containing a number as

        segname ".*[0-9]+.*"

As expected, multiple terms can still be provided on the list of matching keywords, as in

        resname "A.*" GLY ".*T"

to select residues starting with an A, the glycine residues, and residues ending with a T. Kind of silly, but it is just to demonstrate. As with a string, a regular expression in a numeric context gets converted to an integer, which will always be zero.

In brief, a regular selection allows matching to multiple possibilities, instead of just one character. Table 5.8 shows some of the methods that can be used.

Table 5.8: Regular expression methods.
Symbol Example Definition
. . , A.C match any character
[] [ABCabc] , [A-Ca-c] match any char in the list
[~] [~Z] , [~XYZ] , [^x-z] match all except the chars in the list
^ ^C , ^A.* next token must be the first part of string
$ [CO]G$ prev token must be the last part of string
* C* , [ab]* match 0 or more copies of prev char or
    regular expression token
+ C+ , [ab]+ match 1 or more copies of the prev token
\| C\|O match either the 1st token or the 2nd
\(\) \(CA\)+ combines multiple tokens into one

So there are many ways to do some selections. For example, choosing atoms with a name of either CA or CB can be done in the following ways:

        name CA CB
        name "CA|CB"
        name "C[AB]"
        name "C(A|B)"

Several caveats for those who already understand regular expressions. VMD automatically prepends ``^('' and appends ``)$'' to the selection string. This makes the selection O match only O and not OG or PRO. On the other hand, putting ^ and $ into the command won't really affect anything, selections that match on a substring must be preceded and followed by ``.*'', as in .*O.*, and some illegal selections could be accepted as correct, but strange, as in C)|(O , which gets converted to ^(C)|(O)$ and matches anything starting with a C or ending with an O.

A regular expression is similar to wildcard matching in X-PLOR. Table 5.9 is a list of conversions from X-PLOR style wildcards to the matching regular expression.

Table 5.9: Regular expression conversions.
X-PLOR Wildcard Description Regular Expression
* matches any string .*
% matches a single character .
+ matches any digit [0-9]
# matches any number [0-9]+

next up previous contents index
Next: Comparison selections Up: Selection Methods Previous: Quoting with Single Quotes   Contents   Index