From: Axel Kohlmeyer (akohlmey_at_cmm.chem.upenn.edu)
Date: Tue Jan 06 2009 - 19:01:25 CST

On Tue, 6 Jan 2009, jouko_at_berkeley.edu wrote:

J> I am trying to convert a dcd file that is larger than 7 GB to individual
J> pdb files. When I run the script I normally use to convert dcd files to

now there is a big waste of disk space...
why would you want to do something like this?

J> pdb files vmd aborts. The script is pasted below.
J>
J> mol load psf ubq_solution_wb.psf dcd ubq_wb.12ns.dcd
J> set nf [molinfo top get numframes]
J> for {set i 0} {$i<$nf} {incr i} {[atomselect top all frame $i] writepdb
J> /home/jouko/1ubq_phosphate5/0-3_ns/ubq_$i.pdb}

ouch. this is bad programming. each frame will create a new
atomselect function without need.

better would be to do it like this:

# mol load is deprecated!
set mol [mol new ubq_solution_wb.psf type psf]
mol addfile ubq_wb.12ns.dcd type dcd waitfor all
set nf [molinfo $mol get numframes]
set sel [atomselect $mol all]
for {set i 0} {$i < $nf} {incr i} {
   $sel frame $i
   $sel writepdb ubq_$i.pdb
}
$sel delete
unset sel

J> I was thinking that the dcd file might be too large to be read by vmd, so

depends on how much memory you have in your machine. .dcd files are
binary and fairly compact but recent VMD versions have been quite
conservative on memory usage, so if you have 12-16GB main memory,
you might be able to fit it in (with the corrected non-memory leaking
script code).

but since you process it frame by frame, using the bigdcd script
should actually do the trick nicely and would require only enough
memory to keep one or two frames in memory. see the vmd script library
and the many questions to this mailing list for details.

J> I thought I could split it up using sed. However when I do that vmd is

ouch^2. .dcd files are binary and sed is a text tool. but more
importantly, if you don't know the format, you cannot split them
easily. each .dcd file has a header and then some frames, so you'd
have to cutoff the header from the rest, then split the rest into
apropriately sized chunks (the size can be determined with hexdump,
if you know the details for the format and how binary unformatted
fortran files are written) and then you have to combine each chunk
with the proper header. the tool to do this would be 'dd', but it
is horribly complicated (the only chance for corrupt files, though)
and unless you run into a problem with a 2GB filesize limit on your
platform, there should be no problem with using bigdcd on the large
file.

cheers,
   axel.

J> not able to read the split up dcd files, except for the first one. Any
J> help would be appreciated.
J>
J>
J> Thanks,
J>
J> Jouko
J>

-- 
=======================================================================
Axel Kohlmeyer   akohlmey_at_cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.