Re: parallel code in tclforce script

From: Axel Kohlmeyer (
Date: Fri Sep 06 2013 - 00:48:44 CDT

On Fri, Sep 6, 2013 at 1:48 AM, Teerapong Pirojsirikul
<> wrote:
> Hi Axel and Norman,
> Thank you very much for your advice. As for writing a C plugin, it

you are doing the second step before the first. have you measured how
much time is actually spent in your tclforces script per timestep?

> sounds worth trying to me. However, I don't quite have a clear picture
> how this works actually because I have no experiences writing any
> plugins before. For example, how can I make use of the tcl built-in
> commands such as getbond, vecsub, and other stuff in writing c plugin?

the whole point about writing such a plugin would be to *avoid* having
to use them and compute what these commands do more efficiently and
directly in C. but - again - before even considering that, two things
need to be found out: how much time is spent in total on this part of
the calculation (i.e. how much would be the gain and thus warrant the
effort) and is the existing code as efficient as it could be.

this is not the mailing list for Tcl programming, and i don't have the
time to give you a tutorial in writing a Tcl plugin in C. the Tcl
documentation has detailed explanations for that.

> Anyway, I have searched through namd mailing list and found the
> following thread but I'm not sure whether it might be applied my
> problem.

what makes you even think it could?? outside of the mentioning plugin
there is nothing pertinent to your problem.

> Also, may I put the part of the 'for loop' I would like to speedup
> calculating here.
> //This piece of code is in the calcforces proc
> for {set i 1} {$i <= $oxnum} {incr i } {
> set indx [lindex $ox [expr $i-1]]
> set coor $coords($indx)
> set blength [getbond $coor $r1c]
> // if Mg_O distance less than the cutoff skip to next o atom
> if {$blength <= $cutoff} {
> // do sth
> }
> Here the $oxnum ~ 10000 atoms and I have to search through this set of
> atoms just to get a few particles (oxygens) within the $cutoff of the
> target atom (Mg) in every step. This is the most time-consuming part
> that I want to improve.

have you read *anything* about how interactions with a cutoff are sped
up in MD simulations. does the term neighborlist or verlet-list mean
anything to you? if not, please read up on them. because this would be
the solution to your problem. rather than looping over all possible
pairs in every time step. you first build a list of atoms that are
within the cutoff plus a margin (say 2-3 angstrom) of the reference
atom and then you loop over the atoms in this list only. since atoms
don't move around quickly. you can reuse this list for several time
steps, say 5 or perhaps even 10 (needs to be tested) and with this you
suddenly make this loop almost 5 or 10 times faster, respectively.

there may be more optimization potential, but with only seeing a
fragment and not knowing what the purpose of the entire Tcl code is,
there is little room to provide more and better help.


> Any further advice would be appreciated.
> Tee
> 2013/9/5 Axel Kohlmeyer <>:
>> On Thu, Sep 5, 2013 at 8:45 AM, Norman Geist
>> <> wrote:
>>> I would expect NAMD to have the full TCL included, so therefore if it is
>>> possible in general with TCL, why not. If you google around a little "tcl
>>> parallel" "tcl threading" or what might be even more interesting "tcl mpi"
>>> you will find a lot. You will also find a nice looking tcl/mpi interface on
>>> Axels Homepage. In addition with a NAMD build against MPI, you might be able
>>> to use the same startup environment to run your tcl script on the same nodes
>>> out of namd directly.
>> i don't think the latter will work for two reasons:
>> 1) you are only allowed call MPI_Init() once in an application, so you
>> would have to implement a way to pass the MPI communicator from
>> charm++ through NAMD to TclMPI
>> 2) TclForces scripts are only called on the head node (unlike TclBC),
>> if i remember correctly
>> threading may be tricky as well, since threading in Tcl has been
>> supported for a while as a compile time option, it has not been
>> compiled in by default until the very latest release.
>> in any case, over many years i have observed that when people ask
>> about needing to parallelize script code, that the best solution is
>> usually a different one. thus what i would recommend to do is to first
>> benchmark how much time is spent on the TclForces script relative to
>> the total time of the time step. it only makes sense to optimize or
>> parallelize something that takes a lot of time. second, i would try to
>> optimize the script. many times, there is significant optimization
>> potential and then the need to parallelize may vanish. third, it may
>> be possible to write a C plugin to do the same task (same as many VMD
>> plugins have little "helper modules" written in C) to do the same task
>> and gain a significant speedup. at that level it would also be trivial
>> to include OpenMP based multi-threading for additional parallelization
>> at least on the head node.
>> axel.
>>> Norman Geist.
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: [] Im
>>>> Auftrag von Teerapong Pirojsirikul
>>>> Gesendet: Donnerstag, 5. September 2013 05:03
>>>> An: NAMD list
>>>> Betreff: namd-l: parallel code in tclforce script
>>>> Hi NAMD users,
>>>> Anyone has any idea whether or not we can write a parallel code in
>>>> tclforce script. I have a 'for loop' that goes through a large no. of
>>>> atoms and would like to parallelize it.
>>>> Best,
>>>> Tee
>> --
>> Dr. Axel Kohlmeyer
>> International Centre for Theoretical Physics, Trieste. Italy.

Dr. Axel Kohlmeyer
International Centre for Theoretical Physics, Trieste. Italy.

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:37 CST