Lattice Cutoff Computation For Multilevel Summation
---------------------------------------------------

Use the makefile with GNU make to build the following programs:

  genrandq - generate a random charge grid of indicated size

  compgrid - compute potential grid from given charge grid

  diffgrid - compare error between two grids


Simple implementations of the lattice cutoff algorithms are in
lattice_cutoff.c (built into compgrid) for you to examine and
test your code against.


Work through the following example:

$ make

$ ./genrandq 23 19 22 q.dat
  (generate random 23 x 19 x 22 charge grid)

$ ./compgrid q.dat ecutoff.dat
  (compute potentials using cutoff algorithm, nonperiodic boundaries)

$ ./compgrid -v q.dat edirect.dat
  (use direct algorithm, nonperiodic boundaries)

$ ./diffgrid ecutoff.dat edirect.dat
  (should indicate that grids are identical)

$ ./compgrid -p q.dat epcutoff.dat
  (use cutoff algorithm, periodic boundaries - note how slow this is!
  It's slow because of the remainder operator used in the inner loop
  that wraps an index around the edge of the grid.  You'll need to
  determine a better way to do this.)

$ ./compgrid -v -p q.dat epdirect.dat
  (use direct algorithm, periodic boundaries)

$ ./diffgrid epcutoff.dat epdirect.dat
  (this time you'll have some roundoff error due to summing the terms
  in a different order)

Try doing "make opcount" and re-running the cutoff algorithms.  This
gives the floating point operation count.


For this project, you need to implement the cutoff routines in
lattice_cutoff.c with CUDA.  You will need to find a better indexing
strategy for periodic boundaries than the one presented here (the
remainder operator would kill your GPU performance).  Your code needs
to be able to handle lattice sizes ranging from 25^3 up to 150^3, and
you are welcome to push these limits.  The weight lattice ranges are
determined by the integer radius ceil(2*a/h)-1, where a is the cutoff
distance and h is the grid spacing (see lattice_weights.c for details
on the softened potential function to be used).  You can expect the
parameter ranges not to exceed 8<=a<=12 and 2<=h<=3, implying that
the weight lattices have dimensions in the range 11^3 up to 23^3.

You should test in some systematic manner the full range of grid
dimensions and method parameters.  Your concluding report needs to
explain your implementation(s), present the performance results,
and discuss any hardware restrictions that limit your performance.
