NAMD
Macros | Enumerations | Functions | Variables
msm_longrng_sprec.c File Reference
#include "msm_defn.h"

Go to the source code of this file.

Macros

#define STENCIL_1D(_phi, _delta, _approx)
 
#define D_STENCIL_1D(_dphi, _phi, _h_1, _delta, _approx)
 

Enumerations

enum  { MAX_POLY_DEGREE = 9 }
 
enum  { MAX_NSTENCIL = 11 }
 

Functions

static int anterpolation (NL_Msm *)
 
static int interpolation (NL_Msm *)
 
static int restriction (NL_Msm *, int level)
 
static int prolongation (NL_Msm *, int level)
 
static int restriction_factored (NL_Msm *, int level)
 
static int prolongation_factored (NL_Msm *, int level)
 
static int gridcutoff (NL_Msm *, int level)
 
int NL_msm_compute_long_range_sprec (NL_Msm *msm)
 

Variables

static const int PolyDegree [NL_MSM_APPROX_END]
 
static const float PhiStencilFactored [NL_MSM_APPROX_END][2 *MAX_POLY_DEGREE+1]
 
static const int Nstencil [NL_MSM_APPROX_END]
 
static const int IndexOffset [NL_MSM_APPROX_END][MAX_NSTENCIL]
 
static const float PhiStencil [NL_MSM_APPROX_END][MAX_NSTENCIL]
 

Macro Definition Documentation

#define D_STENCIL_1D (   _dphi,
  _phi,
  _h_1,
  _delta,
  _approx 
)

Calculate the stencil of basis function and derivatives of (1/h)Phi. dphi - stencil array (up to size MAX_POLY_DEGREE+1) phi - stencil array (up to size MAX_POLY_DEGREE+1) h_1 - 1/h, h is the grid spacing delta - normalized distance of atom from lowest grid point in stencil approx - APPROX enum value from msm.h

Definition at line 468 of file msm_longrng_sprec.c.

Referenced by interpolation().

#define STENCIL_1D (   _phi,
  _delta,
  _approx 
)

Calculate the stencil of basis function values of Phi. phi - stencil array (up to size MAX_POLY_DEGREE+1) delta - normalized distance of atom from lowest grid point in stencil approx - APPROX enum value from msm.h

Definition at line 280 of file msm_longrng_sprec.c.

Referenced by anterpolation().

Enumeration Type Documentation

anonymous enum

Approximation formulaes are up to degree 9 polynomials.

Enumerator
MAX_POLY_DEGREE 

Definition at line 159 of file msm_longrng_sprec.c.

anonymous enum

Max stencil length is basically PolyDegree+2 for those approximations that interpolate. (We skip over zero in the complete stencils above.)

Enumerator
MAX_NSTENCIL 

Definition at line 205 of file msm_longrng_sprec.c.

205 { MAX_NSTENCIL = 11 };

Function Documentation

int anterpolation ( NL_Msm msm)
static

msm_longrng.c

Compute the long-range part of MSM forces, using single precision.

The entire procedure can be depicted as an inverse V-cycle with bars across each level to represent the short-range gridded calculations. The algorithm first performs anterpolation of the charge to the finest level grid. Each intermediate grid level, below the coarsest level at the top, calculates a more slowly varying long-range part of the interaction by restricting the charges to a grid with twice the spacing. Each grid level also has a short-range cutoff part between charges that contributes to the potentials. Also summed to the potentials at each intermediate grid level are the long-range contributions prolongated from a grid with twice the spacing. Finally, the forces are interpolated (or approximated if the basis functions do not interpolate) from the finest level grid of potentials. The steps of the algorithm are detailed in section 2.3 of the thesis.

The restriction and prolongation operations use the factored approach having a work constant which is linear in the degree of the polynomial, i.e. O(pM), where p is the degree and M is the number of grid points.

The algorithm below uses stencil widths expressed as a function of the degree of the basis function polynomial Phi. The constants for restriction and prolongation stencils are stored in an array indexed by the APPROX enum defined in msm.h header. Calculations of the grid stencils of the 1D Phi and its derivative, needed for interpolation and anterpolation, are expressed as macros that use the APPROX enum to select the polynomial pieces from a switch statement. Periodic boundaries are handled by wrapping around the edges of the grid, as discussed in section 6.2 of the thesis.

XXX The short-range gridded calculations contribute to the virial, but it is not calculated.

XXX The factored restriction and prolongation procedures are not suitable for parallel and GPU computation due to the sequential dependence along each dimension of the grid.

Definition at line 779 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, ASSERT, NL_Msm_t::atom_f, GRID_INDEX, GRID_INDEX_CHECK, GRID_ZERO, NL_Msm_t::gx_f, NL_Msm_t::gy_f, NL_Msm_t::gz_f, NL_Msm_t::hx_f, NL_Msm_t::hy_f, NL_Msm_t::hz_f, MAX_POLY_DEGREE, NL_Msm_t::msmflags, NL_MSM_ERROR_RANGE, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, NL_Msm_t::numatoms, PolyDegree, NL_Msm_t::qh_f, and STENCIL_1D.

Referenced by NL_msm_compute_long_range_sprec().

780 {
781  const float *atom = msm->atom_f;
782  const int natoms = msm->numatoms;
783 
784  float xphi[MAX_POLY_DEGREE+1]; /* Phi stencil along x-dimension */
785  float yphi[MAX_POLY_DEGREE+1]; /* Phi stencil along y-dimension */
786  float zphi[MAX_POLY_DEGREE+1]; /* Phi stencil along z-dimension */
787  float rx_hx, ry_hy, rz_hz; /* normalized distance from atom to origin */
788 #ifndef TEST_INLINING
789  float delta; /* normalized distance to stencil point */
790 #else
791  float t; /* normalized distance to stencil point */
792 #endif
793  float ck, cjk;
794  const float hx_1 = 1/msm->hx_f;
795  const float hy_1 = 1/msm->hy_f;
796  const float hz_1 = 1/msm->hz_f;
797  const float xm0 = msm->gx_f; /* grid origin x */
798  const float ym0 = msm->gy_f; /* grid origin y */
799  const float zm0 = msm->gz_f; /* grid origin z */
800  float q;
801 
802  NL_Msmgrid_float *qhgrid = &(msm->qh_f[0]);
803  float *qh = qhgrid->data;
804  const int ni = qhgrid->ni;
805  const int nj = qhgrid->nj;
806  const int nk = qhgrid->nk;
807  const int ia = qhgrid->i0;
808  const int ib = ia + ni - 1;
809  const int ja = qhgrid->j0;
810  const int jb = ja + nj - 1;
811  const int ka = qhgrid->k0;
812  const int kb = ka + nk - 1;
813 
814  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
815  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
816  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
817  int iswithin;
818 
819  int n, i, j, k, ilo, jlo, klo, koff;
820  int jkoff, index;
821 
822  const int approx = msm->approx;
823  const int s_size = PolyDegree[approx] + 1; /* stencil size */
824  const int s_edge = (PolyDegree[approx] - 1) >> 1; /* stencil "edge" size */
825 
826  GRID_ZERO(qhgrid);
827 
828  for (n = 0; n < natoms; n++) {
829 
830  /* atomic charge */
831  q = atom[4*n + 3];
832  if (0==q) continue;
833 
834  /* distance between atom and origin measured in grid points */
835  rx_hx = (atom[4*n ] - xm0) * hx_1;
836  ry_hy = (atom[4*n + 1] - ym0) * hy_1;
837  rz_hz = (atom[4*n + 2] - zm0) * hz_1;
838 
839  /* find smallest numbered grid point in stencil */
840  ilo = (int) floorf(rx_hx) - s_edge;
841  jlo = (int) floorf(ry_hy) - s_edge;
842  klo = (int) floorf(rz_hz) - s_edge;
843 
844  /* calculate Phi stencils along each dimension */
845 #ifndef TEST_INLINING
846  delta = rx_hx - (float) ilo;
847  STENCIL_1D(xphi, delta, approx);
848  delta = ry_hy - (float) jlo;
849  STENCIL_1D(yphi, delta, approx);
850  delta = rz_hz - (float) klo;
851  STENCIL_1D(zphi, delta, approx);
852 #else
853  t = rx_hx - (float) ilo;
854  xphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
855  t--; \
856  xphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
857  t--; \
858  xphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
859  t--; \
860  xphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
861 
862  t = ry_hy - (double) jlo;
863  yphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
864  t--; \
865  yphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
866  t--; \
867  yphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
868  t--; \
869  yphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
870 
871  t = rz_hz - (double) klo;
872  zphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
873  t--; \
874  zphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
875  t--; \
876  zphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
877  t--; \
878  zphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
879 
880 #endif
881 
882  /* test to see if stencil is within edges of grid */
883  iswithin = ( ia <= ilo && (ilo+(s_size-1)) <= ib &&
884  ja <= jlo && (jlo+(s_size-1)) <= jb &&
885  ka <= klo && (klo+(s_size-1)) <= kb );
886 
887  if ( iswithin ) { /* no wrapping needed */
888 
889  /* determine charge on cube of grid points around atom */
890  for (k = 0; k < s_size; k++) {
891  koff = (k + klo) * nj;
892  ck = zphi[k] * q;
893  for (j = 0; j < s_size; j++) {
894  jkoff = (koff + (j + jlo)) * ni;
895  cjk = yphi[j] * ck;
896  for (i = 0; i < s_size; i++) {
897  index = jkoff + (i + ilo);
898  GRID_INDEX_CHECK(qhgrid, i+ilo, j+jlo, k+klo);
899  ASSERT(GRID_INDEX(qhgrid, i+ilo, j+jlo, k+klo) == index);
900  qh[index] += xphi[i] * cjk;
901  }
902  }
903  }
904  } /* if */
905 
906  else { /* requires wrapping around grid */
907  int ip, jp, kp;
908 
909  /* adjust ilo, jlo, klo so they are within grid indexing */
910  if (ispx) {
911  if (ilo < ia) do { ilo += ni; } while (ilo < ia);
912  else if (ilo > ib) do { ilo -= ni; } while (ilo > ib);
913  }
914  else if (ilo < ia || (ilo+(s_size-1)) > ib) {
915  return NL_MSM_ERROR_RANGE;
916  }
917 
918  if (ispy) {
919  if (jlo < ja) do { jlo += nj; } while (jlo < ja);
920  else if (jlo > jb) do { jlo -= nj; } while (jlo > jb);
921  }
922  else if (jlo < ja || (jlo+(s_size-1)) > jb) {
923  return NL_MSM_ERROR_RANGE;
924  }
925 
926  if (ispz) {
927  if (klo < ka) do { klo += nk; } while (klo < ka);
928  else if (klo > kb) do { klo -= nk; } while (klo > kb);
929  }
930  else if (klo < ka || (klo+(s_size-1)) > kb) {
931  return NL_MSM_ERROR_RANGE;
932  }
933 
934  /* determine charge on cube of grid points around atom, with wrapping */
935  for (k = 0, kp = klo; k < s_size; k++, kp++) {
936  if (kp > kb) kp = ka; /* wrap stencil around grid */
937  koff = kp * nj;
938  ck = zphi[k] * q;
939  for (j = 0, jp = jlo; j < s_size; j++, jp++) {
940  if (jp > jb) jp = ja; /* wrap stencil around grid */
941  jkoff = (koff + jp) * ni;
942  cjk = yphi[j] * ck;
943  for (i = 0, ip = ilo; i < s_size; i++, ip++) {
944  if (ip > ib) ip = ia; /* wrap stencil around grid */
945  index = jkoff + ip;
946  GRID_INDEX_CHECK(qhgrid, ip, jp, kp);
947  ASSERT(GRID_INDEX(qhgrid, ip, jp, kp) == index);
948  qh[index] += xphi[i] * cjk;
949  }
950  }
951  }
952  } /* else */
953 
954  } /* end loop over atoms */
955 
956  return NL_MSM_SUCCESS;
957 } /* anterpolation */
int msmflags
Definition: msm_defn.h:590
static const int PolyDegree[NL_MSM_APPROX_END]
int approx
Definition: msm_defn.h:642
#define GRID_INDEX_CHECK(a, _i, _j, _k)
Definition: msm_defn.h:114
const float * atom_f
Definition: msm_defn.h:572
int numatoms
Definition: msm_defn.h:599
float gz_f
Definition: msm_defn.h:648
float hy_f
Definition: msm_defn.h:637
float hx_f
Definition: msm_defn.h:637
#define GRID_ZERO(_p)
Definition: msm_defn.h:98
#define GRID_INDEX(_p, _i, _j, _k)
Definition: msm_defn.h:66
float hz_f
Definition: msm_defn.h:637
#define ASSERT(E)
#define STENCIL_1D(_phi, _delta, _approx)
float gx_f
Definition: msm_defn.h:648
NL_Msmgrid_float * qh_f
Definition: msm_defn.h:670
float gy_f
Definition: msm_defn.h:648
int gridcutoff ( NL_Msm msm,
int  level 
)
static

Definition at line 1690 of file msm_longrng_sprec.c.

References ASSERT, NL_Msm_t::eh_f, NL_Msm_t::gc_f, GRID_INDEX, GRID_INDEX_CHECK, NL_Msm_t::msmflags, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, and NL_Msm_t::qh_f.

Referenced by NL_msm_compute_long_range_sprec().

1691 {
1692  float eh_sum;
1693 
1694  /* grids of charge and potential */
1695  const NL_Msmgrid_float *qhgrid = &(msm->qh_f[level]);
1696  const float *qh = qhgrid->data;
1697  NL_Msmgrid_float *ehgrid = &(msm->eh_f[level]);
1698  float *eh = ehgrid->data;
1699  const int ia = qhgrid->i0; /* lowest x-index */
1700  const int ib = ia + qhgrid->ni - 1; /* highest x-index */
1701  const int ja = qhgrid->j0; /* lowest y-index */
1702  const int jb = ja + qhgrid->nj - 1; /* highest y-index */
1703  const int ka = qhgrid->k0; /* lowest z-index */
1704  const int kb = ka + qhgrid->nk - 1; /* highest z-index */
1705  const int ni = qhgrid->ni; /* length along x-dim */
1706  const int nj = qhgrid->nj; /* length along y-dim */
1707  const int nk = qhgrid->nk; /* length along z-dim */
1708 
1709  /* grid of weights for pairwise grid point interactions within cutoff */
1710  const NL_Msmgrid_float *gcgrid = &(msm->gc_f[level]);
1711  const float *gc = gcgrid->data;
1712  const int gia = gcgrid->i0; /* lowest x-index */
1713  const int gib = gia + gcgrid->ni - 1; /* highest x-index */
1714  const int gja = gcgrid->j0; /* lowest y-index */
1715  const int gjb = gja + gcgrid->nj - 1; /* highest y-index */
1716  const int gka = gcgrid->k0; /* lowest z-index */
1717  const int gkb = gka + gcgrid->nk - 1; /* highest z-index */
1718  const int gni = gcgrid->ni; /* length along x-dim */
1719  const int gnj = gcgrid->nj; /* length along y-dim */
1720 
1721  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1722  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1723  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1724 
1725  const int ispnone = !(ispx || ispy || ispz);
1726 
1727  int i, j, k;
1728  int gia_clip, gib_clip;
1729  int gja_clip, gjb_clip;
1730  int gka_clip, gkb_clip;
1731  int koff;
1732  long jkoff, index;
1733  int id, jd, kd;
1734  int knoff;
1735  long jknoff, nindex;
1736  int kgoff, jkgoff, ngindex;
1737 
1738 #ifdef NL_DEBUG
1739  if (level==0) {
1740  const float *gcbuf = gcgrid->buffer;
1741  int index = 0;
1742  printf("+++ qh[0,0,0] = %g\n", qh[0]);
1743 #if 0
1744  for (k = gka; k <= gkb; k++) {
1745  for (j = gja; j <= gjb; j++) {
1746  for (i = gia; i <= gib; i++, index++) {
1747  printf("=== gc[%d,%d,%d] = %g\n", i, j, k, gcbuf[index]);
1748  }
1749  }
1750  }
1751 #endif
1752  }
1753 #endif
1754 
1755  if ( ispnone ) { /* non-periodic boundaries */
1756 
1757  /* loop over all grid points */
1758  for (k = ka; k <= kb; k++) {
1759 
1760  /* clip gc ranges to keep offset for k index within grid */
1761  gka_clip = (k + gka < ka ? ka - k : gka);
1762  gkb_clip = (k + gkb > kb ? kb - k : gkb);
1763 
1764  koff = k * nj; /* find eh flat index */
1765 
1766  for (j = ja; j <= jb; j++) {
1767 
1768  /* clip gc ranges to keep offset for j index within grid */
1769  gja_clip = (j + gja < ja ? ja - j : gja);
1770  gjb_clip = (j + gjb > jb ? jb - j : gjb);
1771 
1772  jkoff = (koff + j) * ni; /* find eh flat index */
1773 
1774  for (i = ia; i <= ib; i++) {
1775 
1776  /* clip gc ranges to keep offset for i index within grid */
1777  gia_clip = (i + gia < ia ? ia - i : gia);
1778  gib_clip = (i + gib > ib ? ib - i : gib);
1779 
1780  index = jkoff + i; /* eh flat index */
1781 
1782  /* sum over "sphere" of weighted charge */
1783  eh_sum = 0;
1784  for (kd = gka_clip; kd <= gkb_clip; kd++) {
1785  knoff = (k + kd) * nj; /* find qh flat index */
1786  kgoff = kd * gnj; /* find gc flat index */
1787 
1788  for (jd = gja_clip; jd <= gjb_clip; jd++) {
1789  jknoff = (knoff + (j + jd)) * ni; /* find qh flat index */
1790  jkgoff = (kgoff + jd) * gni; /* find gc flat index */
1791 
1792  for (id = gia_clip; id <= gib_clip; id++) {
1793  nindex = jknoff + (i + id); /* qh flat index */
1794  ngindex = jkgoff + id; /* gc flat index */
1795 
1796  GRID_INDEX_CHECK(qhgrid, i+id, j+jd, k+kd);
1797  ASSERT(GRID_INDEX(qhgrid, i+id, j+jd, k+kd) == nindex);
1798 
1799  GRID_INDEX_CHECK(gcgrid, id, jd, kd);
1800  ASSERT(GRID_INDEX(gcgrid, id, jd, kd) == ngindex);
1801 
1802  eh_sum += qh[nindex] * gc[ngindex]; /* sum weighted charge */
1803  }
1804  }
1805  } /* end loop over "sphere" of charge */
1806 
1807  GRID_INDEX_CHECK(ehgrid, i, j, k);
1808  ASSERT(GRID_INDEX(ehgrid, i, j, k) == index);
1809  eh[index] = eh_sum; /* store potential */
1810  }
1811  }
1812  } /* end loop over all grid points */
1813 
1814  } /* if nonperiodic boundaries */
1815  else {
1816  /* some boundary is periodic */
1817  int ilo, jlo, klo;
1818  int ip, jp, kp;
1819 
1820  /* loop over all grid points */
1821  for (k = ka; k <= kb; k++) {
1822  klo = k + gka;
1823  if ( ! ispz ) { /* nonperiodic z */
1824  /* clip gc ranges to keep offset for k index within grid */
1825  gka_clip = (k + gka < ka ? ka - k : gka);
1826  gkb_clip = (k + gkb > kb ? kb - k : gkb);
1827  if (klo < ka) klo = ka; /* keep lowest qh index within grid */
1828  }
1829  else { /* periodic z */
1830  gka_clip = gka;
1831  gkb_clip = gkb;
1832  if (klo < ka) do { klo += nk; } while (klo < ka);
1833  }
1834  ASSERT(klo <= kb);
1835 
1836  koff = k * nj; /* find eh flat index */
1837 
1838  for (j = ja; j <= jb; j++) {
1839  jlo = j + gja;
1840  if ( ! ispy ) { /* nonperiodic y */
1841  /* clip gc ranges to keep offset for j index within grid */
1842  gja_clip = (j + gja < ja ? ja - j : gja);
1843  gjb_clip = (j + gjb > jb ? jb - j : gjb);
1844  if (jlo < ja) jlo = ja; /* keep lowest qh index within grid */
1845  }
1846  else { /* periodic y */
1847  gja_clip = gja;
1848  gjb_clip = gjb;
1849  if (jlo < ja) do { jlo += nj; } while (jlo < ja);
1850  }
1851  ASSERT(jlo <= jb);
1852 
1853  jkoff = (koff + j) * ni; /* find eh flat index */
1854 
1855  for (i = ia; i <= ib; i++) {
1856  ilo = i + gia;
1857  if ( ! ispx ) { /* nonperiodic x */
1858  /* clip gc ranges to keep offset for i index within grid */
1859  gia_clip = (i + gia < ia ? ia - i : gia);
1860  gib_clip = (i + gib > ib ? ib - i : gib);
1861  if (ilo < ia) ilo = ia; /* keep lowest qh index within grid */
1862  }
1863  else { /* periodic x */
1864  gia_clip = gia;
1865  gib_clip = gib;
1866  if (ilo < ia) do { ilo += ni; } while (ilo < ia);
1867  }
1868  ASSERT(ilo <= ib);
1869 
1870  index = jkoff + i; /* eh flat index */
1871 
1872  /* sum over "sphere" of weighted charge */
1873  eh_sum = 0;
1874  for (kd = gka_clip, kp = klo; kd <= gkb_clip; kd++, kp++) {
1875  /* clipping makes conditional always fail for nonperiodic */
1876  if (kp > kb) kp = ka; /* wrap z direction */
1877  knoff = kp * nj; /* find qh flat index */
1878  kgoff = kd * gnj; /* find gc flat index */
1879 
1880  for (jd = gja_clip, jp = jlo; jd <= gjb_clip; jd++, jp++) {
1881  /* clipping makes conditional always fail for nonperiodic */
1882  if (jp > jb) jp = ja; /* wrap y direction */
1883  jknoff = (knoff + jp) * ni; /* find qh flat index */
1884  jkgoff = (kgoff + jd) * gni; /* find gc flat index */
1885 
1886  for (id = gia_clip, ip = ilo; id <= gib_clip; id++, ip++) {
1887  /* clipping makes conditional always fail for nonperiodic */
1888  if (ip > ib) ip = ia; /* wrap x direction */
1889  nindex = jknoff + ip; /* qh flat index */
1890  ngindex = jkgoff + id; /* gc flat index */
1891 
1892  GRID_INDEX_CHECK(qhgrid, ip, jp, kp);
1893  ASSERT(GRID_INDEX(qhgrid, ip, jp, kp) == nindex);
1894 
1895  GRID_INDEX_CHECK(gcgrid, id, jd, kd);
1896  ASSERT(GRID_INDEX(gcgrid, id, jd, kd) == ngindex);
1897 
1898  eh_sum += qh[nindex] * gc[ngindex]; /* sum weighted charge */
1899 
1900  }
1901  }
1902  } /* end loop over "sphere" of charge */
1903 
1904  GRID_INDEX_CHECK(ehgrid, i, j, k);
1905  ASSERT(GRID_INDEX(ehgrid, i, j, k) == index);
1906  eh[index] = eh_sum; /* store potential */
1907  }
1908  }
1909  } /* end loop over all grid points */
1910 
1911  } /* else some boundary is periodic */
1912 
1913  return NL_MSM_SUCCESS;
1914 }
int msmflags
Definition: msm_defn.h:590
#define GRID_INDEX_CHECK(a, _i, _j, _k)
Definition: msm_defn.h:114
NL_Msmgrid_float * gc_f
Definition: msm_defn.h:672
#define GRID_INDEX(_p, _i, _j, _k)
Definition: msm_defn.h:66
#define ASSERT(E)
NL_Msmgrid_float * eh_f
Definition: msm_defn.h:671
NL_Msmgrid_float * qh_f
Definition: msm_defn.h:670
int interpolation ( NL_Msm msm)
static

Definition at line 960 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, ASSERT, NL_Msm_t::atom_f, D_STENCIL_1D, NL_Msm_t::eh_f, NL_Msm_t::felec_f, GRID_INDEX, GRID_INDEX_CHECK, NL_Msm_t::gx_f, NL_Msm_t::gy_f, NL_Msm_t::gz_f, NL_Msm_t::gzero_f, NL_Msm_t::hx_f, NL_Msm_t::hy_f, NL_Msm_t::hz_f, MAX_POLY_DEGREE, NL_Msm_t::msmflags, NL_MSM_ERROR_RANGE, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, NL_Msm_t::numatoms, PolyDegree, NL_Msm_t::qh_f, and NL_Msm_t::uelec.

Referenced by NL_msm_compute_long_range_sprec().

961 {
962  float *f = msm->felec_f;
963  const float *atom = msm->atom_f;
964  const int natoms = msm->numatoms;
965 
966  float xphi[MAX_POLY_DEGREE+1]; /* Phi stencil along x-dimension */
967  float yphi[MAX_POLY_DEGREE+1]; /* Phi stencil along y-dimension */
968  float zphi[MAX_POLY_DEGREE+1]; /* Phi stencil along z-dimension */
969  float dxphi[MAX_POLY_DEGREE+1]; /* derivative of Phi along x-dimension */
970  float dyphi[MAX_POLY_DEGREE+1]; /* derivative of Phi along y-dimension */
971  float dzphi[MAX_POLY_DEGREE+1]; /* derivative of Phi along z-dimension */
972  float rx_hx, ry_hy, rz_hz; /* normalized distance from atom to origin */
973 #ifndef TEST_INLINING
974  float delta; /* normalized distance to stencil point */
975 #else
976  float t; /* normalized distance to stencil point */
977 #endif
978  const float hx_1 = 1/msm->hx_f;
979  const float hy_1 = 1/msm->hy_f;
980  const float hz_1 = 1/msm->hz_f;
981  const float xm0 = msm->gx_f; /* grid origin x */
982  const float ym0 = msm->gy_f; /* grid origin y */
983  const float zm0 = msm->gz_f; /* grid origin z */
984  float q;
985  double u = 0; /* need double precision for accumulating potential */
986 
987  const NL_Msmgrid_float *ehgrid = &(msm->eh_f[0]);
988  const float *eh = ehgrid->data;
989  const float *ebuf = ehgrid->buffer;
990  const NL_Msmgrid_float *qhgrid = &(msm->qh_f[0]);
991  const float *qbuf = qhgrid->buffer;
992  const int ni = ehgrid->ni;
993  const int nj = ehgrid->nj;
994  const int nk = ehgrid->nk;
995  const int ia = ehgrid->i0;
996  const int ib = ia + ni - 1;
997  const int ja = ehgrid->j0;
998  const int jb = ja + nj - 1;
999  const int ka = ehgrid->k0;
1000  const int kb = ka + nk - 1;
1001 
1002  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1003  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1004  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1005  int iswithin;
1006 
1007  float fx, fy, fz, cx, cy, cz;
1008 
1009  int n, i, j, k, ilo, jlo, klo, koff;
1010  long jkoff, index;
1011  const int nn = (ni*nj) * nk;
1012 
1013  const int approx = msm->approx;
1014  const int s_size = PolyDegree[approx] + 1; /* stencil size */
1015  const int s_edge = (PolyDegree[approx] - 1) >> 1; /* stencil "edge" size */
1016 
1017 #ifdef NL_DEBUG
1018 #if 1
1019  printf("+++ eh[0,0,0] = %g\n", eh[0]);
1020 #else
1021  n = 0;
1022  for (k = ka; k <= kb; k++) {
1023  for (j = ja; j <= jb; j++) {
1024  for (i = ia; i <= ib; i++, n++) {
1025  printf("=== eh[%d,%d,%d] = %g\n", i, j, k, ebuf[n]);
1026  }
1027  }
1028  }
1029 #endif
1030 #endif
1031 
1032  for (n = 0; n < natoms; n++) {
1033 
1034  /* atomic charge */
1035  q = atom[4*n + 3];
1036  if (0==q) continue;
1037 
1038  /* distance between atom and origin measured in grid points */
1039  rx_hx = (atom[4*n ] - xm0) * hx_1;
1040  ry_hy = (atom[4*n + 1] - ym0) * hy_1;
1041  rz_hz = (atom[4*n + 2] - zm0) * hz_1;
1042 
1043  /* find smallest numbered grid point in stencil */
1044  ilo = (int) floorf(rx_hx) - s_edge;
1045  jlo = (int) floorf(ry_hy) - s_edge;
1046  klo = (int) floorf(rz_hz) - s_edge;
1047 
1048  /* calculate Phi stencils and derivatives along each dimension */
1049 #ifndef TEST_INLINING
1050  delta = rx_hx - (float) ilo;
1051  //STENCIL_1D(xphi, delta, approx);
1052  D_STENCIL_1D(dxphi, xphi, hx_1, delta, approx);
1053  delta = ry_hy - (float) jlo;
1054  //STENCIL_1D(yphi, delta, approx);
1055  D_STENCIL_1D(dyphi, yphi, hy_1, delta, approx);
1056  delta = rz_hz - (float) klo;
1057  //STENCIL_1D(zphi, delta, approx);
1058  D_STENCIL_1D(dzphi, zphi, hz_1, delta, approx);
1059 #else
1060  t = rx_hx - (float) ilo;
1061  xphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
1062  dxphi[0] = (1.5f * t - 2) * (2 - t) * hx_1; \
1063  t--; \
1064  xphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
1065  dxphi[1] = (-5 + 4.5f * t) * t * hx_1; \
1066  t--; \
1067  xphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
1068  dxphi[2] = (-5 - 4.5f * t) * t * hx_1; \
1069  t--; \
1070  xphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
1071  dxphi[3] = (1.5f * t + 2) * (2 + t) * hx_1; \
1072 
1073  t = ry_hy - (double) jlo;
1074  yphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
1075  dyphi[0] = (1.5f * t - 2) * (2 - t) * hy_1; \
1076  t--; \
1077  yphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
1078  dyphi[1] = (-5 + 4.5f * t) * t * hy_1; \
1079  t--; \
1080  yphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
1081  dyphi[2] = (-5 - 4.5f * t) * t * hy_1; \
1082  t--; \
1083  yphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
1084  dyphi[3] = (1.5f * t + 2) * (2 + t) * hy_1; \
1085 
1086  t = rz_hz - (double) klo;
1087  zphi[0] = 0.5f * (1 - t) * (2 - t) * (2 - t); \
1088  dzphi[0] = (1.5f * t - 2) * (2 - t) * hz_1; \
1089  t--; \
1090  zphi[1] = (1 - t) * (1 + t - 1.5f * t * t); \
1091  dzphi[1] = (-5 + 4.5f * t) * t * hz_1; \
1092  t--; \
1093  zphi[2] = (1 + t) * (1 - t - 1.5f * t * t); \
1094  dzphi[2] = (-5 - 4.5f * t) * t * hz_1; \
1095  t--; \
1096  zphi[3] = 0.5f * (1 + t) * (2 + t) * (2 + t); \
1097  dzphi[3] = (1.5f * t + 2) * (2 + t) * hz_1; \
1098 
1099 #endif
1100 
1101  /* test to see if stencil is within edges of grid */
1102  iswithin = ( ia <= ilo && (ilo+(s_size-1)) <= ib &&
1103  ja <= jlo && (jlo+(s_size-1)) <= jb &&
1104  ka <= klo && (klo+(s_size-1)) <= kb );
1105 
1106  if ( iswithin ) { /* no wrapping needed */
1107 
1108  /* determine force on atom from cube of grid point potentials */
1109  fx = fy = fz = 0;
1110  for (k = 0; k < s_size; k++) {
1111  koff = (k + klo) * nj;
1112  for (j = 0; j < s_size; j++) {
1113  jkoff = (koff + (j + jlo)) * ni;
1114  cx = yphi[j] * zphi[k];
1115  cy = dyphi[j] * zphi[k];
1116  cz = yphi[j] * dzphi[k];
1117  for (i = 0; i < s_size; i++) {
1118  index = jkoff + (i + ilo);
1119  GRID_INDEX_CHECK(ehgrid, i+ilo, j+jlo, k+klo);
1120  ASSERT(GRID_INDEX(ehgrid, i+ilo, j+jlo, k+klo) == index);
1121  fx += eh[index] * dxphi[i] * cx;
1122  fy += eh[index] * xphi[i] * cy;
1123  fz += eh[index] * xphi[i] * cz;
1124  }
1125  }
1126  }
1127  } /* if */
1128 
1129  else { /* requires wrapping around grid */
1130  int ip, jp, kp;
1131 
1132  /* adjust ilo, jlo, klo so they are within grid indexing */
1133  if (ispx) {
1134  if (ilo < ia) do { ilo += ni; } while (ilo < ia);
1135  else if (ilo > ib) do { ilo -= ni; } while (ilo > ib);
1136  }
1137  else if (ilo < ia || (ilo+(s_size-1)) > ib) {
1138  return NL_MSM_ERROR_RANGE;
1139  }
1140 
1141  if (ispy) {
1142  if (jlo < ja) do { jlo += nj; } while (jlo < ja);
1143  else if (jlo > jb) do { jlo -= nj; } while (jlo > jb);
1144  }
1145  else if (jlo < ja || (jlo+(s_size-1)) > jb) {
1146  return NL_MSM_ERROR_RANGE;
1147  }
1148 
1149  if (ispz) {
1150  if (klo < ka) do { klo += nk; } while (klo < ka);
1151  else if (klo > kb) do { klo -= nk; } while (klo > kb);
1152  }
1153  else if (klo < ka || (klo+(s_size-1)) > kb) {
1154  return NL_MSM_ERROR_RANGE;
1155  }
1156 
1157  /* determine force on atom from cube of grid point potentials, wrapping */
1158  fx = fy = fz = 0;
1159  for (k = 0, kp = klo; k < s_size; k++, kp++) {
1160  if (kp > kb) kp = ka; /* wrap stencil around grid */
1161  koff = kp * nj;
1162  for (j = 0, jp = jlo; j < s_size; j++, jp++) {
1163  if (jp > jb) jp = ja; /* wrap stencil around grid */
1164  jkoff = (koff + jp) * ni;
1165  cx = yphi[j] * zphi[k];
1166  cy = dyphi[j] * zphi[k];
1167  cz = yphi[j] * dzphi[k];
1168  for (i = 0, ip = ilo; i < s_size; i++, ip++) {
1169  if (ip > ib) ip = ia; /* wrap stencil around grid */
1170  index = jkoff + ip;
1171  GRID_INDEX_CHECK(ehgrid, ip, jp, kp);
1172  ASSERT(GRID_INDEX(ehgrid, ip, jp, kp) == index);
1173  fx += eh[index] * dxphi[i] * cx;
1174  fy += eh[index] * xphi[i] * cy;
1175  fz += eh[index] * xphi[i] * cz;
1176  }
1177  }
1178  }
1179  } /* else */
1180 
1181  /* update force */
1182  f[3*n ] -= q * fx;
1183  f[3*n+1] -= q * fy;
1184  f[3*n+2] -= q * fz;
1185 
1186  } /* end loop over atoms */
1187 
1188  /* compute potential, subtract self potential */
1189  u = 0;
1190  if (1) {
1191  for (n = 0; n < natoms; n++) {
1192  float q = atom[4*n + 3];
1193  u += q * q;
1194  }
1195  u *= -msm->gzero_f;
1196  }
1197  for (index = 0; index < nn; index++) {
1198  u += qbuf[index] * ebuf[index];
1199  }
1200  msm->uelec += 0.5 * u;
1201 
1202  return NL_MSM_SUCCESS;
1203 } /* interpolation() */
int msmflags
Definition: msm_defn.h:590
static const int PolyDegree[NL_MSM_APPROX_END]
int approx
Definition: msm_defn.h:642
#define GRID_INDEX_CHECK(a, _i, _j, _k)
Definition: msm_defn.h:114
const float * atom_f
Definition: msm_defn.h:572
int numatoms
Definition: msm_defn.h:599
float gz_f
Definition: msm_defn.h:648
float hy_f
Definition: msm_defn.h:637
float hx_f
Definition: msm_defn.h:637
double uelec
Definition: msm_defn.h:566
#define D_STENCIL_1D(_dphi, _phi, _h_1, _delta, _approx)
#define GRID_INDEX(_p, _i, _j, _k)
Definition: msm_defn.h:66
float * felec_f
Definition: msm_defn.h:571
float hz_f
Definition: msm_defn.h:637
#define ASSERT(E)
NL_Msmgrid_float * eh_f
Definition: msm_defn.h:671
float gx_f
Definition: msm_defn.h:648
NL_Msmgrid_float * qh_f
Definition: msm_defn.h:670
float gy_f
Definition: msm_defn.h:648
float gzero_f
Definition: msm_defn.h:651
int NL_msm_compute_long_range_sprec ( NL_Msm msm)

Definition at line 52 of file msm_longrng_sprec.c.

References anterpolation(), gridcutoff(), interpolation(), NL_Msm_t::msmflags, NL_MSM_COMPUTE_CUDA_FALL_BACK, NL_MSM_COMPUTE_CUDA_GRID_CUTOFF, NL_MSM_COMPUTE_NONFACTORED, NL_MSM_ERROR_SUPPORT, NL_MSM_SUCCESS, NL_Msm_t::nlevels, prolongation(), prolongation_factored(), NL_Msm_t::report_timings, restriction(), restriction_factored(), NL_Msm_t::timer_longrng, wkf_timer_start(), wkf_timer_stop(), and wkf_timer_time().

Referenced by NL_msm_compute_force_sprec().

52  {
53  double time_delta;
54  int rc = 0;
55  int level;
56  int nlevels = msm->nlevels;
57  int use_cuda_gridcutoff = (msm->msmflags & NL_MSM_COMPUTE_CUDA_GRID_CUTOFF);
58  int fallback_cpu = (msm->msmflags & NL_MSM_COMPUTE_CUDA_FALL_BACK);
59  int use_nonfactored = (msm->msmflags & NL_MSM_COMPUTE_NONFACTORED);
60 
62  rc = anterpolation(msm);
63  if (rc) return rc;
65  time_delta = wkf_timer_time(msm->timer_longrng);
66  if (msm->report_timings) {
67  printf("MSM anterpolation: %6.3f sec\n", time_delta);
68  }
69 
71  if (use_nonfactored) {
72  for (level = 0; level < nlevels - 1; level++) {
73  rc = restriction(msm, level);
74  if (rc) return rc;
75  }
76  }
77  else {
78  for (level = 0; level < nlevels - 1; level++) {
79  rc = restriction_factored(msm, level);
80  if (rc) return rc;
81  }
82  }
84  time_delta = wkf_timer_time(msm->timer_longrng);
85  if (msm->report_timings) {
86  printf("MSM restriction: %6.3f sec\n", time_delta);
87  }
88 
90  if (use_cuda_gridcutoff && nlevels > 1) {
91 #ifdef NL_USE_CUDA
92  if ((rc = NL_msm_cuda_condense_qgrids(msm)) != NL_MSM_SUCCESS ||
93  (rc = NL_msm_cuda_compute_gridcutoff(msm)) != NL_MSM_SUCCESS ||
94  (rc = NL_msm_cuda_expand_egrids(msm)) != NL_MSM_SUCCESS) {
95  if (fallback_cpu) {
96  printf("Falling back on CPU for grid cutoff computation\n");
97  use_cuda_gridcutoff = 0;
98  }
99  else return rc;
100  }
101 #else
102  if (fallback_cpu) {
103  printf("Falling back on CPU for grid cutoff computation\n");
104  use_cuda_gridcutoff = 0;
105  }
106  else return NL_MSM_ERROR_SUPPORT;
107 #endif
108  }
109 
110  if ( ! use_cuda_gridcutoff ) {
111  for (level = 0; level < nlevels - 1; level++) {
112  rc = gridcutoff(msm, level);
113  if (rc) return rc;
114  }
115  }
116 
117  rc = gridcutoff(msm, level); /* top level */
118  if (rc) return rc;
119 
121  time_delta = wkf_timer_time(msm->timer_longrng);
122  if (msm->report_timings) {
123  printf("MSM grid cutoff: %6.3f sec\n", time_delta);
124  }
125 
127  if (use_nonfactored) {
128  for (level--; level >= 0; level--) {
129  rc = prolongation(msm, level);
130  if (rc) return rc;
131  }
132  }
133  else {
134  for (level--; level >= 0; level--) {
135  rc = prolongation_factored(msm, level);
136  if (rc) return rc;
137  }
138  }
140  time_delta = wkf_timer_time(msm->timer_longrng);
141  if (msm->report_timings) {
142  printf("MSM prolongation: %6.3f sec\n", time_delta);
143  }
144 
146  rc = interpolation(msm);
147  if (rc) return rc;
149  time_delta = wkf_timer_time(msm->timer_longrng);
150  if (msm->report_timings) {
151  printf("MSM interpolation: %6.3f sec\n", time_delta);
152  }
153 
154  return 0;
155 }
double wkf_timer_time(wkf_timerhandle v)
Definition: wkfutils.c:134
static int interpolation(NL_Msm *)
int msmflags
Definition: msm_defn.h:590
static int prolongation_factored(NL_Msm *, int level)
static int gridcutoff(NL_Msm *, int level)
static int prolongation(NL_Msm *, int level)
void wkf_timer_stop(wkf_timerhandle v)
Definition: wkfutils.c:129
wkf_timerhandle timer_longrng
Definition: msm_defn.h:685
static int anterpolation(NL_Msm *)
static int restriction(NL_Msm *, int level)
void wkf_timer_start(wkf_timerhandle v)
Definition: wkfutils.c:124
int nlevels
Definition: msm_defn.h:645
static int restriction_factored(NL_Msm *, int level)
int report_timings
Definition: msm_defn.h:683
int prolongation ( NL_Msm msm,
int  level 
)
static

Definition at line 1595 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, NL_Msm_t::eh_f, IndexOffset, NL_Msm_t::msmflags, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, Nstencil, and PhiStencil.

Referenced by NL_msm_compute_long_range_sprec().

1595  {
1596  /* grids of charge, finer grid and coarser grid */
1597  NL_Msmgrid_float *ehgrid = &(msm->eh_f[level]);
1598  float *eh = ehgrid->data; /* index the offset data buffer */
1599  const NL_Msmgrid_float *e2hgrid = &(msm->eh_f[level+1]);
1600  const float *e2h_buffer = e2hgrid->buffer; /* index the raw buffer */
1601 
1602  /* finer grid index ranges and dimensions */
1603  const int ia1 = ehgrid->i0; /* lowest x-index */
1604  const int ib1 = ia1 + ehgrid->ni - 1; /* highest x-index */
1605  const int ja1 = ehgrid->j0; /* lowest y-index */
1606  const int jb1 = ja1 + ehgrid->nj - 1; /* highest y-index */
1607  const int ka1 = ehgrid->k0; /* lowest z-index */
1608  const int kb1 = ka1 + ehgrid->nk - 1; /* highest z-index */
1609  const int ni1 = ehgrid->ni; /* length along x-dim */
1610  const int nj1 = ehgrid->nj; /* length along y-dim */
1611  const int nk1 = ehgrid->nk; /* length along z-dim */
1612 
1613  /* coarser grid index ranges and dimensions */
1614  const int ia2 = e2hgrid->i0; /* lowest x-index */
1615  const int ib2 = ia2 + e2hgrid->ni - 1; /* highest x-index */
1616  const int ja2 = e2hgrid->j0; /* lowest y-index */
1617  const int jb2 = ja2 + e2hgrid->nj - 1; /* highest y-index */
1618  const int ka2 = e2hgrid->k0; /* lowest z-index */
1619  const int kb2 = ka2 + e2hgrid->nk - 1; /* highest z-index */
1620 
1621  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1622  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1623  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1624 
1625  const int nstencil = Nstencil[msm->approx];
1626  const int *offset = IndexOffset[msm->approx];
1627  const float *phi = PhiStencil[msm->approx];
1628 
1629  float cjk;
1630 
1631  int i1, j1, k1, index1, jk1off, k1off;
1632  int i2, j2, k2;
1633  int index2;
1634  int i, j, k;
1635 
1636  for (index2 = 0, k2 = ka2; k2 <= kb2; k2++) {
1637  k1 = k2 * 2;
1638  for (j2 = ja2; j2 <= jb2; j2++) {
1639  j1 = j2 * 2;
1640  for (i2 = ia2; i2 <= ib2; i2++, index2++) {
1641  i1 = i2 * 2;
1642 
1643  for (k = 0; k < nstencil; k++) {
1644  k1off = k1 + offset[k];
1645  if (k1off < ka1) {
1646  if (ispz) do { k1off += nk1; } while (k1off < ka1);
1647  else continue;
1648  }
1649  else if (k1off > kb1) {
1650  if (ispz) do { k1off -= nk1; } while (k1off > kb1);
1651  else break;
1652  }
1653  k1off *= nj1;
1654  for (j = 0; j < nstencil; j++) {
1655  jk1off = j1 + offset[j];
1656  if (jk1off < ja1) {
1657  if (ispy) do { jk1off += nj1; } while (jk1off < ja1);
1658  else continue;
1659  }
1660  else if (jk1off > jb1) {
1661  if (ispy) do { jk1off -= nj1; } while (jk1off > jb1);
1662  else break;
1663  }
1664  jk1off = (k1off + jk1off) * ni1;
1665  cjk = phi[j] * phi[k];
1666  for (i = 0; i < nstencil; i++) {
1667  index1 = i1 + offset[i];
1668  if (index1 < ia1) {
1669  if (ispx) do { index1 += ni1; } while (index1 < ia1);
1670  else continue;
1671  }
1672  else if (index1 > ib1) {
1673  if (ispx) do { index1 -= ni1; } while (index1 > ib1);
1674  else break;
1675  }
1676  index1 += jk1off;
1677  eh[index1] += e2h_buffer[index2] * phi[i] * cjk;
1678  }
1679  }
1680  } /* end loop over finer grid stencil */
1681 
1682  }
1683  }
1684  } /* end loop over each coarser grid point */
1685 
1686  return NL_MSM_SUCCESS;
1687 }
int msmflags
Definition: msm_defn.h:590
int approx
Definition: msm_defn.h:642
static const int Nstencil[NL_MSM_APPROX_END]
NL_Msmgrid_float * eh_f
Definition: msm_defn.h:671
static const float PhiStencil[NL_MSM_APPROX_END][MAX_NSTENCIL]
static const int IndexOffset[NL_MSM_APPROX_END][MAX_NSTENCIL]
int prolongation_factored ( NL_Msm msm,
int  level 
)
static

Definition at line 1353 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, NL_Msm_t::eh_f, NL_Msm_t::lyzd_f, NL_Msm_t::lzd_f, NL_Msm_t::msmflags, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, PhiStencilFactored, and PolyDegree.

Referenced by NL_msm_compute_long_range_sprec().

1353  {
1354  /* grids of potential, finer grid and coarser grid */
1355  NL_Msmgrid_float *ehgrid = &(msm->eh_f[level]);
1356  float *eh = ehgrid->data;
1357  const NL_Msmgrid_float *e2hgrid = &(msm->eh_f[level+1]);
1358  const float *e2h = e2hgrid->data;
1359 
1360  /* finer grid index ranges and dimensions */
1361  const int ia1 = ehgrid->i0; /* lowest x-index */
1362  const int ib1 = ia1 + ehgrid->ni - 1; /* highest x-index */
1363  const int ja1 = ehgrid->j0; /* lowest y-index */
1364  const int jb1 = ja1 + ehgrid->nj - 1; /* highest y-index */
1365  const int ka1 = ehgrid->k0; /* lowest z-index */
1366  const int kb1 = ka1 + ehgrid->nk - 1; /* highest z-index */
1367  const int ni1 = ehgrid->ni; /* length along x-dim */
1368  const int nj1 = ehgrid->nj; /* length along y-dim */
1369  const int nk1 = ehgrid->nk; /* length along z-dim */
1370 
1371  /* coarser grid index ranges and dimensions */
1372  const int ia2 = e2hgrid->i0; /* lowest x-index */
1373  const int ib2 = ia2 + e2hgrid->ni - 1; /* highest x-index */
1374  const int ja2 = e2hgrid->j0; /* lowest y-index */
1375  const int jb2 = ja2 + e2hgrid->nj - 1; /* highest y-index */
1376  const int ka2 = e2hgrid->k0; /* lowest z-index */
1377  const int kb2 = ka2 + e2hgrid->nk - 1; /* highest z-index */
1378  const int nrow_e2 = e2hgrid->ni;
1379  const int nstride_e2 = nrow_e2 * e2hgrid->nj;
1380 
1381  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1382  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1383  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1384 
1385  /* set buffer using indexing offset, so that indexing matches eh grid */
1386  float *ezd = msm->lzd_f + (-ka1);
1387  float *eyzd = msm->lyzd_f + (-ka1*nj1 + -ja1);
1388 
1389  const size_t size_lzd = nk1 * sizeof(float);
1390  const size_t size_lyzd = nj1 * nk1 * sizeof(float);
1391 
1392  const float *phi = NULL;
1393 
1394  int i2, j2, k2;
1395  int im, jm, km;
1396  int i, j, k;
1397  int index_plane_e2, index_e2;
1398  int index_jk, offset_k;
1399  int offset;
1400 
1401  const float *phi_factored = PhiStencilFactored[msm->approx];
1402  const int r_stencil = PolyDegree[msm->approx]; /* "radius" of stencil */
1403  const int diam_stencil = 2*r_stencil + 1; /* "diameter" of stencil */
1404 
1405  for (i2 = ia2; i2 <= ib2; i2++) {
1406  memset(msm->lyzd_f, 0, size_lyzd);
1407 
1408  for (j2 = ja2; j2 <= jb2; j2++) {
1409  memset(msm->lzd_f, 0, size_lzd);
1410  index_plane_e2 = j2 * nrow_e2 + i2;
1411 
1412  for (k2 = ka2; k2 <= kb2; k2++) {
1413  index_e2 = k2 * nstride_e2 + index_plane_e2;
1414  km = (k2 << 1); /* = 2*k2 */
1415  if ( ! ispz ) { /* nonperiodic */
1416  int lower = km - r_stencil;
1417  int upper = km + r_stencil;
1418  if (lower < ka1) lower = ka1;
1419  if (upper > kb1) upper = kb1; /* clip edges */
1420  phi = phi_factored + r_stencil; /* center of stencil */
1421  for (k = lower; k <= upper; k++) {
1422  ezd[k] += phi[k-km] * e2h[index_e2];
1423  }
1424  }
1425  else { /* periodic */
1426  int kp = km - r_stencil; /* index at left end of stencil */
1427  if (kp < ka1) do { kp += nk1; } while (kp < ka1); /* start inside */
1428  phi = phi_factored; /* left end of stencil */
1429  for (k = 0; k < diam_stencil; k++, kp++) {
1430  if (kp > kb1) kp = ka1; /* wrap around edge of grid */
1431  ezd[kp] += phi[k] * e2h[index_e2];
1432  }
1433  }
1434  } /* for k2 */
1435 
1436  for (k = ka1; k <= kb1; k++) {
1437  offset = k * nj1;
1438  jm = (j2 << 1); /* = 2*j2 */
1439  if ( ! ispy ) { /* nonperiodic */
1440  int lower = jm - r_stencil;
1441  int upper = jm + r_stencil;
1442  if (lower < ja1) lower = ja1;
1443  if (upper > jb1) upper = jb1; /* clip edges */
1444  phi = phi_factored + r_stencil; /* center of stencil */
1445  for (j = lower; j <= upper; j++) {
1446  eyzd[offset + j] += phi[j-jm] * ezd[k];
1447  }
1448  }
1449  else { /* periodic */
1450  int jp = jm - r_stencil; /* index at left end of stencil */
1451  if (jp < ja1) do { jp += nj1; } while (jp < ja1); /* start inside */
1452  phi = phi_factored; /* left end of stencil */
1453  for (j = 0; j < diam_stencil; j++, jp++) {
1454  if (jp > jb1) jp = ja1; /* wrap around edge of grid */
1455  eyzd[offset + jp] += phi[j] * ezd[k];
1456  }
1457  }
1458  } /* for k */
1459 
1460  } /* for j2 */
1461 
1462  for (k = ka1; k <= kb1; k++) {
1463  offset_k = k * nj1;
1464 
1465  for (j = ja1; j <= jb1; j++) {
1466  index_jk = offset_k + j;
1467  offset = index_jk * ni1;
1468  im = (i2 << 1); /* = 2*i2 */
1469  if ( ! ispx ) { /* nonperiodic */
1470  int lower = im - r_stencil;
1471  int upper = im + r_stencil;
1472  if (lower < ia1) lower = ia1;
1473  if (upper > ib1) upper = ib1; /* clip edges */
1474  phi = phi_factored + r_stencil; /* center of stencil */
1475  for (i = lower; i <= upper; i++) {
1476  eh[offset + i] += phi[i-im] * eyzd[index_jk];
1477  }
1478  }
1479  else { /* periodic */
1480  int ip = im - r_stencil; /* index at left end of stencil */
1481  if (ip < ia1) do { ip += ni1; } while (ip < ia1); /* start inside */
1482  phi = phi_factored; /* left end of stencil */
1483  for (i = 0; i < diam_stencil; i++, ip++) {
1484  if (ip > ib1) ip = ia1; /* wrap around edge of grid */
1485  eh[offset + ip] += phi[i] * eyzd[index_jk];
1486  }
1487  }
1488  } /* for j */
1489 
1490  } /* for k */
1491 
1492  } /* for i2 */
1493 
1494  return NL_MSM_SUCCESS;
1495 } /* prolongation_factored */
int msmflags
Definition: msm_defn.h:590
static const int PolyDegree[NL_MSM_APPROX_END]
int approx
Definition: msm_defn.h:642
static const float PhiStencilFactored[NL_MSM_APPROX_END][2 *MAX_POLY_DEGREE+1]
float * lyzd_f
Definition: msm_defn.h:681
NL_Msmgrid_float * eh_f
Definition: msm_defn.h:671
float * lzd_f
Definition: msm_defn.h:680
int restriction ( NL_Msm msm,
int  level 
)
static

Definition at line 1498 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, IndexOffset, NL_Msm_t::msmflags, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, Nstencil, PhiStencil, and NL_Msm_t::qh_f.

Referenced by NL_msm_compute_long_range_sprec().

1498  {
1499  /* grids of charge, finer grid and coarser grid */
1500  const NL_Msmgrid_float *qhgrid = &(msm->qh_f[level]);
1501  const float *qh = qhgrid->data; /* index the offset data buffer */
1502  NL_Msmgrid_float *q2hgrid = &(msm->qh_f[level+1]);
1503  float *q2h_buffer = q2hgrid->buffer; /* index the raw buffer */
1504 
1505  /* finer grid index ranges and dimensions */
1506  const int ia1 = qhgrid->i0; /* lowest x-index */
1507  const int ib1 = ia1 + qhgrid->ni - 1; /* highest x-index */
1508  const int ja1 = qhgrid->j0; /* lowest y-index */
1509  const int jb1 = ja1 + qhgrid->nj - 1; /* highest y-index */
1510  const int ka1 = qhgrid->k0; /* lowest z-index */
1511  const int kb1 = ka1 + qhgrid->nk - 1; /* highest z-index */
1512  const int ni1 = qhgrid->ni; /* length along x-dim */
1513  const int nj1 = qhgrid->nj; /* length along y-dim */
1514  const int nk1 = qhgrid->nk; /* length along z-dim */
1515 
1516  /* coarser grid index ranges and dimensions */
1517  const int ia2 = q2hgrid->i0; /* lowest x-index */
1518  const int ib2 = ia2 + q2hgrid->ni - 1; /* highest x-index */
1519  const int ja2 = q2hgrid->j0; /* lowest y-index */
1520  const int jb2 = ja2 + q2hgrid->nj - 1; /* highest y-index */
1521  const int ka2 = q2hgrid->k0; /* lowest z-index */
1522  const int kb2 = ka2 + q2hgrid->nk - 1; /* highest z-index */
1523 
1524  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1525  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1526  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1527 
1528  const int nstencil = Nstencil[msm->approx];
1529  const int *offset = IndexOffset[msm->approx];
1530  const float *phi = PhiStencil[msm->approx];
1531 
1532  float q2h_sum, cjk;
1533 
1534  int i1, j1, k1, index1, jk1off, k1off;
1535  int i2, j2, k2;
1536  int index2;
1537  int i, j, k;
1538 
1539  for (index2 = 0, k2 = ka2; k2 <= kb2; k2++) {
1540  k1 = k2 * 2;
1541  for (j2 = ja2; j2 <= jb2; j2++) {
1542  j1 = j2 * 2;
1543  for (i2 = ia2; i2 <= ib2; i2++, index2++) {
1544  i1 = i2 * 2;
1545 
1546  q2h_sum = 0;
1547  for (k = 0; k < nstencil; k++) {
1548  k1off = k1 + offset[k];
1549  if (k1off < ka1) {
1550  if (ispz) do { k1off += nk1; } while (k1off < ka1);
1551  else continue;
1552  }
1553  else if (k1off > kb1) {
1554  if (ispz) do { k1off -= nk1; } while (k1off > kb1);
1555  else break;
1556  }
1557  k1off *= nj1;
1558  for (j = 0; j < nstencil; j++) {
1559  jk1off = j1 + offset[j];
1560  if (jk1off < ja1) {
1561  if (ispy) do { jk1off += nj1; } while (jk1off < ja1);
1562  else continue;
1563  }
1564  else if (jk1off > jb1) {
1565  if (ispy) do { jk1off -= nj1; } while (jk1off > jb1);
1566  else break;
1567  }
1568  jk1off = (k1off + jk1off) * ni1;
1569  cjk = phi[j] * phi[k];
1570  for (i = 0; i < nstencil; i++) {
1571  index1 = i1 + offset[i];
1572  if (index1 < ia1) {
1573  if (ispx) do { index1 += ni1; } while (index1 < ia1);
1574  else continue;
1575  }
1576  else if (index1 > ib1) {
1577  if (ispx) do { index1 -= ni1; } while (index1 > ib1);
1578  else break;
1579  }
1580  index1 += jk1off;
1581  q2h_sum += qh[index1] * phi[i] * cjk;
1582  }
1583  }
1584  } /* end loop over finer grid stencil */
1585 
1586  q2h_buffer[index2] = q2h_sum; /* store charge to coarser grid */
1587  }
1588  }
1589  } /* end loop over each coarser grid point */
1590 
1591  return NL_MSM_SUCCESS;
1592 }
int msmflags
Definition: msm_defn.h:590
int approx
Definition: msm_defn.h:642
static const int Nstencil[NL_MSM_APPROX_END]
static const float PhiStencil[NL_MSM_APPROX_END][MAX_NSTENCIL]
static const int IndexOffset[NL_MSM_APPROX_END][MAX_NSTENCIL]
NL_Msmgrid_float * qh_f
Definition: msm_defn.h:670
int restriction_factored ( NL_Msm msm,
int  level 
)
static

Definition at line 1206 of file msm_longrng_sprec.c.

References NL_Msm_t::approx, NL_Msm_t::lyzd_f, NL_Msm_t::lzd_f, NL_Msm_t::msmflags, NL_MSM_PERIODIC_VEC1, NL_MSM_PERIODIC_VEC2, NL_MSM_PERIODIC_VEC3, NL_MSM_SUCCESS, PhiStencilFactored, PolyDegree, and NL_Msm_t::qh_f.

Referenced by NL_msm_compute_long_range_sprec().

1206  {
1207  /* grids of charge, finer grid and coarser grid */
1208  const NL_Msmgrid_float *qhgrid = &(msm->qh_f[level]);
1209  const float *qh = qhgrid->data;
1210  NL_Msmgrid_float *q2hgrid = &(msm->qh_f[level+1]);
1211  float *q2h = q2hgrid->data;
1212 
1213  /* finer grid index ranges and dimensions */
1214  const int ia1 = qhgrid->i0; /* lowest x-index */
1215  const int ib1 = ia1 + qhgrid->ni - 1; /* highest x-index */
1216  const int ja1 = qhgrid->j0; /* lowest y-index */
1217  const int jb1 = ja1 + qhgrid->nj - 1; /* highest y-index */
1218  const int ka1 = qhgrid->k0; /* lowest z-index */
1219  const int kb1 = ka1 + qhgrid->nk - 1; /* highest z-index */
1220  const int ni1 = qhgrid->ni; /* length along x-dim */
1221  const int nj1 = qhgrid->nj; /* length along y-dim */
1222  const int nk1 = qhgrid->nk; /* length along z-dim */
1223 
1224  /* coarser grid index ranges and dimensions */
1225  const int ia2 = q2hgrid->i0; /* lowest x-index */
1226  const int ib2 = ia2 + q2hgrid->ni - 1; /* highest x-index */
1227  const int ja2 = q2hgrid->j0; /* lowest y-index */
1228  const int jb2 = ja2 + q2hgrid->nj - 1; /* highest y-index */
1229  const int ka2 = q2hgrid->k0; /* lowest z-index */
1230  const int kb2 = ka2 + q2hgrid->nk - 1; /* highest z-index */
1231  const int nrow_q2 = q2hgrid->ni;
1232  const int nstride_q2 = nrow_q2 * q2hgrid->nj;
1233 
1234  const int ispx = (msm->msmflags & NL_MSM_PERIODIC_VEC1);
1235  const int ispy = (msm->msmflags & NL_MSM_PERIODIC_VEC2);
1236  const int ispz = (msm->msmflags & NL_MSM_PERIODIC_VEC3);
1237 
1238  /* set buffer using indexing offset, so that indexing matches qh grid */
1239  float *qzd = msm->lzd_f + (-ka1);
1240  float *qyzd = msm->lyzd_f + (-ka1*nj1 + -ja1);
1241  float qsum;
1242 
1243  const float *phi = NULL;
1244 
1245  int i2, j2, k2;
1246  int im, jm, km;
1247  int i, j, k;
1248  int index_plane_q2, index_q2;
1249  int index_jk, offset_k;
1250  int offset;
1251 
1252  const float *phi_factored = PhiStencilFactored[msm->approx];
1253  const int r_stencil = PolyDegree[msm->approx]; /* "radius" of stencil */
1254  const int diam_stencil = 2*r_stencil + 1; /* "diameter" of stencil */
1255 
1256  for (i2 = ia2; i2 <= ib2; i2++) {
1257 
1258  for (k = ka1; k <= kb1; k++) {
1259  offset_k = k * nj1;
1260 
1261  for (j = ja1; j <= jb1; j++) {
1262  index_jk = offset_k + j;
1263  offset = index_jk * ni1;
1264  im = (i2 << 1); /* = 2*i2 */
1265  qsum = 0;
1266  if ( ! ispx ) { /* nonperiodic */
1267  int lower = im - r_stencil;
1268  int upper = im + r_stencil;
1269  if (lower < ia1) lower = ia1;
1270  if (upper > ib1) upper = ib1; /* clip edges */
1271  phi = phi_factored + r_stencil; /* center of stencil */
1272  for (i = lower; i <= upper; i++) {
1273  qsum += phi[i-im] * qh[offset + i];
1274  }
1275  }
1276  else { /* periodic */
1277  int ip = im - r_stencil; /* index at left end of stencil */
1278  if (ip < ia1) do { ip += ni1; } while (ip < ia1); /* start inside */
1279  phi = phi_factored; /* left end of stencil */
1280  for (i = 0; i < diam_stencil; i++, ip++) {
1281  if (ip > ib1) ip = ia1; /* wrap around edge of grid */
1282  qsum += phi[i] * qh[offset + ip];
1283  }
1284  }
1285  qyzd[index_jk] = qsum;
1286  } /* for j */
1287 
1288  } /* for k */
1289 
1290  for (j2 = ja2; j2 <= jb2; j2++) {
1291  index_plane_q2 = j2 * nrow_q2 + i2;
1292 
1293  for (k = ka1; k <= kb1; k++) {
1294  offset = k * nj1;
1295  jm = (j2 << 1); /* = 2*j2 */
1296  qsum = 0;
1297  if ( ! ispy ) { /* nonperiodic */
1298  int lower = jm - r_stencil;
1299  int upper = jm + r_stencil;
1300  if (lower < ja1) lower = ja1;
1301  if (upper > jb1) upper = jb1; /* clip edges */
1302  phi = phi_factored + r_stencil; /* center of stencil */
1303  for (j = lower; j <= upper; j++) {
1304  qsum += phi[j-jm] * qyzd[offset + j];
1305  }
1306  }
1307  else { /* periodic */
1308  int jp = jm - r_stencil; /* index at left end of stencil */
1309  if (jp < ja1) do { jp += nj1; } while (jp < ja1); /* start inside */
1310  phi = phi_factored; /* left end of stencil */
1311  for (j = 0; j < diam_stencil; j++, jp++) {
1312  if (jp > jb1) jp = ja1; /* wrap around edge of grid */
1313  qsum += phi[j] * qyzd[offset + jp];
1314  }
1315  }
1316  qzd[k] = qsum;
1317  } /* for k */
1318 
1319  for (k2 = ka2; k2 <= kb2; k2++) {
1320  index_q2 = k2 * nstride_q2 + index_plane_q2;
1321  km = (k2 << 1); /* = 2*k2 */
1322  qsum = 0;
1323  if ( ! ispz ) { /* nonperiodic */
1324  int lower = km - r_stencil;
1325  int upper = km + r_stencil;
1326  if (lower < ka1) lower = ka1;
1327  if (upper > kb1) upper = kb1; /* clip edges */
1328  phi = phi_factored + r_stencil; /* center of stencil */
1329  for (k = lower; k <= upper; k++) {
1330  qsum += phi[k-km] * qzd[k];
1331  }
1332  }
1333  else { /* periodic */
1334  int kp = km - r_stencil; /* index at left end of stencil */
1335  if (kp < ka1) do { kp += nk1; } while (kp < ka1); /* start inside */
1336  phi = phi_factored; /* left end of stencil */
1337  for (k = 0; k < diam_stencil; k++, kp++) {
1338  if (kp > kb1) kp = ka1; /* wrap around edge of grid */
1339  qsum += phi[k] * qzd[kp];
1340  }
1341  }
1342  q2h[index_q2] = qsum;
1343  } /* for k2 */
1344 
1345  } /* for j2 */
1346 
1347  } /* for i2 */
1348 
1349  return NL_MSM_SUCCESS;
1350 } /* restriction_factored */
int msmflags
Definition: msm_defn.h:590
static const int PolyDegree[NL_MSM_APPROX_END]
int approx
Definition: msm_defn.h:642
static const float PhiStencilFactored[NL_MSM_APPROX_END][2 *MAX_POLY_DEGREE+1]
float * lyzd_f
Definition: msm_defn.h:681
NL_Msmgrid_float * qh_f
Definition: msm_defn.h:670
float * lzd_f
Definition: msm_defn.h:680

Variable Documentation

const int IndexOffset[NL_MSM_APPROX_END][MAX_NSTENCIL]
static
Initial value:
= {
{-3, -1, 0, 1, 3},
{-5, -3, -1, 0, 1, 3, 5},
{-5, -3, -1, 0, 1, 3, 5},
{-7, -5, -3, -1, 0, 1, 3, 5, 7},
{-7, -5, -3, -1, 0, 1, 3, 5, 7},
{-9, -7, -5, -3, -1, 0, 1, 3, 5, 7, 9},
{-9, -7, -5, -3, -1, 0, 1, 3, 5, 7, 9},
{-3, -2, -1, 0, 1, 2, 3},
}

Index offsets from the stencil-centered grid element, to get to the correct contributing grid element.

Definition at line 214 of file msm_longrng_sprec.c.

Referenced by prolongation(), and restriction().

const int Nstencil[NL_MSM_APPROX_END]
static
Initial value:
= {
5, 7, 7, 9, 9, 11, 11, 7
}

The stencil array lengths below.

Definition at line 208 of file msm_longrng_sprec.c.

Referenced by prolongation(), and restriction().

const float PhiStencil[NL_MSM_APPROX_END][MAX_NSTENCIL]
static

The grid transfer stencils for the non-factored restriction and prolongation procedures.

Definition at line 242 of file msm_longrng_sprec.c.

Referenced by prolongation(), and restriction().

const float PhiStencilFactored[NL_MSM_APPROX_END][2 *MAX_POLY_DEGREE+1]
static

The grid transfer stencils for factored restriction and prolongation. Must be listed in same order as APPROX enum from msm.h

Definition at line 170 of file msm_longrng_sprec.c.

Referenced by prolongation_factored(), and restriction_factored().

const int PolyDegree[NL_MSM_APPROX_END]
static
Initial value:
= {
3, 5, 5, 7, 7, 9, 9, 3,
}

Degree of polynomial basis function Phi. Must be listed in same order as APPROX enum from msm.h

Definition at line 163 of file msm_longrng_sprec.c.

Referenced by anterpolation(), interpolation(), prolongation_factored(), and restriction_factored().