|Tags||Date||Developer||Last 200 Commit Logs|
| ||2021-01-15||johns||Cranked version|
* configure 1.1519 (changed +1 -1)
| ||2021-01-05||johns||Added another cpuid project reference|
* src/WKFThreads.C 1.34 (changed +3 -2)
| ||2020-12-30||johns||Added runtime startup message to clearly indicate builds that have|
runtime CPU dispatch enabled vs. those that do not.
* src/VMDApp.C 1.580 (changed +7 -2)
| ||2020-12-27||johns||Added conditional compilation check for CPU info bailout.|
* src/WKFThreads.C 1.33 (changed +5 -2)
Corrected conditional compilation checks for non-x86 CPU runtime dispatch
* src/util_simd.C 1.30 (changed +2 -2)
| ||2020-12-24||johns||Added Python bindings to permit querying and setting color scale reversal|
and posterization parameters.
* src/py_color.C 1.38 (changed +39 -1)
* configure 1.1518 (changed +1 -1)
Revised color scale internals and the Color window GUI to implement
posterization of color scales down to a user-specified number of
individual color bands.
* src/py_color.C 1.37 (changed +16 -13)
* src/cmd_color.C 1.43 (changed +14 -4)
* src/VMDApp.h 1.257 (changed +5 -3)
* src/VMDApp.C 1.579 (changed +8 -6)
* src/Scene.h 1.69 (changed +7 -3)
* src/Scene.C 1.97 (changed +17 -6)
* src/ColorInfo.C 1.38 (changed +7 -3)
* src/ColorFltkMenu.h 1.25 (changed +2 -0)
* src/ColorFltkMenu.C 1.44 (changed +37 -5)
* src/CmdColor.h 1.35 (changed +3 -3)
* src/CmdColor.C 1.48 (changed +4 -3)
| ||2020-12-23||johns||Added a live color swatch box to the redesigned color menu.|
Continued adjustment of widget sizes and placement.
* src/ColorFltkMenu.h 1.23 (changed +2 -0)
* src/ColorFltkMenu.C 1.42 (changed +14 -7)
Revised the color scale internals to facilitate ooptional color scale order
reversal for both built-in divergent midpoint/offset color scales as
well as tabulated color scales.
* src/py_color.C 1.36 (changed +14 -13)
* src/cmd_color.C 1.42 (changed +5 -3)
* src/VMDApp.h 1.256 (changed +11 -8)
* src/VMDApp.C 1.578 (changed +6 -6)
* src/Scene.h 1.68 (changed +7 -3)
* src/Scene.C 1.96 (changed +7 -3)
* src/ColorInfo.C 1.37 (changed +7 -2)
* src/ColorFltkMenu.h 1.24 (changed +2 -0)
* src/ColorFltkMenu.C 1.43 (changed +26 -15)
* src/CmdColor.h 1.34 (changed +4 -2)
* src/CmdColor.C 1.47 (changed +3 -3)
| ||2020-12-21||johns||Added an implementation of the "cividis" color scale which improves|
upon the popular "viridis" color scale for viewers that have
color vision deficiencies.
* src/Scene.C 1.95 (changed +5 -2)
* src/ColorScaleTables.h 1.2 (changed +265 -3)
Changed _mm_set_pd1() to _mm_set1_pd() which seems more portable
* src/util_simd.C 1.29 (changed +3 -2)
Changed default colorscale "Offset" from 0.10 to 0.06 after comparisons
with high spatial frequency test grating images for both original
parameter and the new value. The built-in VMD divergent color scales
while still much better than the old Matlab "rainbow" scale,
leave much to be desired, both in terms of their ability to show
high frequency fine spatial details, luminance linearity, and other factors.
* src/Scene.C 1.93 (changed +4 -4)
Changed the conditional compilation tests for safe use of the
FLTK Fl_Color_Chooser class in place of the classic VMD color sliders.
* src/ColorFltkMenu.h 1.22 (changed +1 -1)
Corrected width of color category name itembrowser widget.
* src/ColorFltkMenu.C 1.37 (changed +1 -1)
* configure 1.1517 (changed +1 -1)
First steps in a major revision to the VMD color scale infrastructure.
The existing implementation has been extended with support for tabulated
color scales, new internal data structures and APIs to facilitate correct
GUI interaction for non-editable tabulated color scales.
Due to the significant increase in the total number of color scales
now available, the GUI has been revised to support forward
and reverse mapping of color scale menu names that include both scale
type categories and leaf node color scale names.
Redesigned the Color window layout to support a much larger color scale
test image, and added a high spatial frequency test grating image
based on the color scale test images developed by Peter Kovesi.
* src/VMDApp.h 1.255 (changed +2 -1)
* src/VMDApp.C 1.577 (changed +5 -1)
* src/Scene.h 1.67 (changed +37 -26)
* src/Scene.C 1.94 (changed +191 -11)
* src/ColorFltkMenu.h 1.20 (changed +21 -19)
* src/ColorFltkMenu.C 1.38 (changed +110 -31)
Freely licensed perceptually uniform color scales in tabulated form with
256 color entries per table. These are adapted from four of the popular
sequential color scales in Matplotlib, and a large selection of the
linear, cyclic, isoluminance, and rainbow color scales from the set of
CET perceptually uniform color scales published by Peter Kovesi.
* src/ColorScaleTables.h 1.1 (added +5774 -0)
Improved color scale plot title to "CIELAB L* perceptual lightness"
which is more informative and technically correct.
Completed the remaining math for RGB to CIELAB color conversion.
* src/ColorFltkMenu.C 1.40 (changed +7 -4)
Mention color editing GUI revisions.
* README 1.412 (changed +1 -0)
Pulled in latest colvars module from the git master.
* src/colvarscript.C 1.24 (changed +25 -4)
* src/colvars_version.h 1.15 (changed +1 -1)
* src/colvarproxy_vmd.C 1.21 (changed +1 -1)
* src/colvarcomp_rotations.C 1.12 (changed +332 -0)
* src/colvarcomp.h 1.16 (changed +65 -0)
* src/colvaratoms.C 1.23 (changed +2 -6)
* src/colvar.h 1.18 (changed +3 -0)
* src/colvar.C 1.29 (changed +6 -1)
Shifted the L* 100 ligthness marker down 10 pixels to match Fl_Chart layout
* src/ColorFltkMenu.C 1.41 (changed +1 -1)
Significantly redesigned the Color window layout, resizing
the color definitions and color scale tabs to occupy the full
window, and migrating the entirety of the color definition browsers
into the color definitions tab. This change to the window layout
better separates controls that are relevant for color definition from
those used for color scale selection and editing. The larger space
made available within the color scale tab enables the addition of
a plot of color scale CIELAB L* luminance just below the color scale
test grating image.
* src/ColorFltkMenu.h 1.21 (changed +2 -0)
* src/ColorFltkMenu.C 1.39 (changed +228 -35)
* README 1.411 (changed +1 -1)
| ||2020-12-19||johns||Major revision of the VMD color menu to optionally make use of the|
FLTK-provided Fl_Color_Chooser instead of the classic VMD color sliders.
By default, when compiling with FLTK versions >= 1.1.10 VMD will use
Fl_Color_Chooser, and will revert to the classic sliders otherwise.
The window size is significantly increased, to match the width of the
VMD main window, and with additional height to provide an easy-to-use
widget size for the FLTK color selector and associated controls.
The FLTK color control supports floating point RGB, Hex, integer,
and HSV color value ranges, with mouse-based color component scrolling,
so all of the original features have been preserved while adding
* src/ColorFltkMenu.h 1.19 (changed +25 -2)
* src/ColorFltkMenu.C 1.36 (changed +92 -14)
| ||2020-12-18||johns||Added VMD icon to VS2017 project, and Win64 builds.|
* msvc/vs2017/vmd.vcxproj.filters 1.8 (changed +13 -0)
* msvc/vs2017/vmd.vcxproj 1.12 (changed +7 -0)
* msvc/vs2017/resource.h 1.1 (added +21 -0)
* msvc/vs2017/icon1.ico 1.1 (added binary)
* msvc/vs2017/Resource.rc 1.1 (added binary)
Added built-in OptiX and OSPRay ray tracing engines to the Win64 builds.
* msvc/vs2017/vmd.vcxproj.filters 1.7 (changed +22 -1)
* msvc/vs2017/vmd.vcxproj 1.11 (changed +20 -8)
Added x64 installation script based on V5 of VMD 1.9.4a50 installer build.
The installer scripts are re-generated using the "HM NIS Edit" 2.x,
followed by hand-patching the installer code to add/delete VMD registry
keys and deal with x64-specific installer steps and Administrative
privilege level setting. Ideally some of this would be done with macros
or some sort of script template, but these steps have to be inserted
deep into installer subsections making it an annoying manual editing
process at present. This script can be used as a point of reference to
recover the manual hand editing by carefully diffing a freshly generated
installer script with the previous version. It may eventually be possible
to automate this completely by writing an installer generation script or
by devising pathes that apply relative to subsection starting points
that the "patch" utility has a hope of recognizing automatically.
* msvc/inst-nsis/vmd-win64-nsis.nsi 1.1 (added +7655 -0)
| ||2020-12-15||johns||Prevent glwin window destruction from killing the parent app|
in Win32/Win64 builds.
* src/glwin.c 1.32 (changed +28 -15)
| ||2020-12-14||johns||Eliminated startup test messages for WIN64 builds|
* src/win32vmdstart.c 1.52 (changed +2 -2)
MacOS 10.15 and 11.0 builds require Tcl/Tk 8.6 or later
* configure 1.1516 (changed +9 -5)
Updated WIN32 and WIN64 registry query code and software keys to add
support for 64-bit builds.
* src/win32vmdstart.c 1.51 (changed +5 -13)
Updated version dependency comments for FLTK builds for MacOS X
10.15 (Catalina) and 11.0 (Big Sur).
* configure 1.1515 (changed +6 -7)
| ||2020-12-13||johns||Added MacOS X ARM64 targets|
* Makefile 1.127 (changed +11 -1)
Added build configuration for MACOSXARM64
* configure 1.1514 (changed +115 -9)
Added conditional compilation tests for ARM64 MacOS X builds
checking for the compile-time macro ARCH_MACOSXARM64.
* src/vmdsock.c 1.31 (changed +2 -2)
* src/macosxvmdstart.C 1.29 (changed +13 -7)
* src/VMDApp.C 1.576 (changed +3 -3)
* src/Stride.C 1.48 (changed +2 -2)
* src/Spaceball.C 1.64 (changed +3 -1)
* src/QuickSurf.C 1.131 (changed +2 -2)
* src/MeasureVolInterior.C 1.17 (changed +3 -3)
Updated OptiX renderer to add support for ray statistics reporting for
interactive display runs.
* src/OptiXRenderer.C 1.378 (changed +12 -9)
| ||2020-12-12||johns||Changed the OptiX renderer's ray statistics buffer allocation code to specify|
RT_BUFFER_OUTPUT instead of RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL,
since we never write to these buffers on the host side.
* src/OptiXRenderer.C 1.376 (changed +14 -12)
Implemented slight optimizations for OptiX ray statistics gathering among
the various primary ray generation and shading kernels.
* src/OptiXShaders.cu 1.175 (changed +22 -22)
Increased the default threshold for forcing VMD to perform
OptiX renderings in multiple accumulation buffer passes by a factor
of 4x, and permit user override of the default threshold and behavior.
The new code also uses the total number of primary aa samples
and AO shadow feeler rays rather than only the number of primary ray
aa samples, to provide better performance scaling on RTX
hardware-accelerated GPUs going forward. Increasing the number of
rays per launch significantly improves VMD's utilization of the RT cores on
the latest hardware. With too small of the ray batch size per pass, VMD
can't exploit the full hardware performance on the latest RTX cards.
* src/OptiXRenderer.C 1.377 (changed +8 -2)
Revised the low level VMDDisplayList clipping plane methods and
higher-level Displayable clipping plane methods to eliminate
Displayable methods from triggering _needUpdate scene regen/redraw
updates unless: the clipping plane is being changed; or
one of the the clipping plane plane properties is changed and the
active clipping plane mode is currently active (non-zero).
* src/VMDDisplayList.h 1.46 (changed +3 -2)
* src/VMDDisplayList.C 1.42 (changed +6 -1)
* src/Displayable.h 1.91 (changed +44 -1)
| ||2020-12-09||johns||Added placeholder clipping group implementation.|
* src/OSPRayDisplayDevice.h 1.18 (changed +22 -5)
* src/OSPRayDisplayDevice.C 1.17 (changed +64 -1)
* src/OSPRay2DisplayDevice.h 1.3 (changed +23 -6)
* src/OSPRay2DisplayDevice.C 1.3 (changed +63 -1)
| ||2020-11-30||johns||Corrected behavior of 'radius' flag per Barry's testing.|
* src/TclGraphics.C 1.60 (changed +6 -7)
| ||2020-11-25||johns||Windows platforms return raw key state without processing key modifiers,|
so in order to provide the same behavior as X11, we process key modifiers
ourselves w/ toupper()/tolower() calls, etc.
* src/glwin.c 1.31 (changed +10 -2)
| ||2020-11-18||johns||Added conditional definition of NOMINMAX macro when compiling the|
OptiXRenderer code on Windows platform, as required by OptiX-internal headers.
* src/OptiXRenderer.h 1.126 (changed +8 -3)
| ||2020-11-17||johns||Added a wrapper for OptiXRenderer::device_count() to limit overly|
broad inclusion of low level OptiX API headers by classes that needed to
make calls into OptiXRenderer. On the Windows platform in particular, there
are some thorny header file ordering and macro definition issues that
must be satisfied, and while that is easy to do in OptiXRenderer and
OptiXDisplayDevice, this rapidly gets out of hand if other classes
start including those headers as well.
* src/OptiXDisplayDevice.h 1.33 (changed +5 -1)
* src/OptiXDisplayDevice.C 1.83 (changed +8 -1)
Changed order of header inclusion to ensure that low-level OptiX
headers are included prior to system-provided headers, as the OptiX
headers incorporate special handling, e.g., of min/max macros and
related functions on the Windows platform, which is very sensitive to
* src/OptiXRenderer.C 1.375 (changed +7 -3)
* src/OptiXDisplayDevice.C 1.84 (changed +7 -3)
Eliminate direct calls to OptiXRenderer class, to limit the scope of
associated low-level OptiX header inclusion, which has particularly
detailed ordering and macro definition requirements for Windows builds.
* src/FileRenderList.C 1.108 (changed +3 -4)
Eliminated inclusion of OptiXRenderer definition since hardware
enumeration is now done as part of FileRenderList and support for
VCA rendering clusters has been removed previously.
* src/VMDApp.C 1.575 (changed +1 -4)
Use VMD_PI macro to simplify Windows builds.
* src/OptiXRenderer.C 1.374 (changed +7 -7)
| ||2020-11-16||johns||Added include of windows.h for interactive RT compilation on Windows platforms.|
* src/OptiXRenderer.C 1.373 (changed +5 -1)
* src/OSPRay2Renderer.C 1.29 (changed +5 -1)
* src/ANARIRenderer.C 1.12 (changed +5 -1)
Corrected added missing include of windows.h when compiling OSPRayRenderer
on windows platforms.
* src/OSPRayRenderer.C 1.88 (changed +6 -2)
Imported Tachyon glwin updates to permit compilation on win32/win64
* src/glwin.c 1.30 (changed +17 -9)
Use VMD_PI rather than M_PI to please MSVC.
* src/OSPRayRenderer.C 1.87 (changed +2 -2)
| ||2020-11-06||johns||Changed implementation of spheretube color handling for inheritance|
of bulk "draw color colorID" type coloring mode. The implementation
of colorID arrays needs rewriting so it will properly track updates
to color tables ex-post-facto, and that will also impact the
single-color color inheritance mode.
* src/TclGraphics.C 1.59 (changed +10 -8)
* src/MoleculeGraphics.h 1.57 (changed +2 -2)
* src/MoleculeGraphics.C 1.88 (changed +46 -56)
Changed rev to 50 skipping 49 so that special 49 builds don't get confused
with later revs.
* configure 1.1513 (changed +1 -1)
Corrected handling of single-radius spheretube combinations
* src/TclGraphics.C 1.58 (changed +2 -1)
* src/MoleculeGraphics.C 1.87 (changed +13 -7)
* configure 1.1512 (changed +1 -1)
Explicitly include CUDA runtime header with OptiX versions greater than 5.2,
since differences in CUDA runtime headers among versions lead to
cudaGetDeviceCount() being unprototyped in some cases but not others.
This ensures it will always be properly prototyped.
* src/OptiXRenderer.C 1.372 (changed +4 -1)
| ||2020-11-04||johns||Added tcl_get_intarray() to pull in a tcl list as a 1-D flat array of ints|
* src/TclVec.C 1.46 (changed +26 -1)
* src/TclCommands.h 1.59 (changed +4 -1)
Changed the intermediary OptiXDisplayDevice class to accumulate
individual spheres from FileRenderer::sphere() calls into local
buffers the same way that it already did for cylinders and particular
triangle geometry. When the primitive count exceeds a threshold,
the spheres are sent to OptiXRenderer as a sphere_array_color()
primitive, which greatly reduces overheads. By overriding the
FileRenderer::sphere() method, we eliminate sphere triangulation for
scenes that include handfuls of user-drawn spheres, and for spheres
drawn by plugins or the like, leading to higher quality and faster rendering.
* src/OptiXDisplayDevice.h 1.32 (changed +27 -2)
* src/OptiXDisplayDevice.C 1.82 (changed +59 -1)
Updated the "spheretubes" draw command to accept a list of colorIDs
rather than only fully-specified RGB colors. This makes it a more
directly usable replacement for existing sphere-at-a-time scripts
that follow a pattern of calling "draw color", then "draw sphere",
"draw cone", etc.
* src/TclGraphics.C 1.57 (changed +46 -3)
* src/MoleculeGraphics.h 1.56 (changed +3 -2)
* src/MoleculeGraphics.C 1.86 (changed +14 -4)
| ||2020-11-02||johns||Added additional DSSP src URLs|
* src/Stride.C 1.47 (changed +2 -1)
Added notes about AVX/AVX-512 clock rate reductions and high vector
registers causing false dependencies except when specifically cleared,
e.g., by calls to _mm256_zeroupper().
* src/WKFThreads.C 1.32 (changed +7 -2)
Added support for the graphics "info" subcommands for spheretube primitives
and eliminated various debugging console messages.
* src/TclGraphics.C 1.56 (changed +3 -1)
* src/MoleculeGraphics.h 1.55 (changed +18 -14)
* src/MoleculeGraphics.C 1.85 (changed +53 -3)
Added updated STRIDE, DSSP/xDSSP/HSSP URLs
* src/Stride.C 1.46 (changed +16 -1)
Synced up cpuid() refs from Tachyon
* src/WKFThreads.C 1.31 (changed +5 -2)
| ||2020-11-01||johns||Added AVX2 atom selection analysis loop|
* src/util_simd_AVX2.C 1.1 (added +238 -0)
Added AVX2-specific atom selection analysis routine.
* configure 1.1511 (changed +1 -0)
Added runtime CPU dispatch for analyze_selection_aligned_avx2()
* src/util_simd.C 1.26 (changed +14 -4)
Changed the AVX atom selection loops to use the prior strategy for
finding vector-aligned starting/ending indices due to bugs that showed
up in testing with CPU dispatch enabled. Will revisit later.
* src/util_simd_AVX.C 1.7 (changed +43 -7)
Chopped out conditional compilation of alternative AVX-specific SIMD
loops now that runtime CPU dispatch is beginning to be more extensively
* src/util_simd.C 1.27 (changed +3 -70)
Eliminated remnants of AVX2-specific code which is now in its own source file.
* src/util_simd_AVX.C 1.8 (changed +1 -32)
Implemented MoleculeGraphics::find_sizes() for the new "tube" command.
* src/MoleculeGraphics.C 1.81 (changed +12 -4)
Improved spheretube parameter checking
* src/TclGraphics.C 1.53 (changed +11 -16)
Near-complete implementation of "spheretube" drawing primitive.
Still needs to handle color-per-sphere mode in conjunction with
drawtubes being enabled, and command logging is as-yet unimplemented.
The new code does significantly more paraemter checking now.
* src/TclGraphics.C 1.55 (changed +28 -3)
* src/MoleculeGraphics.C 1.84 (changed +19 -6)
Renamed "tube" drawing primitive to "spheretube" and added new flags
and parameters to allow the use of per-sphere user-specified RGB colors,
flag control over tube drawing, and either constant radius or
* src/TclGraphics.C 1.52 (changed +104 -5)
* src/MoleculeGraphics.h 1.54 (changed +6 -4)
* src/MoleculeGraphics.C 1.83 (changed +55 -37)
Renamed MoleculeGraphics::find_sizes() to MoleculeGraphics::find_bounds()
which is a more accurate and meaningful method name.
* src/MoleculeGraphics.h 1.53 (changed +4 -4)
* src/MoleculeGraphics.C 1.82 (changed +2 -2)
Updated conditional compilation tests and logic for runtime CPU dispatched
atom selection analysis code to ensure proper compilation on both
x86 and ARM64 platforms, since VMD now supports runtime dispatch on both.
* src/util_simd.C 1.28 (changed +6 -2)
Updated graphics command to "spheretube" due to its new capabilities
* src/TclGraphics.C 1.54 (changed +5 -5)
Updated the VS2017 VMD project to add util_simd_AVX2.C and nr_jacobi.C
* msvc/vs2017/vmd.vcxproj.filters 1.6 (changed +6 -0)
* msvc/vs2017/vmd.vcxproj 1.10 (changed +2 -0)
Updated the collective variables module to the latest git version, which
also bears the tag "vmd-1.9.4a49".
* src/nr_jacobi.h 1.1 (added +23 -0)
* src/nr_jacobi.C 1.1 (added +144 -0)
* src/colvarvalue.h 1.12 (changed +3 -3)
* src/colvarvalue.C 1.13 (changed +2 -1)
* src/colvartypes.h 1.12 (changed +6 -23)
* src/colvartypes.C 1.11 (changed +109 -157)
* src/colvarscript_commands_colvar.h 1.2 (changed +15 -0)
* src/colvarscript_commands.h 1.2 (changed +14 -0)
* src/colvars_version.h 1.14 (changed +1 -1)
* src/colvars_files.pl 1.5 (changed +4 -2)
* src/colvarproxy_volmaps.h 1.2 (changed +43 -3)
* src/colvarproxy_volmaps.C 1.2 (changed +44 -9)
* src/colvarproxy_vmd_version.h 1.11 (changed +1 -1)
* src/colvarproxy_vmd.h 1.16 (changed +24 -5)
* src/colvarproxy_vmd.C 1.20 (changed +154 -4)
* src/colvarproxy.h 1.20 (changed +19 -0)
* src/colvarproxy.C 1.10 (changed +6 -0)
* src/colvarmodule.h 1.29 (changed +3 -0)
* src/colvarmodule.C 1.27 (changed +53 -23)
* src/colvargrid.h 1.20 (changed +8 -7)
* src/colvargrid.C 1.11 (changed +11 -5)
* src/colvardeps.C 1.16 (changed +18 -16)
* src/colvarcomp_volmaps.C 1.2 (changed +87 -6)
* src/colvarcomp_gpath.C 1.2 (changed +2 -37)
* src/colvarcomp_distances.C 1.21 (changed +4 -4)
* src/colvarcomp_apath.C 1.2 (changed +22 -20)
* src/colvarcomp.h 1.15 (changed +13 -8)
* src/colvarcomp.C 1.18 (changed +2 -2)
* src/colvarbias_meta.C 1.19 (changed +7 -5)
* src/colvarbias_histogram.h 1.7 (changed +0 -1)
* src/colvarbias_histogram.C 1.12 (changed +3 -3)
* src/colvarbias_abf.h 1.10 (changed +6 -4)
* src/colvarbias_abf.C 1.22 (changed +22 -22)
* src/colvarbias.h 1.15 (changed +4 -0)
* src/colvarbias.C 1.19 (changed +11 -0)
* src/colvaratoms.C 1.22 (changed +3 -0)
* src/colvar_arithmeticpath.h 1.2 (changed +13 -1)
* src/colvar.h 1.17 (changed +26 -0)
* src/colvar.C 1.28 (changed +30 -10)
| ||2020-10-31||johns||Added new tcl_get_array() and tcl_get_vecarray() commands to ease|
parsing of large vertex arrays and arrays of per-primitive scalars
for use in implementing new graphics primitives.
* src/TclVec.C 1.45 (changed +56 -5)
* src/TclCommands.h 1.58 (changed +7 -1)
Added support for new "tube" graphics drawing primitive that renders a
series of spheres connected by cones, with arbitrary numbers of vertices,
per-vertex radii, and user specified polygonal representation resolution.
Still an early prototypical implementation.
* src/TclGraphics.C 1.51 (changed +34 -2)
Revised MoleculeGraphics to allow shapes to have an "extradata" field to
permit the implementation new graphics "draw" commands that accept
arbitrarily large vertex arrays, thereby making most efficient use
of the underlying array-oriented DispCmd primitives, which minimizes
the number of VMDDisplayList linked list nodes, and similarly reduces
the total count of API calls all of the way through the rendering system
to the lowest level rendering layers. The "extradata" field enables
the most efficient storage and management of complex geometry, completely
eliminating internal fragmentation of memory that normally occurs with
the original primitive-at-a-time drawing approach.
Added tracking of the active graphics drawing colorID so that batched
geometry DispCmd calls that require rgb floating point buffers can
be used more easily.
* src/MoleculeGraphics.h 1.52 (changed +18 -6)
* src/MoleculeGraphics.C 1.80 (changed +93 -1)
Updated the cone display command API to take const vertex buffer
* src/DispCmds.h 1.119 (changed +3 -2)
* src/DispCmds.C 1.120 (changed +3 -2)
| ||2020-10-30||johns||Updated comments about the behavior of MoleculeGraphics::info_id()|
* src/MoleculeGraphics.h 1.51 (changed +5 -3)
| ||2020-10-29||johns||Added CUDA support and upgraded VS2017 to the latest VS2019|
platform toolset versions, etc.
* msvc/vs2019/vmd.vcxproj.filters 1.3 (changed +72 -1)
* msvc/vs2019/vmd.vcxproj 1.3 (changed +131 -14)
Added FastPBC to the windows builds, to match the current Unix builds.
* msvc/vs2017/vmd.vcxproj.filters 1.4 (changed +4 -1)
* msvc/vs2017/vmd.vcxproj 1.8 (changed +2 -1)
Added include of stddef.h for ptrdiff_t
* src/util_simd_SVE.C 1.5 (changed +2 -1)
Eliminated non-portable variable sized stack allocated memory buffer.
* src/GaussianBlur.C 1.27 (changed +6 -3)
Eliminated timer code leftover from standalone PBC implementation
* src/CUDAFastPBC.cu 1.4 (changed +1 -22)
Enabled CUDA on all of the x64 build targets
* msvc/vs2017/vmd.vcxproj.filters 1.5 (changed +72 -1)
* msvc/vs2017/vmd.vcxproj 1.9 (changed +124 -7)
Protect x86-only runtime dispatch paths with appropriate
* src/QuickSurf.C 1.130 (changed +9 -1)
Updated VS2019 project from pre-CUDA VS2017 project to simplify upgrade.
* msvc/vs2019/vmd.vcxproj.user 1.2 (changed +13 -1)
* msvc/vs2019/vmd.vcxproj.filters 1.2 (changed +112 -1)
* msvc/vs2019/vmd.vcxproj 1.2 (changed +271 -23)
* msvc/vs2019/vmd.sln 1.2 (changed +10 -1)
| ||2020-10-28||johns||Added explicit call to _mm_castsi128_ps(mask) to please MSVS compilers.|
* src/util_simd.C 1.24 (changed +7 -7)
Added explicit type conversions from ptrdiff_t sizes to int to greatly
reduce MSVS compiler conversion warnings.
* src/SmallRingLinkages.h 1.10 (changed +4 -4)
* src/SmallRing.h 1.12 (changed +3 -3)
* src/Scene.h 1.66 (changed +2 -2)
* src/PickList.h 1.38 (changed +2 -2)
* src/P_UIVR.h 1.75 (changed +2 -2)
* src/MoleculeList.h 1.72 (changed +2 -2)
* src/MoleculeGraphics.h 1.50 (changed +2 -2)
* src/Molecule.h 1.67 (changed +3 -3)
* src/MeasureSymmetry.h 1.34 (changed +4 -4)
* src/GraphicsFltkReps.h 1.141 (changed +3 -3)
* src/Fragment.h 1.21 (changed +2 -2)
* src/DrawMolecule.h 1.88 (changed +4 -4)
* src/BaseMolecule.h 1.151 (changed +7 -7)
Added links to current ARM ACLE SVE documentation used to develop
the first variable vector length SVE kernels.
* src/util_simd_SVE.C 1.3 (changed +6 -1)
Added more SVE doc references.
* src/util_simd_SVE.C 1.4 (changed +5 -1)
Begain implementing ARM64 SVE vectorized loops for high performance
* src/util_simd_SVE.C 1.2 (changed +93 -2)
Eliminate warning for ptrdiff_t to double conversion with explicit cast.
* src/AtomSel.C 1.179 (changed +2 -2)
Enabled runtime CPU dispatch for AVX, AVX2, AVX-512, and AVX-512ER
in the VS2017 build rules.
* msvc/vs2017/vmd.vcxproj.filters 1.3 (changed +10 -1)
* msvc/vs2017/vmd.vcxproj 1.6 (changed +6 -3)
Migrated the AVX-512 specific loops to a new runtime dispatch version of
the QuickSurf kernels and removed them from the statically-launched
* src/QuickSurf.C 1.128 (changed +10 -232)
Protect CPU capability data structure w/ conditional compilation tests
for runtime CPU dispatch.
* src/Orbital.C 1.157 (changed +3 -1)
Pulled in Tachyon update to correct detection of FMA3 via x86 CPUID
* src/WKFThreads.C 1.30 (changed +5 -2)
Updated MSVS build flags and directories for release builds
* msvc/vs2017/vmd.vcxproj.user 1.4 (changed +8 -0)
* msvc/vs2017/vmd.vcxproj 1.7 (changed +10 -2)
Updates for the SSE atom selection and statistics kernels to please current
versions of MSVS, which fail to define SSE macros at compile time.
* src/util_simd.C 1.25 (changed +10 -1)
Updates for the SSE molecular orbital kernels to please current versions
of MSVS, which fail to define SSE macros at compile time, and that
no longer support some of the oldest MMX intrinsics and _m64 types
when compiling in 64-bit mode.
* src/Orbital.C 1.158 (changed +7 -2)
Use __align() variable declaration attribute for portability to MSVS
* src/QuickSurf.C 1.129 (changed +2 -2)
| ||2020-10-27||johns||Added ARM SVE general vectorized kernels and helper routines to the build.|
* src/util_simd_SVE.C 1.1 (added +48 -0)
Added first ARM64 CPU feature detection console diagnostics to indicate
the availability of Neon and SVE vector instructions, among others.
* src/VMDApp.C 1.571 (changed +11 -1)
Added include of stddef.h for definition of ptrdiff_t
* src/utilities.h 1.119 (changed +2 -1)
Added initial platform- and OS-specific conditional compilation macros
for runtime CPU feature detection on ARM64 hardware targets.
The initial ARM64 implementation makes use of the Linux getauxval() API.
* src/WKFThreads.h 1.18 (changed +30 -16)
* src/WKFThreads.C 1.27 (changed +21 -2)
Added prototypes for ARM SVE runtime dispatch SVE vector lengh query
* src/utilities.h 1.120 (changed +9 -1)
Added runtime dispatch reporting of ARM64 SVE hardware vector lengths
for 32-bit and 64-bit types.
* src/VMDApp.C 1.573 (changed +10 -1)
Fix a few typos in the ARM CPU dispatch rules
* configure 1.1510 (changed +15 -11)
Greatly simplified conditional compilation macros, include files,
and ifdefs for runtime CPU dispatch and statically-launched SIMD kernels.
* src/Orbital.C 1.156 (changed +30 -29)
Moved CPU hypervisor detection and reporting into x86 block since we don't
yet have the equivalent capability on ARM hardware.
* src/VMDApp.C 1.574 (changed +6 -6)
Pulled in corrections for ARM64 CPU feature detection from Tachyon.
* src/WKFThreads.C 1.28 (changed +6 -5)
Revised the configure script to support runtime CPU dispatch on
ARM64 SVE platforms. Replaced X86AVXDISPATCH with a generic
CPUDISPATCH option, and rewrote all of the build config macros so they
can work with both x86 and ARM hardware.
* configure 1.1509 (changed +53 -27)
Simplified conditional compilation macros for runtime CPU dispatch
kernel source files.
* src/Orbital_AVX512ER.C 1.3 (changed +6 -5)
* src/Orbital_AVX512.C 1.4 (changed +6 -5)
Updated CPU dispatch build configuration to add compile-time definition of
VMDCPUDISPATCH, to make conditional compilation on a wide variety of hardware
and compiler compilations substantially simpler.
* configure 1.1508 (changed +1 -1)
Updated CPU feature detection reporting code for ARM64 platforms, and pulled
in latest updates from Tachyon.
* src/WKFThreads.h 1.19 (changed +16 -12)
* src/WKFThreads.C 1.29 (changed +16 -3)
* src/VMDApp.C 1.572 (changed +28 -3)
| ||2020-10-26||johns||Eliminated statically-launched QuickSurf AVX2 kernels in favor of|
runtime CPU dispatch.
* src/QuickSurf.C 1.127 (changed +5 -270)
Removed the previous static-launch code paths for AVX-512F and AVX-512ER
molecular orbital kernels.
* src/Orbital.C 1.155 (changed +10 -509)
| ||2020-10-22||johns||Added conditional compilation for _WIN64 (and any other LLP64 platforms)|
to add support for size_t and ptrdiff_t Inform output, but avoid duplication
of long types on LP64 platforms (Unix/Linux) where ptrdiff_t and size_t are
defined as long types.
* src/Inform.h 1.34 (changed +11 -1)
* src/Inform.C 1.46 (changed +19 -1)
Pulled in CPU feature detection updates from Tachyon.
* src/WKFThreads.C 1.26 (changed +9 -5)
Replaced the use of long types in CPU QuickSurf algorithm with ptrdiff_t
and size_t for x64 64-bit Windows builds.
Updated SIMD routines conditional compilation macros and tests
for use with MSVS 2017.
* src/QuickSurf_AVX2.C 1.6 (changed +6 -6)
* src/QuickSurf.C 1.126 (changed +21 -21)
Replaced the use of long types with ptrdiff_t and size_t to get
correct behavior on LLP64 platforms such as Windows x64 64-bit builds.
* src/ResizeArray.h 1.57 (changed +24 -23)
Updated SIMD routines conditional compilation macros and tests
for use with MSVS 2017, and replaced the use of long types with
ptrdiff_t and size_t for x64 64-bit Windows builds.
* src/utilities.h 1.118 (changed +7 -6)
* src/util_simd_AVX.C 1.6 (changed +18 -13)
* src/util_simd.C 1.23 (changed +28 -27)
Updated VS2017 debugging targets
* msvc/vs2017/vmd.vcxproj.user 1.3 (changed +6 -2)
* msvc/vs2017/vmd.vcxproj 1.5 (changed +2 -6)
Updated all of the volumetric data representations and associated data
structures, replacing the use of long types with ptrdiff_t and size_t to get
correct behavior on LLP64 platforms such as Windows x64 64-bit builds.
* src/VolumetricData.h 1.57 (changed +12 -10)
* src/VolumetricData.C 1.70 (changed +63 -63)
* src/VolumeTexture.h 1.10 (changed +5 -3)
* src/VolumeTexture.C 1.29 (changed +27 -27)
* src/MolFilePlugin.C 1.198 (changed +9 -8)
* src/DrawMolItemVolume.C 1.181 (changed +14 -14)
* src/DispCmds.C 1.119 (changed +15 -15)
| ||2020-10-21||johns||Corrected a C++ism that got into the MSM code while eliminating|
visual studio warnings.
* src/msmpot_setup.c 1.7 (changed +2 -2)
Minor rewrite of default_mass() method for improved performance on
large structures, and to address JC Gumbart's patch to improve default
mass assignments for iron, fluorine, iodine, bromine, potassium, and calcium.
* src/BaseMolecule.C 1.277 (changed +62 -22)
Rewrote command line argument passing for Python 3.x, since it requires
explicit translation to wide characters. The new code processes
incoming argc/argv by converting each argument to null terminated
wchar_t strings using Py_DecodeLocale() and passing them into PySys_SetArgv().
Also added placeholder call to Py_SetProgramName(), however it is disabled
since it doesn't appear to be particularly beneficial or needed at this time.
* src/PythonTextInterp.C 1.76 (changed +15 -2)
Updated the unsigned integer pointer type conversion macro for 64-bit
* src/Timestep.C 1.74 (changed +6 -8)
| ||2020-10-17||johns||Updated VS2017 x64 project settings for the debug target|
* msvc/vs2017/vmd.vcxproj 1.4 (changed +29 -18)
| ||2020-10-16||johns||Corrected formatting of plugin loader warning messages.|
* src/PluginMgr.C 1.39 (changed +2 -2)
Updated conditional compilation checks for _WIN64 builds.
* src/win32vmdstart.c 1.50 (changed +27 -22)
| ||2020-10-15||johns||Added VS2017 IDE project files|
* msvc/vs2017/vmd.vcxproj.user 1.1 (added +9 -0)
* msvc/vs2017/vmd.vcxproj.filters 1.1 (added +1301 -0)
* msvc/vs2017/vmd.vcxproj 1.1 (added +652 -0)
* msvc/vs2017/vmd.sln 1.1 (added +33 -0)
Added VS2019 project files derived from VS2017 (not yet upgraded)
* msvc/vs2019/vmd.vcxproj.user 1.1 (added +9 -0)
* msvc/vs2019/vmd.vcxproj.filters 1.1 (added +1301 -0)
* msvc/vs2019/vmd.vcxproj 1.1 (added +652 -0)
* msvc/vs2019/vmd.sln 1.1 (added +33 -0)
Added and enabled collective variables in the VS2017 IDE builds
* msvc/vs2017/vmd.vcxproj.filters 1.2 (changed +100 -1)
* msvc/vs2017/vmd.vcxproj 1.2 (changed +36 -3)
Added explicit floating point typecasts to please MSVS
* src/DrawMolItem.C 1.371 (changed +5 -5)
Added explicit type conversion to please MSVS
* src/utilities.h 1.116 (changed +4 -4)
* src/VolumetricData.h 1.55 (changed +7 -7)
* src/Voltool.h 1.12 (changed +8 -8)
Added explicit typecasting of floating point parameters converted from
double-precision Tcl APIs to single-precision internal APIs.
* src/TclVoltool.C 1.10 (changed +18 -13)
* src/TclSegmentation.C 1.15 (changed +5 -3)
* src/TclMeasure.C 1.184 (changed +7 -7)
* src/TclMDFF.C 1.34 (changed +5 -5)
Added explicit typecasts to GLfloat to eliminate MSVS compiler warnings.
* src/Win32OpenGLDisplayDevice.C 1.128 (changed +4 -4)
* src/OpenGLDisplayDevice.C 1.214 (changed +4 -4)
Added explicit typecasts to please MSVS
* src/cmd_videostream.C 1.10 (changed +3 -3)
Added explicit typecasts to the GROUP_T template type to eliminate
* src/Watershed.C 1.38 (changed +3 -3)
* src/GaussianBlur.C 1.26 (changed +7 -7)
Improved consistency of floating point expressions and eliminate MSVS
warnings with explicit typecasts.
* src/utilities.C 1.176 (changed +2 -2)
* src/msmpot_setup.c 1.6 (changed +3 -3)
* src/VolumetricData.h 1.56 (changed +2 -2)
* src/VolumetricData.C 1.69 (changed +14 -12)
* src/Voltool.C 1.11 (changed +3 -3)
* src/DisplayDevice.C 1.149 (changed +5 -5)
Improved consistency of floating point expressions containing
mixed types with explicit type conversions. These changes address
a variety of type conversion warnings from MSVS.
* src/utilities.h 1.117 (changed +6 -1)
* src/utilities.C 1.175 (changed +5 -5)
* src/util_simd.C 1.22 (changed +2 -2)
* src/Watershed.C 1.37 (changed +3 -3)
* src/VolumetricData.C 1.68 (changed +10 -7)
* src/Voltool.C 1.10 (changed +5 -5)
* src/VideoStream.h 1.22 (changed +3 -3)
* src/TclVoltool.C 1.9 (changed +3 -3)
* src/TclMeasure.C 1.183 (changed +3 -3)
* src/TclMDFF.C 1.33 (changed +3 -3)
* src/TclGraphLayout.C 1.5 (changed +6 -6)
* src/Segmentation.C 1.29 (changed +3 -3)
* src/ScaleSpaceFilter.C 1.22 (changed +3 -3)
* src/OpenGLRenderer.C 1.474 (changed +3 -3)
* src/MeasureQCP.C 1.35 (changed +8 -8)
* src/GraphicsFltkReps.h 1.140 (changed +2 -2)
* src/GraphLayout.C 1.8 (changed +2 -2)
* src/BaseMolecule.C 1.276 (changed +2 -2)
Installer script src for the Nullsoft Scriptable Install System (NSIS)
* msvc/inst-nsis/vmd-win32-nsis.nsi 1.1 (added +5483 -0)
Revised the project name to "vmd" from "winvmd" to better match the
expectations of VS2017 and VS2019, so there aren't warnings about
the auto-generated output filename vs. the default macro-generated filenames.
Cleaned out now-unused libraries such as libGLU which has been superceded
by built-in code in VMD itself. Eliminated out-of-date compiler flags
that remained from initial project conversion process.
Added basic x64 targets and assumed dependency paths.
* msvc/vs2017/vmd.vcxproj.user 1.2 (changed +1 -1)
* msvc/vs2017/vmd.vcxproj 1.3 (changed +218 -22)
* msvc/vs2017/vmd.sln 1.2 (changed +10 -1)
| ||2020-10-13||johns||Added implementation of the new drawpixels_rgba4u() method used for|
video streaming etc.
* src/Win32OpenGLDisplayDevice.C 1.127 (changed +93 -0)
* configure 1.1507 (changed +1 -1)
* configure 1.1506 (changed +1 -1)
Updated date tag
* Announcement 1.70 (changed +1 -1)
| ||2020-10-12||johns||Added sincosf() and sincos() compatibility macros for windows compilers.|
* src/utilities.h 1.115 (changed +6 -1)
Extra language protection ifdefs
* src/win32vmdstart.h 1.14 (changed +6 -0)
Please current revs of MSVS
* src/win32vmdstart.h 1.13 (changed +3 -1)
Switch to use of VMD_PI instead of M_PI for windows compilers.
* src/GraphLayout.C 1.7 (changed +2 -2)
Updated Win32 display code with the updated framebuffer readback routines.
* src/Win32OpenGLDisplayDevice.C 1.126 (changed +21 -2)
| ||2020-10-01||johns||Added conditional compilation support for assignment of VMD-generated|
representation string tags to ANARI scene hierarchy objects via
"name" tags, as implemented by the current developmental USD back-end.
USD uses these name tags not only to name scene components, but also
to generate underlying pathnames in its scene directory structure,
so there are some noteworthy limitations on what name strings are
legal. In particular, a variety of characters aren't safe for use
in the name tags as they have special meaning in pathnames, e.g.,
"/", ":", and so on. Further, the USD library used by the back-end
has some name uniqueness expectations that may need to be met by appending
or incorporating integer counters as suffixes to ensure that multiple scene
objects are assigned globally unique names. Ultimately, there is likely
an opportunity here for VMD to better by storing additional GUI-safe
and/or USD-safe strings to meet the restrictions in USD, and to cause
external tools that use such tags to display better in graphical interfaces.
* src/ANARIRenderer.h 1.7 (changed +5 -0)
* src/ANARIRenderer.C 1.11 (changed +47 -0)
* src/ANARIDisplayDevice.h 1.2 (changed +4 -1)
* src/ANARIDisplayDevice.C 1.2 (changed +6 -1)
Allow VMD to set background color for the USD back-end.
* src/ANARIRenderer.C 1.10 (changed +7 -9)
Updated ANARI renderer with initial support and workarounds for the early
limitations of the ANARI back-end targeting Pixar's USD (universal scene
* src/ANARIRenderer.h 1.6 (changed +3 -2)
* src/ANARIRenderer.C 1.9 (changed +116 -64)
| ||2020-09-09||johns||Ensure release of the active device so that OSPRay 2.x doesn't creash during|
shutdown in the OpenVKL layer.
* src/OSPRay2Renderer.C 1.28 (changed +2 -1)
Updated to current ANARI API, eliminating the renderer parameter to
* src/ANARIRenderer.C 1.8 (changed +2 -10)
| ||2020-09-05||johns||Swapped out hard-coded sleep for environment variable based debugger attach|
* src/OSPRay2Renderer.C 1.27 (changed +7 -3)
| ||2020-08-16||johns||Added comments about the behavior of OSPRay 1.x ospSetData() since|
the OSPRay 2.x and ANARI approaches used shared data that the application
side must maintain for persistent access until scene/context destruction.
* src/OSPRayRenderer.C 1.86 (changed +6 -6)
Continued improvement of ANARI scene/context teardown logic.
* src/ANARIRenderer.h 1.5 (changed +1 -3)
* src/ANARIRenderer.C 1.7 (changed +61 -23)
Continued improvements to OSPRay 2.x scene/context teardown code and
* src/OSPRay2Renderer.h 1.9 (changed +1 -3)
* src/OSPRay2Renderer.C 1.26 (changed +53 -19)
| ||2020-08-15||johns||Added environment variable disable for OSPRay 1.x interactive progressive|
rendering for simplified testing.
* src/FileRenderList.C 1.107 (changed +6 -4)
Eliminated unnecessary code left from prior design work, tests, and debugging.
* src/OSPRayRenderer.C 1.84 (changed +2 -23)
Unified and streamlined all of the mesh methods that currently convert to
a triangle soup memory representation for OSPRay 1.x.
* src/OSPRayRenderer.h 1.34 (changed +5 -2)
* src/OSPRayRenderer.C 1.85 (changed +63 -208)
| ||2020-08-14||johns||Back-ported improvements from the ANARI renderer.|
Eliminate missing ospCommit() and ospRelease() reference count
decrement operations. Fixed a bug in internal framebuffer
state change management.
* src/OSPRay2Renderer.C 1.24 (changed +57 -17)
Corrected OSPRay 1.x framebuffer state management.
* src/OSPRayRenderer.C 1.80 (changed +2 -1)
Corrected destructor loops for light deletion, added logging and error
callbacks available in later 1.x API revs, and avoid adding
a potentially empty (zero-length) sphere array object to the ospModel.
In combination with the final revs of OSPRay 1.x, this commit appears
to fix memory leaks in VMD's use of the OSPRay 1.x APIs, although
OSPRay itself leaks a handful of objects still.
* src/OSPRayRenderer.C 1.83 (changed +39 -16)
Eliminated some HMD-specific state variables unnecessarily
inherited from OptiX.
* src/OSPRayRenderer.h 1.31 (changed +1 -6)
* src/OSPRay2Renderer.h 1.7 (changed +1 -6)
* src/ANARIRenderer.h 1.3 (changed +1 -6)
Ongoing revisions to update the OSPRay 1.x code to follow current
conventions with better adherance to reference count management,
although issues still remain in tests with 1.8.5.
Renamed ort_xxx types to avoid namespace collisions with multiple
uses of ResizeArray that confused Clang's address sanitizer mode.
* src/OSPRayRenderer.h 1.32 (changed +13 -13)
* src/OSPRayRenderer.C 1.81 (changed +196 -63)
Prevent SymbolTableElement objects from leaking when a new atom selection
macro replaces an existing one with the same name, e.g. as when loading
a saved state file.
* src/SymbolTable.h 1.63 (changed +41 -6)
Renamed renderer-internal ort_xxx types to avoid namespace collisions
with multiple uses of ResizeArray that confused Clang's address sanitizer.
* src/OSPRayRenderer.h 1.33 (changed +3 -3)
* src/OSPRayRenderer.C 1.82 (changed +26 -26)
* src/OSPRay2Renderer.h 1.8 (changed +18 -18)
* src/OSPRay2Renderer.C 1.25 (changed +42 -42)
* src/ANARIRenderer.h 1.4 (changed +18 -18)
* src/ANARIRenderer.C 1.6 (changed +42 -42)
Used the nascent trace capture and debuggin framework to identify and
eliminate missing anariCommit() and anariRelease() reference count
decrement operations. Fixed a bug in internal framebuffer
state change management. Ran tests with no-op scenario (construct/destruct),
single frame rendering, and back-to-back frame rendering with a persistent
context across batch rendering operations. Still needs a bit more testing
with interactive cases, more back-ends, and ideally with the aid of
a code coverage tool to ensure all branches have been exercised.
* src/ANARIRenderer.C 1.5 (changed +57 -19)
| ||2020-08-11||johns||Added calls to _mm256_zeroupper() prior to returns to external OS|
ABI-compliant functions to prevent AVX-512 clock limits from remaining
active in subsequently executed scalar or SSE code, caused by false
dependence on upper vector register state. This is a slightly different
variant of the well-known problems with AVX-SSE instruction set encoding
and CPU state transition issues seen on prior hardware generations, but
in this case (on current Xeon and extreme desktop CPU series)
the main impact is a reduction in peak clock rate for as long as the
upper vector register state exists.
This is discussed at some length in various places online:
* src/Orbital_AVX512ER.C 1.2 (changed +12 -2)
* src/Orbital_AVX512.C 1.3 (changed +7 -1)
Added calls to _mm256_zeroupper() prior to returns to external OS
ABI-compliant functions to prevent x86 AVX-SSE instruction set encoding
(VEX vs. non-VEX) transition performance loss due to CPU state transition
penalties (pre-Skylake) or false dependence on upper register state
(Skylake and later CPUs). These vector zeroing calls are fast on
Xeon hardware, but were very poor performers on Xeon Phi hardware,
costing several tens of instructions. For VMD's use of object-file-scope
architecture target compilation, we don't have to worry about calls
within the same file scope since they will all use the same instruction
set encoding. Here, we only need to be sure to zero the upper AVX registers
prior to returning to functions compiled using non-AVX execution modes
(non-VEX instruction encoding, etc). By zeroing the registers prior to
return, we eliminate problems with false dependence on upper vector register
states causing performance loss in subsequently executed SSE or scalar code.
This is discussed at some length in the Intel Optimization Guide and online:
* src/util_simd_AVX.C 1.5 (changed +10 -1)
* src/QuickSurf_AVX2.C 1.5 (changed +5 -1)
| ||2020-07-31||johns||Updated to latest ANARI master, w/ API callback changes, and added|
support for nascent debugging and tracing layer prototypes.
* src/ANARIRenderer.C 1.4 (changed +24 -5)
| ||2020-07-29||johns||Added FMA3 #ifdef|
* src/QuickSurf_AVX2.C 1.4 (changed +3 -3)
Added compilation flags for FMA3 instructions to AVX2 builds.
* configure 1.1504 (changed +1 -1)
Check for FMA3 instructions before dispatch of the AVX2+FMA QuickSurf kernels.
* src/QuickSurf.C 1.125 (changed +4 -2)
* configure 1.1505 (changed +1 -1)
| ||2020-07-28||johns||Added runtime dispatch variants of AVX512 MO kernels.|
* src/Orbital_AVX512ER.C 1.1 (added +258 -0)
* src/Orbital_AVX512.C 1.1 (added +315 -0)
Added support for runtime CPU dispatch of hand-vectorized MO kernels.
* src/Orbital.h 1.45 (changed +2 -2)
* src/Orbital.C 1.152 (changed +82 -4)
Added support for runtime dispatch of Intel x86 AVX-512 (subsets F,VL,BW,DQ,CD)
suited for descendants of Skylake Xeon, and AVX-512ER (subsets F,CD and and ER)
suited for KNL Xeon Phi and any future CPUs supporting the
exponential/reciprocal instructions used therein.
* configure 1.1500 (changed +27 -12)
Completed AVX-512F version of the molecular orbital kernel.
On an Intel i7-9800X, runtime dispatch performance for AVX-512F
(with fully populated memory system) is roughly 3x faster than the
4-way vectorized SSE kernel.
* src/Orbital_AVX512.C 1.2 (changed +8 -7)
Corrected bytes_next_alignment() helper to use the alignment size parameter
rather than the hard-coded 32-byte AVX-specific alignment.
* src/util_simd_AVX.C 1.3 (changed +3 -3)
Disable compilation of AVX2 specific routines for the time being.
* src/util_simd_AVX.C 1.4 (changed +4 -2)
Eliminated compiler warnings.
* src/Benchmark.C 1.13 (changed +2 -3)
Enabled runtime dispatch for molecular orbital kernels on AVX-512F CPUs.
* src/Orbital.C 1.154 (changed +3 -5)
Ensure that AVX2 dispatch targets also get to use AVX macros/headers
* configure 1.1503 (changed +1 -1)
Further streamlining, and elimination of mixed vector instruction set code paths
* src/QuickSurf_AVX2.C 1.3 (changed +17 -82)
Lockout AVX-512F orbital kernel until it is fully debugged.
* src/Orbital.C 1.153 (changed +7 -3)
Rewrote analyze_selection_aligned_dispatch_avx() to calculate aligned
array index offsets from low level pointer arithmetic, rather than looping
and testing for alignment within the loop itself. Added and tested an
AVX2 loop for counting selected atoms, for an eventual AVX2 dispatch variant.
* src/util_simd_AVX.C 1.2 (changed +52 -34)
Silence runtime dispatch debugging messages for the time being
* src/util_simd.C 1.21 (changed +3 -3)
|Other commits are hidden...|