VMD 1.9.4 Development

VMD Development Status

Latest VMD CVS statistics and changelog

Working toward VMD 1.9.4 beta releases
- Updated to the latest colvars rev ffde432e48f88989d4d780e91dcd7dee19a94243
- Added GROMACS tpr file reader plugin by Josh Vermaas https://github.com/jvermaas/vmd-tprreader The initial implementation has only been tested on single precision tpr files. No alchemical changes were tested. The files can't be made by a very old file generator. GROMACS versions prior to 4.0 may be too old to load.
- Applied topotools bugfix update from Axel Kohlmeyer
- Imported Colvars git rev 627e98d8b91901ae3a1a19a7522a290dc8fdd0e5 containing the latest updates from Giacomo and Jerome.
- vnd: Added morphology_line style to cmd_mod_rep_node_fullsel and removed the 'display resetview' from it. Fixed a typo in show_compart_ranged
- vnd: Added compartment page, similar to activity page Extended render page, to include movie generation Fixed some bugs regarding reset view
- vnd: Fixes to properly read compartment segments assigned to each local node_id in a population. Start of procedures to find/display edge/synapse information during compartment activity display.
- vnd: Added compartment-data related queries to cmd_query: compartment_data_min_max, compartment_time_ranges, compartment_var_names.
- vnd: Added procedures for loading, displaying, and animating compartment data. Most relevant procs are: ::neuro::load_hdf5_compart_file, ::neuro::cmd_create_rep_compart_moment_selection_ranged, ::neuro::compart_animate_selection_render_ranged. For a future interactive slider/playnack controls, ::neuro::show_compart_moment_nodelist_ranged works with an existing molecule and existing sublist of globalNodeIdList.
- Added new files from Colvars git rev ad064bdd011204ea2f5e2e87ebacdb1255292b91
- qwikfold: Updated with latest rev of qwikfold from Diego Gomes and Rafael Bernardi
- topotools: Topotools plugin v1.9 from Axel Kohlmeyer
- signalproc: Applied Axel Kohlmeyer's patch to correct a typo in the signalproc plugin.
- vnd: Workaround to fix the different display origin between general reps and activity reps
- vnd: Fixed color assigment in spike method
- vnd: Changed spikeEnd in GUI from 1800 to 3000, to accomodate animation of datasets with bigger timestamps.
- vnd: Added morphology_draft to spike and added larger spikewaittime
- vnd: Now properly using VND global node id, instead of model population node_id, in show_spike_pop_moment_from_list.
- vnd: Fixes to GUI spacing, connectivity styles, and object draw
- vnd: Removed more 'scale by' commands; they are no longer needed in general display procs now that bounds are working for reset view. Changed morphology-related style names -- Previously: morphology (displays disconnected spheres) and morpology_spheretubes (displays connected spheretubes) Now, respectively: morphology_draft (displays disconnected spheres) and morphology (displays connected spheretubes). In GUI, changed value in representation style combobox to accomodate new morp hology_draft name.
- vnd: Removed scale by command from reset view and cmd_mod_rep_node_fullsel
- vnd: Changed soma-only drawing from spheretubes to traditional spheres. A workaround to allow proper bounds checking for display resevtview. It looks like VND/VMD is not currently accounting for spheretubes (at least multiple solo spheres with with drawtubes 0) in the MoleculeGraphics::find_bounds check.
- vnd: Fixed bug that would read in a population N times if it had N groups. Was hard to notice visually, since led to creation of overlaid objects.
- vnd: Updated system query for num_neurons to num_neurons_non_virtual
- vnd: Added 'reset VND' option in File menu for that I reshuffle some of the variable definitions into 'initialize' proc. Added new implementation of edit rep with new cmd_mod proc in vnd_read.tcl Fixed some issues that came during the tutorial preparation. Added morphology_spheretube style.
- vnd: Added proc cmd_mod_rep_node_fullsel based on cmd_create. That works with the GUI to edit reps and keep the same mol id and rep id in nrepList. Eventually we would want to replace this or collapse it to the cmd_create, but the previous implementation for edit rep had bugs in the rep order and this fixes that.
- vnd: Fixed issue with lreplace in cmd_show_rep and cmd_hide_rep original code was not updating the variable 'shown' in nrepList cause issues with the save/load visualization state
- vnd: Added ability to control if a new nrep includes virtuals (neurons with model_type == "virtual", set from their type). Also added whole-system queries num_neurons_non_virtual, num_nodes_non_virtual (synonyms). Details: When cmd_create_rep_node_fullsel is called, if the namespace-wide variable ::neuro::display_virtuals_at_creation is True, the rep will contain the virtuals. If False, the rep will not contain the virtuals. The nRepList entry for this nrep will contain num_neurons that reflects whether virtuals are included. A new flag has been added to each nRepList entry: displayed_virtuals_at_creation. For edge nreps, virtuals for source/target are for now always included, and displayed_virtuals_at_creation is always True.
- vnd: Fixed bug: had failed to load system if no morphologies directory was defined. Not defining morphologies_dir is acceptable in SONATA for systems with no morphologies.
- vnd: Fixed bug that showed giant soma for morphology_spherertube. Corrected single-element morphology section list assignemnt, so that section ID starts -- properly, by SWC format -- at 1 instead of 0. Works with checks elsewhere that expect 1.
- Added Axel Kohlmeyer's plugin for YAML formatted LAMMPS dump files.
- Molecular Orbitals: Continued revision for 3-D grid launch kernel variant
- vnd: Added ability to use spheretubes for morphology drawing, so morphologies don't appear as disconnected groups of spheres at close inspection. For now, uses special style morphology_spheretube for node and spike animation cmd_create commands. Drawing spheretubes for now as lower-performance pairs of points and over-drawing; in future should draw as complete sections (chains of connected points, between bifurcations and end points.
- vnd: New activity tab with animation controls for play, pause, step, speed and window
- dipwatch: Changed pack to grid to fix issue with newer Tcl 8.6.x
- Added links to AltiVec documents required to maintain the POWER 8/9/10 VSX code
- Removed old OptiX paths associated with NCSA Blue Waters and Indiana U. Big Red II systems since they have been decomissioned.
- Removed CRAY_XK platform configurations now that NCSA Blue Waters has been decomissioned (the last running XK system that we're aware of).
- Removed old configuration options associated with NCSA Blue Waters, since it has been decomissioned.
- Minor housekeeping on chain definitions while auditing for potential issues with new PDBx assembly chain definitions that can contain dashes, etc.
- Added ARM intrinsics docs URL.
- Change CRAY_XC targets to go for SM 6.0 and later
- Revised some Cray gcc/clang compilation flags to avoid trouble with the latest compilers.
- vnd: Loading and displaying spike time series for populations is added. load_hdf5_edge_file_pair , cmd_show_spike_pop_seqeunce_from_fullsel, and show_spike_pop_moment_from_list are relevant procs for loading, showing a fixed animation onscreen with parameters, and showing a "moment" of the series.The moments are the basis for current fixed onscreen animation but can be used for interactive animation, movie making, and similar. The zoomscale parameter for cmd_show_spike_pop_seqeunce_from_fullsel is a hack - should be replaced with good control of scaling when drawing multiple molecs (nreps). Also: reading virtual attribute for nodes.
- cranked version
VMD 1.9.4 alpha 57 (April 27, 2022)
- psfgen: Force-clear psfgen context/handle data structures during allocations, by using calloc() rather than malloc.
- psfgen: Added additional safety checks in psfgen topo_mol_get_atom() to prevent psfgen from attempting to dereference residue atomArray[]->name strings for NULL atoms. The same safety checks are used in similar loops elsewhere, so hopefully Joao's "NEWPSFGEN" code will behave correctly with fallthrough after the residue atom loop terminates.
- Updated the MacOS X plugin builds to build against libnetcdf.dylib
- Correct the MacOS X ARM64 resource compiler flags
- Add NETCDFDYNAMIC flag for all Mac builds.
- vnd: Replaced createRepArgs code to avoid issue with '-' in selections
- Begin reorganization of molecular orbital kernel launch and data transfer logic to permit the use of 3-D grid launches and alternate strategies to deal with grid padding, etc.
- Added two new thread-coarsened CUDA kernels based on the "packed" input array variant. They use a strategy of increasing the per-thread arithmetic intensity by computing 2 or 4 grid points at a time, but reading only the same number of input operands from the basis array and wavefunction arrays. This comes at the cost of increased register pressure, but the trade-off is well worth it on the newer "Ampere" GA10x GPUs since they have large register files. It is particularly beneficial for the performance of GPUs such as the A40/A6000 that have 2x the single-precision floating put ALU throughput, as it increases instruction level parallelism significantly.
- vnd: Edge rendering now includes morphology target (efferent) information for swc f iles (later add source (afferent) swc info if this is ever used by files). To display, use one of the following new edge style of cmd_create_rep_source_target_edges_fullsel: simple_edge_swc, source_sphere_swc, target_sphere_swc, source_target_sphere_swc.
- Added environment variable override to enable use of the new Ampere kernel when set. Early benchmarking shows a 30% performance gain for the A6000 GPUs w/ double performance single-precision floating point and GDDR6 memory, but break-even on higher-end A100 hardware.
- Allow runtime override of the packed cached global memory molecular orbital kernel for Ampere and other recent GPUs.
- Updated the molecular orbital kernel launch logic to encompass the new packed cache kernel for Ampere and other recent GPUs.
- Added a new 4th molecular orbital kernel tuned specifically for Ampere (possibly also Volta and Turing) GPUs.
- Added notes about compiler versions that generate incorrect SVE code from intrinsics used in the molecular orbital kernel.
- cv_dashboard: Updated to colvars main 948003965f79c7266e5e7da39156cf8f1b4c59fe
- colvars: Updated to current colvars main 948003965f79c7266e5e7da39156cf8f1b4c59fe
- Added Python bindings equivalent to the Tcl "mol fromsels" command, contributed by Josh Vermaas.
- Eliminated vector selection operator since will use a zeroing vector multiply to automatically yield zero values of inactive SIMD lanes without needing any additional instructions.
- Carefully revised the ARM64 SVE orbital code to make safe use of non-selected-undefined "_x" SVE instructions which generate fewer machine instructions as compared with their non-selected-zero "_z" counterparts. By using the zeroing intrinsics for key operations at the end of long arithmetic sequences and in assignment of key variables, we can ensure inactive SIMD lanes end up with zero values, but we use the faster (fewer instructions) undefined variants in long multiply-add sequences that have the same source and target machine register, thereby eliminating the need for the compiler to generate additional MOVPRFX instructions.
- Replaced all of the SVE non-selected-undefined "_x" SVE intrinsics with their non-selected-zero "_z" variants so we're certain to get only zeros in inactive SIMD lanes. This costs some additional MOVPRFX machine instructions in addition to FMULs, but we can work our way backwards and use the _z variants only for the final ops in each sub-sequence when it makes sense.
- Misc corrections and improvements to the ARM64 SVE exponential approximation
- Improved ARM64 SVE vector size console output formatting for error case.
- Improved structure of the molecular orbital runtime CPU dispatch loop to the various SIMD variants.
- Added ARM SVE fall-back implementations to make it easier to workaround some currently-broken ARM64 compiler toolchains that lack functional SVE support.
- Enabling compilation of ARM64 CPU SIMD feature detection w/ NVIDIA HPC SDK.
- Changed ifdef checks to enable ARM64 compilation with the NVIDIA HPC SDK.
- Removed safety checks for old versions of the Portland Group compilers since recent versions seem to accept intrinsics code without so much trouble.
- Eliminated unreachable code warnings by updating the conditional compilation macro structure.
- Ensure the error handling code paths close image output file handles so they aren't leaked.
- Eliminated the convenience union used for bitwise manipulation of the floating point values, since variable length SVE vectors can't be included in structure types.
- MacOS X doesn't support ARM SVE vector instructions yet, so they are disabled within MacOS X ARM64 builds, even for runtime CPU dispatch.
- Added Orbital_SVE.C to the build when compiling for LINUXARM64
- Enable runtime dispatch of SVE kernels on ARM64 builds.
- Added rough draft of hand-vectorized ARM64 SVE molecular orbital kernels.
- Enable ARM64 SVE instruction set runtime vector length detection and reporting.
- Corrected 64-bit double precision SIMD broadcast for ARM64 NEON
- Added ARM64 NEON SIMD implementations of 1-D array stats, atom selection accelerators, bounding box calculations: minmax_1fv_aligned_neon(), minmax_3fv_aligned_neon(), etc.
- Comment out ARM64 Unix kernel auxval2 query via AT_HWCAP2 until we actually need it for runtime queries of ARM CPU optional instruction set features of interest.
- Added ARM64 NEON SIMD instruction kernels for runtime dispatch on MacOS and other ARM64 platforms. The initial implementation still contains some SVE kernels as placeholders.
- Updated ARM64 NEON SIMD molecular orbital implementation to make explicit calls to vcvtq_f32_u32() for conversion of unsigned int comparison flag results to single-precision floating point, and to use the resulting floating point mask array to compute the orbital electron density if enabled.
- Fix XLC VSX conditional compilation rules for POWER8/9/10 builds
- vndplugin: Added piecewise file reading, using for edge files. Now can read large files like v1 without crashing. See new procs hdf5_piecewise_dataset and hdf5_simple_dataset_range. Also: turned off some list sorting not currently needed; now handling "indicies" edge subgroup spelling variant the same as special, optional "indices" edge subgroups -- which for now remains ignoring these subgroups.
- vndplugin: Replaced call to cmd_create_rep for cmd_create_rep_node_fullsel that allows more complex selection syntax and adjusted the tree view example rep creation to include the full selection
- vndplugin: Initial version of tree view with button to create example rep containing one selection
- vndplugin: Made edge searching faster (4.04x speedup for a layer 4 model test). Now creates globalEdgeDataList and does a round of list-based searching. Also made help message for queries more polite.
- vndplugin: Added full selection strings (logic and parenthesis) and edge searches. Can a) make a nodes rep from one full selection string b) make an edges (connections) rep from two node selection strings. See new procs cmd_create_rep_node_fullsel, cmd_create_rep_source_target_edges_fullsel. Edge source is in edges_source_nodes_target_nodes. Parser uses tokenize_sel_string and shunting_yard and some utility stack and queue procs. Also, extented some selections so == takes a list of numbers.
- Drastically improve vectorizability of VolumetricData::clamp(), in exchange for always writing back to memory. We could hand-write a SIMD loop that would encompass the best aspects of both approaches with a SIMD-wide true/false if any lane had to be clamped, checked via something like _mm256_testz_si256() in AVX, for example.
- Corrected a latent bug in clamp_int() that had as yet not yet been triggered by any calling code.
- Updated colvars to main revision f2a0dff623a5cc46fc9efa6288ee9ec7230d8b1b
- Capitalize compile time Marching Cubes kernel template specialization parameters for clarity.
- Eliminated the old Marching Cubes code associated with deprecated CUDA texture reference APIs.
- Completed rewrite for CUDA texture object APIs, and checked allocations with compute sanitizer.
- Completed first phase of rewriting the marching cubes texture management to use the modern CUDA texture object APIs that replaced the now-deprecated texture reference APIs. The triangle vertex count and edge combination tables have been updated to use linear texture objects, so the last step will be to convert the 3-D texture mapping as well, although it uses the "array" based APIs which are quite a bit different.
- vnd: Fixed a variable change that had broken several _select functions. Also, fixed an incorrect response to some kinds of error in selection string when creating rep based on a selection: no longer attempts to draw nonexistent node -1.
- vnd: Added tree view GUI to information section works with current cmd_query commands missing functionality of 'Query' and possible automatic rep generation with toggle button
- vnd: Added several queries to ::neuro::cmd_query, including node_types_in_group FILESET POPULATION GROUP. Fixed an error that was resetting globalEdgeIdCount for each pop read in. Fixed bugs with node vs edge variables.
- vnd: Now reading in edges (i.e. neuron connections). Current default is to not read edges, since large (GB-scale) edge files can cause memory problems and crashes. To load edges: when calling cmd_load_model_config_file, set reading_edges param to True. Added cartesian Boolean property to nodes, True for nodes with any x,y,z coords. There are no longer skipped groups when loading. See ::neuro::initVars for new namespace variables edge(), globalEdgeIdList, edge_pop_hash, and more. Can now do simple straight-cylinder edge drawing with ::neuro::proto_show_edges_from_list. Need to add edge-oriented (less vitally, cartesian-status-oriented) query and selection commands.
- vnd: Unified createRepArgs procs and revamped syntax for args Fixed multiple bugs with Representations table Improved updateRepMenu, editRep and delRep
- Improved device enumeration column alignment for cases when there are a mix of similar and dissimilar GPUs in the same host.
- hbonds: fixed small bug I introduced when sel2 isn't present.
- Updated comments and constants to match latest FMA benchmark structure and did some brief tuning tests on an RTX A6000.
- Added "vnd" (a neuroscience visualization GUI and associated back-end data parsing, selection, and storage code) to the plugin tree.
- Added compile-time macro check to detect C11, which adds portable C library implementations of threads, atomic operations, memory alignment, and Unicode string support. The current code doesn't do any checking for the optional __STDC_LIB_EXT1__ (which must equal 201112L if set) at present. As of Feb 2022, C11 is listed as having only partial support from the major vendor compilers and GCC/Clang. It is noteworthy that C17/C18 adds no new language features and only addresses inconsistencies etc. It is currently expected that the next C2x language version will be finalized in 2023. Much of C2x revolves around updating items in common with newer C++ specs, absorbing a few POSIX APIs, eliminating K&R function definitions, etc.
- Implemented further optimizations that eliminate branching for 256-atom-block traversals when we have contiguous range of selected atoms within a block.
- Added optimized 256-atom block traversal using packed-byte first/last/count block-specific index selection information.
- Continued implementing sparse selection traversal optimization approach. Based on C++11 lambda expressions to hide acceleration structure traversal complexity and allow simple closure style anonymous lambda syntax at call sites.
- Added a new secondary atom selection acceleration array "on256", that contains selection information for blocks of 256 atoms, permitting tremendously improved performance when processing sparse selections in very large systems. Due to the increased complexity of the selection traversal loop, the new acceleration scheme is primarily intended for use in combination with via C++11 lambda expessions, either named, or more likely structured as anonymous closures at the actual call site. The call sites are not polluted with complex traversal code, and they retain the performance benefits of full inlining of the lambda contents into the innermost loop body. Renamed the new method AtomSel::for_selected_lambda() for the time being.
- Added example of anonymous lambda syntax for COM calculation.
- Test use of C++ 2011 lambda expression for measure command atom selection inner loop body.
- Added draft example atom selection mechanism for use with C++ 2011 lambda expressions, to permit much broader use of atom selection processing acceleration techniques.
- timeline: First fix for pack/grid issue with TCL8.6 in the calculation menu
- Added "vmdinfo compilers" command to permit runtime query of the compile-time C/C++ language standard levels. We may extend this further with additional checks for specific compiler toolchains, e.g. GCC, Clang/LLVM, PGCC, Intel C++, IBM XLC, etc.
- Added C++ 2011 compiler flag and machineary for the capture of key details of the compile-time environment both for C and C++ source files.
- Added a C source file to capture details of the host compile-time language standard and/or other C compilation environment details that can't be captured from within the C++ source files. This is mostly to permit capture of the environment that many of the molfile plugins are expected to use, since VMD itself is totally C++ dominant.
- corrected test and scoping for enabling the use of Address Sanitizer with GCC and Clang based compilations
- Updated OSPRay 2.x build rules to use OSPRay 2.8.0 by default
- Hardened the VMD FileRenderList startup and renderer registration code for Intel OSPRay to ensure that any OSPRay initialization failures don't crash VMD. This should handle both insufficient CPU instruction set extensions as well as runtime errors that have been observed with loading of the ISPC shared libs on Windows systems.
- Revised ANARI renderer implementation to more closely track the provisional spec, but maintain the use of some extensions for particular renderers for the purposes of rendering of high production value images for the ANARI manuscript in the short term.
- Updated default ANARI path for early public SDK structure.
- cranked version
VMD 1.9.4 alpha 56 (December 15, 2021)
- plumed: Updated to Plumed plugin version 2.8
- Write atomic element and chain fields to the PDB input for Stride/DSSP since the latest versions of DSSP require these to be present.
- molefacture: Added error check for empty molecule in element combobox selection
- Updated ANARI to default to the "example" device after recent renaming.
- qwikfold: Added QwikFold to the builds
- qwikfold: Added QwikFold placeholder to the script that populates the extensions menu during VMD startup
- sync with colvars git version 6d97a4339092d4f04f0dee0bf91c66cafc3bc406
- multiseq: Changed pack calls to grid
- Added two better designed fallback cases for VMD initialization on Windows systems, to set the temporary directory if TMPDIR is unset at startup.
- qmtool: Fixed pack/grid issue when calling multiplot embed.
- qwikmd: Fixed multiplot embed column order.
- multiplot: Changes pack to grid geometry manager to be compatible with plugins that use multiplot embed in new TCL 8.6 Checked plugins that use 'multiplot embed': QwikMD, QMtool, FFTK and MDFF Only QwikMD and QMtool required fixes.
- Added use of ptrdiff_t types for key voxel indexing arithmetic, to enable QuickSurf density map generation for volumes containing more than 2 billion voxels, such as the SARS-CoV-2 Delta Aerosol visualization.
- Applied Josh Vermaas' patch to ensure that VMD parses all possible keywords for file writing operations passed by Molecule.py
- Ensure that DSSP emits verbose output if issues arise.
- Fixes for parsing issues with some versions of mkdssp.
- Updated the ANARI renderer to allow the user to specify the behavior of the USD back-end interface at runtime, including whether or not to emit USD as local files or to make a network connection to an Omniverse server, the root level USD path to use for exporting sessions, and to select binary or ASCII USD formatting.
- Inhibit the "interactive" ANARI renderer from showing up as an option when the USD back-end has been enabled.
- Short-term workaround to prevent a crash that can occur at the start of the vectorized texture map accumulation code, likely due to memory alignment limitations.
- Revised the ANARI FileRenderer subclass to ensure that VMD representations lead to unified instance/group nodes containing all of the surfaces, geometry, and materials that make up the representation.
- Replaced old Purify-based memory checking build rules with modern Address Sanitizer based checking.
- Ensure zero-initialization of atomicnumfactor_4 regardless of state variables.
- Updated the per-representation comment DisplayList DCOMMENT token generation code for complete coverage of reps, and the current rep names. The comment token will be used to populate graphical interfaces that show the structure of the molecular scene graph.
- Updated ANARI material handling code to more gracefully adapt to different renderer subtypes and their associated limitations.
- Eliminated completion side of Axes comment buffer output so that all comment buffers are tagged at start only.
- Made the comment buffer output for the reps considerably briefer for better use within the ANARI USD namespace and other scene tree browser displays that break down the scene components into groups of geometry, instances, etc.
- Eliminated old sphere-specific workaround code path for the ANARI USD back-end. The current code attempts to honor the provisional spec APIs, so the workaround scheme is no longer needed, albeit there are still limitations on the Omniverse side unrelated to VMD itself. Added USD naming for the top level ANARI world object created by VMD, and began working on minimizing the number of cases were USD would generate an empty session.
- Updated the ANARI renderer implementation to use more narrowly defined ANARI_ARRAY1D types for the geometry array API calls. Updated the geometry subtype and parameter strings to bring them in sync with the ANARI provisional spec.
- Continued updates to bring VMD in sync with the ANARI provisional specification and recent back-end renderer updates.
- Added support for new ANARI USD back-end, and updated framebuffer format configuration to the current ANARI provisional specification.
- Eliminated Python startup warning reported by Josh Vermaas
- Protect RTRT DSPHERE display list token processing code with a new scope to prevent leaking of variables and state into the parent scope, and to ensure successful compilation with the most recent revs of GCC which are less permissive about skipped initialization scenarios.
- cranked version
VMD 1.9.4 alpha 55 (October 19, 2021)
- Roll OptiX PTX compute capability back to SM 5.0 for a while longer, at lest for non-RTX RTRT builds.
- Added support for lone DCYLINDER and DSPHERE display command tokens in the full-time RTX RTRT display mode.
- Added OpenGL background color state softcopy tracking to facilitate correct state tracking when we enter/leave full-time RTX RTRT mode.
- Added new QwikFold graphical interface plugin for AlphaFold runs.
- cv_dashboard: Updated to Colvars version 1f1c3c32dcd2cf9fba9457514ce63bf284271fa9.
- Updated to Colvars version 1f1c3c32dcd2cf9fba9457514ce63bf284271fa9.
- Updated README
- cranked version
VMD 1.9.4 alpha 54 (October 7, 2021)
- Added support for modulation of the NewCartoon representation thickness by user-specified per-atom fields, following the existing approach already implemented for VDW and NewRibbons representations.
- Tied runtime QuickSurf texture format selection to the user-specified surface quality parameter assigned when the representation is computed.
- Continued unification and optimization of the QuickSurf density/texture kernels for multiple texture formats. Changed QuickSurf memory allocation routines to track and incorporate the texture format in its calculations so that it can be changed on-demand by the user's surface quality preference at runtime.
- Updated ORNL Summit build configuration for CUDA 11.0.3, removed compilation support for Kepler GPUs of compute capability 3.0 (now more than 10 years old), and updated the OptiX ray tracing engine shader generation to use compute capability 6.0 as the PTX targeting Pascal and later generation GPUs.
- Applied Josh Vermaas' corrections to the built-in FastPBC implementation, replacing the FastPBC unwrapping algorithm with a corrected displacement method based on von Bulow's displacement algorithm.
- Updated the ANARI renderer to use the accumulation buffer, and run multiple passes, since we don't standardize the meaning of sample count parameters among renderers presently.
- Major revision of ANARI renderer subclass to sync up with the ANARI provisional API and associated SDK updates. The updated implementation isn't using the accumulation buffer yet and various other aspects of object ownership and lifecycle still need cleanup. The current code isn't handling the newly required interactions between the ANARIFrame object and its required parameters in a graceful way. The code needs some refactoring to better support the new object dependency graph associated with the updated API.
- VMD previously assumed that all python objects that can be read by PyLong_FromLong pass the PyLong_Check test. In Python 2, this was true, but in Python 3, numpy.int32 or numpy.int64 datatypes can be validly converted into python integers, but don't pass Trying to do something relatively innocuous, setting resids for a selection to a numpy array, was failing with the old Python 2 type approach: values = np.cumsum(np.ones(len(sel))) sel.resid = values With the revised implementation, this now works correctly with Python 3.
- Corrected the behavior of Stack::clear() to ensure that the curr pointer is reset correctly.
- Add explicit type conversion to please Clang.
- Ensure that the colorscale posterization flag is initialized in the constructor.
- Simplified QuickSurf RGB3F kernel loops with some operator overloading.
- Back-ported some CUDA kernel optimizations from the QuickSurf RGB4U code path to the RGB3F code path. Ultimately these kernels should end up being unified using templates.
- psfgen: Fixed psfgen readmol command for billion-atom systems.
- Fixed a bug in handling of user-specified isovalues for non-color-per-atom mode (e.g. ColorID) in the CUDA QuickSurf code path.
- hbonds: Applied JC Gumbart's patch to correct the hbond analysis plugin to correctly handle cases where segnames aren't set, and to handle cases where multiple chains occur across selections.
- Eliminated check for the X11 composite extension when using off-screen Pbuffers.
- Disabled check/warning for the X11 Composite extension since all of the newer distros have turned it on by default, and conventional planar stereoscopic displays are all but gone from the face of the earth at this point. Those rare sites that still have stereo displays will know that they can't use one of the compositing window managers and/or they will have system software that works correctly in such a case.
- Updated Windows x64 distribution and installer for latest updates.
- fftk: fixed bug when LP section is present but empty in PSF file
- cranked version
VMD 1.9.4 alpha 53 (June 30, 2021)
- Short-term modifications for initial VND "Visual Neuronal Dynamics" neuroscience tool release with the developmental GUI.
- cranked version
VMD 1.9.4 alpha 52 (June 18, 2021)
- multiplot: Added independent editplot GUI, called from a new 'edit' menubar button Minor changes to plot's namespace variables to accomodate for added functionality in plot customization (e.g. titlefontsize). Added prototype proc to draw grid lines
- Implemented many updates due to recent changes in the ANARI API design. Still need to completely resolve changes related to having made the framebuffer the object that collects camera and other global scene state that used to be provided on render calls.
- Added default handler for cylinder array primitive needed for better full-time RTRT performance.
- Added cylinder array display primitives which are far higher in performance when using full-time RTRT display mode.
- Added code to generate molecular representations using cylinder array primitives, especially for use in RTRT rendering modes.
- Added OptiX 6 based implementation of full-time RTRT rendering mode back end. The new OptiX 7 version of Tachyon will end up being significantly more efficient with caching and tracking of existing command list buffers and other data in GPU-resident data structures, but this is a usable placeholder implementation albeit the various holdovers from its more offline-oriented original design. The new code adds methods to aggregate materials, and command list buffers into a complete scene description, and to efficiently render it.
- Added preliminary support code to permit full-time real-time ray tracing as an alternative to OpenGL rasterization, but maintaining the use of OpenGL to provide on-screen image presentation, double buffering, etc. Added a new method to allow the main X11 OpenGL window title to be modified at runtime, to more clearly indicate which renderer is active.
- Ensure that anyhit traversals against opaque geometry end immediately.
- minor tweak to improve ANARI context regen ordering
- Minor cleanup to GLSL shader compilation code while looking into similarities with needs for loading OptiX 7.x compiled PTX.
- Revised ANARI, OptiX, and OSPRay framebuffer methods to match recent updates to the OptiX method names, and to incorporate corrections and improvements already made in the OptiX framebuffer methods, in support of fully interactive ray tracing pass-through to OpenGL DisplayDevice subclasses.
- Print updated internal framebuffer state variables when verbose output is enabled.
- Ensure that framebuffer_config() and framebuffer_resize() always set internal width/height member variables so that framebuffer reads and display are handled correctly for pass-through of full-time ray tracing in OpenGL DisplayDevice subclasses.
- Added calc_matrix_scale_factor() method to compute a uniform scaling factor for cone/cylinder/sphere primitives if/when flattening the scene graph (by pre-transforming all geometry) for best interactive ray tracing performance.
- Added framebuffer map/unmap methods to permit OpenGL DisplayDevice subclasses to implement full-time ray tracing pass-through.
- Improved naming of framebuffer management methods prior to adding new map/unmap calls required for full-time ray tracing pass-through from within the OpenGLDisplayDevice hierarchy.
- Minor optimizations for vmd_DrawConic()
- Continued revision of OptiX geometry methods to improve const correctness.
- Improved const correctness of the materials and trimesh geometry methods.
- Replace stack clearing loop with call to new clear() method.
- Added a clear() method to allow cleanup of loop-based clearing idioms found in the DisplayDevice subclasses that manage matrix stacks.
- Applied Josh's update to CUDAFastPBC to correct a race condition that occured during tests with 2 million frame trajectories. There was a rare race condition between memcpy and cudaMemcpyAsync. The new CUDAFastPBC implementation has a cleaner memory copy interface for the unwrapping portion of the code that no longer uses the host-side memcpy, and has a distinct pinned buffer for every stream.
- A few cherry picked colvars module bug fixes and updates provided by Jerome and Giacomo.
- autopsf: Prevent ouf-of-bounds write of NUL string termination character if the length of inres is longer than the capacity of the resname string buffer. By clearing with memset first, we guarantee that the string will be terminated. This is a likely cause of some of the crash/hang behaviors recently observed with autopsf.
- autopsf: Ensure that the PsfAtom constructors clear all of the string fields completely prior to using them.
- Added the Google AI group's "Turbo" rainbow color scale. Began adding infrastructure for conversion between sRGB colors (gamma 2.2) and linear (gamma 1.0) RGB colors, since color scales, particularly those that are tabulated, may not be in linear color spaces. This is complicated by the fact that the VMD FLTK code presently displays the color scale colors as-is, when they should actually be stored linear and rendered in GL using linear colors, but drawn in sRGB in the FLTK GUI. Further changes will be needed to ensure the correct color spaces are always used for tabulated color scales.
- Corrected the behavior of the DisplayDevice base class implementation of the cueing_on() and cueing_off() methods so that batch mode renderings conducted with "-dispdev text" or "-dispdev none" don't force depth cueing on.
- Corrected typo in tablelist distribution macros
- Added support for setting the AO maximum occlusion distance in the CPU versions of Tachyon 0.99.2 and later.
- cv_dashboard: Synced with colvars git master.
- Synced colvars module with colvars git master.
- Enable native cylinder rendering for ongoing testing of the new OSP_DISJOINT curves API mode.
- Revised the OSPRay 2.x renderer to use the new OSPRay 2.5.0 curve APIs with OSP_DISJOINT, for rendering cylinders and cones.
- Updated OSPRay 2.x renderer to make use of the path tracer and add experimental support for luminous materials.
- Updated OSPRay 2.x paths for OSPRay 2.5.0
- Updated ANARIRenderer for recent API+spec revisions. Eliminated ANARIGroup objects, and updated associated calls to anariNewGeometricModel() and anariNewInstance() and their parameter set calls. Added short-lived correction to leak of futures from asynchronous anariRenderFrame() APIs. This will go away when the APIs are revised so that the futures are implicit rather than explicit objects.
- Updated all of the framebuffer creation/mapping and parameter assignment APIs.
- Corrected missing ospRelease() calls on the futures returned from asynchronous ospRenderFrame() calls.
- Added the new ARM64 NEON+FMA QuickSurf kernels to the build.
- Added runtime dispatch for QuickSurf kernels on ARM64 CPUs with NEON and FMA SIMD instructions.
- Wrote a complete first implementation of QuickSuff for ARM64 NEON SIMD instructions, on machines that have FMA. Tested on a Mac "M1" processor thus far.
- Implented complete molecular orbital kernel for ARM64 NEON SIMD instructions on CPUs that support FMA instructions. A partial implementation exists for machines lacking FMA, but more code is needed in the wavefunction loops to make it complete. The ARM64 NEON exponential approximation uses a different early-exit testing scheme from x86 SIMD variants, due to the lack of equivalent instructions to gather high order bits from multiple SIMD lanes into a single scalar, so this ARM64 NEON code uses a horizontal maximum scheme to exploit the same early-exit opportunity in a slightly different way.
- Completed AVX2 version of the molecular orbital kernels, based on single-precision floating point AVX instructions, 32-bit integer AVX2 instructions needed for the fast exponential approximation, and single-precision fused-multiply add instructions used throughout. At present, this kernel assumes availability of FMA, so it cannot be used on CPUs that have AVX/AVX2 but lack FMA.
- Implemented early exit optimization for the SIMD exponential approximation by computing a horizontal maximum for the input parameters, and returning the clamped value 0.0 when the max of all four inputs are below the cutoff parameter. Corrected all of the NEON vfmaq_f32() intrinsic calls with correct parameter ordering.
- Added conditionally compiled ARM NEON SIMD code for CPUs that lack FMA instructions.
- Misc cleanup and corrections to aexpfnxneon() SIMD exponential approximation.
- Enable compilation of the 16-byte-alignment check helper routine for ARM for the time being. This should be replaced globally by the newer N-byte alignment routines, but this is expedient for the moment.
- Added function prototype and runtime dispatch launch code for NEON SIMD version of the molecular orbital kernel.
- Added first draft version of Molecular Orbital kernels hand-coded using ARM NEON and FMA SIMD instructions. This code compiles but hasn't been tested yet. The first draft was adapted from the AVX2 version, but with reference to the older SSE code in a few cases where I could more easily lookup equivalent machine instructions from SSE and NEON. All of the 4-element floating point operations should be totally straightforward drop-ins, so the only instructions that are tricky in the conversion to NEON are (un)aligned vector load/store operations, integer bit shifting, and some of the mixed floating point and integer bit field manipulations used for the fast expontial approximation implementation. The current version of the code unsafely assumes the availability of ARM FMA instruction support. It may be necessary to split that out, but since the initial target of this work is the Apple Mac "M1" hardware, we can revisit this more broadly if/when there are other important ARM targets, CPU instructions, or OS kernel APIs that we can use to reliably make the determination of FMA availability at runtime, which is a current limitation on MacOS and ARM in particular.
- Updated the build system to support runtime CPU dispatch for ARM NEON with FMA instructions for the new Apple "M1" Mac hardware.
- Added #ifdef protection for SVE vector length queries on ARM platforms
- psfgen: Fix compile issue for standalone psfgen builds on MacOSX. Clang complained about missing prototype for newhandle_msg_ex().
- Updated to Tcl/Tk tablelist package version 6.11
- Added molecular orbital runtime dispatch logic for the new AVX2 kernel.
- Added x86 runtime dispatch AVX2 molecular orbital kernel to the build.
- Corrected AVX2 bitwise masking operation for each single-precision floating point SIMD lane.
- Added first draft of AVX2 molecular orbital kernels for runtime CPU dispatch.
- cgtools: Corrected an atom selection leak and incremented the cgtools version to 1.3.
- Added runtime startup message to clearly indicate builds that have runtime CPU dispatch enabled vs. those that do not.
- Added conditional compilation check for CPU info bailout.
- Corrected conditional compilation checks for non-x86 CPU runtime dispatch
- Added Python bindings to permit querying and setting color scale reversal and posterization parameters.
- Revised color scale internals and the Color window GUI to implement posterization of color scales down to a user-specified number of individual color bands.
- Revised the color scale internals to facilitate ooptional color scale order reversal for both built-in divergent midpoint/offset color scales as well as tabulated color scales.
- Added a live color swatch box to the redesigned color menu. Continued adjustment of widget sizes and placement.
- Added dependency on win32molfilelibs for the win32bin targets to prevent a race condition with building molfile plugin dependencies that the standalone binaries expect to link against.
- Updated cv_dashboard from the latest git master.
- Updated the colvars module from the latst git master.
- cranked version
VMD 1.9.4 alpha 51 (Dec 21, 2020)
- Changed the conditional compilation tests for safe use of the FLTK Fl_Color_Chooser class in place of the classic VMD color sliders.
- Shifted the CIELAB L* 100 ligthness marker down 10 pixels to match Fl_Chart layout lightly better.
- Improved color scale plot title to "CIELAB L* perceptual lightness" which is more informative and technically correct. Completed the remaining math for RGB to CIELAB color conversion.
- Added an implementation of the "cividis" color scale which improves upon the popular "viridis" color scale for viewers that have color vision deficiencies.
- Significantly redesigned the Color window layout, resizing the color definitions and color scale tabs to occupy the full window, and migrating the entirety of the color definition browsers into the color definitions tab. This change to the window layout better separates controls that are relevant for color definition from those used for color scale selection and editing. The larger space made available within the color scale tab enables the addition of a plot of color scale CIELAB L* luminance just below the color scale test grating image.
- First steps in a major revision to the VMD color scale infrastructure. The existing implementation has been extended with support for tabulated color scales, new internal data structures and APIs to facilitate correct GUI interaction for non-editable tabulated color scales. Due to the significant increase in the total number of color scales now available, the GUI has been revised to support forward and reverse mapping of color scale menu names that include both scale type categories and leaf node color scale names. Redesigned the Color window layout to support a much larger color scale test image, and added a high spatial frequency test grating image based on the color scale test images developed by Peter Kovesi.
- Changed _mm_set_pd1() to _mm_set1_pd() which seems more portable
- Added freely licensed perceptually uniform color scales in tabulated form with 256 color entries per table. These are adapted from four of the popular sequential color scales in Matplotlib, and a large selection of the linear, cyclic, isoluminance, and rainbow color scales from the set of CET perceptually uniform color scales published by Peter Kovesi.
- Changed default colorscale "Offset" from 0.10 to 0.06 after comparisons with high spatial frequency test grating images for both original parameter and the new value. The built-in VMD divergent color scales while still much better than the old Matlab "rainbow" scale, leave much to be desired, both in terms of their ability to show high frequency fine spatial details, luminance linearity, and other factors.
- Corrected width of color category name itembrowser widget.
- Major revision of the VMD color menu to optionally make use of the FLTK-provided Fl_Color_Chooser instead of the classic VMD color sliders. By default, when compiling with FLTK versions >= 1.1.10 VMD will use Fl_Color_Chooser, and will revert to the classic sliders otherwise. The window size is significantly increased, to match the width of the VMD main window, and with additional height to provide an easy-to-use widget size for the FLTK color selector and associated controls. The FLTK color control supports floating point RGB, Hex, integer, and HSV color value ranges, with mouse-based color component scrolling, so all of the original features have been preserved while adding significant convenience.
- cranked version
VMD 1.9.4 alpha 50 (Dec 17, 2020)
- Added x64 installation script based on V5 of VMD 1.9.4a50 installer build. The installer scripts are re-generated using the "HM NIS Edit" 2.x, followed by hand-patching the installer code to add/delete VMD registry keys and deal with x64-specific installer steps and Administrative privilege level setting. Ideally some of this would be done with macros or some sort of script template, but these steps have to be inserted deep into installer subsections making it an annoying manual editing process at present. This script can be used as a point of reference to recover the manual hand editing by carefully diffing a freshly generated installer script with the previous version. It may eventually be possible to automate this completely by writing an installer generation script or by devising pathes that apply relative to subsection starting points that the "patch" utility has a hope of recognizing automatically.
- Added VMD icon to VS2017 project, and Win64 builds.
- Prevent failure of stat() on multi-gigabyte trajectory files from causing the DCD open method to return a fatal error when compiled on 64-bit Windows platforms.
- Updated WIN32 and WIN64 registry query code and software keys to add support for 64-bit builds.
- Prevent glwin window destruction from killing the parent app in Win32/Win64 builds.
- Added built-in OptiX and OSPRay ray tracing engines to the Win64 builds.
- plugins: Prevent the use of parallel make for WIN32 and WIN64 targets until we resolve some dependencies that appear to trigger an output race among make workers with particular build targets.
- fftk: Updated image files from 2015 version that had camel case filenames, which causes problems on case-insensitive filesystems.
- qmtool: Fixed bug for FFTK usage
- cv_dashboard: Corrected inconsistent cv_dashboard version numbers.
- MacOS 10.15 and 11.0 builds require Tcl/Tk 8.6 or later
- Updated version dependency comments for FLTK builds for MacOS X 10.15 (Catalina) and 11.0 (Big Sur).
- Added conditional compilation tests for ARM64 MacOS X builds checking for the compile-time macro ARCH_MACOSXARM64.
- Added MacOS X ARM64 targets
- Added build configuration for MACOSXARM64
- rnaview: Fix missing function prototype in rnaview
- Added build config for MacOS X hosts with ARM64 CPU architecture.
- Updated OptiX renderer to add support for ray statistics reporting for interactive display runs.
- Revised the low level VMDDisplayList clipping plane methods and higher-level Displayable clipping plane methods to eliminate Displayable methods from triggering _needUpdate scene regen/redraw updates unless: the clipping plane is being changed; or one of the the clipping plane plane properties is changed and the active clipping plane mode is currently active (non-zero).
- Increased the default threshold for forcing VMD to perform OptiX renderings in multiple accumulation buffer passes by a factor of 4x, and permit user override of the default threshold and behavior. The new code also uses the total number of primary aa samples and AO shadow feeler rays rather than only the number of primary ray aa samples, to provide better performance scaling on RTX hardware-accelerated GPUs going forward. Increasing the number of rays per launch significantly improves VMD's utilization of the RT cores on the latest hardware. With too small of the ray batch size per pass, VMD can't exploit the full hardware performance on the latest RTX cards.
- Changed the OptiX renderer's ray statistics buffer allocation code to specify RT_BUFFER_OUTPUT instead of (RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL), since we never write to these buffers on the host side.
- Implemented slight optimizations for OptiX ray statistics gathering among the various primary ray generation and shading kernels.
- Windows platforms return raw key state without processing key modifiers, so in order to provide the same behavior as X11, we process key modifiers ourselves w/ toupper()/tolower() calls, etc.
- psfgen: psfgen file format tag logic correction from Josh Vermaas.
- qmtool: Fixed bug with FFTK atomname Ow. The bug originated from a fix implemented in qmtool, regarding two-letter atomnames. In the original version they were being recognize by the first letter. For example, Chlorine (Cl) were considered as Carbon (C).
- Added conditional definition of NOMINMAX macro when compiling the OptiXRenderer code on Windows platform, as required by OptiX-internal headers.
- Changed order of header inclusion to ensure that low-level OptiX headers are included prior to system-provided headers, as the OptiX headers incorporate special handling, e.g., of min/max macros and related functions on the Windows platform, which is very sensitive to header ordering.
- Eliminate direct FileRenderList calls to OptiXRenderer class, to limit the scope of associated low-level OptiX header inclusion, which has particularly detailed ordering and macro definition requirements for Windows builds.
- Added a wrapper for OptiXRenderer::device_count() to limit overly broad inclusion of low level OptiX API headers by classes that needed to make calls into OptiXRenderer. On the Windows platform in particular, there are some thorny header file ordering and macro definition issues that must be satisfied, and while that is easy to do in OptiXRenderer and OptiXDisplayDevice, this rapidly gets out of hand if other classes start including those headers as well.
- Eliminated inclusion of OptiXRenderer definition since hardware enumeration is now done as part of FileRenderList and support for VCA rendering clusters has been removed previously.
- Added include of windows.h for interactive RT compilation on Windows platforms.
- Imported Tachyon glwin updates to permit compilation on win32/win64
- cranked version
VMD 1.9.4 alpha 49 (Nov 5, 2020)
- Changed implementation of spheretube color handling for inheritance of bulk "draw color colorID" type coloring mode. The implementation of colorID arrays needs rewriting so it will properly track updates to color tables ex-post-facto, and that will also impact the single-color color inheritance mode.
- Corrected handling of single-radius spheretube combinations
- Explicitly include CUDA runtime header with OptiX versions greater than 5.2, since differences in CUDA runtime headers among versions lead to cudaGetDeviceCount() being unprototyped in some cases but not others. This ensures it will always be properly prototyped.
- Updated the "spheretubes" draw command to accept a list of colorIDs rather than only fully-specified RGB colors. This makes it a more directly usable replacement for existing sphere-at-a-time scripts that follow a pattern of calling "draw color", then "draw sphere", "draw cone", etc.
- Added tcl_get_intarray() to pull in a tcl list as a 1-D flat array of ints
- Changed the intermediary OptiXDisplayDevice class to accumulate individual spheres from FileRenderer::sphere() calls into local buffers the same way that it already did for cylinders and particular triangle geometry. When the primitive count exceeds a threshold, the spheres are sent to OptiXRenderer as a sphere_array_color() primitive, which greatly reduces overheads. By overriding the FileRenderer::sphere() method, we eliminate sphere triangulation for scenes that include handfuls of user-drawn spheres, and for spheres drawn by plugins or the like, leading to higher quality and faster rendering.
- Added additional DSSP src URLs
- Added notes about AVX/AVX-512 clock rate reductions and high vector registers causing false dependencies except when specifically cleared, e.g., by calls to _mm256_zeroupper().
- Added support for the graphics "info" subcommands for spheretube primitives and eliminated various debugging console messages.
- Renamed "tube" drawing primitive to "spheretube" and added new flags and parameters to allow the use of per-sphere user-specified RGB colors, flag control over tube drawing, and either constant radius or radius-per-sphere drawing.
- Updated the VS2017 VMD project to add util_simd_AVX2.C and nr_jacobi.C
- Updated conditional compilation tests and logic for runtime CPU dispatched atom selection analysis code to ensure proper compilation on both x86 and ARM64 platforms, since VMD now supports runtime dispatch on both.
- cv_dashboard plugin: Updated the cv_dashboard plugin from the git master tagged as "vmd-1.9.4a49".
- colvars: Updated the collective variables module to the latest git version, which also bears the tag "vmd-1.9.4a49".
- rst7plugin: Implemented changes from Josh Vermaas to address incorrect parsing of some combinations of AMBER restart files, e.g. as written by AMBER 18. The new code should now properly parse restart files that either have velocities or do not, and that either have timestamp information, or not. All four of those combinations have been tested successfully with the new code, and a little bit of further code cleanup and commenting has been done since the new logic is a little more complex.
- Added runtime CPU dispatch for analyze_selection_aligned_avx2()
- Renamed MoleculeGraphics::find_sizes() to MoleculeGraphics::find_bounds() which is a more accurate and meaningful method name.
- Implemented MoleculeGraphics::find_sizes() for the new "tube" command.
- Changed the AVX atom selection loops to use the prior strategy for finding vector-aligned starting/ending indices due to bugs that showed up in testing with CPU dispatch enabled. Will revisit later.
- Added support for new "tube" graphics drawing primitive that renders a series of spheres connected by cones, with arbitrary numbers of vertices, per-vertex radii, and user specified polygonal representation resolution. Still an early prototypical implementation.
- Revised MoleculeGraphics to allow shapes to have an "extradata" field t permit the implementation new graphics "draw" commands that accept arbitrarily large vertex arrays, thereby making most efficient use of the underlying array-oriented DispCmd primitives, which minimizes the number of VMDDisplayList linked list nodes, and similarly reduces the total count of API calls all of the way through the rendering system to the lowest level rendering layers. The "extradata" field enables the most efficient storage and management of complex geometry, completely eliminating internal fragmentation of memory that normally occurs with the original primitive-at-a-time drawing approach. Added tracking of the active graphics drawing colorID so that batched geometry DispCmd calls that require rgb floating point buffers can be used more easily.
- Added new tcl_get_array() and tcl_get_vecarray() commands to ease parsing of large vertex arrays and arrays of per-primitive scalars for use in implementing new graphics primitives.
- Updated the cone display command API to take const vertex buffer parameters.
- Updated comments about the behavior of MoleculeGraphics::info_id()
- Protect x86-only runtime dispatch paths with appropriate compile-time ifdefs
- Added CUDA support and upgraded VS2017 to the latest VS2019 platform toolset versions, etc.
- Updated VS2019 project from pre-CUDA VS2017 project to simplify upgrade.
- Enabled CUDA on all of the Windows x64 build targets
- Added FastPBC to the windows builds, to match the current Unix builds.
- Eliminated non-portable variable sized stack allocated memory buffer.
- Eliminated timer code leftover from standalone PBC implementation
- Added explicit type conversions from ptrdiff_t sizes to int to greatly reduce MSVS compiler conversion warnings.
- Updated MSVS build flags and directories for release builds
- Pulled in Tachyon update to correct detection of FMA3 via x86 CPUID instruction tests.
- Use __align() variable declaration attribute for portability to MSVS
- Migrated the AVX-512 specific loops to a new runtime dispatch version of the QuickSurf kernels and removed them from the statically-launched code path.
- Updates for the SSE atom selection and statistics kernels to please current versions of MSVS, which fail to define SSE macros at compile time.
- Added explicit call to _mm_castsi128_ps(mask) to please MSVS compilers.
- Updates for the SSE molecular orbital kernels to please current versions of MSVS, which fail to define SSE macros at compile time, and that no longer support some of the oldest MMX intrinsics and _m64 types when compiling in 64-bit mode.
- Enabled runtime CPU dispatch for AVX, AVX2, AVX-512, and AVX-512ER in the VS2017 build rules.
- Added links to current ARM ACLE SVE documentation used to develop the first variable vector length SVE kernels.
- Begain implementing ARM64 SVE vectorized loops for high performance analytical routines.
- Protect CPU capability data structure w/ conditional compilation tests for runtime CPU dispatch.
- Moved CPU hypervisor detection and reporting into x86 block since we don't yet have the equivalent capability on ARM hardware.
- Added runtime dispatch reporting of ARM64 SVE hardware vector lengths for 32-bit and 64-bit types.
- Added prototypes for ARM SVE runtime dispatch SVE vector lengh query helper routines.
- Added ARM SVE general vectorized kernels and helper routines to the build.
- Revised the configure script to support runtime CPU dispatch on ARM64 SVE platforms. Replaced X86AVXDISPATCH with a generic CPUDISPATCH option, and rewrote all of the build config macros so they can work with both x86 and ARM hardware.
- Updated CPU feature detection reporting code for ARM64 platforms, and pulled in latest updates from Tachyon.
- Added first ARM64 CPU feature detection console diagnostics to indicate the availability of Neon and SVE vector instructions, among others.
- Pulled in corrections for ARM64 CPU feature detection from Tachyon.
- Added initial platform- and OS-specific conditional compilation macros for runtime CPU feature detection on ARM64 hardware targets. The initial ARM64 implementation makes use of the Linux getauxval() API.
- Added include of stddef.h for definition of ptrdiff_t
- Simplified conditional compilation macros for runtime CPU dispatch kernel source files.
- Greatly simplified conditional compilation macros, include files, and ifdefs for runtime CPU dispatch and statically-launched SIMD kernels.
- Updated CPU dispatch build configuration to add compile-time definition of VMDCPUDISPATCH, to make conditional compilation on a wide variety of hardware and compiler compilations substantially simpler.
- Eliminated statically-launched QuickSurf AVX2 kernels in favor of runtime CPU dispatch.
- Removed the previous static-launch code paths for AVX-512F and AVX-512ER molecular orbital kernels.
- Replaced the use of long types in CPU QuickSurf algorithm with ptrdiff_t and size_t for x64 64-bit Windows builds. Updated SIMD routines conditional compilation macros and tests for use with MSVS 2017.
- Updated SIMD routines conditional compilation macros and tests for use with MSVS 2017, and replaced the use of long types with ptrdiff_t and size_t for x64 64-bit Windows builds.
- Pulled in CPU feature detection updates from Tachyon.
- Updated VS2017 debugging targets
- Updated all of the volumetric data representations and associated data structures, replacing the use of long types with ptrdiff_t and size_t to get correct behavior on LLP64 platforms such as Windows x64 64-bit builds.
- Added conditional compilation for _WIN64 (and any other LLP64 platforms) to add support for size_t and ptrdiff_t Inform output, but avoid duplication of long types on LP64 platforms (Unix/Linux) where ptrdiff_t and size_t are defined as long types.
- Corrected a C++ism that got into the MSM code while eliminating visual studio warnings.
- Rewrote command line argument passing for Python 3.x, since it requires explicit translation to wide characters. The new code processes incoming argc/argv by converting each argument to null terminated wchar_t strings using Py_DecodeLocale() and passing them into PySys_SetArgv(). Added placeholder call to Py_SetProgramName(), however it is disabled since it doesn't appear to be particularly beneficial or needed at this time.
- Minor rewrite of default_mass() method for improved performance on large structures, and to address JC Gumbart's patch to improve default mass assignments for iron, fluorine, iodine, bromine, potassium, etc.
- endianswap: Changed use of "long" integer types to "ptrdiff_t" for portability to 64-bit Windows.
- dcdplugin: Changed use of "long" integer types to "ptrdiff_t" for portability to 64-bit Windows. Since this plugin uses fastio.h, there were no changes for _WIN64 conditional compilation for calls to fseek/ftell type APIs, but we'll need to test and do any updates in fastio.h if the Windows-specific I/O APIs aren't sufficient as-is.
- jsplugin: Changed use of "long" integer types to "ptrdiff_t" for portability to 64-bit Windows. Since this plugin uses fastio.h, there were no changes for _WIN64 conditional compilation for calls to fseek/ftell type APIs, but we'll need to test and do any updates in fastio.h if the Windows-specific I/O APIs aren't sufficient as-is.
- ccp4plugin: Ensure initialization of some local variables.
- ccp4plugin: Changed use of "long" integer types to "ptrdiff_t" for portability to 64-bit Windows. Added conditional compilation checks for _WIN64 to force the use of Windows-specific _fseeki64() and _ftelli64() when working with potentially large files.
- Updated unsigned integer pointer type conditional compilation rule for 64-bit Windows builds.
- Use parallel make on 'sphere' MSVC builds.
- Set correct VS2017 paths for both x86 and x64 targets on 'sphere'
- Updates for standalone cygwin-based plugin compilation w/ VS2017 on 'sphere'
- Updated WIN64 target compilation and linkage flags.
- Updated plugin build rules to properly compile VMD C/C++ plugins for 64-bit Windows platforms.
- Updated conditional compilation checks for _WIN64 builds.
- Corrected formatting of plugin loader warning messages.
- Installer script src for the Nullsoft Scriptable Install System (NSIS)
- Revised the VS2017+VS2019 project name to "vmd" from "winvmd" to better match the expectations of VS2017 and VS2019, so there aren't warnings about the auto-generated output filename vs. the default macro-generated filenames. Cleaned out now-unused libraries such as libGLU which has been superceded by built-in code in VMD itself. Eliminated out-of-date compiler flags that remained from initial project conversion process. Added basic x64 targets and assumed dependency paths.
- Added explicit typecasts to the GROUP_T template type to eliminate MSVS warnings.
- Improved consistency of floating point expressions containing mixed types with explicit type conversions. These changes address a variety of type conversion warnings from MSVS.
- Added and enabled collective variables in the VS2017 IDE builds
- Added VS2017 and VS2019 IDE project files
- cranked version
VMD 1.9.4 alpha 48 (October 13, 2020)
- Corrected a comment line that was interfering with the correct behavior of the TkCon initialization routine.
- cranked version
VMD 1.9.4 alpha 47 (October 12, 2020)
- Added implementation of the new drawpixels_rgba4u() method used for video streaming etc.
- extra language protection ifdefs
- Please current revs of MSVS
- Added sincosf() and sincos() compatibility macros for windows compilers.
- Updated Win32 display code with the updated framebuffer readback routines.
- Switch to use of VMD_PI instead of M_PI for windows compilers.
- jsplugin: Added missing ifdef for conditional compilation of standalone test binary with GPU-Direct Storage support.
- Added conditional compilation support for assignment of VMD-generated representation string tags to ANARI scene hierarchy objects via "name" tags, as implemented by the current developmental USD back-end. USD uses these name tags not only to name scene components, but also to generate underlying pathnames in its scene directory structure, so there are some noteworthy limitations on what name strings are legal. In particular, a variety of characters aren't safe for use in the name tags as they have special meaning in pathnames, e.g., "/", ":", and so on. Further, the USD library used by the back-end has some name uniqueness expectations that may need to be met by appending or incorporating integer counters as suffixes to ensure that multiple scene objects are assigned globally unique names. Ultimately, there is likely an opportunity here for VMD to better by storing additional GUI-safe and/or USD-safe strings to meet the restrictions in USD, and to cause external tools that use such tags to display better in graphical interfaces.
- Allow VMD to set background color for the ANARI/USD back-end.
- molfile plugins: Added dependency on molfilelibs for the non-win32 targets to prevent a race condition with building the libmolfile_plugin.a dependency that the standalone binaries expect to link against.
- fftk: tiny fix if segname is missing when clicking "analyze input" in BuildPar
- Updated ANARI renderer with initial support and workarounds for the early limitations of the ANARI back-end targeting Pixar's USD (universal scene description) format.
- Updated to current ANARI API, eliminating the renderer parameter to anariNewMaterial().
- Ensure release of the active device so that OSPRay 2.x doesn't crash during shutdown in the OpenVKL layer.
- bfeestimator: Updated binding free energy estimator version tags.
- cranked version
VMD 1.9.4 alpha 46 (September 5, 2020)
- vmdtkcon: Disable multi-tab associated interpreter creation/deletion buttons since these are totally disjoint from the VMD Tcl interpreter and would just cause trouble otherwise. Hid other buttons that are either confusing or not particularly functional in the context of the VMD plugin use case. Updated the 'vmdtkcon' plugin version to 2.0, and updated build scripts and plumbing to correctly launch the new code built on TkCon 2.7. Added extra initialization check since the new TkCon 2.7 ::AtSource method doesn't trigger initialization when run as a VMD plugin. Reapplied previous VMD plugin modifications to the base Tcl/Tk 8.6-compatible version of TkCon 2.7.3. Began import of Tcl/Tk 8.6-compatible version of TkCon 2.7 to ensure correct behavior on MacOS X Catalina, and other current VMD builds. Added the original Tcl/Tk 8.6-compatible TkCon 2.7.3 to the tree for long-term reference. Updated comments with more details about modifications required for operation within the VMD plugin system.
- fftk: Incorporated major FFTK updates from Andrew Pang and J.C. Gumbart, extending it with support for lone pair (LP) particles, as used in recent versions of CGenFF. The new FFTK also adds support for improper fitting, a highly user-requested feature. Since the prior commits had not yet taken on major version number update, the major version was updated to 2, and as of these changes, the minor version is now 1, so all of the Makefile and Tcl package version tags and documentation notes are now updated to 2.1 as they should be.
- namdplot: Updated NAMDPlot plugin for NAMD 3.0 (and typos in early versions)
- fftk: Fixed bug with bond parametrization using ORCA
- multiplot: Added new procs based on qwikmd molecular orbital plot plotOrbitals, addQMOrbDeltaLine and deletePlotLines At the moment, only QMtool uses this.
- molefacture: Changes to include QMtool compatibility Added "File | Export to QMtool" functionality based on "File | Apply changes to parent" case. Modified "Apply changes to parent" to include call from QMtool to edit structure. In both cases, Molefacture closes and raises QMtool.
- qmtool: New qmtool 2.0 plugin. Separated the gui code into qmtool_gui.tcl file. Added specific QM code that deal with input generation to the files qmtool_gamess.tcl, qmtool_gaussian.tcl and qmtool_orca.tcl Molecular orbital analysis was taken from QwikMD and rewritten into multiplot (i.e. MultiPlot::plotOrbitals)
- Check for FMA3 instructions before dispatch of the AVX2+FMA QuickSurf kernels.
- Enabled runtime dispatch for molecular orbital kernels on AVX-512F CPUs.
- Completed AVX-512F version of the molecular orbital kernel. On an Intel i7-9800X, runtime dispatch performance for AVX-512F (with fully populated memory system) is roughly 3x faster than the 4-way vectorized SSE kernel.
- Streamlined QuickSurf vmd_gaussdensity_avx2() AVX2 code path for CPU runtime dispatch, and corrected compilation macro tests to respect runtime dispatch mechanism.
- Corrected bytes_next_alignment() helper to use the alignment size parameter rather than the hard-coded 32-byte AVX-specific alignment.
- Added support for runtime CPU dispatch of hand-vectorized MO kernels.
- Added support for runtime dispatch of Intel x86 AVX-512 (subsets F,VL,BW,DQ,CD) suited for descendants of Skylake Xeon, and AVX-512ER (subsets F,CD and and ER) suited for KNL Xeon Phi and any future CPUs supporting the exponential/reciprocal instructions used therein.
- Rewrote analyze_selection_aligned_dispatch_avx() to calculate aligned array index offsets from low level pointer arithmetic, rather than looping and testing for alignment within the loop itself. Added and tested an AVX2 loop for counting selected atoms, for an eventual AVX2 dispatch variant.
- Began importing standalone benchmark routines for atom selection and volume analysis routines, particularly the SIMD acceleration routines for which runtime CPU dispatch algorithms exist, to allow in-situ verification of performance.
- Updated clean target to eliminate SIMD runtime-dispatch object files.
- Added initial build system support for runtime dispatch of AVX, AVX2, and AVX-512 SIMD instruction set extensions on x86 platforms, with the new configuration flag X86AVXDISPATCH.
- Updated main QuickSurf CPU routines to support runtime-dispatch to hand-coded AVX2 kernsl on CPUs with AVX2 SIMD instruction support.
- Added runtime-dispatch versions of SIMD kernels for atom selection and QuickSurf / MDFF density map computation.
- Added analyze_selection_aligned_dispatch() with support for runtime dispatch for AVX-capable CPUs.
- Added AVX 8-way horizontal add helper routine
- Slight optimization of SSE2 selection count loop in analyze_selection_aligned(), to favor faster vertical additions within the innermost loop, followed by a horizontal add at the very end.
- Modified SymbolTable and ParseTree classes to cache the VMDApp pointer to access global CPU capability bit flags, enabling them to be used for runtime dispatch of CPU-specific versions of performance critical routines, such as x86 SSE, AVX, or AVX-512 vector instructions for atom selections. By caching CPU capability flags globally, there's no need to re-query on-the-fly, thereby ensuring near zero-overhead launch cost, so that the fast CPU-specific routines benefit both large and small problem sizes alike, without the overhead of waking thread pools or launching additional CPU threads.
- Eliminated the old atom selection bound determination loop that has been replaced by hand-coded SIMD routines for some time now.
- Migrated enumeration of host CPU count, memory availability and capacity, and optional instruction set extensions and SIMD vector support to VMDApp initialization where similar things are also done for CUDA and the various renderers. The new CPU capability queries are now functional across multiple platforms and compilers, enabling them to be used for runtime dispatch of CPU-specific versions of performance critical routines, such as x86 SSE, AVX, or AVX-512 vector instructions for atom selections, volumetric data analysis, MDFF simulated density map calculations, and QuickSurf representations, just as a few examples. The fastest CPU-specific atom selection routines provide performance levels that are bound by memory bandwidth, even with only a single thread. By caching CPU capability flags globally, there's no need to re-query on-the-fly, thereby ensuring near zero-overhead launch cost, so that the fast CPU-specific routines benefit both large and small problem sizes alike, without the overhead of waking thread pools or launching additional CPU threads.
- Explicitly initialize spline matrix to zeros.
- added struct tag for improved forward declarations in C++
- Added vmd_mpi_nodeinfo() to query only the nodecount and node rank without performing the full hardware scan. This allows VMD to make good choices about CPU capability console messages when initialized as part of VMDApp rather than in earlier startup phases.
- Added VMDApp pointer to atom selection constructor and subclasses, to facilitate access to global data structures that enumerate CPU instruction set extensions and/or other hardware special features required for effective use of runtime dispatch for hand-coded SIMD loops for performance critical atom selection operations. This allows previously cached CPUID results and data structures to be provided as parameters to leaf node atom selection operators.
- Added detection and reporting of IEEE 16-bit floating point conversion instructions, and VM/hypervisor execution environment.
- Replaced individual calls to find_first_selection_aligned() and find_last_selection_aligned() with analyze_selection_aligned() to facilitate implementation of runtime CPU dispatch optimizations, e.g., for x86 SSE2, AVX, AVX-512 vector instructions.
- Modified analyze_selection_aligned() to allow it to universally replace individual calls to find_first_selection_aligned() and find_last_selection_aligned(), thereby making is far simpler to implement runtime CPU dispatch, e.g., for specific AVX-512, AVX, and SSE implementations of the selection loops.
- Updated vector intrinsic conditional compilation checks for Clang versions below 5.x, after checking with compiler explorer (http://godbolt.org/).
- Make SSE _mm_load_pd1 workaround Clang-specific since it's not really related to MacOS X specifically.
- Handle older revs of clang on both MacOS X and other platforms
- Pulled in Tachyon updates to ensure x86 max CPUID function codes parameters are queried and checked in case a primitive CPU is encountered at runtime.
- Pulled in Tachyon CPU ID routines to use for runtime dispatch of CPU-specific SIMD vector code, detection of SMT depth, etc. New version favors inline x86 assembly instead of using compiler/runtime provided feature queries, since the direct use and parsing of x86 CPUID assembly output provides much more detail.
- Report availability of x86 SSE 4.1 instructions and hyperthreading.
- Pulled in Tachyon CPU ID routines to use for runtime dispatch of CPU-specific SIMD vector code, detection of SMT depth, etc. New version favors inline x86 assembly instead of using compiler/runtime provided feature queries, since the direct use and parsing of x86 CPUID assembly output provides much more detail.
- Added conditional compilation checks for CPU capability flag queries protecting OSPRay for its SSE 4.1 dependency.
- Changed the ANARI startup to fall back to the "reference" device by default.
- Synced with latest CPU capability query updates from Tachyon
- Automatically disable OSPRay renderers when SSE 4.1 instructions are known not to be available. This prevents runtime crashes on host machines that lack SSE 4.1, particularly some low-end netbooks, etc.
- Updates for SSE 4.1 detection from Tachyon threads lib. Assist with runtime reqs for Intel RT libs.
- cranked version
VMD 1.9.4 alpha 45 (July 16, 2020)
- When _CONDA_ROOT is detected, VMD will use it to locate Python 3.7 libs.
- Added OSPRay 2.x renderer mode flag to switch between sci-vis and pathtracer
- Updated OSPRay 2.x renderer error/status callbacks for OSPRay 2.2.0
- Allow runtime override of ANARI renderer registration in the GUI, for debugging of issues with the OSPRay 2.x pass-through back-end, and others that may similarly encounter issues with multiple instantiation init/shutdown.
- Updated the ANARI renderer for the latest API in version 58abf849. This update adds the device parameter to all APIs, and corrects the device release behavior. Basic testing was done with the OSPRay 2.x pass-through device.
- bfeestimator: Updated binding free energy estimator plugin w/ new version from Chris Chipot and collaborators.
- ccp4plugin: Updated MRC/CCP4 plugin to emit extended header tags stored by some commercial microscopy tools in the console messages to help with identifying unusual proprietary file formats. Inhibit printing of MRC symmetry records when the IMOD stamp has been identified.
- topotools: Merged Axel's latest topotools updates.
- Propagated unification of multi-frame and single-frame I/O code paths back to the QCP code.
- Explicitly clear stack allocated string buffers.
- Added runtime override of frame/file distribution strategy for GDS I/O benchmarking.
- Unified common parts of single- and multi-frame trajectory I/O offset and size calculations to streamline the code and improve readability.
- Ensure counts of residues/fragments/etc are initialized to zero since the current colvars code lacks safety checks for existence or initialization of atomic coordinates, structure analysis results, etc.
- Added __restrict__ keyword to QCP inner product kernel parameters to prevent compiler assumed pointer aliasing.
- Incorporated GPU-Direct Storage benchmarking loops into the 'vmdbench' commands. Currently performance data are only printed on the console and are not yet emitted as script-accessible output. Needs a little more cleanup since it was originally situated within the CUDA QCP algorithm for testing and verification of GDS performance early on.
- Allow jsplugin to be included into other source files with file scope linkage for early testing of out-of-core APIs in multiple parts of VMD.
- Eliminated old GPU-Direct Storage alpha APIs that have been superceded in subsequent beta releases.
- An initial implementation of a VMD FileRenderer subclass for the first prototype Khronos ANARI rendering interface. Ongoing revisions to the interface will require significant revisions until the ANARI standard is finalized.
- Declare trimesh_c4u_n3b_v3f() normal parameters explicitly as signed char since some compilers (ARM, POWER) default to unsigned types for char when not declared explicitly.
- Moved ANARI FileRenderer initialization prior to OptiX 6.x initialization to prevent conflicts with ANARI back-end OpenGL/EGL context creation performed within the main VMD thread. This should become unnecessary beginning with OptiX 7.x.
- OSPRay 2.x: Commented out "ka" material property currently ignored by the OBJ material.
- Enable use of __restrict__ in MDFF, QuickSurf, MC CUDA kernel parameters.
- topotools: Updated Josh Vermaas' current email and corrected spelling typos.
- jsplugin: Corrected missing check for verbose console output state.
- Corrected placement of __restrict__ keyword for OptiX hardware triangle degeneracy safety check hwtri_test_calc_Ngeom()
- Added runtime checks for ARM64 NVML shared library installation locations
- Free OptiX GPU device list returned from device_list().
- Added linux.arm64.egl target for ORNL Wombat and similar systems
- qwikmd: Fixed 'pack' vs 'grid' problem. Requires MacOS testing
- Eliminated detailed timers from OSPRay 2.x rendering loop.
- Updated OSPRay 2.x renderer to use the 'pathtracer' renderer by default until such time as the 'scivis' renderer implements lighting and material properties completely.
- Corrected type enums passed to ospSetParam() for several data items and for materials.
- Ensure use of correct OSP_DATA parameter to ospSetParam() calls that set geometry buffers.
- Corrected OSPRay 2.x camera parameter assignments in interactive renderer
- Added ANARI renderer to the UIs when available.
- Hardened the grid search routines to ensure graceful error handling in cases where an atomic coordinate takes on the value of NaN, causing grid bounds and other calculations to yield unusable results.
- Changed NVML wrapper to try locating both libnvidia-ml.so as well as libnvidia-ml.so.1, since some container images are missing the standard symlinks to the .so.xxx versions.
- cranked version
VMD 1.9.4 alpha 44 (June 18, 2020)
- Hardened QuickSurf against NaN-valued atomic coordinates by checking the results of the bounding box and grid size determination and terminating early if bogus grid sizes or bounding coordinates are found.
- autopsf: Modified GUI to remove 'pack' and replace it with 'grid' Tested in linux, need to test in new macos with tcl8.6
- Corrected a filename string leak in VMDTempFile class used for NanoShaper and MSMS.
- colvars: Updated to version 4036bfd947763ec015b3150e22624213341548a0
- fftk: Fixed pattern matching for Gaussian 09.
- cranked version
VMD 1.9.4 alpha 43 (June 11, 2020)
- Silence OptiX shared library loading errors in cases where CUDA is also unavailable. Going forward with OptiX 7, CUDA becomes a prerequisite, so there's no point in emitting an OptiX-specific error if CUDA is also unavailable.
- Ensure initialization of page-aligned I/O flags in CoorPluginData, and correctly set the page alignment flag to indicate byte-aligned rather than page-aligned I/O for molfile plugins that don't implement page alignment APIs.
- Localized scope of intermediate variables used for scanning CUDA NVLink P2P connectivity and topology determination during startup.
- Corrected conditional compilation tests for RTX triangle geometry destruction code path, and added destruction handling for a loose geometrytrianglesgroup object.
- Disabled the use of OptiX RTX-specific AS build optimization that frees the incoming vertex/index buffers during AS builds. Performance had been observed to suffer during tight rendering loops after a few tens of iterations. Conditional compilation now triggers the use of either rtGeometryTrianglesSetBuildFlags(..., RT_GEOMETRY_BUILD_FLAG_RELEASE_BUFFERS) or VMD's own buffer deallocation management. The new code defaults to manual buffer deallocation, which has a much more repeatable performance over thousands of iterations. Corrected a missing call to destroy the RTX triangle-specific acceleration structure.
- Added build configuration for ANARI renderer implementation.
- Ongoing cleanup of OSPRay 2.x renderer back-end. Updated scivis renderer parameter strings for changes from v1.x to v2.x API. Eliminated scene epsilon since it has been eliminated in OSPRay 2.x. Updated console status messages regarding shadow rendering since OSPRay 2.x has no way to render scenes without shadows presently.
- Eliminated shaderpath member since current OSPRay rendering back-ends aren't using it. If/when we have OSPRay extension modules this may need to be revived.
- Eliminated slim context management methods not needed for OSPRay 2.x.
- Eliminated slim context management methods not needed for OSPRay 1.x.
- Moved the OSPRay 2.x error callback immediately following context creation.
- Updated OSPRay 2.x error/status callbacks and log levels, commented out material parameters unsupported by current 'scivis' renderer, etc.
- Made a placeholder cylinder implementation with creative abuse of the OSPRay 2.x "curve" geometry. Unfortunately, while the OSP_LINEAR type works for generating cylindrical spans from pairs of consecutive control points, when combined with the OSP_ROUND curve type, OSPRay inserts spheres at the ends when they are not wanted.
- Misc cleanup and updates to OSPRay2Renderer, with an attempt at using the OSPRay "curve" primitive as a cylinder stand-in.
- Found that OSPRay 2.x allows color-per-primitive (and material-per-prim) by setting properties on the containing GeometricModel object.
- Added direct implementation of ospSetObjectAsData() equivalents for potential use later in ANARI prototyping.
- Favor calling ospSetParam() rather than ospSetObject() wrapper. It might be desirable to cut the use of the various wrappers altogether and use ospSetParam() everywhere.
- Added early draft implementation of OSPRay 2.x renderer. OSPRay versions 2.1.1 and earlier are deficient for molecular graphics due to the lack of per-sphere colors, lack of cylinder geometry, and a few others minor issues. Hopefully these can be addressed in subsequent versions, allowing this early draft implementation to be turned into something truly usable.
- OSPRay versions 2.1.1 and earlier lack cylinder primitives, so VMD has to use the fall-back geometry path for cylinders for the time being.
- Added first version of OSPRay 2.x DisplayDevice subclass.
- Added conditional compilation and registration of OSPRay 2.x renderer.
- Updated the configure script to properly handle OSPRay w/ MacOS X.
- Added OSPRay 2.x build configuration support.
- cranked version
VMD 1.9.4 alpha 42 (May 29, 2020)
- Drop support for the use of NVTX v2 found in older versions of CUDA prior to CUDA 10.0
- Added include of utilities.h in GraphLayout to take care of linking the right sincosf() math routines on MacOS X.
- Updated topotools plugin with Axel's latest v1.8 package.
- Prevent division-by-zero for coincident points in non-optimized Fruchterman-Reingold code path.
- Added Tcl bindings to allow user control over the distance_epsilon parameter for the Fruchterman-Reingold implementation. For reference and precision experiments, added a less optimized version of the force calculations via conditional compilation.
- Continued cleanup of F-R graph layout implementation, and added Tcl bindings to override default area, K scaling factor, and temperature scaling factor. Apply linear temperature scale factor after applying quadratic curve.
- Allow user-generated weight matrix to be passed in with optional -weights parameter, currently only supporting fully-connected graphs needed for clustering analysis.
- Added graph layout class and commands to the build.
- Added Tcl bindings for graph layout commands.
- Added graph layout commands used for plotting clustering analysis results.
- A basic implementation of a variation of the Fruchterman-Reingold graph layout algorithm, initially intended for interactive visualization of results from molecular dynamics clustering analysis. The modified algorithm accepts a weight matrix, and has special case efficient handling for fully-connected graphs arising from clustering analysis with implicit rather than explicit edge connectivity.
- Added fallback pure-Tcl version of the built-in "lmap" command for Tcl versions prior to 8.6 that lack it.
- Added further conditional compilation and source comments for the the latest GDS beta.
- Updated the GPU-Direct Storage test code for beta API 0.7.
- Added docs for the new cellaxes, volume, integral, and mean suboptions to the voltools command.
- Misc cleanup of linenoise and tecla line editor constructor implementations
- Added voltool subcommands to obtain the cell axes, volume, and integral per Giacomo's patch.
- Added double precision version of cell_axes() to facilitate a double precision implementation of cell_volume().
- Added cell_volume() and integral() methods per Giacomo's patch, for improved simulation analysis.
- Use forward declaration to eliminate type conversion hack, since the tecla object is a pointer anyway.
- Completed the command completion callback for Tecla command line editing.
- Added skeleton of tecla command completion callback
- Changed the default behavior of VMD such that if 'rlwrap' was detected and used externally, any built-in line editor(s) are automatically disabled, without the need to enumerate each one independently.
- Added rudimentary line editing support using tecla, a very featureful library that supports more customizability and incidentally implements fully non-blocking character-at-a-time I/O with special TTY handling routines to set and reset raw TTY modes. It also has caller-provided context pointers in its callback APIs and is designed to allow it to be driven by an external event loop like VMD's.
- Further modifications were made to the linenoise Unix TTY handling code to allow VMD to control TTY buffer flushes when switching between raw and cooked TTY mode. This is required so that VMD can switch to raw mode for character-at-a-time input, so that linenoise doesn't have blocking behavior that prevents the VMD main loop from free-running. With these changes, VMD free runs except when in actual command editing. Correct handling of VMD console output is made more complex by entry and return from raw TTY mode, but it is a much more usable scenario for the end user.
- Eliminate spurious compiler warnings from certain GCC versions by moving the Inform class temporary buffer onto the stack.
- wrap unused VMDApp pointer with conditional compilation until it's needed for profiling bindings.
- Eliminate warning in verbose console output/debug code for video streaming.
- Eliminated disused state variables that were previously required to support remote rendering and compositing using the VCA APIs.
- Implemented runtime-generated command completion lists for VMD's Tcl bindings through Tcl introspection on the list of commands and procs in the global namespace. Subsequent list filtering is used to eliminate candidates such as Tk GUI registration procs that are only called once during startup, and are not intended to be called by the user. Modified linenoise from the standard distribution by adding user context parameters for each of the callbacks, so that the caller can pass in runtime-generated lists of command completion strings without the use of global variables or other undesirable methods.
- Added a new function to auto-generate a complete list of potential tab-completion strings for use in interactive line editing. The function uses Tcl introspection to fetch the current command list in the global namespace, and filters out special commands such as those associated with one-time Tk GUI registration operations that shouldn't be matched.
- Emit startup messages to indicate when internal console command line editing is enabled or disabled, and whether by user or by virtue of external rlwrap usage.
- additionally set environment variable VMDRLWRAPINUSE so VMD can emit more informative startup messages when internal line editing is auto-disabled.
- Added support for optionally built-in linenoise line editor
- Added minimalistic 'linenoise' line editing as an optional built-in VMD line editing component.
- Allow users to disable the autodetection and use of rlwrap by checking for the existence of a VMDNORLWRAP environment variable.
- Integrated interactive command line editing via 'linenoise' into the pre-existing Tcl command processing loop, re-enabling lockouts for non-TTY console, MPI, and other special cases, along with runtime user override and an appropriate console startup info message.
- Initial support for built-in command line editing similar to GNU readline, but based off of the BSD-licensed 'linenoise' library, which is a clean, minimalistic library providing just the core line editing features most VMD users want/expect, with support for all of the mainstream OS versions and terminals. For Windows and/or better UTF-8 support, we may need to look at the 'linenoise-ng' fork, or other libs.
- Added comment about Tcl command history handling
- Eliminated use of old long-deprecated Tcl_Write() APIs, and similarly anachronistic code paths.
- Eliminated old Tk 8.4.x code path for creating image objects.
- Updated source comments to clarify the expectations for volumetric processing routines best included in the VolumetricData class (fully-general support for non-orthorhombic volumes with non-uniform spacing along the three axes), vs. those that only support orthorhombic volumes with uniform grid spacing on all three axes, and task-specific functions (e.g. for MDFF), which are included in the Voltools class.
- Misc auditing and cleanup of "measure volinterior", eliminated per-voxel division by explicitly multiplying by reciprocal.
- heatmapper: Minor stylistic cleanup revisions in preparation for Tcl 8.6 grid vs. pack widget layout corrections.
- Added initial Python bindings for qcp_rmsdmat
- Corrected self-RMSD values since they are currently included in the output.
- Updated configure script for NVTX v3.
- Assume CUDA 10.2 minimum, soon we will roll forward to CUDA 11.0.
- Switch build scripts to OSPRay 1.8.5 and later.
- torsionplot: Revised torsionplot plugin GUI to eliminate grid/pack conflicts in Tk 8.6. Cranked plugin version to 1.2. Further testing and GUI optimizations should be done, but this version now runs on Tk 8.6 without errors thus far.
- chirality: Revised chirality plugin GUI to eliminate grid/pack conflicts in Tk 8.6. Cranked plugin version to 1.4. Further testing and GUI optimizations should be done, but this version now runs on Tk 8.6 without errors thus far.
- cispeptide: Revised cispeptide plugin GUI to eliminate grid/pack conflicts in Tk 8.6. Cranked plugin version to 1.4. Further testing and GUI optimizations should be done, but this version now runs on Tk 8.6 without errors thus far.
- clonerep: Revised the clonerep plugin to correct conflicting use of Tk pack/grid layout managers on the same master container. It was never technically legal to mix the use of grid and pack within the same master as they can fight over layout control potentially leading to deadlock. Tk 8.6.x and later versions now explicitly detect this type of misuse of grid/pack and throw a runtime error. The problem can most easily be rectified by ensuring that pack and grid are applied to independent frame containers.
- Updated membranemixer docs for the updated 'grid' based widget layout manager.
- Localize the only use of tmpnam() and tempnam() calls into a single helper function.
- Eliminate compilation of the old CoreFoundation / Carbon APIs for MacOS X application bundle introspection when compiling for 64-bit MacOS X.
- docs: Updated docs with new section for 'measure volinterior'
- Added membranemixer to the top level plugin docs page
- membranemixer: Updated membranemixer plugin to address the problem with mixing the "pack" and "grid" Tk widget managers by rewriting using only "grid".
- psfgen: Fix hyperlinks for importing into NAMD User's Guide
- psfgen: Needed to remove the newline in PSFGENAUTHORS definition to build NAMD ug
- Updated console output for SUMMIT plugin compilations
- Compiled STRIDE for Catalina, patched shared library dependency paths in CatDCD, Cionize, Tcl/Tk 8.6.10 script updates, etc.
- Added membranemixer to the extension menu GUI
- membranemixer: Added membranemixer to the build
- membranemixer: Added docs for membranemixer plugin
- membranemixer: first rev of membranemixer plugin added to the tree.
- Corrected the FltkOpenGLDisplayDevice reshape() and resize() methods for High-DPI usage with Apple Retina displays.
- Added static OSPRay_Global_Shutdown() methods to ensure clean OSPRay shutdown. On MacOS X Catalina this prevents a crash at shutdown.
- Added conditional compilation for the interactive OSPRay renderer, since the interactive OSPRay display window doesn't have a MacOS X-native or FLTK-specific variant.
- qwikmd: Added arrow_right.gif and arrow_down.gif icons to the build.
- cranked version
VMD 1.9.4 alpha 41 (April 17, 2020)
- Allow runtime user-override of FLTK high-DPI OpenGL/Widget support.
- Added High-DPI support startup message when available.
- Updated plugin build script for local compilations on 'tijuana' test box.
- Use framework linkage of Tcl/Tk for 64-bit MacOS X builds
- Improved use of ResizeArray extend() and set_size() methods to achieve zero-copy semantics and minimal control overhead in tight loops that interactively build geometry buffers for display.
- Added "expert use only" annotations for the new extend() and set_size() methods that facilitate a more efficient idiom for appending huge numbers of coordintes, vertices, colors, and face index sets to ResizeArrays that store interactively generated geometry buffers for graphics display, with zero-copy semantics. When used correctly, these expert-only methods yield zero-copy semantics and avoid per-item control overhead, increasing performance of tight loops by up to 3x or more vs. append() methods. While batched appendN() methods outperform a single append, the bounds and allocation checking overhead and copy semantics of the append methods are too slow for performance-critical loops. Updated the append2() method to extend by the minimal size.
- Corrected FLTK version test macros for high-DPI / retina display support, and screen size pixel calculations.
- Protect CUDA QCP kernel path with conditional compilation. The code still needs proper CPU fallback logic.
- orcaplugin: Added parsing of SCF energies. Based on gamess plugin, required for new QMtool plugin.
- Eliminated all parameters, methods, and OptiX context management code associated with the old VCA APIs that are no longer supported in current revs of OptiX.
- Corrected qcp rmsdmat framecount when no last frame is provided explicitly, per Josh Vermaas' patch.
- Implemented per-warp reduction scheme for find_max_values() kernel to accelerate cases where the number of segment groups is smaller than the number of voxels, but larger than the maximum number that can be handled by the kernels that do reductions into shared memory.
- Eliminated the last remnants of the old VMDThreads implementation now that the wkf_thread_barrier_init_proc_shared() API has been added to the WKFThreads implementation.
- Updated doxygen tags to exploit new features.
- cranked version
VMD 1.9.4 alpha 40 (March 14, 2020)
- Merged build script section changes for LINUXARM64 from ORNL 'wombat' cluster and re-tested.
- psfgen: Moved PSFGENLOGFILE and VPBONDS to the psfgen_data data structure. PSFGENLOGFILE and VPBONDS, like the counters of this data structure, are only used and defined in the first position. This way, only one definition is valid for all psfgen_context instances.
- Fix the new psfgen code to allow multiple patches to the same residue. This makes constant-ph run correctly. A new test, test_constantPH.tcl, was added to the automated suite to check these capabilities.
- multiseq: Corrected multiseq distribution version number to match the internal versioning scheme which is also used for access to the web-based versions.txt files that index all of the other datbases.
- Ensure that all buffers allocated by the CUDAWrapNVML APIs are freed when the context handle is destroyed.
- Corrected buffer extension logic for lit points volume rendering rep.
- Corrected buffer resize logic for volumetric point rendering rep.
- orcaplugin: Correct flawed string handling in Orca plugin.
- psfgen: Print the message "Info: generating structure..." in a new line
- psfgen: Recognize the "PATCH" and "DRUDE" in the CHARMM topology command "AUTOGENERATE ANGLES DIHEDRALS PATCH DRUDE" to avoid the printing of the error during the topology parsing.
- molefacture: Update to the new syntax of the cgenffcaller package.
- cgenffcaller: Clean up the command-line interface, so the user doesn't need to deal with namespaces. Added documentation.
- Added notes about MS Windows high-DPI screen handling, particularly as it applies to Tcl/Tk.
- molefacture: Updated Molefacture to use cgenffcaller plugin without requiring a password. Molefacture still checks if the plugin exists. After some iterations of alpha releases, or even beta, one can assume that the cgenffcaller will always exist, and all the checks and fallbacks can be deleted.
- cgenffcaller: New CGenFFCaller plugin automates the fetching of CHARMM topology and parameters from the CGenFF server.
- Added notes about enabling Apple Retina support for Tcl/Tk widgets through the use of the NSHighResolutionCapable key in the application bundle Info.plist file.
- Added notes about FLTK 1.4.x and High-DPI display support for non-MacOS systems (Win32, Win64, X-Windows).
- Added high-DPI support for Apple "retina" displays using Apple-specific extensions for FLTK versions >= 1.3.5, 1.4.x and later.
- autopsf: Improvements to the sort_to_writepdb procedure to support large systems. Avoid breaking the chain into one selection per residue if re-ordering is not needed.
- cranked version
VMD 1.9.4 alpha 39 (December 5, 2019)
- Added documentation for OPTIX_CACHE_PATH environment variable, and removed old documentation for the old VCA-specific OptiX features.
- OptiX 6.5.0 and driver 440.44 no longer crash upon deallocation of the triangle geometry instance objects used in the RTX-specific code path. Changed the workaround macro test to key on the OptiX version number.
- Added src comments to indicate that the CUDA char3 type is explicitly declared as a signed type.
- Corrected function signatures of VMD-internal rendering APIs and associated data structures to explicitly specify "signed char" rather than "char" since signedness of the "char" type is left by language specs as a compiler implementation-defined choice.
- jsplugin: Modified read_js_timestep_index_offsets() such that various parameters are treated as optional since CPU-based and GPU-accelerated I/O cases need different subsets of the parameters in practice.
- netcdfplugin: Cranked minor version for the fix for an AMBER cell angles string leak.
- netcdfplugin: patch from Thomas Holder to prevent leaking the AMBER cell angles units string, when present.
- molefacture: include the option "-a" to output all the parameters when executing the local installation of cgenff
- maeffplugin: Applied Thomas Holder's patch for maeffplugin to support insertion codes with CMS file import.
- Added support for both buffered and direct I/O on the host-CPU code path, and increased OOC I/O reporting messages to clarify which code path is active.
- Promoted more I/O paths to be controllable via runtime parameters.
- Eliminated hard-coded OOC file count limits in the CPU fallback path.
- Fixed host-side GDS alternate path for OOC analysis.
- Added prototype for GPU threadpool worker affinity mapping setup routine.
- Emit messages indicating how many worker threads are being created per-GPU, etc. Eliminated excessive console output mesgs.
- Added GPU worker thread affinitizing code, currently optimized for the DGX-2. This needs to perform runtime mapping and topology queries using NVML rather than being hard coded, but this is useful at the outset.
- Made multi-frame I/O and multiple worker threads per-GPU runtime controllable for benchmarking purposes.
- Added verbose console message support to vmd_cuda_devpool_setdeviceonly()
- Added support for initialization of fully-connected NVLink P2P early on in device initialization. The P2P flags need to be set once per VMD process for each GPU device pair during startup.
- Migrate GPU peer-to-peer initialization to separate thread pool helper routine specifically for that purpose, since this is a special device-specific operation that shouldn't be performed more than once per process, even if many threads are bound to the same devices.
- Added new vmd_cuda_devpool_enable_P2P() for use on the DGX-2, and vmd_cuda_devpool_setdeviceonly(), needed when building extra-large GPU device management pools for GDS or other special case needs.
- Implemented a basic multi-frame read optimization to improve performance when processing small-sized atomic structures. Ideally we'd have a readv() type API to allow gathers from a sparse atom selection etc, but at present none exists, so the current multi-frame strategy does a contiguous read.
- Continued generalization of the multi-file GDS I/O path in qcp_soa_gpu_ooc() and measure_rmsdmat_qcp_ooc_thread().
- Generalized the framecount hacks for multi-file analysis calcs
- Added QCP reduction buffer allocation, various cleanup. Updated rmsdmat_qcp_ooc() parameters to abstract some of the back-end implementation and keep the scripting language bindings free of low-level I/O operations and GPU considerations.
- Updated startup scripts to correctly identify ARM64 platforms
- Added LINUXARM64 target to make_distrib
- Added CUDA kernels and async launches for single-pass AOS to SOA conversion, atom selection compaction, and QCP inner product computation.
- Added an extra NULL string safety check to please GCC 8/9.
- Corrected max string length bounds for strncpy() calls that were off by one.
- Corrected a potentially undersized string buffer.
- Corrected potentially undersized string buffer in TclVolMap console output code.
- Added static_cast to cope with GCC 8.x compiler warnings about shallow copies in particular cases, e.g. Matrix4 classes in ResizeArray
- Added an ARM64 build configuration based on the HPE Apollo 70 nodes
- Updated QCP out-of-core GPU-Direct Storage implementation to allow worker threads to share CUfileHandle_t objects, for greatly simplified setup, management, and teardown of out-of-core calculations.
- Added a workaround for early NVIDIA ARM64 driver and toolkit combinations where the driver version is lower than the toolkit version.
- Rewrote the video streaming endianism handling code to use a union rather than a typecast to eliminate compiler warnings. Added a safety check to report errors on the console if we get a VS_IMAGE message and decodesz ends up being zero.
- Corrected long int format specifier in QuickSurf debugging console output code.
- Added a safety check to emit errors if decodesz is zero.
- Updated the QMData macros with a string format specifier for use with the QM string buffers processed by TclMolInfo. This ensures there aren't buffer overruns during sprintf() operations.
- Added an extra safety check to the Stride output parser.
- Added more error checking to the NanoShaper output parser code.
- Ensure initialization of all elements in the spline matrix in all cases.
- qwikmd: Prevent the user from using the "segname" to define QM Regions when preparing the simulation from a PDB and not from a previous simulation prepared with QwikMD. Autopsf changes the segname during the preparation phase.
- Silence various g++ 7.x compiler warnings with minor code tweaks
- Tweak EGL headers to make it simpler to build headless with no X11
- Added placeholder out-of-core QCP implementation fctns for testing GDS.
- qwikmd: Bugfix for the QM Orbitals Analysis and for the code to generate the mergefile.sh to merge the QM output files.
- Eliminated compiler warnings about signed vs. unsigned comparisons
- Eliminated VCA checks from video streaming code path
- Eliminated warnings about use of boolean non-zero value test.
- Corrected indentation to silence compiler warning
- Ensure variable initialization to silence compiler warnings.
- Corrected bad if-break scoping in the "bonds" rep
- Added braces to make if scope explicit vs. the Python macros that might expand in an unprotected/dangerous way.
- Tweaks to Python bindings to please g++ 7.x which has some minor brain damage when it comes to for-scoped variable initialization and error handling goto statements.
- jsplugin: Updated jsplugin standalone tests for CUDA. Added double-buffering and asynchronous host-device DMAs for the reference CUDA test case. Added test loops for GPU-Direct Storage into the JS plugin standalone test framework.
- jsplugin: Updated the standalone jsplugin test code to make correct use of the new read_timestep_pagealign_size() entry point so that page-aligned I/Os are used as expected during standalone performance tests.
- doc: Updated compilation notes to encompass Python 3.x as well.
- psfgen: Eliminated C++isms that aren't tolerated by MSVC C compilers.
- cranked version
VMD 1.9.4 alpha 38 (October 17, 2019)
- Added draft experimental direct I/O plugin timestep read APIs for high performance random access out-of-core trajectory I/O supporting both OS kernel bypass and GPU-Direct I/O implementations.
- Added Vulkan 1.x headers to the distribution for the time being.
- Added Turing-specific GPU compute capability 7.5 to default build flags
- 64-bit MacOS X builds for Catalina and later don't support CUDA or OpenCL since Apple has killed off both of them.
- Implemented a macro workaround for MacOS X failing to provide the C99 math routines such as sincosf() in their headers on MacOS X 10.10 and earlier, when compiling in 64-bit mode.
- molefacture: Delete legacy bond list updating call. Re-format and correction to some comments. Improvements to the hydrogen atoms deletion when the bond order is changed. Re-wrote the text message in the info window when trying to select or modifying molecules anchor atoms.
- qwikmd: Delete charmmprev unused variable that was causing some issues with the deletion of topology files.
- Added User's Guide bib refs for various RMSD and clustering algorithms. Revised the text in the "measure cluster" documentation to begin addressing a critique on the description of the lineage of the algorithm implementation in VMD to specify more clearly that it is most closely a variant of Daura's algorithm.
- Updated comments about isosurface mesh generation and texturing.
- Pre-extend per-vertex color arrays for uniform color meshes to reduce overhead. Added comments about eliminating per-vertex color storage to improve performance when we are using volumetric texturing or the like.
- Optimized the volume gradient calculation
- Updated the volume side length/axes assignment and scaling methods to force recomputation of the volume gradient since the gradients are computed with respect to the voxel side length ratios
- qwikmd: Bugfix for the creation of mutation/protonation state combobox selector to clean the array, and not unset the array.
- Corrected voxel accessor methods to promote member variables to long integer types for indexing calculations to prevent integer wraparound with very large volumes.
- Began improvement of the volume gradient calculation loops
- Applied the same optimizations from the point volume rendering method to the lit points method, and improved various other bits of code.
- namdgui: Fix typo in the name of the lipid parameter file name.
- qwikmd: Add GIF images to be used as motion indicators (arrows to expand and collapse GUI frames). These images should be used instead of the unicode characters to prevent missing characters. Update QwikMD code to use the arrow GIF images
- Continued streamlining "mol volXXX" implementations and added new "volorigin" command to allow the user to overwrite the origin of the volume.
- psfgen: Delete commented-out code and update the comments for the previous bugfix (adding atoms to two different residues)
- psfgen: Bugfix for the patches adding atoms to two residues and the atomArray resizing routine. The solution was to keep a record of the patch name applied to the residue. If the record is empty of the patch name is different from the record, then resize atoms array (if needed).
- Reduced ribbon loop control overhead and improved floating point constant consistency.
- Don't call gridsize() method in loop bound test, use cached value instead.
- Improved color scale texture map generation performance by another 20%
- Updated integer color index range clamping code to ensure branchless code generation on most compilers and hardware platforms.
- Improved performance of color-by-volume texture generation loops
- psfgen: Bugfix for the calculation of the scaling factor used to place the colinear lone pairs.
- Corrected indexing arithmetic for 3-D texture maps larger than 4 gigavoxels.
- Improved efficiency of the "points" volume rendering methods with an improved voxel value test criteria.
- Eliminated usage of the old "register" keyword deprecated in C++17 and beyond
- Eliminated unused display command object.
- Make use of sincos() rather than separate sin() and cos() calls.
- Updated the measure volinterior code to the latest revision from Alex and did updates for comments that have already been addressed, and further revision and cleanup for a few of the easy issues that still needed attention.
- Streamlining and optimization of VMD display command tokens and reduced dependence on persistent renderer state in favor of self-contained display command state. These changes will permit higher performance back-end renderer implementations and will eventually allow independent parallel/concurrent rendering of multiple display command tokens.
- Eliminated single DPICKPOINT primitives from all of the display command processing loops.
- Eliminated use of individual pick point primitives in favor of pick point arrays, so the individual pick point primitives can be completely eliminated from VMD display command processing loops.
- Fixed handling of trajectory reader plugins that require the special MOLFILE_NUMATOMS_UNKNOWN flag to indicate that the trajectory format does not include atom count information and that it must be paired with a structure file that provides this information. The matching structure file must be loaded first as always, and then the trajectory read code path will allow the MOLFILE_NUMATOMS_UNKNOWN result to permit reading of the trajectory.
- Updated Win32 registry key string
- fftk: ORCA mode corrections. Fixed parsing of hess.log to correct for systems larger than 10 atoms. Fixed output file name to be 'hess_job2_internal.hess'.
- psfgen: Added a comment about psfgen_static_init function called by NAMD during the start-up process
- Updated psfplugin to correctly write out 10-character records for CHARMM EXTended format, and corrected some C++isms that worked their way into src comments, etc. The new code uses ternary expressions to select the right fprintf() format specifier when the CHARMM EXTended output mode is enabled.
- psfgen: Fix the writing of extended psf files to be compatible with CHARMM. The patch implements the 10 characters number in the "!N" header sections for the extended psf file format. Patch sent Joshua Vermaas.
- Substantial optimizations added to DrawMolItem::draw_volume_isosurface_points() to enable visualization of large tomography volumes. Tested up to 33 gigavoxels so far. Revised the innermost loop memory traversal behavior and improved the isovalue test logic to greatly reduce overheads. At this point to get further performance we need a different display command that assumes uniform point coloring so we can avoid generating color-per-vertex input arrays that presently double the required memory storage and bandwidth.
- Changed the VMD volume loading code to defer calculation of volume gradient maps on-demand, when dealing with huge volumes where the gradient map would consume more than 3GB of RAM.
- Changed the point-based volume rendering code to automatically break up geometry batches into chunks of 10 million elements or less to avoid various types of OpenGL vertex indexing and buffer size limits when rendering multi-gigavoxel volumes.
- Added comments about isosurface extraction for multi-gigavoxel volumes, OpenGL triangle mesh indexing issues, multi-pass MC isosurface extraction, and thread pool usage.
- Cleanup and optimization of CPU marching cubes isosurface extraction code while working on improved handling for huge volumes.
- Corrected MolAtom forward declaration to be a class as it should be.
- Made ResizeArray::extend() method public so that performance-critical graphics loops can be optimized to use a zero-copy approach, eliminating use of the appendX() methods in their innermost loops, and enabling direct writes to the final buffer position. This is particularly valuable for marching cubes, sphere, and point-based rendering methods where the appendX() methods can incur significant overhead relative to the rest of the (simple) arithmetic in the innermost loops of such code.
- Improved the performance and volume size scalability for both the point-based volume representations and the FieldLines representation seed calculations.
- Corrected point-based volume representation indexing arithmetic to handle multi-billion voxel volumes. Somehow the point methods weren't updated when the others were.
- fftk: updated the version number to 2.0
- molefacture: updated the version number to 2.0
- cranked version
VMD 1.9.4 alpha 37 (August 27, 2019)
- solvate: Corrected hard-coded psfgen version dependency to 2.0
- networkview: Updated hard-coded psfgen version dependency for psfgen 2.0
- Corrected ffTK Makefile to add files to the distrib target.
- fftk: Updated ffTK plugin. Mariano and Giuseppe have revised it to add abstractions that support multiple tools for quantum chemistry calculations used in parameter development.
- Hoisted and precalculated a few trivial quantities, and added significant comments about places where we can further improve performance.
- Added performance related source code comments for future improvements.
- Eliminated all of the unnecessary (const VolumetricData *) casts.
- Cleaned up various minor issues in the new volinterior code, removing and/or annotating a few bogus constructs that had been left in the code, marking functions that should be migrated out of this code which is just meant to create bindings for the scripting language, and eliminated poorly formatted code such as bad indentation and use of tab characters.
- Pulled in U. Del team's revs for the volinterior code from their latest version, including probability maps.
- mdffplugin: changed ReMDFF map generation to create initial unblurred map separately, avoiding issues with trying to apply a guassian with sigma 0
- mdffplugin: changed ssrestraint balloon tooltip proc name to avoid conflicts with actual ssrestraint proc
- psfgen: Set the psfgen version to 2.0
- psfgen: Switched psfgen to use the new code paths by default.
- psfgen: Implemented safety checks to prevent the use of the molfile plugin to write structures with lone pairs and drude particles until the plugins are updated. Also prevented the use of writenamdbin for structures containing drude particles. Since the lone pairs are treated as atoms at the pdb level, and there is no extra field to add to the file (alpha and thole), we can use this command to write structures with lone pairs.
- psfgen: Deleted the declaration of topo_mol_lonepair_t *lonepairList in the topo_mol_segment_t structure since it was not being used.
- psfgen: Use the proper definition of atomtype with drude in the case of charmm psf file format
- psfgen: Bug fix on the command delatom and fixed all the references to the dihedral atoms the during the writing process.
- psfgen: Include prototype declaration of the topo_mol_set_drude_xyz function in the topo_mol.h Fix the compiler warning messages on psf_file_extract functions.
- psfgen: Include prototype declaration of the topo_defs_anisotropy function in the topo_defs.h. Deletion of the unused static_assert_int_is_32_bits variable.
- psfgen: Elimination of the compiler warnings from Hydrogen mass repartition code, and assume default values correctly. The command assumes now the form of "hmassrepart dowater 1/0 mass [float numer]". The arguments are optional, and the default values are 0 for dowater and 3.024 for (hydrogen target) mass. Elimination of the variable "i" as a flag for whether the mass value was defined in the command.
- psfgen: Set drude coordinates from the pdb when loading a psf and pdb containing drude particles. The coordinates from the drude are loaded from pdb either using the command coordpdb or the "readpsf ... pdb ..." command. The other mechanisms to load structures are not yet up to date as molfile, psfgen, and NAMD binary files need to agree on how to store all this information. This is valid for reading and writing. Safety checks still need to be implemented to avoid segmentation fault if the user decides to use those mechanisms with drude.
- psfgen: Anisotropy section of the psf files is now being parsed and loaded in the readpsf command.
- psfgen: Fixed some code formatting issues
- psfgen: Fix to the parsing and units conversion of the angles and dihedrals from the lone pair section. The distance, angles, and dihedrals are now float opposing to double, as this precision is enough for these quantities. The same change should be employed throughout the code in the areas where double is not necessary.
- psfgen: Continue to add read-in functionality for drude forcefield and lone pairs. The atoms are now filtered in the case of being lone pairs or drude particles. The bonds between these particles and their hosts may be present in the psf file. In this case, the bonds are not read from the psf, but they can be restored if the vpbonds option is set to 1 (default value) in the psfgen script.
- psfgen: Include the chain ID info extracted from the pdb when using the command "readpsd ... pdb ..." command. Fix the out-of-bounds references when deleting dihedrals and impropers Beginning of the implementation of the read of structures containing drude particles and lone pairs. The compiler messages will be addressed after the read-in functionality is complete.
- psfgen: Fix the printout of the dihedrals and impropers when using the molfile plugin.
- amiraplugin: change plugin name for amira plugin as pointed out by Joao
- cranked version
VMD 1.9.4 alpha 36 (July 27, 2019)
- Revised the recursive ray generation logic for the transparency peeling implementation in the internal OptiX ray tracing engine. For correct operation with the RTX runtime strategy and its associated stack management scheme, we MUST increment the ray recursion depth counter when performing transparent surface peeling, otherwise we could go beyond the max ray recursion depth that we previously requested from OptiX. This will work less well than the former approach in terms of worst-case visual outcomes, but we presently have no alternative and must avoid more serious issues with runtime stack overruns. This may be a good time to revisit whether or not we can more efficiently "continue" the ray ignoring the current intersection by some other API, more like we do for shadow handling for transparent surfaces with any-hit rays. If we could perform a "continue" that would eliminate recursion-related stack management question for transparency peeling entirely.
- mdff plugin: removed old -allframes option from sim, a holdover from when this command used volmap. This functionality should be added back in the future.
- psfgen: Further corrections of the is_xxx() helper routine fctn prototypes. Corrected misspelled macro #ifdef check that led to bad function prototypes.
- psfgen: Commented problematic code in the hydrogen mass repartitioning command processing (potentially uninitialized data) and eliminated unused variables causing compiler warnings. Clearly more valgrind runs are needed to find uninitialized data issues like these.
- psfgen: Replaced the old hash table implementation in psfgen with the current one from VMD, which is also used by all of the plugins, for consistency. The new version has better-doxygenized source comments (though still in need of improvement), and is also updated with macros to handle the namespace management needs of statically-linked plugins, e.g., the molfile plugins so we don't encounter namespace collisions from multiple plugins or other libs that contain the same hash table symbols.
- json: Added pure-script JSON parser package from older revs of TclLib
- psfgen: Added copyright and revision control headers to psfgen source files. They should have been there all along, but they were missing. The headers make it obvious what version was the basis for user-submitted patches and the like, so they are helpful even aside from documentation/comments that describe the source code purpose/structure etc.
- psfgen: Corrected many portability, syntactic, and code style issues.
- psfgen: replaced non-portable/illegal dynamic array size construct with portable code.
- psfgen: Corrected many portability, syntactic, and code style issues. Commented on many locations where integer constants are used for return codes without comments, these should all get replaced by symbolic constants/enums/macros instead.
- psfgen: corrected C++ syntax that worked its way into code that should be strict ANSI C.
- autoionize/cionize/solvate plugins: changed PSFGEN readpsf / coordpdb lines to one line readpsf ___ pdb __ which respects long atomnames
- moved write_file method for writing VolumetricData to file using MolFilePlugin from TclVoltool.C to Voltool.C since it is a general use method. Updated mdff_sim to use new write_file method instead of old volmap code, bringing it up to date with newer voltool commands.
- Added check to make sure if vmd_cuda_calc_dens returns an error that we don't try to init_new_volume or write map to file with bad VolumetricData
- Revised MDFF map simulation GPU call chain to correctly propagate GPU runtime errors to ensure that CPU fallback is performed.
- molefacture: Skip residues evaluation without topology to be submitted to the cgenff until the cgenffcaller plugin is available. This allows the user to export the topologies, psf, and pdb files without being prompted with the error that the cgenff (server or local copy) is not defined correctly.
- psfgen: Initialize atoms' lone pair and drude fields
- psfgen: The first psfgen version to pass the test suite; the hydrogen mass repartition (both in a small and half of a million atom structures) tests, and the build of the additive charmm36 and drude force field with lone pairs and drude particles. As noticed before, there is still room for improvement in the data structure (avoiding the use of double and lone pair and drude information for all atoms); and also for speed up. Psfgen is now compiling with the molfile plugin and using it to generate js files (the other file formats can also be produced this way). The read-in of structures with lone pairs and drude particles will be developed next. Need to check if VMD still adds the space in the atoms' names with three characters when generating js files. This forces a while loop in the functions is_hydrogen and is_oxygen.
- cranked version
VMD 1.9.4 alpha 35 (July 10, 2019)
- molefacture: adding missing molefacture_balloontext.tcl to the VMFILES list of the Makefile
- Added comments about the lack of recursion safety checks for shadow feeler rays associating with lighting calculations. Since we don't take this level of recursion into account in the renderer's internal recursion tracking, we must ask OptiX to allow for one extra level of recursion beyond the depth we're internally tracking to ensure that there's enough stack space available.
- Improved both shading code and comments about the enforcement of the maximum ray recursion depth since the interaction between the renderer and the OptiX >= 6.x runtime system's memory allocation for the recursion stack critically depends on absolute agreement between what the renderer asks for and what it does during a run. The way we had tracked the recursion depth previously was really only tracking recursion of the last surface hit for shading and it wasn't taking into account shadow feeler rays or the like, whereas the OptiX runtime has to accurately compute the stack space for any recursive rtTrace() call. The recursion tracking logic in the renderer used zero-based counting and also had an incorrect comparison test so the original code had allowed one more recursion than requested in practice, which while harmless in older OptiX APIs that used an application-provided recursion stack memory request sizes, is intolerable with the new implementation since it computes the necessary stack size directly from the maximum recursion depth. The revised code now asks OptiX for a maximum recursion depth one greater than the surface hit depth to allow lighting calculations to complete, and the internal max recursion counter tracking has been corrected for its use of zero-based indexing. Further comments should probably be added to the lighting code about the necessity for this extra level of recursion beyond what the shader is tracking internally.
- molefacture: Remove the JSON parser package dependency from molefacture.
- molefacture: Allow the import of topologies with halogens, but only if no lone pairs are detected. This ensures compatibility with the CHARMM36 pre-lone pairs in the halogens.
- readcharmmpar: Created a new mechanism to store and retrieve the parameters read by the plugin. Now there is a global list that stores all the parameters fed by the user, and upon reading, there is handlerID that is returned. This handlerID points to the position of the topology in the global list. In this way, the program calling this plugin does not need to store the topologies but queries the readcharmmpar plugin for individual components of the topologies. The overhaul is not complete. Further modifications will need to happen, especially changing the arguments of the procs to accept the handlerID instead of the full digested parameters.
- molefacture: Small changes and corrections to the documentation page
- Modified the RTX accelerated ray tracing path to be extra conservative about its maximum recursion depth and associated stack size requirements, and added a VMDOPTIXMAXSTACKSIZE environment variable that causes VMD to set the RTX runtime for maximum supported recursion depth and associated stack size allocation.
- molefacture: Update the index.html of the Molefacture web page
- molefacture: Update the plugin webpage according to the new version
- molefacture: Fix bug in gui after detecting the presence of the cgenffcaller plugin
- Added rough draft initial placeholder data structures for alchemical free energy information in trajectory files.
- molefacture: Fix protein builder bug
- molefacture: Fix hotkey definition for move fragments
- qwikmd: Add message of "*Terminal Patches not supported" in the patches entry label
- Temporarily increase per-atom bond storage to 256 for a special user-requested build. While doing this, I've gone ahead and added comments about rearchitecting the code to eliminate this limitation with dynamic allocation outside of MolAtom in the containing molecule class, resulting in a likely-acceptable performance trade-off given the propensity for an improved (reduced) average case memory use for bond storage and traversal.
- Misc cleanup in MolAtom. Time to revisit whether it would be wise to pull the bond data structures out of the atom class to permit them to be dynamically resized for better average-case memory efficiency while still permitting loading of unusual molecular structures or models with huge per-particle bond counts.
- Updated format specifier for 64-bit Python APIs that return longs.
- cranked version
VMD 1.9.4 alpha 34 (June 26, 2019)
- Updated structure preparation plugin version numbers after changes associated with molefacture were integrated.
- molefacture: Molefacture redesign. The molefacture interface was completely redesigned to become more interactive and easy to use. Many features, like FEP and Nucleic Acid Builder, were hidden to reduce complexity while the new interface was being developed. These features will be re-implemented one at the time in the subsequent versions. The new version is not based on xbgf files, as the molecule is designed in memory, and export to files only when necessary. Instead of 3 different lists (atom, bonds, and angles), there is now only one table listing the atoms and its information (name, type, element, etc.). The bonds, angles, and dihedrals are detected automatically while the atoms are selected, and can be manipulated using the sliders in the respective GUI sections. Molefacture can use Open Babel (obminimize command) or NAMD to minimize the structure being modeled. In the case of NAMD, CHARMM36 Force Field is used, and the user can submit the new molecule to the CGenFF server to fetch the topologies and parameters. TODO: The direct connection between VMD and CGenFF server (cgenffcaller plugin) is not yet distributed until the user authentication mechanism is defined. Once this is established, the plugin will be included in the VMD distribution, and the user will be able to submit to the server from directly within VMD. Until then, one has to this manually, or use the local version of CGenFF. Many new functions were added, like adding independent fragments and bond them together; creating ring and chain "skeletons" made up of carbon atoms that can be decorated and modified for faster drafting of molecules, and an "Undo" button to go back to previous states in the modeling process. This plugin will have more porpuses than just creating molecules, and the interface is planned to be modulated using the left panel (atom's table) by making this panel a notebook with tabs, in which each tab serves a different purpose. Example of one of these porpuses is the selection of the different partitions of a drug part of an alchemical transformation.
- qwikmd: Declare missing array at the namespace level
- molefacture: Update the definition of the amino acid fragments to ensure compatibility with the new Molefacture version.
- molefacture: Update the definition of the fragments to ensure compatibility with the new Molefacture version. Add the carbonyl fragment.
- molefacture: Add the icons for the buttons of the new Molefacure interface. There are two versions of the same image, one for Linux and Windows, and the other "_mac.gif" for MacOS. This is necessary as tk 8.5 does not show the text of the radiobuttons when the "-indicatoron false" option is set. The Mac version of the icons has the text included in the image itself.
- Revised write_js_timestep() to ensure that it will always succeed, regardless whether the caller provides page-aligned block-multiple-sized memory buffers or not. Since write performance is generally far less critical than read performance, two I/O calls (each) are used to write out the Timestep coordinate and PBC unit cell blocks in page-multiple-sized records on disk. We use a persistent zero-filled buffer in the jshandle data structure to write the necessary padding bytes.
- qwikmd: Prevent QwikMD from deleting molecules loaded before opening QwikMD. Update the Gui to use tktooltip and infobuttons plugin. Separate the GUI design from the declaration of long commands. Use a condensed atomselection in a single representation to represent the residues selected either on the "Manipulation Window" or in the OpenGL window, instead of creating individual representations per selected residue. Small Improvements and bug fix. Add the interface between QwikMD and Molefacture. QwikMD can now export un-parameterized molecule to Molefacture. When Molefacture "apply to the parent," loads the information to the "Edit Atom" window of QwikMD. Although this works, one needs to make it more bulletproof.
- autopsf: Fix typo: '#' missing from in the comment.
- qwikmd: Update the plugin to use the new readcharmmtop plugin interface.
- paratool: Update the plugin to use the new readcharmmtop plugin interface.
- readcharmmtop: Create a new mechanism to store and retrieve the topology read by the plugin. Now there is a global list that stores all the topologies fed by the user, and upon reading, there is handlerID that is returned. This handlerID points to the position of the topology in the global list. In this way, the program calling this plugin does not need to store the topologies but queries the readcharmmtop plugin for individual components of the topologies. The overhaul is not complete. Further modifications will need to happen, especially changing the arguments of the procs to accept the handlerID instead of the full digested topology.
- autopsf: Protect the filenames from spaces in the path. Enforce the detection of the first residue of the segment, so single residue segments are identified correctly.
- Use calloc() rather than malloc() in Timestep allocations so we don't have to make an additional subsequent memset() call to zero out the coordinate arrays.
- idatm: Improvement of the ring detection using atomselection instead of moltoptools. moltoptools doesn't work for nanotubes. make_bond now uses the bond list and bond order from the atomselection and not the one in memory. It is important to get the correct indexing when the atom's index is not consecutive. As far as I am aware, only the new Molefacture uses this plugin.
- mdffplugin: tooltips now use the new tktooltip plugin.
- Added the initial versions of the tktooltips and infobutton plugins
- Added further URL references about Python 3 module initialization practices that improve portability on recent compilers, particularly those that support C++11.
- Changed the atom selection module initialization approach to address problems that arise with Clang++ 8.x, which gets upset about the use of PyObject_HEAD_INIT() and suggests adding braces etc. This is discussed in significant detail as part of PEP 1323: https://www.python.org/dev/peps/pep-3123/ The discussion of PEP 1323 suggests the use of PyVarObject_HEAD_INIT() instead of PyObject_HEAD_INIT().
- Eliminate duplicated extern "C" linkage type qualifier used to ensure that name mangling is disabled.
- Further variable initialization changes to avoid compiler diagnostics from Clang++
- Perform allocation/initialization of the result Python list early enough that it isn't undefined if subsequent errors occur, but we're already past basic initialization phases for the method in question. This is required to please Clang++ in its analysis of code initialization and exception handling.
- Eliminated warnings from clang++ related to potentially uninitialized local variable state when the error handling branches are taken, by improving locality of variable scoping for PyObject pointers so they don't cross the boundary between normal execution and error condition cases.
- Set numvalues prior to the first Python error handling branch to prevent compiler errors related to jumping past the initialization of numvalues, even though we're not really using it within the error handling block.
- Many more const correctness fixes for Python 3.x APIs, particularly as fallout from the use of PyUnicode_AsUTF8().
- Rewrote the build_set_values() helper in the Python atom selection implementation to eliminate redundant passing of atom counts and selection flag arrays in favor of passing a const pointer to the atom selection itself, which allows the internal loops within build_set_values() to make use of atom selection accelerator loop bounds.
- We need to revise the build_set_values() helper before we can exploit atom selection accelerators.
- Const correctness fix for Python 3.x
- Removed the py_atomselection.C source file that contained the original (Oct 2000) VMD python bindings for performing atom selections. The original atomselection interface has been deprecated since 2007, so at this point we can safely remove it altogether as part of modernization for Python 3.x. The newer implementations have now been further improved and this reduces the code we're maintaining going forward.
- Re-merged the Python atom selection interface changes to make use of firstsel/lastsel to improve atom selection traversal performance. The Python 3.x changes led to a loss of this optimization, so this revision puts it back in place.
- Modify the VMDApp pointer checks to unify with the other implementations.
- Merged Robin Betz's Python API changes to support both Python 2.x and 3.x, changes to Python module initialization, and module naming hierarchy.
- Updated Python 2.x/3.x initialization and fixes to prevent C++ name mangling.
- Merged Robin Betz's Python API changes: All functions take keyword arguments. Clearer keyword arguments for some functions, but backwards-compatible, too. All functions and modules have docstrings. More comprehensive error checking of invalid input. Careful reference counting of all Python objects to avoid memory leaks. Test cases for all Python modules + bug fixes found while writing test cases in python/test. Tested builds with address sanitizer and caught several memory leaks. Compiled and tested with Python 2.7 and 3.6. Greatly improved atomsel attribute access: instead of atomsel().get("x") you can just say atomsel().x, etc. Renamed atomsel module to selection to clarify difference between it and atomsel type. Python module initialization functions appear only in one spot (py_commands.h), making it easier to add new modules.
- Continued cleanup of the Python bindings. Merged the last parts of Robin Betz's Python 3 initialization changes along with improved error handling and improved Python type checking for atom selections in particular. A few of the changes have to be held back via conditional compilation until the matching changes have been completed, e.g., in the atom selection code.
- mdffplugin: Added tooltip text for labels of main tab of MDFF GUI, MDFF Setup, to aid users in understanding the purpose of each option.
- cv_dashboard: Updated cv_dashboard with a fix for the launch issue I previously reported.
- Updated to latest colvars source with the STL algorithm fix permanently implemented.
- Re-eliminated duplicated registration of the _sphere method accidentally reintroduced when merging in Robin Betz's changes which were based on an older version of the code. Eliminated the "graphics_" prefix in favor of just "py_" leading to shorter function names and improved code readability.
- A noteworthy deficiency in our Python interface parsing implementations arises in cases (as in py_display) when we need to match parsed keywords against a long list of potential candidates, triggering the correct assignments, type conversions, and subsequent VMD actions to be taken. At present, the parsing code performs what boils down to a variant of linear search through the series of candidate keywords, either by looping and/or branching with string compares. This makes the code ugly and slower than it should be. We should instead create persistent hash tables (that live for the life of the Python module, interpreter, or application) that contain all of the keywords, and associate them with enumerations or function pointers to directly trigger the right actions. That would eliminate the linear search behavior (both variants), and if properly generalized, this pattern could be used in many other parts of the VMD Tcl/Python bindings where we have to parse keywords from large lists of candidates.
- Renamed keyword arguments parameter from "keywds" to "kwargs" for clarity and consistency with the other python bindings that do keyword handling.
- consistency among Python module init routines
- Merged Robin Betz's changes for keyword-based Python parameter parsing, improved doc strings, and Python 3.x initialization.
- Revised all of the VMD Python initialization routines to match the function signature needed for tabulated module initialization per Robin Betz's patch to support Python 3.x.
- Applied part of Robin Betz's patch adding new wrapper functions for Python/C++ intrinsic type conversions that abstract key API differences between Python 2.x and Python 3.x.
- Eliminated unnecessary calls to PyArg_ParseTuple() for Python methods that don't need to accept any arguments, and changed their registration to METH_NOARGS.
- Added safety checks for VMDApp pointers, per Robin Betz's patch. Const correctness improvements for modern Python APIs.
- Const correctness for Python doc strings. Revised Python doc strings per Robin Betz's suggestions.
- Renamed registered Python method fctns, adding py_ prefix for clarity
- Eliminated ancient const correctness workarounds for old versions of Python
- Modernized all of the Python bindings. Since the Python APIs have been const-correct for quite a few years now, there's no longer a reason to hold onto old typecasts that were only needed by very old Python revs.
- Rewrote VMD Python bindings to use the standard PyCFunction function pointer typedef rather than our own vmdPyMethod function pointer typedef. Python itself now provides what we need, so there's no longer any reason to maintain our own type. The function pointer typedef originally made it easier for us to use static file scope linkage to avoid namespace collisions. Since the Python provided typedef is identical to ours, there should be no impact from this change other than improved standardization/style.
- Applied Robin Betz's patch to eliminate stray cr/lf pairs from the end of atom selection macros using strcspn().
- Pulled in Robin Betz's change to more gracefully handle mismatched atom counts in the topolgy vs. the coordinate file buffers. Needs testing.
- Updated Tkinter class for Python 3.x APIs.
- Added API to query menu index from its string name, needed for Python 3 APIs
- cranked version
VMD 1.9.4 alpha 33 (May 22, 2019)
- Added the colvars cv_dashboard plugin to the extension menus
- Added new colvars cv_dashboard plugin to the plugin tree.
- comment out hooks for as-yet-unimplemented features in the video streaming client implementation.
- eliminated unused resname string in the Stride interface.
- Eliminated unused variables from the video streaming interactive RT client.
- Protect new GPU device query APIs and P2P GPU topology routines with conditional compilation.
- Misc fstype cleanup to eliminate compiler warnings about unreachable code due to return statements in each of several different OS/platform-specific ifdef blocks.
- Added missing include of the STL algorithm header to colvars src files
- Updated colvars docs
- Updated colvars to the latest git trunk.
- Corrected internal debug reporting of P2P GPU link states/features.
- Updated device enumeration comments to prepare for incorporation of user-specified GPU device masks along with P2P PCIe/NVLink topology info.
- ensure we get all of the fctn pointers we need for NVML
- Added access to the NVML CPU affinity mask setting routine given a specific caller-supplied GPU index. It is noted that the NVML docs currently state that the back-end implementation is limited to 64 CPUs, so I have contacted NVIDIA engineering to ask about this vis-a-vis systems like the DGX-2 and the ORNL Summit compute nodes.
- Added commentary to vmd_cuda_devpool_setdevice() about where and how host-GPU management thread affinity assignment should be done.
- Updated src comments and eliminated a redundant safety check that's now handled much earlier in the GPU initialization code.
- Added first phase of NVML open/query to make use of CPU-GPU affinity mask information at the top level.
- Added CUDAWrapNVML to the build.
- Added code (John's BSD-licensed src) to access the NVML shared library at runtime using a simple set of wrapper functions that ensure safety. The NVML shared library provides access to important host platform GPU hardware details such as the best CPU affinity mask associated with each GPU, taking into account the NUMA node, PCIe topology, and NVLink topology that exist on the system.
- Imported thread affinity reporting/analysis code from ongoing work on Summit to make VMD play nice with batch systems like IBM's 'jsrun' tool that lock-down CPU thread affinity prior to launching the child process. The new VMD CPU affinity code tries to detect the situation where the OS or batch system are externally enforcing CPU-thread affinity and "do the right thing" during startup. This implies that we must accept the external CPU thread affinity assignment and put all of the brains for CPU-GPU thread/context affinity in the CUDA startup code since we won't be able to influence which CPU core or socket the host threads run on.
- Corrected logic for runtime determination of DSSP / XDSSP location.
- Corrected a few bits of logic in the console output deduplication code for GPU device enumeration during startup.
- Track number of physical GPU devices as well as the number that are available, for use in detailed formatting of console status output. Draft implementation of condensed GPU console output inspired by the approach taken by Solaris 'dmesg' to reduce repetitive syslog output.
- Revised description of the CUDAAccel class to make it clearer to the uninitiated why it needs to exist and why all of the CUDA code in VMD needs to use the various device query routines it provides rather than making direct use of the underlying CUDA runtime or driver APIs themselves.
- Added further comments for GPU P2P topology analysis and reporting routines in preparation for adding abstractions to fully support the combination of user-defined GPU device selection masks in combination with P2P kernels that need to be able to access P2P topology info to enable and exploit P2P transfers, direct GPU P2P memory access, etc.
- Revised the VMD CUDA startup code to enable analysis of the PCIe/NVLink peer-to-peer communication topology, along with determination and reporting of the total number of GPU P2P links and islands.
- Revised internal APIs for query of CUDA peer-to-peer NVLink connectivity and GPU peer-to-peer link capabilities (performance, native atomic support, etc).
- Doxygenized filesystem locality APIs
- updated documentation for voltool to include new mask, write, hist, and info commands.
VMD 1.9.4 alpha 32 (April 30, 2019)
- Added VMD info message when kernel-bypass I/O is disabled when operating on files located on a remote filesystem.
- Determine whether a given file on a local or remote filesystem. We don't engage kernel-bypass direct-I/O when reading from remote filesystems, as it turns out that filesystem caching mechanisms like the Linux 'cachefilesd' daemon interpret Unix direct-I/O (e.g. O_DIRECT) calls to be NFS-cache-bypassing I/O operations, which is not what we want in that particular case. We want local NFS caches to be utilized rather than hitting the network, so in the case of remote filesystems we now revert to normal I/O. We only engage kernel-bypass I/O when the file of interest resides on a local filesystem where there are no surprising side effects from direct-I/O. This is particularly beneficial on the NVIDIA DGX-2 which incorporates a large 30+TB RAID-0 NFS cache, as an example. The implementation modifies the behavior of VMD block-sized memory alignment queries to cause the desired behavior on the side of the molfile plugins. If we don't query the page alignment size, the associated molfile plugin will have to assume that we can't support kernel-bypassing block-based direct-I/O, and will revert to using normal I/O.
- Added vmdfsinfo sources to the build.
- Added thin cross-platform wrappers to help VMD determine whether a given file is located on a local or remote filesystem, to help determine whether or not to engage kernel-bypass direct-I/O when reading terabytes of trajectory data. This is particularly helpful on very large RAID-0 SSD-cached filesystems such as the NFS cache mechanism on the NVIDIA DGX-2.
- Benchmarked VMD startup time on the NVIDIA DGX-2 and annotated the source code with comments about the costs associated with each startup phase. The overall CUDA-associated startup time is roughly 9.7 seconds on the DGX-2 w/ 16 GPUs.
- Added comment about CUDA driver version check calls and their startup perf costs on a DGX-2.
- Replaced a missing conditional macro definition test needed to maintain support for OptiX 4.x on ORNL Titan and NCSA Blue Waters, for the time being.
- Began adaptation of the VMD startup code to add a bunch of new hardware queries about GPU-Host memory coherency, page table access, and wrote an early draft query for determining the system-wide NVLink connectivity and observed topological features such as the number of connected NVLink islands, etc.
- psfgen: As the drude particles are never stored as atoms in the atomArray, it is necessary to make a first pass through all atoms in the residue to update the atomid (additional increment for drude particles on the host) before starting printing any information to the psf file. This was not being done previously, and it is now fixed. The need to duplicate the atomid update before printing to any file (psf, pdb or namdbin) would be avoided if the write commands were somehow communicating and setting a global variable when the atomids were first assigned. Not implemented yet. Fix the exclusion declaration as for every particle is necessary to have an entry, even for drude and lone pairs (0 as default).
- psfgen: Fix the units of the colinear lone pairs. Initialize lone pair counter. Fix the line counter of the improper psf section
- psfgen: Fix lone pairs initialization
- Significantly revised the CUDA startup device query code to eliminate ancient #ifdef blocks for CUDA versions 4.0 and below. Added device properties query for single-precision to double-precision perf ratio, which can be useful to know for selection of particular arithmetic strategies at runtime.
- psfgen: Fix to velnamdbin log message by Brian Radak from NAMD repository Fix output error when writing velnamdbin A modest output error was added when psfgen was modified to write velnamdbin files as a secondary option for writenamdbin. The buffer was simply being overwritten by the velnamdbin message rather than first being printed to stdout. This is now fixed. To reiterate, this only affected the message sent to stdout, the actual files were of course being written as intended. I just happened to be checking this carefully when I noticed the message was not appearing.
- psfgen: Re-do the initialization of the atom's lone pair flag and pointer
- psfgen: Implementation of the parsing and assignment of the anisotropy definitions for the drude force field. These definitions are stored at the residue level, opposing the storage of bonds (e.g.) at the atom level. Although there is not an explicit DELETE ANISOTROPY in the CHARMM, everytime an atom is deleted from a residue containing anisotropy definitions, all entries are checked. If the deleted atom is declared in an anisotropy definition, this entry is eliminated as well. There only one exception which is present in the toppar_drude_nucleic_acid_2017c.str, where one DELETE ANISOTROPY is found, but the same entry is re-define a few lines below this call - not sure if this is a mistake in the str file. When comparing the psf files generated by psfgen and CHARMM, psfgen add the anisotropy defined in patches at the beginning of the aniso list of the first residue called in the patch. CHARMM places them at the end of the NUMANISO section. Not much effort was put into the evaluation of the quality of the rtf and str files being read, as the drude force field doesn't have a lot of people developing their own topologies. This need to be worked out. Regarding the drude force field and lone pairs, the read-in of psf/pdb files containing drude and lone pairs still need to be implemented. Molfile plugin cannot write psf files containing drude and lone pairs yet.
- psfgen: Fix missing initialization of the numaniso counter (counter of the anisotropy definitions) and the aniso pointer of the residues.
- psfgen: Set the VPBONDS (virtual particle bonds) = 1 by default, so the bonds between the drude particles and their host, and the lone pairs and their host are printed in the bonds section of the psf file. Fix the traverse of the atomArray array in the tcl_segment to increment the index i instead of incrementing the pointer itself. Added the psfgen_kill_mol function call before returning TCL_ERROR.
- psfgen: Fix the inclusion of psfgen.h so the PSFGENLOGFILE is defined and the messages can be saved to a log file.
- psfgen: Fix the dihedral and improper definitions based on the new data structure. TODO: the communication between psfgen and molfile plugin is still to be tested and developed. The molfile plugin is still not capable of writing psf files for drude force field. The atoms section' columns Alpha and Thole are missing, as well as the sections NUMLP and NUMANISO.
- Added support for computing secondary structure using DSSP, either the original DSSP code or the newer descendant "xdssp" tools produced by different authors.
- psfgen: Implementation of hydrogen mass repartition in psfgen (patch by Brian Radak). This command assumes that a segment was already built, wither by segment command or read-in a psf and pdb/namdbin files. The mass of the hydrogen atoms are increased to the target mass (default 3.024), and the mass of the heavy atom is subtracted (heavy_atom_mass - hydrogen_mass - target_mass) Hydrogen mass repartition syntax:"hmassrepart dowater/nowater (default nowater) target mass (default 3.024)" - e.g. hmassrepart dowater 3.024 Command not documented yet.
- psfgen: Fix the charge of the drude particle host atom
- psfgen: Delete duplicated output line with number of atoms
- psfgen: Initialize lone-pairs and drude fields when reading a psf file. The read-in functions are still not finalized but needed to fix this part to implement the mass repartition sent by Brian Radak
- psfgen: Bug fix in the extraction of dihedral from psf
- psfgen: Initial implementation of lone pairs (lp) and drude particles. Here the lone pairs, like in the topology files (rtf/str) are treated as atoms with an additional structure topo_mol_lonepair_t. The structure has the values for distance, angle (in case of linear lp is used as scale), dihedral and a array of atoms - lp's host at which the lp is bonded and two other atoms that form the angle and dihedral that allow placing the particle. The linear lp has only one host and one additional atom to define the direction. As the declaration of bonds in lps is optional, psfgen ignores the ones read in rtf and generates them during the writing process if the command "vpbonds 1" (command not documented yet) is executed before the command writepsf. This same command also prints the bonds between the drude particles and their hosts. If "vpbonds 0", neither drude or lone pairs bonds are printed to the psf file. The drude particles are just store as additional info (alpha, thole, and type) in the host atom. The particle is generated when writing to the psf file, and it is never stored in the atomArray array. The function to guess the lone pair's coordinates is the same as to guess the atom's coordinates from the IC card of atoms, only the order of the atoms is reversed. This follows the same logic of CHARMM and NAMD (which have the same functions to place particles .) The implementation is not finished yet. So far, the !NUMLP and !NUMANISO sections are not being printed, and the read-in psf and PDB files are not implemented. The read-in will be actually the last one to be implemented, once the data structures changes are stable. The data structure may be changed rather soon to avoid have 2 doubles (alpha and thole) and 1 char* (drude type) that are never used, which is the case of additive force-field. The variable lonepairList at the segment level (topo_mol_segment_t struct) will store lp pointers in the segment, in the additive force-field mode (if no drude particle is detected). This is quite useful as in the additive case we can have millions of atoms and a few lps. In this case, storing the pointers to this particles make it faster to address them instead of search in the atomArray. This is not implemented yet.
- psfgen: These commits start a series of changes that aim to improve psfgen's performance and implement CHARMM lone pairs and drude particles. The data structure of the atoms in the molecule was changed from a linked list to an array (atomArray). Using an array brings a lot of advantages, including access the atom using the index that it appears in the topology, and not having walk all the list, comparing atom's name to find an atom every time it is needed. The index is valid as long as the topology is not defining patches and the residue was not changed (e.g., with deletions of a patch). In the cases where the residue was changed, we need to rely on the previous way to find atoms, by searching for its name in the residue. The atoms are still continuous, without gaps even when an atom is deleted, and the array has a null pointer at the end. When a patch is applied, the final number is tested, and if needed, the size of the array is expanded using realloc. To delete atoms, the pointers of the array are re-arranged to close the gap left by the deleted atom. The dihedral and improper data structure were also changed, as now, instead of saving the same information in all the atoms that compose the dihedral or improper, including in the atom being used as index 0 of the definition, the atom being defined (e.g. index 0 of the dihedral) is not stored in the variable, the rest of the atoms are stored. This speeds up when going through the dihedrals and improper to be written to the psf file. The same changes in the angle data structure didn't return considerable time improvements, so it was left the same. One may want to apply the same logic to all data structures (angle, cmap, and exclusions) in the future. The bonds data structure was kept the same, as the atoms need to be accessible in both orders randomly, which make having the information about both atoms all the time beneficial. The dihedral detection (auto-dihedral) also changed. Instead of using the angles to find the dihedral, now the bonds are used (fewer atoms to check - faster), and the atom ID is used like in VMD, as the bonded atom has to have an ID bigger than the previous to be considered. To compile psfgen with the new changes, one has to add NEWPSFGEN as a compiler flag. Another flag was added for profiling purposes, NOIO. Add NOIO flag to the compiler, and the file printing events will be avoided. Several comments were added throughout the code.
- psfgen: New command to save the psfgen messages to a file (psfgen_logfile ) and close the file (psfgen_logfile close), instead of printing to the console. This command allows to use several log files in the same psfgen session, but only one file can be opened at the time. This might be useful to store the messages of the different sections of the script to separate files. Command not documented yet.
- changed the MDFF histogram code to use longs to avoid integer overflow issues on large density maps
- qwikmd: Bug fix during the check for spaces in the definition of QM package installation path
- Generalized the naming and description of the secondary structure input generator for Stride since the same approach also works for DSSP.
- Misc cleanup of some ancient and hideous code in the STRIDE interface, so that it doesn't provide a poor example for implementing DSSP and other secondary structure assignment interfaces.
VMD 1.9.4 alpha 31 (March 14, 2019)
- Replaced the original dome master implementation that was used for the CADENS dome show renderings with the more optimal formulation that resulted from the more thorough mathematical treatment that was done for the Ray Tracing Gems book chapter. This version uses the same variable name nomenclature found in the RTG chapter, and it eliminates a few normalization operations that are avoidable due to the details of the underlying projection arithmetic. For the final CADENS production renderings it was more important to ensure 100% consistency of the image content despite multiple years of development time and revisions to VMD intervening between the earlist test shots and final rendering. Now that the "Birth of Planet Earth" dome masters are finalized, there's no longer a reason to maintain the verbatim original code. Any future CADENS dome renderings will now be done with the latest RTG-descendant code instead.
- Removed the pedagogical variants of the dome master camera projection implementations that were really intended just for the Ray Tracing Gems book chapter. They were included in VMD only to ensure that there were no significant problems in the decomposed and simplified implementations that were included inline within the book chapter. Now that the book has been finalized, there is no longer a reason to maintain these alternative pedagogical implementations within the VMD source code itself, and we can revert to the fully inlined and more mathematically streamlined implementations that have been in use for the CADENS project up to now.
- Tagged the current OptiX renderer code revisions for future reference with respect to the Ray Tracing Gems book chapters.
- Updated the Ray Tracing Gems book references in the OptiX code to point to the Apress web site, and added the final chapter and page numbers.
- Revised comments pertaining to global context-wide RTX initialization for final OptiX 6.0.0 API behavior.
- The rtContext[GS]etStackSize() APIs are not supported when using the RTX execution strategy. Although they don't (yet) return errors in that case, we protect against calling them except when running in a non-RTX mode. Presumably these APIs will be deprecated in future OptiX releases.
- Updated and added details after re-testing the RTX-specific hardware triangle instance list destruction code path with OptiX 6.0 and the 418.30 drivers.
- Updated the VMD OptiX renderer to make use of the new OptiX 6.0.0 APIs required for runtime calculation of stack size required for a given max ray tracing recursion depth. Revised the internal shader state recursion depth and transmission ray depth counter variables to use unsigned types so they are consistent with the new OptiX APIs and new the context-wide recursion limit state. The change from signed to unsigned types for the shader-internal counters comes with slight complexity in handling of transmission ray surface crossing counting since we can no longer blindly decrement the surface crossing counter and compare using less than or equal to 1, since we can now have integer wrarparound. Using the expression (new.transcnt = max(1, old.transcnt) - 1) resolves the basic problem albeit with extra complexity. It may be desirable to reformulate the transmission ray surface crossing implementation to count up rather than down, which deviates from the original Tachyon CPU implementation, but is probably the right thing to do in light of the new OptiX 6.0 APIs favoring unsigned types for recursion counts and the like.
VMD 1.9.4 alpha 30 (March 8, 2019)
- mdff: Added basic segmentation interface to MapTools GUI. Since we use the -separate_groups option there to load the segmented maps as subvolumes of the original map, new volid selection menus have been added for both the main map and secondary map (for binary ops). Every command has been updated to use the sub volume volid where applicable.
- mdff: added a save button for the main Map Tools mapmol which calls the new voltool write command to save the map to a file. Removed the auto-save dialog box popup from the commands with output options, since now a user can save the map directly.
- created new method for writing volumetric data to files using molfile plugin interface instead of the hacky code in Volmap. The method is now used by all the voltool commands when the output option is used. The write_file method parses the output file name to determine the extension and load the correct molfile plugin. However, currently only the dx plugin is capable of writing. As writing capability is added to other volumetric data molfile plugins, the existing voltool code should work correctly as is.
- cranked version
VMD 1.9.4 alpha 29 (February 25, 2019)
- Updated all of the VMD OptiX API calls for the final OptiX 6.0.0 API.
- VMD versions built with OptiX 6.0.0 and greater no longer support remote cluster-based rendering since the cluster rendering APIs are deprecated as of OptiX 6.
- OptiX 6.0 has dropped support for the VCA remote device APIs, so they are now only conditionally compiled into VMD based on the OptiX version number in use.
- Added special shader modifications used for the final dome master renderings for the NSF CADENS "Birth of Planet Earth" fulldome movie.
- Changed conditional compilation to require OptiX version 6.0.0 or later for RTX rather than prior versions, since we still have developmental builds with old APIs hanging around on other platforms like ppc64le.
- mdff: added threshold to saved settings and moved all Map Tools settings to their own namespace, since we don't have to worry about saving them because of their interactive nature.
- mdff: added threshold option to the density load box to be used with the griddx command. This is especially useful now that the Map Tools GUI gives an interactive histogram plot, making it easier for threshold determination.
- chirality: fixed possibly long-standing bug where top was being used in an atomselect command instead of the molid variable, resulting in errors if the -mol option was used during restrain
- mdff: fixed incorrectly copied variable name, causing trace headaches
- qwikmd: Bug fix when QwikMD window was closed and an analysis tab is selected
- pdbxplugin: Started rewriting the pdbxplugin low-level parsing routines to allow them to handle the much more general formatting used by the IHM extensions. Began implementing skeletal parsing of IHM cross link and cross link restraint records. The previous code skipped record types that contained string data of various kinds, but this is not really an acceptable situation since it breaks parsing of various types of IHM data, and it would mean that parsing of the more general variants of mmCIF won't work either. A general note about the previous code is that it wasn't written in a very modular or extensible way, so some redesign is going to be required to eliminate the sort of "cut and paste" parser code growth that occured during early development.
- Began revising the experimental Integrative Hybrid Modeling (IHM) features of the PDBx plugin to facilitate adding support for several more of the IHM record types and to prepare for revisions to the molfile plugin data structures to begin facilitating native handling of IHM data going forward.
- pdbxplugin: Minor revisions for self-consistency, code formatting, and readability.
- cranked version
VMD 1.9.4 alpha 27 (February 8, 2019)
- pdbxplugin: Changed default behavior of the PDBx plugin to assign the PDBx type field to the VMD atom name field so we get CA atoms and others labelled as VMD expects them to be per the original PDB file format and variants thereof.
- dcdplugin: Allow the DCD plugin to optionally skip checking the Fortran record lengths against the system size, to permit easy recovery of files that only have a single corrupted record length value. This is now unified with the existing code to disable record length checks for systems with more than 2^30 atoms for similar reasons, to allow user-disabling of checking via the VMDDCDNOCHECKRECLEN environment variable.
- Added lmplugin and orcaplugin to the optional plugin compilation list. In principle, the ORCA plugin should be able to be enabled by default after further testing/revisions have made it bulletproof. The Lattice Microbes plugin has dependencies on both HDF5, and recursively also on some compression libraries, so that means it can't be compiled if we don't have the dependencies compiled and available.
- Applied patch from Robin Betz to correct mismatched calloc()/delete [] with a corrected call to free().
- Don't attempt to free the FileSpec volume setids list if it's not allocated.
- Patch from Robin Betz prevents leak of per-node hardware stat/info records during call to vmd_mpi_nodescan().
- qwikmd: Prevent errors when no protein, nucleic or glycan molecules are present during structure check
- updated "mdffi sim" to load density to new molecule and make output to file optional
- Added new hist and info commands to Voltool: "hist" calculates a histogram of the density map and returns a list of bin midpoint and count pairs, "info" returns the specified information about the density map (gridsizes, origin, minmax).
- orcaplugin: continued correcting ORCA plugin helper functions that had the wrong linkage scoping.
- orcaplugin: Continued rewrite of ORCA plugin to address portability and robustness issues. Corrected uninitialized state flag used to track whether the ORCA log files were in units of Angstroms or Bohr.
- cranked version
VMD 1.9.4 alpha 26 (January 29, 2019)
- Began updating VMD profiling infrastructure for NVTX V3 APIs in CUDA 10 and later.
- Added compilation rules for NVTX V3 profiling APIs in CUDA >= 10
- Added new mask command in Voltool, which masks a map around an atom selection. This command combines the previous two-step process between volmap and volutil into a single command that is executed entirely in memory and foregoes file i/o.
- Updated the UIVR tool methods to improve atom selection traversal with the use of firstsel/lastsel
- Updated the Python atom selection interface to make use of firstsel/lastsel to improve atom selection traversal performance.
- Revised color lookup update routine to exploit firstsel/lastsel to improve atom selection traversal performance.
- Accelerate atom selection traversal using sel->firstsel and sel->lastsel to split traversal loops into three separate stages, eliminating all conditional assignments and branching from the the beginning and end loops.
- Applied Giacomo Fiorin's patch to correct the behavior of the volmap commands that accept weights, when processing MD trajectories. Previously, the code worked correctly only by coincidence, as there were no safety checks to determine if the number of weights could vary due to selections having dependencies on time-varying properties. Discussing with Giacomo, the most logical approach was to develop a new scheme for handling this case rather than maintaining the old approach that used the tc_get_weights() routine in an unsafe manner.
- Removed old Voltool code (unary ops) that was moved into VolumetricData and properly integrated. The old Voltool code was no longer in use anywhere.
- Modified new histogram command to use voltool and fixed bug with multiplot x axis. Changed every remaining use of volutil to voltool. Removed -deprecate options for sim and cc. mdff plugin is now entirely independent from volutil.
- cranked version
VMD 1.9.4 alpha 25 (January 15, 2019)
- psipred plugin: Updated comments to clarify that the Psipred plugin has been successfully tested with Psipred versions 3.21 and 4.02 after the latest update.
- psipred plugin: Applied Thomas Albers' diff to eliminate references to a fourth set of weights. The first pass uses as input parameters the scoring matrix and three weights, but multiseq (in psipred::calculateSecondaryStructure) hands it four. Elsewhere (in psipred::checkPackageConfiguration) Multiseq explicitly checks for the fourth weight. With the change to eliminate the fourth set of weights, Psipred-3.21 works as expected.
- Allow compilation of the ORCA plugin on Linux and MacOS for testing of recent revisions to make it more C++03 friendly.
- Added HDF5 build configs for LINUXAMD64 target. Revised HDF5 linkage parameter passing.
- lmplugin: Added build rules for the latest Lattice Microbes plugin. Since it depends on the HDF5 libraries, it will get compiled whenever the HDF5 build variables are set, but not otherwise, just like the plugins that have dependencies on NetCDF, Tcl, Expat, etc...
- orcaplugin: Added build rules for the Orca plugin. It won't get added to the standard list of dependencies until the last portability issues have been resolved.
- lmplugin: Tyler's heavily revised/updated version of the Lattice Microbes plugin with reduced external library dependencies, misc refactoring, corrected RDME particle positioning, and better error handling.
- orcaplugin: Pulled in Max's matrix class for the time being, and started dealing with all of the compilation problems related to use of particular C++11 features that are not universally available.
- Misc cleanup of qmplugin.h to allow clean inclusion by C++ plugins like the Orca plugin.
- fftk plugin: updated a few variable names to use the Configuration namespace.
- Ensure that the OptiXRenderer::render_to_videostream() method always gets compiled, so that even headless VMD builds for server hardware can do video streaming w/ interactive ray tracing.
- Did some overdue cleanup on various "measure" routines.
- Pulled in the latest colvars implementation.
- Eliminated a local variable declaration that inhibited the proper handling of server socket management. Misc cleanup.
- Eliminate unused parameter to please Solaris compilers.
- Overcome template instantiation limitations associated with macros on older C++ compilers.
- prefer truncation rather than case-specific rounding until we get past old C++ compiler limitations with templates.
- Eliminated non-portable C++ code using runtime-sized stack-allocated arrays.
- Eliminated extra local cmdSphere object.
- Updated the dome master camera to allow elevation modulated stereo eye separation akin to what I implemented previously in the omnidirectional stereoscopic projection camera.
- updated voxel_coord routine to use member variables instead of passed in VolumetricData
- Updated list of published refs to include the VR Developer Gems chapter
- Tiny dome camera readability tweak, and added VR developer gems reference
- Updated all of the OptiX renderer src files to refer to the new Ray Tracing Gems chapters that describe various of the implementation details herein.
- Revised the camera implementation for the dome master projection to eliminate unnecessary normalization operations, improve the DoF circle of confusion disk basis vector calculation, and rename all of the variables to follow the new naming conventions I established for the Ray Tracing Gems book chapter.
- Don't store return code from the pagealigned size plugin APIs since we don't yet have a well-defined scenario where it's ever going to be non-zero.
- commented out videostream cmdQueue until it gets used by text commands that need to be queued and logged, etc.
- Added an #ifdef and workaround for broken Clang SSE intrinsics on MacOS X
- Added Siggraph/SC'18 demo code for alternative window titles when RTX is on.
- readcharmmtop: Parsing the charge penalties from stream files coming from CGenFF server
VMD 1.9.4 alpha 24 (December 1, 2018)
- Added ref to DECH2018 in NanoShaper text.
- Added DECH2018 reference so we can refer to it in the Nanoshaper docs
- Updated copyright dates, NIH grant number formatting, and added Ryan McGreevy to the list of authors.
- mdff: Updated griddx command to use new voltool pot command to generate potentials in a single step, speeding up the potential generation especially for large maps.
- Corrected an internal state change in the code to allow RTX mode to be disabled when the OptiX context is created.
- Clear video frame pending flag by default so we don't pull a zero-sized frame when running in pure-text mode with no attached GLX/EGL framebuffer.
- Added videostream text commands to set the target "bitrate" and "framerate".
- Added handling of on-the-fly reconfiguration of video streaming codec parameters on both client and server side.
- Cranked version number.
VMD 1.9.4 alpha 23 (November 6, 2018)
- Updated the Multiseq help URLs to point to Zan's new lab page URLs.
- Changed CUDA builds to use CUDA 9.0 so that they support NGC containers.
- Changed the video streaming implementation to push OpenGL frames out at GLX/EGL front/back buffer swap time rather than pulling them in at an unspecified future time during event loop processing.
- Set the last_xxx timers upon establishing video streaming connections to eliminate spurious warnings about heartbeat timeouts.
- Revised the video streaming constructor to explicitly clear internal "pending frame" state variables, added several new safety checks in the video encoder chain to ensure that any bogus inputs can't make it to back-end encoders, since their default behavior is sometimes implemented such that they terminate the entire process by calling exit() rather than cleaning up error state. Added first first round of code to set last_xxx timer values during setup to prevent spurious console warnings about heartbeat timeouts, still need to update the connection setup code to do the same.
- Improved remote streaming console startup messages
- corrected indentation following an if/else construct that could have been confusing previously.
- Ensure that the videostream server exits the interactive ray tracing loop if we get a dropped client connection while it is running.
- Address a compiler warning about a misleading indentation in the context of an if/else.
- Added basic usage information for voltool commands
- Updated TclVoltool methods to call the improved unary ops in VolumetricData. Cleaned up some of the usage information for understandability. Added new mdff_potential command for creating MDFF potentials in one iteration for improved performance, especially on larger maps.
- Fixed binary ops to use gridsize() and longs for indexing of multi-billion voxel tomograms. Added new code for creating new Molecules and loading new VolumetricData into them, used by the binary ops.
- fftk: added NewDefinition flag to Geom keyword for Gaussian, which is needed to delete old ICs before setting new ones
- Updated the video streaming implementation to forward all client-side key events from the local windowing system to the remote server, for remote interpretation. This has the advantage of maintaining support for user-defined key macros and other general VMD UI functionality. To be complete we still have to add pick event handling.
- Completed initial keyboard event forwarding implementation for remote video streaming of interactive OptiX renderings.
- Added high-level support for remote video streaming into FileRenderList and OptiXDisplayDevice, which launch the appropriate back-end OptixRenderer methods dpending on whether or not the active DisplayDevice subclasses support a GUI or are using off-screen rendering. This works for both compile-time off-screen-only builds such as EGL, as well as builds that can be run both with a GUI and in off-screen mode, such as OpenGL Pbuffers.
- Added video streaming-specific OptiXRenderer method that has no GL drawing or other distractions, intended for remote viz from supercomputers, clusters, or other "big iron" back-end visualization servers.
- Negated video streaming UI scene transformation events since within OptiXRenderer they are applied to the camera rather than onto the scene/objects.
- Added remote rendering UI event handling in the core OptiXRenderer loops.
- Implemented new methods to allow server-side polling of incoming UI events within the innermost OptiXRenderer event loops and for use in MPI-based distributed memory time series visualizations.
- Eliminated old anim_interp ray tracing state variables, and protected the definition of the hwtri_enabled flag with conditional compilation tests for RT_USERTXAPIS.
- commented out [under|un]utilized video streaming header-related helper fctns
- GCC appears to handle _mm_set_pd1() okay, so we enable it too.
- Protected SSE code path with a test on defined(__INTEL_COMPILER) since Clang is broken /wrt SIMD here.
- not terribly interested in tracking total network bytes sent for now.
- Eliminated some dead code remnants from VolumetricData
- Added both push, and poll-pull frame sending modes into video stream implementation while experimenting with latency, etc.
- Mouse auto-rotation is disabled when running in video streaming client mode.
- When video streaming is active in client mode, the VMDApp methods for rotating, translating, and scaling result in messages being sent to the remote video stream server for interpretation and injection into the VMD mouse operations so that all of the standard mouse interactions behave as expected.
- Ensure that glViewport() gets called when we bypass the normal event handling loops, otherwise the coordinate system used for video streaming can get out-of-sync with the window resize operations.
- prevent excessive videostream console mesg tests
- Added extra error checking in the video streaming encoder simulator and corrected the order of operations so that the remaining network payload is zeroed out only after the frame decoding process is complete.
- Eliminated old hard-coded video stream resolution initialization
- Updated pending frame API for const correctness in the OptiX "push" case, and misc other cleanup.
- modified the constructors for OptiXRenderer to require a VMDApp pointer needed for use by the video streaming implementation.
- Revised video stream pending frame API to allow both push/pull type stream drive+encode.
- Added special case client-side video stream event handling bypass for local display updates.
- Encapsulated video streaming encoder bookkeeping in the routine for sending frames so that we can use it both in a frame "pull" mode with OpenGL, and in frame "push" mode coming from OptiX. Added an optional implementation that makes use of persistent fixed-size memory buffers for both the compress and uncompressed RGBA buffers used in the client and server. Disabled enforcement of exact codec-block-multiple image sizes and added comments about the use of direct calls into the DisplayDevice classes for handling resize events vs. the use of the normal VMDApp APIs which can lead to excessive recursion.
- Added a bit more paranoid state save/restore and testing for bugs that cropped up during display within DCV during testing.
- Added texture map based image display code for use by the video streaming implementation.
- added prepare3D() call for video streaming draw handler
- videostream: corrected width/height mismatch in resize request handling
- Added a drawpixels_rgba4u() method to the DisplayDevice classes, for short-term use by the video streaming implementation. Reduced console output during video streaming, and addressed a few small flaws in the video encoder-specific wrapper functions.
- Improved the modularity of the video encoder/decoder implementation in the video streaming system. All of the NvPipe-specific encoder components are now wrapped in their own abstractions. Implemented a "simulated" video encoder and associated abstraction functions that essentially operate as a passthrough presently. With some further work this could be improved to encompass the most common H.264/H.265 preprocessing stages such as colorspace conversion, and 4:4:4 to 4:2:0 chrominance downsampling.
- Significantly reworked the way VolumetricData uses and exposes its internal gradient map so that it can be invalidated and recomputed on-demand when any of the unary map processing operators are applied to an existing volume. The new code no longer has to recompute the gradient as part of the map operations, and instead it uses an invalidation scheme in the same way that we do for min/max/mean/sigma values.
- Corrected cached_min/cached_max remapping code for cases when the scaleby method is called with a negative scale factor and the min/max values need to be swapped rather than merely scaled.
- Further scoping changes to please XLC on Power9
- moved the video stream encoder reconfiguration minutiae into a helper function to resolve compilation problems with XLC on Power9 platforms, where the optimizer would otherwise get upset about transient variable initialization within one of the switch cases.
- Implemented a max-frame-rate throttling mechanism into the video streaming event loop so that the video encoder loop never tries to push more than a max number of frames per second, even if the bitrate is low enough and the network bandwidth is high enough to allow it. The frame rate throttle will be particularly helpful to prevent server-side network overflows on unreliable long-haul networks. Added code to calculate and report running averages of frame rates, compression ratios, and other video streaming statistics.
- Protect the "videostream" commands with an up-front test to see if the VideoStream UIObject exists or not. If the UIObject hasn't been created, then all of the "videostream" commands are unavailable. This could happen both due to conditional compilation, and also in cases where we might have an initialization failure in a particular video encoder back-end implementation. We'll need to add additional safety checks to block scenarios where the back-end video codecs might fail in some other unrecoverable way.
- Implemented a second RGBA format readpixels() method for the DisplayDevice classes, required for writing alpha-channel output to file formats like PNG, and for use in testing video streaming encoders that require 32-bit per pixel RGBA image structures.
- Added necessary event handling plumbing so that the main VMD event loop informs the video streaming class when new OpenGL frames are pending compression and transmission. The client side implementation sends window resize events to the server, and the server destroys and recreates the encoder instance with new video parameters, and sends the necessary resize/updates to the OpenGL window.
- Implemented reverse-direction videostream connections, where the client is the listener and the server creates an outbound connection. This is required in many cases to overcome firewall limitations at supercomputer sites.
- Updated the video streaming implementation to allow the client to dynamically resize the server's video stream, restarting the encoder with new parameters (and I-frames) as-needed.
- Updated the video stream network code to gracefully handle unexpected disconnects at both endpoints.
- When video streaming is active (in either server or client mode), the check_event() method calls app->background_processing_set() to force the main VMD event loop to run at maximum performance without any millisleep calls or other schemes to moderate CPU usage. This ensures minimum latency to pump network sockets, drive the video encode/decode streams, and so on. Reduced console output and made network timeout status checks much less aggressive for now.
- Completed heartbeat/keepalive implementation and handling loop for the video streaming "incoming server, outgoing client" case.
- Corrected use of the NvPipe_Format enumeration in light of the fact that it is generated differently on ppc64le targets vs. x86 targets, even though both platforms are little-endian byte ordering.
- Added in skeletal video streaming implementation
- Added a hand-coded AVX/AVX2 loop for minmaxmean_1fv_aligned()
- Completed testing of hand-coded SIMD implementation of single-pass minmaxmean_1fv_aligned() routine that computes the min/max/mean through an array of values using CPU SIMD vector instructions in a single pass. Adapted the VolumetricData class to invoke the single-pass min/max/mean in any case where both the extrema values and mean are all invalid at the same time and one or the other are needed.
- dipwatch: Axel's updated version of the dipole watcher plugin to facilitate optionally updating atom selections at each new timestep
- Revised the VolumetricData class to prevent direct access to the datamin/datamax member variables, in favor of the same on-demand caching scheme used for computation of mean and standard deviation. This eliminates the necessity to syncrhonously recompute min/max values after the voxel data has been modified. A significant outcome of this change is that the datarange() method for referencing the min/max voxel values has to be non-const, since the call to obtain the min/max values may trigger the actual min/max calcualtion itself. Callers throughout VMD that previously called get_volume_data() now have to have a writeable object in order to be able to call the datarange() method.
- Added an alternative ray-sphere intersection approach based on the Hearn-Baker technique that optimizes accuracy for the case of spheres that are small relative to the distance from the ray origin.
- Updated comments to dome master code path and swapped left/right eye stacking to match the more prevalent left-eye-on-top format used by YouTube, Vimeo, etc.
- Since the RTX-enabled versions of OptiX won't be generally available until 2019, the VMD 1.9.4 release is reverted to using OptiX 5.1. We will either choose CUDA 9.2 or CUDA 10.0 for CUDA depending on how the current driver series behaves with NICE DCV. Previous tests with the 41x driver series encountered incompatibilities with DCV, so we wouldn't want the VMD-required driver version to become an impediment to use of DCV.
- Cranked version number.
VMD 1.9.4 alpha 22 (October 17, 2018)
- Updated Summit builds to use CUDA 9.2.148
- Corrected the non-SIMD code path for minmaxmean_1fv_aligned()
- colvars updates.
- Only reallocate the gradient map array when we actually change the map dimensions rather than just the contents of the voxels themselves
- Revised unary operator methods in VolumetricData to ensure that we destroy and recreate the volume gradient anytime the underlying voxel data has been modified. There are a few cases where we should be able to scale the existing gradients rather than recomputing them from scratch, but this change is a good starting point.
- Added minmaxmean_1fv_aligned() for use by VolumetricData class.
- Largely rewrote both the downsample and supersample methods that were recently pulled into the VolumetricData class from code that originated in the volutils plugin. The original implementations were not performant and didn't properly handle indexing arithmetic as required for multi-billion voxel maps. The original code was traversing memory with non-unit stride due to incorrect loop nesting order.
- Rewrote VolumetricData::pad() to correct the nesting order of copy loops to generate consecutive memory accesses and corrected indexing arithmetic to allow handling of multi-billion voxel density maps
- Rewrote the VolumetricData::binmask() method to accept a caller-provided threshold, with a default value of zero. Corrected the indexing math for gigavoxel maps and eliminated the spurious clamp() call in the old implementation. This implementation does not preserve NaNs.
- Rewrote the sigma_scale() method to use the cached mean and sigma values and replaced division by multiplication in the performance critical loop.
- Added methods for computing and caching the values of the mean and standard deviation (sigma) for a map, along with methods for invalidating them and forcing recomputation on-demand when any of the voxel datas is changed.
- Added a private compute_mean() method for VolumetricData to use internallly. This does the full O(N) mean calculation reading all of the voxels. In general we will plan to avoid this by maintaining a precomputed mean value available both within and via public methods that are O(1). The idea will be to prevent callers from needlessly recomputing the quantity, and performing the costly O(N) calculation only when absolutely necessary. The current implementation doesn't do anything special to improve accuracy or to prevent NaN-valued voxels from causing problems. Subsequent revisions will have to add handling of problematic cases.
- Updated the VolumetricData methods clamp(), scale_by(), scalar_add(), and rescale_voxel_value_range() to exploit algebraic knowledge to recompute datamin/datamax values rather than using brute force calls to minmax_1fv_aligned(). Added notes to the pad() method on specific cases where we can teach the to use fast algebraic updates rather than having to call minmax_1fv_aligned() (e.g. when padding with zeros, rather than cropping). Rewrote the arithmetic in the main loop in rescale_voxel_value_range() to avoid redundant operations and to replace division with multiplication by a precomputed ratio.
- First round of migration of unary operator methods from Voltool into VolumetricData itself. Several of the new methods absorbed from Voltool need to be revised to be able to handle multi-billion voxel maps properly (avoiding integer index wraparound). Several of the methods currently recompute datamin/datamax by brute force rather than exploiting algebraic knowledge of the transformation being performed, so these can be made much more performant.
- Revised CUDA radix sorting API to add support for caller-supplied persistent key/value work area arrays to eliminate the need for transient allocations on-the-fly. With some additional minor revisions to the API and its callers, this will eventually enable CUB to be able to use the double-buffered sort methods at minimum overhead.
- Revised the CUDA radix sort API to permit the caller to specify optional minimum and maximum key values to use in computing tighter bounds on the number of bit columns that the back-end radix sort has to operate on. One of the benefits of the bitwise nature of the algorithm stages in a radix sort is that we can trim the range of bits to only those that are used within the unsigned input values. This reduces the number of radix sort stages, thereby improving performance. Since this implementation works on unsigned integral key types, we can determine the starting and ending bit positions to sort on if we know the range of the input values. If we don't know the precise range of input values, we can still benefit from any upper or lower bound value. Worst case we use all of the bits in the key type, and in that case it is be helpful to use a narrow-bit-width key type.
- Added API support for persistent memory allocations used by CUDA Key-Value pair sorting in device-side memory buffers. Added draft implementation for CUB as well as Thrust.
- Began preparing the top level QuickSurf code to be able to exploit persistent sort/scan work areas across calls to CUDAQuickSurf::calc_surf() to achieve higher performance, particularly when CUB is used for the back-end scan and sort implementations. Refactored redundant CUDA device index and compute capability checking out of CUDAQuickSurf::calc_surf(), moving the hardware queries into the constructor, leaving only the very simplest tests in the calc_surf() method. Removed support for SM 1.x from CUDAQuickSurf. We now support only GPUs with compute capability 2.x or higher (3-D grids).
- Updated to CUDA 10.0 and added CUDASort to the build.
- Migrated the low-level sorting ops out of CUDASpatialSearch into CUDASort to enable a higher level of abstraction to be used to facilitate using CUB, Thrust, or any other back-end sorting implementation without changes to the caller, while also creating the opportunity to reuse persistent memory allocations, thereby increasing performance.
- Revision of the CUDA parallel prefix sum (scan) API and implementations to ensure support exclusive scan for vector types like uint2 with both Thrust and CUB, needed for Marching Cubes. The new APIs enable the use of persistent temporary work buffers among many back-to-back prefix sum calls, improving performance significantly vs. the original Thrust implementation that does internal temporary GPU memory allocations on-the-fly.
- Updated comments about Thrust versions included with CUDA 9.0 and later still being broken with respect to vector types such as uint2.
- dcdplugin: Revised the DCD plugin to allow reading trajectories containing more than 2^30 atoms, which requires ignoring the Fortran I/O markers that are only 32-bits in size, and therefore contain I/O size values that are either truncated or wrapped around. Revised some of the arithmetic associated with fixed and unfixed atom memory allocations to use long integer types to avoid integer wraparound.
- dcdplugin: Corrected console diagnostic message and eliminated overly broad scoping of "readlen" variable in favor of direct numeric comparison of bytes read from low level readv() call.
- utilities: Extend the search of dihedral angles to the combinatorial number of possible angles formed by more than two atoms bonded to a central atom. Addition of the proc to calculate the factorial number necessary to support the dihedral search.
- qwikmd: corrected handling of temporary directory environment variables on the Windows platform.
- Enable OptiX RTX support by default, except if the user sets an environment variable VMDOPTIXNORTX. RTX mode is only supported on Maxwell and later GPUs, so further testing is required here.
- Added new "flag0" through "flag7" per-atom bit-fields specifically for efficient storage of per-atom true/false values, pre-computed compound selections, and similar, at minimum memory cost. The current implementation stores up to 8 flags per atom in unsigned char types, but we could trivially increase that. We might also consider permitting the user to access the complete set of per-atom bit fields as integers or with bitwise arithmetic operations to facilitate rapid selection of atoms meeting multiple bitwise criteria. The per-field binding functions required by the existing API are burdensome and demonstrate that it is time to consider revising the internal atom selection keyword API to add integer parameters to be provided to callback functions, so that only a single pair of function bindings is required, rather than one per-field-instance. Such a change would be most helpful for the bit field case, but the user fields and others might also benefit from such an API change.
- Updated collective variables documentation.
- colvars: Updated the collective variables module to the latest version.
- Updated the Voltools.h header to correct comments and enforce proper doxygen comment formatting.
- Corrected VMD OptiX version test macros for version numbers greater than 4.0.0 where a new numeric encoding scheme is used: The encoding for OptiX version numbers prior to 4.0.0 is: major*1000 + minor*10 + micro. For OptiX versions 4.0.0 and higher, the encoding is: major*10000 + minor*100 + micro. For example, for version 3.5.1 the encoding yield 3051, and for version 4.5.1 it yields 40501.
- Added a comment about the OSPRay path tracer's support for "Tf" filtered transparency. The SciVis renderer ignores this presently, so we don't try to use it yet.
- Updated to OSPRay 1.7.0.
- mdff plugin: Updated all relevant MDFF commands to use new Voltool commands instead of volutil internally. Front end usage remains unchanged
- Added Tcl bindings for the new volumetric processing routines.
- Added volumetric processing routines required by the new MDFF related routines for performing alignment and cross correlations on cryo-EM density maps.
- Added new methods for VolumetricData that explicitly handle cases where we may request out-of-bounds voxels, which can occur for example when computing cross correlations of aligned partially overlapping density maps, when resampling from one map's coordinate system to another, and so on. We should consider improving the nomenclature we use for these new methods, currently voxel_value_from_coord_safe() and voxel_value_interpolate_from_coord_safe(), to clarify their explicit handling of out-of-bounds voxels, and we may also want to adopt functionality that is more directly analogous to what GPU texture mapping units provide with handling for infinite periodic tiling along each of the coordinate axes, and to permit the caller to specify what value to use for out-of-bounds references rather than hard-coding this into the method.
- dcdplugin: Cranked DCD plugin minor version in light of periodic cell handling changes.
- dcdplugin: DCD plugin will now express non-periodic unit cells with cell side lengths of zero. In the distant past there was some sort of issue with PBC side lengths of zero, but this must be handled properly by now or fixed if not, since we will eventually also need to support partially-periodic unit cells arising from the newer revs of NAMD's MSM.
- Added double-precision variants of various vector fctns need to manipulate density map basis vectors in full precision.
- Added conditional compilation protection for half-precision arithmetic in the image segmentation routines to prevent compilation problems on CUDA 7.5 and older versions still in the field.
- Protect one of the RTX-specific triangle mesh member functions with conditional compilation tests.
- Sync up compile-time OptiX RTX macros with Tachyon GPU back-end
- Sync up OptiX ray statistics buffer macros with Tachyon GPU back-end
- fftk: fixed a potential memory leak in bond/angle optimization
- Cranked version number.
VMD 1.9.4 alpha 21 (August 16, 2018)
- First implementation of OptiX support for new NVIDIA "RTX" GPUs with hardware-accelerated BVH traversal and ray-triangle intersection.
- Compile against OptiX 5.2DEV or later for OptiX RTX development and testing
- Added code for runtime control over OptiX 5.2 execution strategy.
- Added conditional compilation of extra OptiX 5.x runtime error handling due to restructuring of the way OptiX functionality is distributed among shared libraries and new runtime driver loading stages that didn't previously exist.
- solvateplugin: Improve numerical behavior of the solvate overlap check for custom solvents.
- Replaced deprecated calls to ospNewLight() with ospNewLight2() starting with OSPRay 1.6.x
- Added #ifdefs for OSPRay APIs deprecated by later revisions.
- Updated to OSPRay 1.6.1.
- Added NanoShaper to the main VMD docs, updated discussion of molecular surface representation methods to include basic guidance to direct users to use the right tool for the right situation. Further revision will be necessary to capture all of the details of what is now possible.
- Started web site updates of VMD 1.9.4 documentation required for release
- Updated compilation documentation.
- Cranked version number.
VMD 1.9.4 alpha 20 (July 15, 2018)
- torsionplot: Prevent the use of torsion plot beyond 5000 residues for performance reasons.
- Added ZLIB flags for x86_64 builds
- Updated colvars module to 2018-07-02
- Cranked version number.
VMD 1.9.4 alpha 19 (June 8, 2018)
- Added comments explaining a way to init watershed that minimizes the max group number, and refactored instantiation of templates in GaussianBlur
- Modify CPU thread affinity handling for ORNL Summit, so that VMD behaves better with the IBM 'jsrun' job launcher and its external assignment of thread affinity. The new implementation will select affinities for worker threads using the incoming list obtained when the main VMD process/thread first starts. There's still an issue with using the physical CPU count to spawn worker threas rather than the count of CPU cores that are accessible by the initial CPU affinity list obtained during startup. We should be able to revise the thread management code to do a little better job for us in that respect, although at present this is a somewhat Linux-specific detail.
- Return the newly created molecule ID when producing a new molecule using the new "mol fromsels" command.
- Use double precision for measure_center() to prevent problems with operations on all-atom selections in billion atom systems. We could instead do short sub-summations in single-precision, periodically summing into a double-precision accumulator only every few thousand iterations, which would maintain the speed of the original loop on hardware with poor double-precision conversion/summation performance, but the loop conditions would get more complex, so it would need benchmarking on multiple processor types before we would be sure to get any speed benefit out of it. For the time being this should help address destructive floating point cancellation with huge selections, and we can revisit performance later.
- Cranked version number.
VMD 1.9.4 alpha 18 (May 4, 2018)
- Enable the new LatticeCubes representation GUI controls and graphical representation text commands by default now that it has been tested by the ZLS group with the latest versions of Lattice Microbes.
- Corrected the non-page-aligned fallback safety check test condition for callers that fail to query the new molfile plugin API read_timestep_pagealign_size() before their first call to read trajectory timesteps.
- Added two new segmentation merging schemes that work with integer image types. MERGE_HILL_CLIMB is the default scheme that was previously used, and producese the best results with FP images. MERGE_WATERSHED_HILL_CLIMB produces similar results to the prevoius scheme but takes much longer to run and works well with integer types. MERGE_WATERSHED_OVERLAP is an implementation of the scheme that the original Chimera paper describes.
- Enabled full function pointer checking for implementation of MolFilePlugin::can_read_pagealigned_timesteps()
- Revised VMD to implement the necessary logic for the new molfile plugin API read_timestep_pagealign_size(), used to negotiate the required memory alignment block/page size for high performance kernel-bypass direct I/O, e.g. for NVME flash arrays and the like. The new plugin API enables plugins to return whatever alignment is needed on a per-file basis. When a plugin is called by a caller that doesn't call read_timestep_pagealign_size() prior to reading the first timestep, it is expected to automatically provide a fall-back I/O implementation that will work with a non-aligned non-page-multiple padded destination memory buffer. This allows the same plugin to transparently provide peak performance to sophisticated callers, but full backward compatibility for tools that lack the required internal support for kernel-bypass direct I/O.
- jsplugin: Revised jsplugin to automatically use or disuse block/page-aligned kernel-bypass direct I/O depending on whether the caller has made use of the new read_timestep_pagealign_size() entry point in the VMD molfile plugin API, beginning after ABI version 17
- orcaplugin: Added first rev of molfile plugin for reading ORCA files. Still needs to be made more cross-platform portable and more broadly compilable with varying C++ standards.
- Added plugin API read_timestep_pagealign_size(), and revved ABI version to 18.
- corrected output messages for each of the integer group types used by the image segmentation class.
- Updated default VMD optional feature build flags for ORNL Summit
- Updated build configurations for new system software on ORNL Summit.
- DoF is now implemented for orthographic projections, so we don't restrict it from being toggled in the interactive OptiX renderer anymore.
- Draft implementation of read_timestep_pagealign_size() APIs for the molfile plugin interface.
- Draft molfile plugin API for read_timestep_pagealign_size()
- Validated the performance of the new 18-neighbor version of the segmentation code. The latest version runs at comparable performance to the original under the same test conditions that the previous code used (5x initial blur, no scale-space blur, etc). The performance of the new code ends up being 4x to 6x faster than the previous code when it runs with 1x initial blur and the expected 2x scale space blur. Removed all of the old conditional compilation tests used for performance testing and Nsight Systems profiling test runs presented at GTC2018.
- Rewrote DispCmdPointArray::putdata() to limit the maximum vertex buffer size so that multi-billion atom renderings don't overwhelm internal vertex indexing arithmetic in back-end renderers. This prevents crashes in OpenGL libs/drivers caused by integer wraparound, which can occur if massive vertex buffers are rendered in a single pass. The revised code breaks up over-large vertex buffers into multiple display command buffers that are individually small enough to avoid any trouble in the back-end renderers.
- Revised DrawMolItem::draw_solid_spheres() and draw_solid_cubes() and DrawMolItem::draw_lines() to prevent vertex buffer sizes from growing beyond the maximum number of elements that OpenGL implementations can handle due to internal 32-bit indexing arithmetic that is liable to encounter integer wraparound with large vertex array sizes.
- Added constructors in Segmentation for differnet image types for the standalone segmentation test code. Fixed some bugs in the short and char image type segmentation path. Cleaned up some variable names relating to image and group types. Fixed some bugs in the GaussianBlur code that could occur if the kernel_size became unreasonably large.
- Removed unneeded test code from ScaleSpaceFilter and fixed bug that caused the blur sigma to remain unchanged after each round of merging.
- modelmaker: abinitio now restarts from prior failed or completed runs and appends all results to look like one run
- Added templates to GaussianBlur and ScaleSpaceFilter (and corresponding CUDA code) for the image voxel type. Currently implemented types are float, unsigned short, and unsigned char, but it is trivial to add/remove more. There is a known issue with integer image types that may prevent groups from merging. Also modified the group handling so that both signed and unsigned types are supported. Previously we used negative values as flags in some places which prevented unsigned types from working.
- modelmaker: fixed bug when calculating per residue cc
- autoionize: Typo correction in detecting Cesium (CES)
- qwikmd: Addition of the optional salts for structure preparation: CsCl MgCl2 CaCl2 ZnCl2 (already supported by the autoionize)
- Additional .qwikmd files loading mechanism using the "initial structure" selection entry and load button
- Make the deletion of the previous output folders more precise. Delete only the files of the system to be replaced and leave the rest intact. Lock the save button in the case of the preparation a QM/MM simulation from a classical simulations
- qwikmd: Change the QM Options from global (same options for all protocols) to QM options for each protocol. A combobox on the top of "QM Options" controls the target protocol. Add the capability to load the orbitals for orca. The print commands are exemplary, working when Orca ran with PM3. These commands need to be changed if another theory level is used. General bugfixes and improvements.
- Finished implementing update_type function in Segmentation.C so that the datatype used to represent groups can be switched while the segmentation algorithm is running. Fixed various bugs with the templatized group types that did not appear before when only using floats.
- Eliminated two tiny memory leaks in the CPU fallback code path for the MDFF cross correlation code.
- eliminated unused counter to please Clang/LLVM
- Fixed segfault in the CPU gaussian blur code path caused by an uninitialized pointer.
- Changed constant from double to float to prevent pointless double promotion
- Added a new experimental "LatticeCubes" representation to draw axis-aligned particle-radius-scaled lattice site cubes for use when visualizing LM lattice sites, particularly for multi-modal simulations that combine both lattice-based cell simulation with brownian dynamics based particle/bead simulation. The first implementation is absolutely minimalistic and borrows heavily from the old GLSL sphere rendering code path. In principle, the OpenGL path would best be done using additional GLSL geometry shader stage to multiply the incoming vertex out to build the cubes. Similarly, the initial implementation for the FileRenderer hierarchy is also very minimalistic and doesn't batch up the triangles into higher level meshes, so there's a lot of inefficient triangle-at-a-time processing occuring when feeding the cubes to OptiX. Lastly, for renderers like Tachyon and OptiX where we have freedom to introduce a purpose-built primitive, we could create an actual cube array primitive like we do for spheres, and this would boost performance tremendously. Fornow by default, the new experimental "LatticeCubes" representation is only available in the graphical interfaces when the code is compiled with -DVMDLATTICECUBES, otherwise the display commands and associated rendering paths exist, but the graphical representation and GUI controls are all disabled.
- pdbxplugin: Added ability to read PDB_model_num (the alt_loc field in vmd) from PDBx files.
- modelmaker: workaround for rosetta bug where abinitio doesn't keep fixed part's sidechains fixed. for now, take fixed part from template and merge that with rosetta generated part.
- misc cleanup of CPU-based QCP RMSD algorithms before building up the GPU side further.
- Modified the behavior of VMDApp::molecule_load() to accept PDB-Dev mmCIF hybrid model files that in some cases contain graphical objects such as spheres, but may or may not contain atomic structure information. Previously, in any case when a molfile plugin advertised the ability to read atomic structure information, we expected it to follow through with a successful read, but now we have to check whether or not the specific file in question happens to contain atomic structure information before we try and read it, and in either case we must still check for volumetric density map information, graphics objects, and similar. Similarly, the logic for loading simulation trajectory information has also been updated to check for a non-zero atom count as a necessary precondition. It should be noted that if we further revise VMD to enable time-varying density map information or other non-atomic information to be read from a trajectory file that this will have to be revised further.
- pdbxplugin: Added ability to read spheres from PDB-Dev files. Fixed a string handling bug that could cause some fields to be truncated.
- modelmaker: analysis alignment now uses exact selection text of user and does not add backbone, so that users can align with only CA as an example. If aligned pdb already exists, skips doing alignment again. Will now overwrite formatted dcd so subsequent analysis runs can be done in same folder.
- Continued work on APIs to improve saved state restore performance for massive biomolecular complexes.
- pdbxplugin: Added a check for PDB-Dev files and adds warning that support is experimental. Also adds warnings for missing PDBx fields that are required in the PDBx spec. Fixed memory leak and other memory related errors in delete_pdbxParser. Improved string handling when reading headers to prevent buffer overflows. Fixed errors in the way we were parsing special bonds. Removed unused variables and renamed/reformated parts of the code for better readability.
- Added detailed comments about the two CUDA QCP kernel designs and their current approach and a couple of special alternative schemes that may come closer to GPU "speed of light" under a limited range of circumstances.
- Added a CUDA thread-block-per-structure-pair QCP RMSD implementation optimized for small structures.
- Added support for CUDA 7.5 so that the CUDA-accelerated QCP code can be compiled on NCSA Blue Waters and ORNL Titan.
- Added a GPU-accelerated quaternion characteristic polynomial (QCP) RMSD alignment inner product kernel designed for device-wide calculations (all thread blocks collectively compute QCP inner product sums) suited for very large structures of macromolecular complexes, viruses, etc.
- Added VMDApp::color_change_namelist() method to drastically reduce the number of display update cycles that occur when loading saved states associated with massive molecular complexes. Large molecular complexes with hundreds of millions of atoms tend to have color categories (e.g. segment names, chains, etc) that have a much larger number of entries than would occur for small structures with only a few million atoms. The impact of the huge item lists associated with each color category is a form of quadratically scaling complexity when processing saved state files due to the combination of previous approach of setting color category items one-at-a-time, which has a side effect of triggering representation updates, which then trigger display updates. This is inconsequential for small systems, but for systems in the >= 100M atoms range such as the Chromatophore, Flu viruses, or the protocell, it is a major performance issue. The simplest resolution (without significant changes to the way that saved state restoration is performed wrt/ file loading and color category initialization) is to batch all of the color category updates together at once, thereby triggering only a single cycle of DrawMolItem updates and subsequent display updates. This approach would also make it feasible, with a minimal amount of code to continue to enable a saved state written by VMD 1.9.4 to be loaded by VMD 1.9.3 or perhaps VMD 1.9.2, at least for the time being, albeit only VMD 1.9.4 would get a performance gain.
- pdbxplugin: Commit includes multiple bug fixes, and experimental support for PDB-Dev. Fixed an error handling bug that occurred when we couldn't find the number of atoms in a file. Fixed a buffer overflow that occured when special bond column names were larger than expected, and added length checks to internal string manipulation code. Fixed multiple memory leaks and uninitialized reads. Added comments and renamed variables to improve readability.
- modelmaker: new model command options for sorting and modifying output structures
- modelmaker: python clustering now creates a pdb of each medoid
- modelmaker: python clustering now creates dcds for each cluster
- modelmaker: added kmin and kmax options for python clustering
- modelmaker: fixed naming for tmp input pdb and seq files in model command
- Cranked version number.
VMD 1.9.4 alpha 17 (March 22, 2018)
- Updated molecular orbital code to automatically select the L1 cache kernel when running on Volta hardware.
- Updated build logic for LIBPNG and ZLIB so that we can redistribute these with the VMD build and ensure we avoid conflicts with older and incompatible platform-supplied versions of these libraries.
- Added plumbing for VMD builds using a Vulkan rendering path rather than OpenGL, although there's a long way to go before the Vulkan path is ready for prime-time.
- Added significant additional error checking to EGL surface creation and context binding during startup. Added checks for EGL implementations that do very late memory allocations and emit more informative error strings when things go awry.
- Continued adding debugging checks and optional diagnostic outputs into OptiXRenderer initialization routines.
- Added detailed error checking/reporting during the earliest phase of EGL context creation.
- Continued adding debugging checks and optional diagnostic outputs into OptiXRenderer initialization routines.
- Added much more extensive console output during device enumeration logic to help track down issues with running OptiX inside containers on certain hardware/software platforms. When set for debug output, the VMDOPTIXVERBOSE environment variable now causes the OptiXRenderer device enumeration code to emit intermediate results as OptiX APIs are queried for device counts, compute capabilities, and so on.
- Added support for LLVM/clang for Linux builds
- Added typecasts for eliminate LLVM/clang warnings for Python keyword lists
- Ensure that Orbital class member fctn descriptions show up on Doxygen
- Trimmed unused enum member variable from CUDAAccel class.
- Eliminated unused input/unput functions from the AtomLexer atom selection scanner implementation
- Removed old special case SC'15 ray tracing demo code paths that were enabled by a SC15ANIMSPHERESHACK compile-time macro.
- Added conditional compilation for vmd_input_hook() which is only used when compiling VMD as a shared object for use within a Python module.
- Eliminated unused state variables in the Python atom selection bindings
- Miscellaneous cleanup and conditional compilation completeness improvements to please LLVM/clang, which is more attentive to unreferenced class member variables and the like.
- Cranked version number.
VMD 1.9.4 alpha 16 (March 19, 2018)
- Switched builds to OptiX 5.0.1
- Started updating the VMD 1.9.4 README files
- Started testing/revisit on default setting for the auto-sample-count adjustment loop when used in combination with the progressive API
- Updated the comments related to 256-way shader specialization and its impact on VMD startup performance.
- Added a wrapper for nvtxInitialize() since we didn't have one yet.
- Added ORNL Summit EGL config into the defaults Makefile as summit.egl
- Corrected image segmentation code so it can still be compiled when CUDA is disabled.
- Added an extra internal flag so that the QuickSurf GPU-side object allocation and caching strategies don't impact cases where we've forced the use of the CPU code path for other reasons.
- Added NVTX tags to top of Segmentation algorithm hierarchy to ensure that the front-end constructor and final GPU-host copy of the group map are both included as they should be.
- Eliminated explicit CUDA synchronizations where they are not critically needed.
- Eliminated unnecessary GPU-host-GPU copy cycle for 3-D density map when initializing the Watershed algorithm from a previously Gaussian-blurred density map image that already resides on the GPU.
- Eliminated unnecessary GPU-host-GPU copy cycle for 3-D density map when initializing the Watershed algorithm from a previously Gaussian-blurred density map image that already resides on the GPU.
- Revised to segmentation algorithms to make use of cudaMemcpyAsync(), cudaMemcpyToSymbolAsync(), and cudaMemsetAsync() and so on, where it is safe to do so.
- Updated conditional compilation macros on various branches of the image segmentation algorithm to facilitate easy compilation of the 10+ variants of the code (and profiles) required for the GTC2018 Nsight system profiler talk.
- Added support for LLVM (clang) compilation in the ORNL Summit builds
- Updated ORNL Summit XLC compiler flags to add -qsuppress=1500-036 to suppress warnings about NOSTRICT (the usual worries about the impact of non-associativity of floating point math). The -qsuppress flag is supported for XLC compiler versions after 20180208.
- Eliminate compiler warnings on signed/unsigned comparison for MOLFILE_BADOPTIONS (which is 0xffffffff or -1)
- revised implementation of the K nearest neighbor selection to avoid signed/unsigned comparisons
- Added measure commands "centerperresidue" "rmsfperresidue" "rmsdperresidue" for testing.
- Added console message when image segmentation diagnostic output is enabled
- Updated built in help for 'profile' script commands
- Added -Wno-unknown-pragmas to all of the GCC compiler flag sets to prevent warnings associated with OpenACC #pragmas in several source files when compiled with versions of GCC that predate support for OpenACC.
- Commented out unused yellow2 variable in DrawRingsUtils
- Continue revision of CPU affinity logic to support special case handling for batch systems that externally manage affinity masks etc.
- Restructured the CPU thread core/socket affinity code in prep to add special handling for batch systems that restrict our ability to set thread affinity at runtime.
- Added cleanup rules for .lst files produced by IBM XLC compilers
- Added initial tags to support compilation with LLVM (clang)
- The current GCC versions on Summit (4.8.5) don't have -mtune=power9 yet, so we revert to power8 for now.
- Modified the stock Khronos EGL header eglplatform.h to cope with systems like the ORNL Summit machine that don't have any X11 headers installed.
- Further revision of the Watershed algorithm to handle large volumes, while reducing register footprint of the CUDA kernels
- Eliminated -ll link flags leftover from commercial Unix lex linkage
- Took the proverbial flamethrower to all of the ancient hacks that related to old compilers that used explicit directories for C++ template caching that used to pollute the filesystem. We are long since rid of these now so there's no longer a reason to keep any of the build steps associated with compilers of that era.
- Added explicit type conversions to eliminate spurious compiler warnings for the idx2sub() variants
- Eliminated orphaned calculation of curr_diff in the CPU Watershed code path
- Silence spurious compiler warnings about float to int conversions
- More colvars updates to correct some remaining mismatched integer types among some colvars APIs
- Disable CPU thread pool affinitization on the ORNL Summit system since the job scheduling tools currently box in CPU threads according to the details of the parameters passed to 'jsrun'. If VMD tries to reset the CPU affinity of the threads in the pool it gets runtime errors, so for the time being we'll accept what the job scheduler assigns and make the best of what we're given.
- Updated the configure script to workaround various teething issues on the ORNL Summit system.
- Updated to colvars version 2018-03-09
- Updated the MRC/CCP4 plugin to correctly interpret non-IMOD tomograms that use signed-byte voxel formats by checking the sign of 'amin' in the header.
- Updated to latest Khronos KHR/khrplatform.h header
- Updated to latest Khronos EGL headers
- Cranked version number.
VMD 1.9.4 alpha 15 (March 7, 2018)
- multiseq: Updated multiseq to accept any current version of the colorscalebar plugin
- colorscalebar: Rewrote the internals of the color scale bar plugin to generate color bars that will render with high quality in ray tracing engines, by replacing line primitives with cylinders and color-per-vertex triangles.
- Added #ifdefs for socklen_t case for Summit and OpenPOWER platforms.
- Improve Segmentation error handling warnings for illegal voxel type
- Updated to colvars version 2018-02-24
- Corrected redefinition of default parms on Thrust branch of #ifdef in the CUDAParPrefixOps code.
- Corrected "mol fromsels" PBC unit cell transfer logic to depend on 'selidx'.
- Updated built-in help to document flags that allow the user to override default values for the Gaussian blur sigma for both the initial Watershed pass, and both the initial sigma and blur-multiple coefficients for each iteration of the scale-space segmentation algorithm.
- Corrected the CUDA parallel prefix sum wrapper builds to correctly set conditional compilation flags for CUB vs. Thrust for callers that need to know whether to make use of the optional caller-provided workspace allocations parameters by providing pre-allocated buffers of appropriate size.
- Updated CCP4/MRC map reader plugin to properly handle density maps with several billion voxels in them.
- misc cleanup of various test code and leftover test scaffolding
- Promoted image segmentation indexing and size related intermediate variables to long integer types to allow processing of volumes containing tens of billions of voxels
- Corrected VMD-internal side of the of the molfile plugin interface for volumetric data to prevent problems when loading tomograms or density maps containing tens of billions of voxels. The previous code was extensively using plain integer types that are insufficient for indexing calculations, which have been replaced by long integer types for now.
- Corrected console output when loading massive tomograms containing tens of billions of voxels so that the correct voxel counts and memory use are displayed, by performing arithmetic with long integer types and by replacing a few more instances of hand-written size arithmetic with calls to the VolumetricData::gridsize() method.
- Changed SIMD-vectorized min/max functions to accept long types to allow them to process cryo-EM density maps with tens of billions of voxels and to facilitate atom selections w/ long integer types at a later time.
- Typecast volume dimension parameters to ensure that huge tomograms don't overflow intermediate arithmetic when computing total voxel counts after getting volume metadata from the reader plugin.
- Begin addressing DCD plugin limitations with respect to 2-billion atom simulations
- Improved color mapping for NVTX profile tags
- Added NVTX tags to clearly show startup script execution and command line or user .vmdrc startup processing activity vs. the interactive command interpreter loop.
- Add profiler tags for the longer running OptiX class initialization steps that involve shader compilations, etc.
- Added NVTX markers for CUDA device pool initialization time ranges
- Improved use of NVTX markers for file I/O activity and VMD text command execution. Text command completion is still vague since some operations are not triggered until the next VMD display update cycle.
- Eliminated unnecessary inclusion of tcl.h and TclCommands.h in MDFF.C
- qwikmd: Allow creation of QM regions with charges between -1 and +1 with MOPAC, but emit warnings
- Added inclusive scan to CUDAParPrefixOps. Noted that we're going to have an issue with some of the intrinsic vector types like uint2 which are used in Marching Cubes, and that we may need to explicitly supply a zero-value initializer for type T to handle the lack of a built-in automatic type conversion from int to types like uint2. It may be simpler to expose an extra zero parameter here rather than baking it into the implementation the way it has been done at present.
- Added NVTX profiler push/pop operations into CUDAWatershed::init_gpu_on_device()
- Added in conditional compilation for GPU-based initialization to ease collection of pre-optimization profiles for the GTC presentations.
- Added comments and improved formatting of the explicit template instantiation block
- Added CUDAParPrefixOps to the build.
- Migrated parallel prefix sums routines out of the CUDA image segmentation code so that the CUB-based code paths and the more explicit handling of persistent allocations of temporary workspace buffers can be generalized in other parts of VMD that make extensive time-critical use of parallel prefix operations, such as QuickSurf, Marching Cubes, etc.
- Added an init_gpu_on_device function to CUDAWatershed that allocates arrays on the GPU and then calls the calc_neighbors_kernel. This is called instead of the previous init_gpu function, and performs the initial watershed neighbor initializtion entirely on the GPU. It is a large performance win over the previous CPU-based initialization.
- modelmaker: gnuplot fix
- modelmaker: fixed bug in ssanalysis if only 1 struct. Turn off evaluate_ss_analysis for now because gnuplot is causing a variety of issues and may no longer be needed.
- modelmaker: quick_mdff now works with multiple maps and new gridpdb selection text. removed last mention of Juan's old cluster code paths
- Cranked version number.
VMD 1.9.4 alpha 14 (March 1, 2018)
- modelmaker: ss analysis no longer launches another vmd instance
- modelmaker: added optional output prefix for fragment files
- qwikmd: Add qmbondscheme option to QwikMD GUI
- Added notes on key GPU segmentation algorithm control flags
- Added notes about the tight loop over GPU update_kernel() launches and checks on changes_d state after each run.
- disable CUB until we've formalized the usage of CUB for scan etc.
- Refactored the CUDA image segmentation classes to permit the initialization phase neighbor calculations to be performed on the GPU rather than the host. Added a first draft of CUDA calc_neighbors_kernel() and calling routines that will eventually replace the CPU-side initialization routines.
- Added notes about pthread_setname_np() on Linux
- Refactored the CUDA-accelerated scale space filter and segmentation classes to permit the use of persistent temporary work buffers for the parallel prefix sum (exclusive scan) operations. This change permits elimination of the CUDA global memory allocation/deallocation pairs within the scale space filter loops. Thrust has a mechanism for ensuring reuse of temporary work area allocations, but it requires very new versions of GCC, in contrast, the CUB device-wide exclusive scan implementation provides explicit workspace parameters that can easily be managed by the caller.
- Added notes about Thrust cached allocations supported w/ GCC 4.4 and later
- Added comments about thrust-internal allocation/free calls that show up in detailed profiler traces
- Corrected NVTX tag color alpha channel values to 0xff so that tags display properly both in 'nvvp' and the system profiler
- Began revision of the entire image segmentation pipeline to enable the use of persistent GPU-side temporary buffers across multiple iterations of high-level segmentation algorithm steps. By incrementally modifying the affected classes with some optional temporary workspace parameters, we can also do A/B comparisons with profiling tools to evaluate the overall performance impact. At present the CUDASegmentation classes have been updated to use this approach, but the entire collection of segmentation classes need a more unified scheme for persistent GPU object management. This is just a starting point for further revision and refactoring.
- Renamed get_cuda_array() helper routine to alloc_cuda_array() for clarity. The segmentation pipeline APIs need cleanup so we don't have to create and expose wrappers for low level memory allocations like this among the different segmentation related classes.
- Added the BFE estimator plugin to the startup scripts
- Added Chris Chipot's binding free energy estimator plugin.
- Eliminated debugging code with conditional compilation and removed unnecessary extra calls to cudaDeviceSynchronize()
- Eliminated more debugging code and calls to cudaDeviceSynchronize()
- Continued cleanup, shave off unnecessary calls to cudaDeviceSynchronize()
- Refactored the Watershed constructor to eliminate the separate helper method.
- Updated comment about runtime associated with the CPU-side init() method.
- Cause the GaussianBlur class to avoid GPU-host copies after blur() operations until absolutely necessary, which is only the case if get_image() is called to get the results in a host-side memory buffer. This cuts the runtime by 30% or so, particularly for large size maps.
- Added more profiling hooks in the Segmentation and ScaleSpaceFilter classes to track down unnecessary host-GPU data movement calls.
- Added more profiling hooks to the Gaussian blur class
- Added profiler hooks to key CUDA Gaussian blur routines
- Improved usage of NVTX profiling markers in the Segmentation and ScaleSpaceFilter classes.
- Continued improvment of MeasureVolInterior loop nests and indexing arithmetic.
- Corrected the loop nest traversal order in MeasureVolInterior markIsoGrid() and RaycastGrid() for best memory performance. Eliminated leftover unused temporaries and did more cleanup.
- use VolumetricData::gridsize() rather than computing voxel counts within the measure volinterior implementation
- rewrote the countIsoGrids() routine for measure volinterior
- Cleanup and de-tabification of 'measure volinterior'.
- Misc cleanup of measure volinterior Tcl bindings
- Added Tcl bindings for "measure volinterior" commands, lightly revised from Juan's original implementation.
- Added MeasureVolInterior.[Ch] source files associated with Juan's ray casting based vesicle/capsid interior volume routines.
- Added Juan Perilla's routines to measure the interior volume of a vesicle or capsid, based on inside/outside tests performed on a simulated density map (e.g. from QuickSurf, 'volmap', or similar) in combination with an isovalue boundary value threshold parameter, and a ray casting approach for marking voxels.
- modelmaker: added fragments command for using local fragment picker and removing Robetta webserver dependence
- Added a profiler timestamp marker for the CUDA Watershed::init_gpu() method
- Migrated Watershed segmentation profiler timestamp marks into the specific CPU/GPU methods
- docs: Updated documentation for the collective variables module.
- docs: Imported documentation for the many Python bindings contributed by Josh Vermaas.
- Further optimization of the shuffle-based MDFF GPU kernels for SM >= 3.x
- Revised the molecular orbital GPU kernels to eliminate use of the old umul24() intrinsics that were beneficial for performance on the first few generations of GPUs.
- Revised the direct Coulomb electrostatics GPU kernels to eliminate use of the old umul24() intrinsics that were beneficial for performance on the first few generations of GPUs.
- Revised the implicit ligand sampling GPU kernels to eliminate use of the old umul24() intrinsics that were beneficial for performance on the first few generations of GPUs.
- Updated the built-in CUDA multiply-add benchmark to eliminate use of the old umul24() intrinsics that were beneficial for old GPUs. Increased the default workload size by 40x to do a better job on the latest Volta GPUs.
- Rewrote the MDFF cross correlation intra-warp parallel sum reduction loops to exploit shuffle instructions for Kepler and later GPUs. The use of shuffle instructions eliminates much of the need for shared memory. The new reductions are written against the latest __shfl_xxx_sync() routines from CUDA 9.x and beyond, since the previous shuffle instruction variants are now deprecated. There is no noticable impact on performance thus far, which is logical since the reduction scheme we use was highly efficient even on the older generation GPU hardware.
- Made 2xl and 4xl variants of the find_max_values_shm() kernel for improved convergence rate with very large numbers of segments.
- Corrected image segmentation diagnostic timing output.
- Wrote a new CUDA kernel to significantly speed up the find_max_values() kernel by creating a variant that limits the scope of potential atomic update collisions for each thread block to SM-local shared memory, followed by a block-wide update to global memory. The new find_max_values_shm() kernel is only used when the number of segments is smaller than or equal to the number of voxels per thread block, such that each thread handles one potential group's max value update.
- modelmaker: changed default cccolor selection from all to protein and noh
- Added profiler markers for CUDA segmentation algorithm steps, and noted that we currently spend 20% of our overall GPU-accelerated segmentation runtime in the Watershed::init() routine.
- Added comment about memory bandwidth limitations associated with sampleVolume() use within the MC code.
- QuickSurf: Added comments about key kernel latency hot spots
- Added -lineinfo compiler flag to ensure that CUDA kernel dissassemblies shown in profiling tools can be correlated with source
- Re-tuned QuickSurf CUDA launch bounds for Kepler and later GPUs with CUDA 9.x
- Migrated main VMD thread profiler hook to the earliest point of execution, and special-cased the marker to it is used even when running with the system profiler.
- Added profiling hooks in the cryo-EM segmentation routines
- Added profiling tags in the MDFF cross correlation code path, and ensured unique tag colors for timeline profiler views.
- Cleaned up MDFF error propagation and CPU-fallback handling if/when cases arise that the CUDA code path can't continue. Added a new parameter to the default QuickSurf() constructor to allow forced CPU fallback and override of the use of the CUDA-accelerated code path if we already have a cascading error situation.
- Added comment about MDFF internal environment manipulation logic
- Removed early test code from the MDFF CPU/GPU cross correlation launch path
- Added new "profile" text commands to allow VMD scripts to enable/disable collection of profiling data with external tools like the NVIDIA visual profiler, system profiler, etc.
- Added cmd_profile.C to the build
- Added ref to NVTX sample
- Added profiling hooks into QuickSurf and Molecular Orbital routines
- Added NVTX profile markers for VMD text commands where possible
- Implemented initial use of NVTX tags with the NVIDIA system profiler
- Added PROFILE_NAME_THREAD macro for use in labeling host threads
- Added header-based profiling hooks for NVTX and similar profiling APIs.
- Added NVTX configure flag to enable use of the NVTX profiling APIs for VMD builds compiled against the NVIDIA development tools.
- modelmaker: model usage info no longer shows get_empty_density info
- Updated plugin build scripts for final ORNL Summit system
- fixed pdb2seq output and model usage info
- Split the compilation target for Summit to be separate from the more generic OPENPOWER so we can do extra (potentially not generalizable) compile-time optimizations for VMD.
- modelmaker: fixed cluster alignment code path and added modelmaker interface for modeller scripts
- Added first version of the ModelMaker plugin
- Cranked version number.
VMD 1.9.4 alpha 13 (January 24, 2018)
- Corrected ownership of density map memory allocation in the segmentation -separate_groups subcommand.
- Added a -separate_groups subcommand to the density map segmentation implementation, to automatically mask the reference density map by group IDs, emitting a new density map corresponding to each group ID, with all non-member voxel values set to zero.
- Improved argument parsing for segmentation routines.
- Standardize the console output routines for timing output
- updated built-in help for segmentation commands
- Linked up the "segmentation" Tcl command bindings.
- Updated default segmentation parameters and optional argument parsing.
- the 'mdffi' commands are no longer experimental
- Organized Tcl command binding initializations so they occur in alphabetical order (there are no other initializations or side effects at this point in the code, only the actual command binding).
- Added initial Tcl bindings for the scale-space density map segmentation algorithms.
- Updated the Segmentation class to add a constructor that works directly from the VolumetricData class.
- Changed volumetric data API to use long return type for the gridsize() method to prevent integer overflow for large datasets.
- Refactored gaussian blur 1D kernel generation and added 3D kernel generation in prepration for adding 3D gaussian blur convolutions.
- Added comments about the display update callback implementation and the limitations imposed by older FLTK APIs vs. the latest.
- Began revisions to remove/hide VMD GUI controls that aren't particularly relevent to rendering performance in 2018: both backface culling and display list cache mode are (today) anachronistic schemes for gaining rendering performance, which were even at the time really meant as solutions for special rendering systems like the CAVE, tiled display walls, and similar. Since most users will never have contact with the kinds of hardware where these approaches were useful, there's no longer a reason to incorporate them in the GUI. The underlying features still exist and can be manipulated with text commands however. While modifying the GUI, I also took the opportunity to eliminate hard-coded menu indexing in favor of the use of a simple enum type, which simplifies conditional compilation of menu items via ifdefs.
- Added top level image segmentation and scale-space filtering classes to the build.
- Accomodate header file expectations of some old C++ compilers.
- Corrected Gaussian blur class to call integer abs()
- Protect CUDA GPU related teardown call with #ifdefs so that it is not called on CPU-only builds, such as 32-bit MacOS X.
- More corrections to minor issues in the latest colvars update.
- workaround for type-ambiguous calls to pow() in colvars
- Updated colvars module to the current git version 3bf0ded
- Added the new top level scale-space 3-D image segmentation classes to the build.
- Added the 3-D scale-space Gaussian filtering to the build. This is used for scale-space variants of the Waterhsed image segmentation algorithm in VMD.
- Began type-generic templatizing all of the image segmentation related classes so they are capable of operating on arbitrary pixel/voxel types, to allow us to exploit high performance hardware features for half-precision IEEE floating point, as well as a variety of fixed-point and integer voxel/pixel types for better memory bandwidth and in some cases significantly higher arithmetic throughput. This is particularly valuable on the GPU, as it will eventually enable the so-called "tensor core" in NVIDIA's latest Volta GPUs to be used for some of the convolution operations used in scale-space filtering.
- Updated to OSPRay 1.4.3.
- Added CUDA-accelerated Gaussian blur kernels to the build.
- qwikmd: Bug Fix on the declaration of the Temperature when preparing simulations from previous simulations performed with QwikMD
- qwikmd: Use vec operations to calculate mean and standard deviation
- nanotube: Corrected command line parameter count check on the graphene plugin to accept 6 parameters rather than a minimum of 8.
- qwikmd: Fix issue with copying files on Mac returned by the glob command
- qwikmd: Prevent issues finding the original pdb file when stored in folders with space characters
- Cranked version number.
VMD 1.9.4 alpha 12 (December 21, 2017)
- qwikmd: Add mdProtInfo array to the save file, when the qwikmd file is saved without preparing the simulation.
- watershed: Protect call to destroy_gpu() with ifdefs so that it doesn't cause linkage issues on non-CUDA builds (e.g. 32-bit MacOS X).
- alascan: updated alascan plugin dependency version number for new parsefep
- fftk: Added fftk_ChargeOpt_ESP.tcl to the distrib target
- ParseFEP: Updated version of ParseFEP from Chris Chipot and colleagues. The new version improves FEP analysis performance by a factor of ten or so. This version also corrects a bug in the Bennett-Acceptance Ratio (BAR) module that had affected prior versions.
- Corrected a bug in the use of toupper() in the atom name string matching code used to assign default atomic masses when they are not explicitly provided by input files.
- fftk: added link to parmed for conversion to Amber-style files
- fftk: Added support for RESP calculations used in the Amber force field.
- qwikmd: Typo Correction
- Updated VMD to use OptiX 5.0.0 test builds.
- runante: Add missing close brackets
- qmtool: fixed a parsing error for Gaussian
- Cranked version number.
VMD 1.9.4 alpha 11 (November 8, 2017)
- Continued development of "mol fromsels": added handling of cross-term maps
- Continued development of "mol fromsels": Ensure we don't create duplicate bonds when building a new molecule from multiple selections.
- Changed OptiX cylinder count comparison logic so we don't emit an odd buffer size.
- Modified the OptiX renderer to break up sphere and cylinder array buffers containing more than 5 million primitives into multiple smaller buffers. This helps OptiX exploit NVLink-based distributed memory ray tracing through round-robin allocation of geometry buffers over multiple GPUs. Previously VMD was accumulating as much geometry as possible into each buffer to minimize API overheads. The buffer size cutoff chosen is mostly arbitrary, and it may be that going to larger arrays containing 10 million elements or more may be sufficient to achieve the desired result.
- Continued implementation of "mol fromsels": Ensure that after the new molecule has been constructed from the list of selections, we call newmol->analyze() so that per-atom fields populated by VMD's internal structure analysis are assigned before we return to the caller. When building the new molecule, we also combine the molecule datasetflag from each of the selected molecules, ORing in the fields provided by each, since the new molecule is a superset of the originals. The user can of course override this behavior by manually setting or clearing the molecule dataset flags themselves ex post facto.
- Cranked version number.
VMD 1.9.4 alpha 10 (October 27, 2017)
- Corrected mismatched parameter to strstr() in vaspoutcarplugin, reported by Thomas Holder.
- Corrected leaked title string and temporary atom parsing strings in vaspposcarplugin, reported by Thomas Holder.
- msmsplugin: Fix a memory leak of temporary path strings in the MSMS plugin reported by Thomas Holder.
- Improved MolFilePlugin indexing efficiency for angle/dihedral/improper traversal
- Continued development of "mol fromsels": Added handling for filtering angles/dihedrals/impropers from the selection list into the new molecule.
- Continued implementation of "mol fromsels": Added initialization path to copy periodic cell information from first selection into the new molecule. It doesn't make sense to copy in quantities like timestep index, physical time, or energies, and even PBC unit cell info is only useful in some cases. The implementation should use the merge mode flag to determine whether to copy PBC cell information or not. Added comments related to copying other per-timestep data such as velocities.
- Continued work on 'mol fromsels': added code to copy atomic coordinates from multiple selections into a new timestep, when one or more of the selections have coordinates associated with them. The code does not replicate timesteps, but rather only takes the currently active coordinates for each selection, enabling the caller to mix and match whichever timesteps they might want to use by using '$sel frame' judiciously. The new code only handles coordinates and does not yet handle other time-varying quantities such as velocities, forces, "user" fields, or PBC info.
- Initial implementation of "mol fromsels" Tcl binding.
- Fix missing free in 'measure sasalist'
- Implemented a prototypical implementation of "mol fromsel" to create a new molecule from a list of selections. The code presently copies the required per-atom fields from each selection, and ignores the merge mode flag. The merge behavior and data consistency requirements among the list of selections are not yet implemented or enforced. Different flags will be used to indicate wheather or not to copy over data such as bond/angle/dihedral/improper arrays and similar. QM data will not be copied. It is conceivable that we might want to copy trajectory data under some circumstances, but for now we will copy only timestep 0.
- Correction the shift in the time axis when plotting the QMEnergies due to the double print of the Energies by NAMD. Addition of the printing of the QMEnergies analysis info in the infoMD file
- Addition of the functions and checkbutton to plot electrostatic energies Correction the shift in the time axis when plotting the energies due to the NAMD double printing of some timesteps in the log file
- Caclulate convlution boundary conditions ahead of time instead of performing boundary checks for each read.
- Moved watershed timing code from CPU and GPU specific functions to the generic watershed function, so the same timing code is used for both.
- Regenerate the pdb structure and the tempMol molecule if a mutation is performed. If a mutation is performed on the N-terminal of a protein chain, re-evaluate the N-terminal patch
- New class to perform repeated volumetric Gaussian blurs, needed for watershed filtering.
- Updated Watershed class to remove storage of original image float array. This is only needed to initialize Watershed, so it makes sense to store it elsewhere.
- Cranked version number.
VMD 1.9.4 alpha 9 (October 19, 2017)
- Updated docs for colvars module
- Updated builds to use OSPRay version 1.4.0
- Changed CUDA device reporting format string to allow for longer GPU names, such as the new "Tesla V100-PCIE-16GB"
- Further cleanup, commenting, and doxygenization of the Watershed class. Added comments about proposed code refactoring and changes to GPU handling.
- Moved some macros from Watershed header to C++ code, and moved some struct and enums into the WATERSHED_INTERNAL ifdef
- Removed use of device variables for watershed cuda code and pass them as kernel parameters instead.
- Added a separate function to copy the output of the cuda watershed algorithm from GPU to the host, instead of copying it automatically when the algorithm finishes. This is in prepration for doing the rest of the filtering on the GPU.
- Refactored part of the watershed update kernel to remove redundant if statement.
- Removed use of global varaible neighbor_offset lookup table in the CPU and GPU Watershed implementations by including it as a class member variable.
- fftk: Changed ffTK GUI scripts to use Unix linefeed text formatting to prevent future issues with CVS diffing changes.
- fftk: set elements using topotools if not found in PDB
- Added more socklen_t #ifdef cases.
- Add typecast to Tcl_SetResult() calls on literal strings
- Modern putenv() implementations do not copy their environment input string, so putenv() calls have need a strdup() for strings in automatic or otherwise transient string variables.
- Improved doxygen comments for MobileInterface and continued cleanup.
- Modern putenv() implementations do not copy their environment input string, so putenv() calls have need a strdup() for strings in automatic or otherwise transient string variables.
- Significant cleanup of MobileInterface class implementation to eliminate type-punned pointers and aliasing, improved portability, and improve code style uniformity. Also got rid of a couple of small typos and bugs that the compiler hadn't ever complained about.
- Guard the OptiX device enumeration message helper routine with #ifdefs used by fully interactive builds, since the routine isn't used for batch mode runs.
- Added typecasts for string literals to be used in PNG headers.
- Misc cleanup of unreferenced variables in the CUDA QCP algorithm.
- Updated Cray builds to use the socklen_t variant of accept()
- Updated to colvars version 5005669abc5a97ed497411af7e027e9f2a77578e
- Eliminate compiler warnings for Linux headers that use the socklen_t variants of accept().
- Misc cleanup in the SpringTool class.
- Added typecasts on simulated command line parameters to eliminate compilation warnings on Android
- FastPBC: Ensure initialization of optional array pointers to NULL.
- Added (char*) typecasts to Tcl_SetResult() calls that pass in literal strings
- Eliminate compiler warnings about unused helper functions for CPU vectorization when compiling on on ARM platforms such as Android, when NEON instructions are not enabled.
- eliminate unused thread launch result code
- #if out the old internal HMDMgr quaterion code since we're using the VMD quaternion implementation now.
- Eliminate compiler warnings for unused variables etc in the CUDA-related startup code when compiling for Android.
- qwikmd: Complete the addition of the validation process of the "Ignore Interactive Forces" button
- Updated the Watershed host code to compile successfully without always assuming that CUDA support is enabled.
- Promoted Watershed index offsets from char to int to improve performance and portability across compilers that choose signed/unsigned char types differently when left unspecified. Updated timer output to report total in one case that was incorrect.
- Eliminated problems with ambiguous use of ceil() in integer arithmetic used for block sizing/padding in Watershed implementation.
- Protect internally-used macros in Watershed implementations by WATERSHED_INTERNAL, so that callers don't get any of them in their namespace.
- QM/MM bug fix in the declaration of Temperature in the case of the first protocol is not a Minimization (binvelocities and temperature declared at the same time). Added new validation of the status to the "ignore interactive foreces" button (disable) if MDFF is selected since MDFF does not use this option.
- Added CPU- and GPU-accelerated Watershed image segmentation implementations to the standard build.
- android: Added missing revision info header
- namdgui: Updated the list of parameter and stream files to be included by default to favor CHARMM36 since the structure preparation plugins such as AutoPSF are now using CHARMM36 by default.
- qwikmd: Add the "Ignore Interactive Forces" checkbutton to control the IMD keyword "IMDignoreForces". Allow or ignore forces applied during a "Live View" simulation
- colvars: Synced with colvars tree, to COLVARS_VERSION "2017-09-14"
- Updated location of Rez binary for new revs of Apple's dev tools.
- Enable compilation using CUDA 9.0RC and later now that workarounds are in place for problems with Thrust scan() prefix sums on uint2 types.
- Implemented a temporary workaround for problems with CUDA 9.0RC that break the use of the Thrust parallel prefix scan() routines when used in conjuntion with vector integer types such as uint2 due to lack of a conversion constructor in the CUDA 9.0 toolkit headers.
- Force inclusion of cuda_fp16.h when compiling with CUDA 9.0 due to changes in the default sub-inclusions for cuda.h
- Replaced OptiX 4.1.1 shared lib with a patched library that should cure segfault problems when running on machines with no GPUs or associated driver software.
- structurecheck: Use unique number to flag the selected atoms (-9999) to be checked
- Incorporated a patch for the AMBER parm7 parser from Robin Betz that corrects the parm7 plugin's handling of the atomic number field in the input prmtop file instead of using mass field to guess what element each atom is. This previously caused problems when VMD was passed a prmtop where the hydrogens have had their masses repartitioned (for a faster simulation timestep). Updated the plugin minor version number.
- Updated the AMBER rst7 writing code per suggestions from Josh Vermaas about the behavior of the Fortran reader on the AMBER side when used with relatively small atomic structures. Changing the format specifier from %10d to %6d for writes cures read-side problem for the Fortran code. Updated the rst7plugin minor version number.
- Cranked version number.
VMD 1.9.4 alpha 8 (August 30, 2017)
- psfgen: Updated psfgen to meet requirements for out-of-core I/O with trajectory file formats that need page-aligned page-multiple memory buffers to support direct unbuffered I/O for atomic coordinates. Updated the psfgen version number to 1.7.
- catdcd: correct logic for freeing output buffers when index-selected coordinate frames are used and page-aligned page-multiple buffers are used by plugins.
- Updated VMD configure script to use CUDA 9.0, add support for Volta GPUs (adding compute_70 and sm_70 compilation targets), and OptiX 4.1.1.
- Added additional exception handling for Thrust/CUB failures that lead to thrust::system::system_error being thrown at runtime. This arose in early testing on Volta GPUs with CUDA 9.0RC and with binaries produced by CUDA 8.0 when run on Volta.
- Force both binary and src distributions to incorporate the colvars_files.pl script into the resulting tar file.
- Updated distribution scripts to encompass new colvars configure sub-script
- Migrated listing of collective variables module source files into separately maintained per script section to ease ongoing updates.
- Updated collective variables documentation from the latest git repo pull.
- Updated collective variables module to match latest version from Git: ae4375492605ba82f2be3058a3c8e95fcf56ee68
- plumed: Updated the PLUMED plugin to match the latest version in Git.
- NanoShaper: Corrected bug in NanoShaper vertex file parser
- NanoShaper: Eliminate bad referencess to an already-closed NanoShaper facet file during the vertex parsing block.
- Continued misc cleanup of old code to please valgrind by pre-clearing various temporary buffers to prevent uninitialized reads/conditionals here and there.
- initialize temporary memory buffer to all NUL chars to please valgrind
- qwikmd: Disabled MD and SMD options after preparation/load on both Easy and Advanced Run
- qmtool: slight tweak to support gaussian 16
- namdenergy: added support for Drude forcefield
- fftk: added fix from Po-Chao; changed "Optimization completed\." to "Optimization completed"
- qwikmd: Initialize the smd protocol check button variable in the easy run tab
- qwikmd: Added text for the infoMD file regarding the QM/MM simulations and temporary topology generation.
- qwikmd: Bug fix in the preparation of the QM/MM simulation, prepared from a previous MM simulation with a temporary topology containing iron atoms (needed a temporary stream file). Bug fix for live, advanced SMD simulations (generation of the smd file from an unfinished simulation).
- qwikmd: Bug fix for the display of the list of residues defined as "QM" type (macro)
- rmsdvt: Applied patch from Thomas Albers to make RMSDVT cope with failure to create its working directory when it happens to try a non-writable location.
- Updated build scripts for OSPRay 1.3.1
- structurecheck: Ensures the Reset of torsionplot even if the plugin returns an error
- Released test version for QM/MM paper submission/review
VMD 1.9.4 alpha 7 (July 12, 2017)
- cranked version for QM/MM paper submission/review
- Migrate MacOS X plugin compiles to clang on 'malaga'.
- qwikmd: Added the QMMM configuration files to the Makefile distrib target
- pdbtool: Updated PDB download URLs and updated version number.
- qwikmd: Added check procedure to find TMPDIR folder (new QWIKMDTMPDIR env variable). Added procedure to allow different QM charges from the ones defined in the topology files. Bug fix for atom selection evaluation.
- autopsf: Updated AutoPSF plugin to new RCSB PDB web site layout,
- structurecheck: update structurecheck version to 1.1
- structurecheck: Add error handling in the case of torsionplot fails, usually due to the absence of or wrongly declared TMPDIR.
- qwikmd: update qwikmd version to 1.2
- vmdprefs: Cranked vmdprefs version number for recent bug fix.
- readcharmmpar: Updated version number of readcharmmpar for recent stream file changes.
- Change default behavior when TMPDIR is unset to try /tmp and if that fails, scream loudly to the terminal console and tell the user to take corrective action.
- qwikmd: Skip pdbalias if called by qwikmd
- qwikmd: Changes on the QM/MM configuration files according to Marcelo's comments
- qwikmd: Add the info button text for the QM regions (empty for now)
- qwikmd: Improvements to the energies, pressure, volume, and temperature log text
- qwikmd: QwikMD 1.3 alpha version containing QM/MM interface for the NAMD QM/MM paper. Addition of the "fake" topology functions Function to select charged residues when using Mopac via vmd_pick_event Automatic renaming of residues and atoms upon pdb loading process (replacing autopsf psfaliases command)- including the /toppar/pdbaliastable.txt file containing all the pdbaliases Bug fix for plotting energies, temperature, pressure, volume and smd General bug fixes and GUI improvements
- Enable molecular orbital representation grid reuse optimization by default since no crashes or problems have been seen yet.
- Preliminary test implementation of scheme to allow multiple representations to reuse the same molecular orbital grid when they refer to the same orbital ID, wavefunction type, excitation, spin, grid spacing, and other grid-specific parameters. This short-circuits a number of common cases that otherwise trigger duplicative orbital grid calculations. With the optimization in place, the animation speed for QM/MM simulation trajectories with large QM regions can be up to 2X faster in common cases, and more if multiple density isovalues are shown superimposed. In cases where orbital calculation time is non-dominant, the performance gain is much more moderate, often in the range of 20% to 30%, but depending heavily on the speed of the remaining isosurface extraction step. When combined with a GPU-side marching cubes implementation, performance gains could be much higher so long as entire orbital grids remain completely GPU-resident as all of the graphical representations are processed.
- qwikmd: Reflect the changes to download pdbs diretcly from webpdb
- cranked version
VMD 1.9.4 alpha 6 (June 23, 2017)
- autopsf: Updated AutoPSF plugin to new RCSB PDB web site layout, adapting the automatic PDB download to the new layout of the RCSB PDB web site, since the previous URLs no longer function.
- save_state: Avoid using loop control idioms that break when deleting objects from resizable lists.
- webpdbplugin: Updated the webpdb plugin to adapt the automatic PDB download to the new layout of the RCSB PDB web site, since the previous URLs no longer function.
- plugins: Eliminate historical build configurations and associated host configs.
- Merged in bug fix for handling of non-orthogonal unit cells in the ::TopoTools::replicatemol function.
- Ensure that when a Pickable gets deleted, that any pending pick events are canceled so that we don't get a crash later on in pick_end(). This cures problems with user-defined callbacks on mouse pick events that delete representations associated with the in-progress pick event. The new code will null out any in-progress pick when the Pickable is removed from the PickList.
- vmdprefs: fixed a bug in the gui control of aoambient and aodirect settings
- qwikmd: New functions to generate topologies for the QM region
- qwikmd: Widgets to generate topologies for the QM region ONLY Modification of the Edit Atoms window to become Generate Missing QM Region Topology window
- readcharmmpar: Remove comment of the read para flag to make psfgen aware of the beginning of the parameters section
- Corrected handling of volmap output filenames that lack the .dx extension.
- gromacsplugin: Changed the Groamcs plugin TRX timestep read logic so that if we get a NULL pointer for mdio_ts.pos caused by a zero-sized coordinate frame (possibly occuring as the result of a corrupt file or some other reason?) we skip the afflicted timestep and continue reading.
- mdff: fixed temperature setting for remdff
- mdff: turned temperature control back on in ReMDFF. added resetmaps.sh for ReMDFF.
- qwikmd: Update the configuration files in the user qwikmd Library folder (copy the older ones to the backup folder). Add the QMMM files to the user qwikmd library folder
- Update Colvars to fd7445328d5f670863690869bf95716dce8a1dd8
- qwikmd: First implmentation of the QM/MM and functions to prepare QM/MM structure files and namd configuration files. Allow prepare new structures and simulations starting from a previous simulation prepared with QwikMD. General improvements and bug fixes.
- qwikmd: Declare steps per cycle explicitly. First version of the Configuration files for QM/MM calculations. Add text for the info buttons of the QM/MM tab Fix text for materials and colors info buttons. Update solvent, salt ions and concentration variables in the text Correct the balloon text of the materials resolution label
- autopsf: Bug fix for the undeclared Ccoords (broken fro coarse grain models)' Bug fix for the mapping of chains to segments (chaintoseg variable). The chains were always one cicle increment ahead, breaking in the last residue of the pdb.
- Ensure that the VCA cluster device is initialized to NULL in all cases.
- Implemented maximum transparent surface display feature that matches the behavior of the original CPU Tachyon -trans_max_surfaces command line flag. This is particularly useful when rendering images of complex cryo-EM density maps that have a very large transparent surface depth complexity which can result in overly-detailed or confusing images. By dropping display of the Nth and greater transparent surfaces a scientist can make an effective "illustrative" rendering by removing the excessive background detail.
- Prevent an uninitialized read due to order of constructor initialization steps.
- cranked version
VMD 1.9.4 alpha 5 (April 26, 2017)
- Updated ffTK Makefile to add missing installation of fftk_Configuration.tcl source file, and updated version number to 1.2.
- Allow people to use transmode as a means of hacking shaders in pre-built binaries by passing values all the way down into OptiX.
- autoionize: Added an extra safety check so that autoionize will only try and create Tk dialogs for error output when the user is running in a graphics-enabled session and has launched the GUI for autoionize rather than through the scripting interface. Updated autoionize version numbers to 1.5.
- xmdff: added -ignorestderr to exec to avoid xmdff aborting if vmd reports any libgl warnings
- Implemented an alpha transparency channel for perspective, orthographic, and dome master camera modes in the TachyonL-OptiX ray tracing engines for use when compositing images rendered by VMD with externally rendered content or background materials produced by other means, as needed for the NSF CADENS movie project: "The Birth of Planet Earth".
- Added alpha channel tracking infrastructure to the low level OptiX ray tracing code.
- corrected rgba4f to rgba4u conversion loop
- Added alpha channel image output support to OptiXRenderer image output routines
- Bolstered error handling for OptiX image output after adding alpha channel paths
- Commented out png_set_alpha_mode() call that apparently isn't implemented in all revs of libpng. Corrected missing file close.
- Initial implementation of alpha channel image file output
- Added in PNG alpha channel output implementation based on Tachyon src
- cranked version
VMD 1.9.4 alpha 4 (April 19, 2017)
- psfgen: Replace use of int64_t with long since older MS compilers didn't support int64_t anyway, and long works fine on 64-bit platforms where it would be possible to generate multi-billion atom structures.
- psfgen: The stdint.h header doesn't exist on older MSVC compilers. Since we don't actually need it on any existing platform we compile VMD on, don't include it, at least for the time being.
- mergepdbs: updated script to get the right topology file location
- Corrected the old buffer-based OptiX lighting code path which hadn't yet been correctly synced up with changes for positional lights etc.
- Prevent un-normalized directional light directions from making it into the core OptiX rendering code.
- Removed all conditional compilation checks for VMD plugin ABI versions prior to 15 since plenty of time and use has shown the oldest API entry points to be useful by now. A current question is whether it still makes sense to retain some of the APIs that were added circa 2010/2011 for the purposes of paratool-based parameterization and similar, which may no longer be necessary given the molecular file formats that currently exist in the field.
- mdffplugin: fixed bug in checking paramater file location (qwikmd issue) and replaced code with much more reasonable check
- namdenergy: Include changes to the "Parameter Files (-par FILE):" description sent by Peter Freddolino
- topotools: Applied Axel's latest Topotools bug fixes that address small problems reported by One Sun Lee and Aric Newton, and updated the DOI for topotools.
- plugins: Added compilation rules for KTH PDC Cray XC40 'Beskow'
- Added ref to NvPipe API
- qwikmd: Call catdcd using the explicit path to the folder inside the VMD plugins tree
- colvars: Updated to colvars version 20170321
- psfgen: Minimal change to allow writing tiled billion-atom systems for testing.
- psfgen: Increase version to 1.6.7.
- psfgen: Support 2 GB hasharrays to enable 2^25 residues per segment.
- fftk: converted geometry optimized PDB and charge optimized PSF files to multi-use variables accessed through the configurations namespace
- fftk: general configurations changed to be self-initializing on startup
- fftk: the configuration namespace is now a dependency of script-based bond/angle and dihedral optimizations
- fftk: implementing general configuration namespace and hooked up to variable storing namd binary path
- readcharmmpar: uncomment "read para *" lines to be detected by psfgen and ignore until the next "read rtf" flag appears again
- Added "Delphi Force" plugin to the externally-hosted plugin list.
- Emit a console message when the user has overridden the VMD OptiX shader path
- Correct a bug in the shadow ray transparent surface light attenuation
- Updated collective variables module with changes that migrated large method implementations out of headers (previously declared inline) and into the class implementation, which also cures problems with the Intel C/C++ compiler and its ability to generate code for multiple target architectures in the same binary. The previously inlined methods were not being properly generated as runtime-dispatched functions by the Intel 2015 compilers.
- Further streamlining of hand-vectorized CPU kernels, removing old non-FMA implementations for AVX/AVX2, and misc cleanup.
- Prototypical implementation of AVX-512F molecular orbital kernel. Uses exponential approximation approach, AVX-512 mask registers, and FMA instructions. Needs further tuning, but good enough for starting work.
- Put in thread affinity logic for the persistent thread pool used by the molecular orbital CPU kernels. While not important for standard x86, this is of significant importance for Power8 and KNL that both have significant SMT depths, to eliminate run-to-run variation caused by Linux shuffling threads around with large thread counts.
- cranked version
VMD 1.9.4 alpha 3 (March 8, 2017)
- psfgen: corrected psfgen version number to match compiled code
- psfgen: Fix C++-style variable declarations after first statement.
- Bugfixes and optimizations in the Colvars module
- Enforce the checking of the protocol's temperature
- qwikmd: Force the initialization of the velocities after the Minimization steps.
- Collective variables updates and a fix for a bug reported by Chris Mayne.
- Added comment to make it clearer what low-level array storage format is used for people reading the headers.
- Prevent the VMD viewport height factor from influencing OptiX dome master camera behavior. We should instead add an explicit fov control API for this.
- Update VMD builds to require OSPRay 1.2.0 or later.
- Replace the use of the global "aoWeight" factor from OSPRay 1.1.x with the new special-purpose "ambient" light implemented in OSPRay 1.2.0.
- Updated VMD to require OSPRay version 1.2.0 or later, since that version fixes the known problems with lighting through transparent surfaces. VMD enables the OSPRay "aoTransparencyEnabled" flag by default since that's the normal behavior for the other ray tracing engines that support it.
- cranked version
VMD 1.9.4 alpha 2 (February 7, 2017)
- psfgen: Sync with NAMD. Bump version to 1.6.6. In topology file PRES (patch) entries support ATOM record with name but no type or charge to specify subsequent atom insertion order. From Brian Radak: new psfset command to replace vel, bfactor, and renameatom commands, extend segment command to query charge.
- Latest collective variables module updates from Giacomo and Jerome.
- Added source files for the prototypical FastPBC implementation.
- Added Python bindings for per-residue RMSD, RMSF, QCP etc, and Hbonds.
- Added routines for per-residue RMSD, RMSF, CoM, and refactored Hbond code to facilitate access through native Python interface. The Hbond changes need further revision for robustness.
- tinkerplugin: Applied patch from James Graham to correctly handle parsing of Tinker arc files that contain simulation periodic cell information.
- Added Tcl "mol voldelete" command and updated the behavior of "mol volmove" so that both require the volume index parameter for the sake of encouraging consistent usage.
- Added prototype fast PBC wrapping code written by Josh Vermaas as part of preparing for the ORNL/NCSA OpenACC hackathon.
- Added per-residue variants of several existing VMD "measure" routines: measure_center_perresidue(), measure_rmsd_perresidue(), and measure_rmsf_perresidue(). Added an initial refactored variant of the hbonds determination code that has been separated out of the Tcl hbonds implementation so that it can be called directly from Python also.
- Added Python interface to allow volumetric data objects to be deleted from a specified molecule. The Tcl variant of this still needs to be added.
- cranked version
VMD 1.9.4 alpha 1 (January 31, 2017)
- Initial prototype implementation of molecule instancing feature, with Tcl bindings for new commands "mol addinstance" (to add one), "mol showinstances" (to show instances for a selected representation), and "mol instances" (to return the count of instances). At present, the instance list is global to a molecule and they are not individually selectable or modifiable, but this is a short-term limitation. Ultimately, the instances should be indexed, and multiple lists of instances should be maintained per molecule, accessible by indices or string name keys. These next steps will enable development of hierarchies of instances and other advances in a prototypical form for evaluation and testing for a period of time before they are ultimately committed into a low-level implementation.
- gromacsplugin: Use modular arithmetic to prevent formatting problems on GRO files that contain more than 100,000 atoms. Since GROMACS ignores the atom indices, causing the indices to wrap around is a simple workaround to prevent GRO files for large structures from becoming unreadable.
- psfgen: Warn and ignore on self and duplicate bonds when reading topology files.
- Replace the use of literal constants for trajectory loading "waitfor" sentinel values with a new FileSpec::waitfor enum for this purpose. Corrected an old error in the save trajectory FLTK menu.
- qwikmd: Bug fix in the declaration of the default material
- qwikmd: Bug fix on atom name changes
- Corrected VMD AVX-512 CPU feature test reporting so that it correctly identifies non-KNL Xeon CPUs that have AVX-512F, but not AVX-512ER etc. The previous VMD startup code, when run on next-gen Skylake Xeon E5 CPUs misidentified them as KNL in the startup messages. At present, only KNL supports all four of the AVX-512F/CD/ER/PF and other scientific computing instruction subsets. The Skylake Xeon E5 supports just AVX-512F/CD so far.
- Corrected incorrectly terminated Tcl_AppendResult() calls in error handling cases that arise in animate, display, and measure commands. Eliminated tabs in the source code, and did minor code reformatting cleanup where necessary.
- mdff: moved autosmoothing code from mdff_setup to mdff_gui since mdff_setup doesn't normally make potentials.
- topotools: Prevent un-escaped brackets from being interpreted as a Tcl command.
- mdff: initial commit of new BETA feature for setting up ReMDFF simulations
- Promote integer volume dimensions to long to avoid integer overflow issues in the Tachyon renderer interface.
- Fixed 3-D texture map allocation and management APIs in VMD to allow handling of large byte-per-voxel tomograms that would break 32-bit indexing arithmetic that inadvertently remained due to lack of manual integer type promotion in a few places where sizes were computed in the VolumeTexture class. This was corrected by adding a new size calculation method that does the required (ugly) manual type promotion internally.
- Corrected various minor issues that cropped up during compilation of the OSPRay renderer with GCC rather than Intel C++.
- torsionplot: Bug fix for windows, where the tmpdir folder is defined only as "c:" and not "c:/"
- psfgen: Recognize READ statements even if previous END statement is missing.
- mdff: fixed file overwrite bug on Windows when MDFF called from QwikMD
- Wrote skeletal prototype for hand-written GPU-accelerated QCP kernels.
- Updated the QCP code to avoid on-demand thread pool creation to avoid performance loss on Xeon Phi hardware with large core counts and current Linux kernels that don't seem to initially migrate large numbers of threads well.
- Revised the QCP RMSD measurement APIs to pass in the VMDApp pointer so we have access to both the CPU and GPU persistent thread pools. This matches the structure we used for the RDF implementation, and it also sets the stage for eliminating the CPU-side on-demand thread launch approach we previously had to use, which is a particularly poor performer on Xeon Phi hardware. In the case of CUDA the thread pool is also a must have since we'd otherwise have to create fresh threads/contexts.
- psfgen: Sync with NAMD 2.12, mostly from Brian Radak: vel, bfactor, and renameatom commands; query velocities, mass, and atomid; writepsf nopatches option.
- Revised the VMD startup script to better accomodate the special needs of EGL-enabled VMD builds running on SLURM-based Cray XC50 machines such as the newly-upgraded Piz Daint system at CSCS.
- moldenplugin: Corrected buffer parameter in sscanf() call, from Thomas Holder, Schrodinger
- plyplugin: eliminate unused string list, from Thomas Holder, Schrodinger
- jsplugin: Corrected function return type, from Thomas Holder, Schrodinger
- qwikmd: Remove Render related text. Add resolution info.
- qwikmd: Add MDFF text info. Add color scheme text.
- qwikmd: Remove Render commands. Label corrections. Add check resolution command after adding representations.
- qwikmd: Addition of color schemes. Combobox corrections to avoid getting stuck in selections. Minor GUI corrections.
- qwikmd: Added images for new QwikMD docs, updated CSS page formatting, and updated the documentation page to match the current QwikMD software.
- Added more URLs for reference for the Amira plugin
- Added Amira plugin to the build
- Added early prototypical plugin for reading Amira mesh files
- Force the fully-general shader to enable runtime toggling of the VR clipping plane and the VR headlight.
- Slightly improved the shader safety check for bogus surface normals resulting from degenerate triangles in isosurfaces reconstructed from noisy data.
- Added a short-term workaround for scenes that contain some really bad surface geometry from marching cubes surfaces that result from noisy and coarse resolution cryo-ET
- Improved console debugging messages
- Added code to read and report amin/amax/amean when reading MRC maps. Improved MRC/CCP4 plugin handling with a new special case for for orthorhombic cell axes, which solves a numerical precision problem with very large tomograms where there was previously a possibility of computing a NaN for two of the Z-axis basis vector components when running the code path that previously assumed a non-orthorhombic map. Improved code organization.
- Updated URLs for MRC/CCP4 file format specifications for the variations used by EMDB, the IMOD program, and recently published format extension proposals in JSB.
- Cranked version
VMD 1.9.3 Final Release (November 30, 2016)

Please email any questions to vmd@ks.uiuc.edu.

Home

Overview

Publications

Research

Software

VMD Molecular Graphics Viewer

NAMD Molecular Dynamics Simulator

BioCoRE Collaboratory Environment

MD Service Suite

Structural Biology Software Database

Computational Facility

Outreach

VMD 1.9.4 Development

VMD Development Status