From: FX (fxcoudert_at_gmail.com)
Date: Mon Apr 06 2020 - 05:01:55 CDT

Hi everyone,

I was thinking about a way to add cell information to files read or written in XYZ format. As you know there is no cell information recorded in the XYZ file format, but it is actually pretty easy to add some: each structure has a comment line, which different tools can use to store additional information. We can add the cell information there, in a format that is unambiguous to anyone reading the file:

1188
 generated by VMD, with cell: a=27.565001 b=38.624001 c=10.748000 alpha=93.489998 beta=90.900002 gamma=117.150002
 O 10.659605 14.038974 6.458052
 H 10.304418 14.500156 7.211330

and yet will be ignored by any tool that does not expect it.

I also added the code to read back the information, so that XYZ files can be round-tripped with no loss of information. This can also allow other tools to start adding cell information in the future, in particular users’ own conversion/analysis scripts. The reading code is completely specific about what it expects, to avoid reading wrong information. It will only read information in the specific format above, where all the a= b= c= alpha= beta= gamma= fields are provided and non-zero. Otherwise, we fall back to the current behaviour.

I hope this can be useful, and welcome feedback.

Best regards,
FX

-- 
Dr. François-Xavier Coudert
Senior Researcher / Directeur de Recherche CNRS
at the Institut de Recherche de Chimie Paris
Professeur attaché ENS / PSL University
Webpage: https://www.coudert.name/

--- vmd-1.9.4a38-orig/plugins/molfile_plugin/src/xyzplugin.c 2016-11-28 06:01:55.000000000 +0100
+++ vmd-1.9.4a38/plugins/molfile_plugin/src/xyzplugin.c 2020-04-06 11:45:37.000000000 +0200
@@ -143,6 +143,49 @@ static int read_xyz_structure(void *myda
   return MOLFILE_SUCCESS;
 }
 
+static void read_cell_from_string(char *str, molfile_timestep_t *ts) {
+ float a = 0, b = 0, c = 0, alpha = 0, beta = 0, gamma = 0;
+ char *p;
+
+ if ((p = strstr(str, "a=")))
+ a = atof(p+2);
+ if (a == 0)
+ return;
+
+ if ((p = strstr(str, "b=")))
+ b = atof(p+2);
+ if (b == 0)
+ return;
+
+ if ((p = strstr(str, "c=")))
+ c = atof(p+2);
+ if (c == 0)
+ return;
+
+ if ((p = strstr(str, "alpha=")))
+ alpha = atof(p+6);
+ if (alpha == 0)
+ return;
+
+ if ((p = strstr(str, "beta=")))
+ beta = atof(p+5);
+ if (beta == 0)
+ return;
+
+ if ((p = strstr(str, "gamma=")))
+ gamma = atof(p+6);
+ if (gamma == 0)
+ return;
+
+ /* We found all cell parameters */
+ ts->A = a;
+ ts->B = b;
+ ts->C = c;
+ ts->alpha = alpha;
+ ts->beta = beta;
+ ts->gamma = gamma;
+}
+
 static int read_xyz_timestep(void *mydata, int natoms, molfile_timestep_t *ts) {
   int i, j;
   char atom_name[1024], fbuffer[1024], *k;
@@ -154,6 +197,9 @@ static int read_xyz_timestep(void *mydat
   if (NULL == fgets(fbuffer, 1024, data->file)) return MOLFILE_ERROR;
   if (NULL == fgets(fbuffer, 1024, data->file)) return MOLFILE_ERROR;
 
+ /* read optional cell parameters in comment line */
+ read_cell_from_string(fbuffer, ts);
+
   /* read the coordinates */
   for (i=0; i<natoms; i++) {
     k = fgets(fbuffer, 1024, data->file);
@@ -224,7 +270,13 @@ static int write_xyz_timestep(void *myda
   int i;
 
   fprintf(data->file, "%d\n", data->numatoms);
- fprintf(data->file, " generated by VMD\n");
+
+ if (ts->A && ts->B && ts->C) {
+ fprintf(data->file, " generated by VMD, with cell: a=%.6f b=%.6f c=%.6f alpha=%.6f beta=%.6f gamma=%.6f\n",
+ ts->A, ts->B, ts->C, ts->alpha, ts->beta, ts->gamma);
+ } else {
+ fprintf(data->file, " generated by VMD\n");
+ }
   
   atom = data->atomlist;
   pos = ts->coords;