From: Mayne, Christopher G (
Date: Wed Apr 30 2014 - 13:32:39 CDT


Oh, I wish you had asked about that in the first place. I wrote some VMD code a few months back to construct a PSF/PDB for an arbitrary generation of PAMAM a few months back. I needed to build out to generation 6 or 7, so hand building in a molecular editor wasn't going to cut it (I tried for a few hours). It's a challenging problem, so I'm glad you found a suitable solution.

Christopher Mayne

On Apr 30, 2014, at 1:18 PM, Jean-Patrick Francoia wrote:

Le 30/04/2014 15:05, Mayne, Christopher G a écrit :

As I mentioned before, SMILES only encodes molecular graph data, not coordinates. So to convert SMILES to PDB for more than a couple of residues you would have to solve the protein folding problem to start from connectivity data and arrive as a useful 3D structure. SMILES is really not the appropriate format for protein structures, in general.

Perhaps if you told us what you are trying to do, we could be more helpful. Your original question was quite simple--are there tools to go from SMILES to 3D coordinates? Yes, there are, but they are designed for small molecules. You seem to be more interested in large protein structures.

Christopher Mayne

On Apr 30, 2014, at 1:37 AM, Jean-Patrick Francoia wrote:

Le 30/04/2014 08:23, Eduard Schreiner a écrit :
there are examples coming with it. Still, I do not think it is a good idea to build a protein with 15000 amino acids with emc. You will get some amorphous mass.


On Tue, Apr 29, 2014 at 10:51 PM, Jean-Patrick Francoia <<>> wrote:
Le 29/04/2014 21:11, Eduard Schreiner a écrit :
Hi all,

I would also like to mention the free tool EMC

which I use a lot to build any type of amorphous system or polymer without a specific secondary structure, all based on SMILES definition of all the components.


On Tue, Apr 29, 2014 at 6:56 PM, Davide Provasi <<>> wrote:
Chemaxon has tools and Java libraries to convert smiles to formats vmd can visualize
It requires a license but it's free for non-commercial use.
the Corina web demo is also free and would generate high quality 3D molecular structures.
these tools, however are optimized for small molecules;
I'm not sure how well they would perform on peptides, let alone long proteins
good luck


On Tue, Apr 29, 2014 at 11:01 AM, Norman Geist <<>> wrote:
I don't think this is possible even with scripting, as "building" tools do
usually have a database with coordinate templates. How could you otherwise
determine the position of the atoms while adding them? VMD is more a
visualization-, than a building-tool, although it can be used for

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von:<> [<>] Im
> Auftrag von Jean-Patrick Francoia
> Gesendet: Dienstag, 29. April 2014 14:49
> An:<>
> Betreff: vmd-l: Visualize a SMILES string

> Hello,
> I wonder if there is a way to visualize a SMILES string in VMD, and
> possibly in 3D. I did a bit of digging but found nothing clear.
> How do you do that ?
> Regards
> JP

Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.>
Davide Provasi
Dept. of Structural and Chemical Biology
Mount Sinai School of Medicine
Icahn Medical Institute Building
1425 Madison Avenue, Box 1677
New York, NY 10029-6574
Fax: 212-849-2456<tel:212-849-2456>
EMC seems to be exactly what I need. How many atoms do you have to handle by SMILES string ? It seems to be a lot, from what I saw on some websites.
Would you have any usefull links to start ? The doc provided in the package is just documentation.
No, there is "only" 960 amino acids :) So, how would you translate a SMILES chain that long into an exploitable PDB ?
Ok, I will close the topic by bringing the solution I finally found, and some explications as well, if somebody needs them in the future.
I'm trying to model a complex homopolymer of lysine. Some of you might have heard about PAMAM or polyLysine, it's kind of the same thing (mine are called DGL). Mine is a dendrigraft, so basically, the lysine are polymerized either on their epsilon amine (pseudo peptidique bound), or either on their alpha amine. It's a branched polymer, and there are several generations of them: from 1, to 5. There are 8 amino acids for the G1, and 960 amino acids for the G5. What we know about them is the average number of epsilon bounds per DGL, and the average number of alpha bounds. So each molecule of DGL is unique.
To model this kind of molecule, I first needed to generate them, randomly. So, I wrote a python program to produce a SMILEs string (strings are easy to use in programs) with the previousy mentionned parameters. At the end, I obtain a HUGE string for a DGL G5, and traditional programs like Avogadro just crash when they try to convert a SMILEs string that long to 3D coordinates.
Because yes, I understand I need to have a file with 3D coordinates to perform modelisation. What I needed was a program capable of generating 3D coordinates from a huge SMILES string.
I finally found what I needed: Discovery Studio Visualizer. The visualization program only is free. With that, I have been able to generate a PDB and a mol2file for all the generations, while I failed with other programs for G > 3.
So, Thank you very much all of you for the help. Even if that wasn't what I was looking for, I learnt some usefull stuff.