From: John Stone (johns_at_ks.uiuc.edu)
Date: Mon Oct 05 2009 - 21:14:10 CDT

Hi Vlad,
  Could you send us a tar.gz file containing one of the sets of files
you're working with so we can more easily reproduce the problem
you're experiencing? The description you gave below is a little
bit terse, but if you send us both the files and the sequence of
steps you're taking, that should make it much easier to sort out.

Cheers,
  John Stone
  vmd_at_ks.uiuc.edu

On Thu, Oct 01, 2009 at 11:26:01AM +0200, Vlad Cojocaru wrote:
> Dear VMD users (Multiseq developers),
>
> I am looking into using multiseq for some alignment projects ...
> Multiseq seems a very nice interface, however there are a couple of
> issues I would like to discuss.
>
> I am following the steps:
> 1. Upload a multiple fasta sequence file that corresponds to a list of
> pdbids:chainids. The fasta file is downloaded from the PDB.
> 2. Automatically download for each sequence, the corresponding chain in
> the corresponding pdb file
> 3. Aligning the sequences based on the loaded structures
> 4. Save the alignment profile
> 5. Use the profile further
>
> The reason I would like to load the fasta file before the structures is
> simple: some structures have missing residues, thus if multiseq reads
> the sequence directly from the structural residues, it would load an
> incomplete sequence. The problem is that upon loading the fasta file
> each sequence gets the name "SEQUENCE" in the multiseq lines. The word
> "SEQUENCE" is the last column in the fasta headers downloaded from PDB.
> The first column is "PDBID:CHAINID". Now, if I try to automatically
> retrieve the pdb chains corresponding to the sequences in the fasta
> file, this is currently not possible. I would imagine that if each
> loaded sequence would be recorded with the name taken from the first
> column of the fasta header, the automatic download of the corresponding
> chain in the PDB should be possible.
>
> Of course I know that the fasta files from UNIPROT have the sequence
> name on the last column, rather than first.
>
> But maybe it would be useful to follow the convention of the PDB fasta
> files ...
>
> Best wishes
> Vlad
>
> --
> ----------------------------------------------------------------------------
> Dr. Vlad Cojocaru
>
> EML Research gGmbH
> Schloss-Wolfsbrunnenweg 33
> 69118 Heidelberg
>
> Tel: ++49-6221-533202
> Fax: ++49-6221-533298
>
> e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de
>
> http://projects.villa-bosch.de/mcm/people/cojocaru/
>
> ----------------------------------------------------------------------------
> EML Research gGmbH
> Amtgericht Mannheim / HRB 337446
> Managing Partner: Dr. h.c. Klaus Tschira
> Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
> http://www.eml-r.org
> ----------------------------------------------------------------------------
>

-- 
NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349
  WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078