Re: unable to open binary file

From: Kwee Hong (joyssstan0202_at_gmail.com)
Date: Tue Oct 26 2010 - 01:26:10 CDT

Hmm..

I'm still having the error below with an extra notes.* *
*
*
*ERROR: Error on renaming file ZN_wb_md.restart.coor to
ZN_wb_md.restart.coor.old: Invalid cross-device link*
*FATAL ERROR: Unable to open binary file ZN_wb_md.restart.coor: File exists*
*------------- Processor 0 Exiting: Called CmiAbort ------------*
*Reason: FATAL ERROR: Unable to open binary file ZN_wb_md.restart.coor: File
exists*
*
*
*[0] Stack Traceback:*
* [0:0] CmiAbort+0x5c [0xb4521c]*
* [0:1] _Z8NAMD_errPKc+0x9d [0x520c99]*
* [0:2] _ZN6Output17write_binary_fileEPciP6Vector+0x17e [0x98619e]*
* [0:3] _ZN6Output26output_restart_coordinatesEP6Vectorii+0x1b5 [0x986003]
*
* [0:4] _ZN6Output10coordinateEiiP6VectorP11FloatVectorR7Lattice+0x12b
 [0x985c57]*
* [0:5]
_ZN24CkIndex_CollectionMaster39_call_receivePositions_CollectVectorMsgEPvP16CollectionMaster+0x18f
 [0x533603]*
* [0:6] CkDeliverMessageFree+0x21 [0xa863df]*
*Charmrun: error on request socket--*
*Socket closed before recv.*

This round I doubt the problem got to do with the file's permission. We are
using nfs parallel file system on the cluster. We export the nfs
using (rw,sync,no_subtree_check,no_root_squash) options.

Anyway to tackle this?

Thanks

Regards,
Joyce

On Tue, Oct 5, 2010 at 11:55 AM, Kwee Hong <joyssstan0202_at_gmail.com> wrote:

> Hi Axel.
>
> Thanks for your reply. We had got the problem solved. It got to do with the
> file's permission.
>
> But I got another problem when I tried to extend the simulation of another
> structure.
>
> I got this message:
> *FATAL ERROR: Bad global angle count!*
> *
> *
> *FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html*
> *------------- Processor 0 Exiting: Called CmiAbort ------------*
> *Reason: FATAL ERROR: Bad global angle count!*
> *
> *
> *FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html*
>
> After reading the NAMD troubleshooting and some posts at the mailing list,
> I'm sure that i use the correct .xsc file at the restart with the margin set
> at 2.5 in the config file. And when i visualised my system using vmd, I do
> not find any long bonds. I don't understand why such situation happen.
>
> Then I tried to run the simulation from the beginning (not using any
> restart file), it return the same error, too. Is there any problem with my
> cluster as I had the same system run on a single workstation for 1ns and it
> completed successfully?
>
> Regards,
> Joyce
>
> On Mon, Oct 4, 2010 at 11:03 PM, Axel Kohlmeyer <akohlmey_at_gmail.com>wrote:
>
>> On Mon, Oct 4, 2010 at 10:00 AM, Kwee Hong <joyssstan0202_at_gmail.com>
>> wrote:
>> > Hi.
>> >
>> > I had my system run at a single workstation for 1ns and it completed
>> > successfully. Then I extend the simulation to 5ns using a 14 nodes
>> cluster.
>> >
>> > And half way through the simulation, I got this error:
>> >
>> > WRITING COORDINATES TO DCD FILE AT STEP 1605500
>> > WRITING COORDINATES TO RESTART FILE AT STEP 1605500
>> > ERROR: Error on renaming file 2mrt_md_extend.restart.coor to
>> > 2mrt_md_extend.restart.coor.old: Invalid cross-device link
>> > FATAL ERROR: Unable to open binary file 2mrt_md_extend.restart.coor:
>> File
>> > exists
>> > ------------- Processor 0 Exiting: Called CmiAbort ------------
>> > Reason: FATAL ERROR: Unable to open binary file
>> 2mrt_md_extend.restart.coor:
>> > File exists
>> >
>> > After reading the mailing list, I realised it might be caused by NAMD is
>> > computer architecture dependent. May I know is there a way to solve it?
>> Or I
>> > would need to rerun the whole 5ns simulation on the cluster?
>>
>> joyce,
>>
>> no. this has nothing to do with it.
>>
>> the problem is the call to rename(2) preceding the attempt to
>> open the file. for some reason, you cannot overwrite the old
>> backup file. do you have some kind of parallel filesystem on
>> that new cluster? you should carefully check the status, ownership
>> and permissions of the existing and old restart files.
>>
>> cheers,
>> axel.
>>
>>
>> >
>> > Thanks.
>> >
>> > Regards,
>> > Joyce
>> >
>>
>>
>>
>> --
>> Dr. Axel Kohlmeyer akohlmey_at_gmail.com
>> http://sites.google.com/site/akohlmey/
>>
>> Institute for Computational Molecular Science
>> Temple University, Philadelphia PA, USA.
>>
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 05:23:20 CST