Re: AW: Low CPU usage with NAMD running on linux cluster

From: amin_at_imtech.res.in
Date: Tue Jul 31 2012 - 00:01:02 CDT

Thanks for the suggestions. Here is what I got using lshw -short. Please have a
look at it.
Regards.

Amin.

WARNING: you should run this program as super-user.
H/W path Device Class Description
=========================================================
                                   system Computer
/0 bus Motherboard
/0/0 memory 13GiB System memory
/0/1 processor Intel(R) Xeon(R) CPU
X5460 @ 3.16GHz
/0/2 processor Intel(R) Xeon(R) CPU
X5460 @ 3.16GHz
/0/100 bridge 5000P Chipset Memory Controller Hub
/0/100/2 bridge 5000 Series Chipset PCI Express x8
Port 2-3
/0/100/2/0 bridge 6311ESB/6321ESB PCI Express
Upstream Port
/0/100/2/0/0 bridge 6311ESB/6321ESB PCI Express
Downstream Port E1
/0/100/2/0/0/0 ib0 bus MT25204 [InfiniHost III Lx HCA]
/0/100/2/0/2 bridge 6311ESB/6321ESB PCI Express
Downstream Port E3
/0/100/2/0/2/0 __tmp1817455902 network 80003ES2LAN Gigabit Ethernet
Controller (Copper)
/0/100/2/0/2/0.1 eth3 network 80003ES2LAN Gigabit Ethernet
Controller (Copper)
/0/100/2/0.3 bridge 6311ESB/6321ESB PCI Express to
PCI-X Bridge
/0/100/2/0.3/2 scsi0 storage SAS1064 PCI-X Fusion-MPT SAS
/0/100/3 bridge 5000 Series Chipset PCI Express x4
Port 3
/0/100/4 bridge 5000 Series Chipset PCI Express x8
Port 4-5
/0/100/5 bridge 5000 Series Chipset PCI Express x4
Port 5
/0/100/6 bridge 5000 Series Chipset PCI Express x8
Port 6-7
/0/100/7 bridge 5000 Series Chipset PCI Express x4
Port 7
/0/100/8 system 5000 Series Chipset DMA Engine
/0/100/1c bridge 631xESB/632xESB/3100 Chipset PCI
Express Root Port 1
/0/100/1c/0 eth0 network 82571EB Gigabit Ethernet Controller
/0/100/1c/0.1 eth1 network 82571EB Gigabit Ethernet Controller
/0/100/1d bus 631xESB/632xESB/3100 Chipset UHCI
USB Controller #1
/0/100/1d/1 usb2 bus UHCI Host Controller
/0/100/1d.1 bus 631xESB/632xESB/3100 Chipset UHCI
USB Controller #2
/0/100/1d.1/1 usb3 bus UHCI Host Controller
/0/100/1d.2 bus 631xESB/632xESB/3100 Chipset UHCI
USB Controller #3
/0/100/1d.2/1 usb4 bus UHCI Host Controller
/0/100/1d.3 bus 631xESB/632xESB/3100 Chipset UHCI
USB Controller #4
/0/100/1d.3/1 usb5 bus UHCI Host Controller
/0/100/1d.7 bus 631xESB/632xESB/3100 Chipset EHCI
USB2 Controller
/0/100/1d.7/1 usb1 bus EHCI Host Controller
/0/100/1d.7/1/8 storage Multidevice
/0/100/1e bridge 82801 PCI Bridge
/0/100/1e/c display ES1000
/0/100/1f bridge 631xESB/632xESB/3100 Chipset LPC
Interface Controller
/0/100/1f.1 storage 631xESB/632xESB IDE Controller
/0/100/1f.1/0 ide0 bus IDE Channel 0
/0/100/1f.1/0/1 /dev/hdb disk Optiarc DVD RW AD-7560A
/0/100/1f.2 storage 631xESB/632xESB/3100 Chipset SATA
IDE Controller
/0/100/1f.3 bus 631xESB/632xESB/3100 Chipset SMBus
Controller
/0/101 bridge 5000 Series Chipset FSB Registers
/0/102 bridge 5000 Series Chipset FSB Registers
/0/103 bridge 5000 Series Chipset FSB Registers
/0/104 bridge 5000 Series Chipset Reserved
Registers
/0/105 bridge 5000 Series Chipset Reserved
Registers
/0/106 bridge 5000 Series Chipset FBD Registers
/0/107 bridge 5000 Series Chipset FBD Registers
/1 scsi3 storage
/2 scsi4 storage
/3 scsi5 storage
/4 scsi6 storage
> Hi all,
>
> for me this sounds more like a problem of the node interconnect. BTW as long as
> the scaling is linear, you shouldn't give too much on low CPU utilization. If
> the scaling is bad, that means the speedups are not the same as the used nodes,
> you should more worry.
>
> First of all we need to know what interconnect you have. You already told
> Ethernet, but there are 10/100/1000/10000 MBit/s versions outside there so
> please tell what it is.
>
> You can check that by doing lshw on the nodes and look for Ethernet adapters,
> post the name of the devices here.
> Also the name or model of the switch is of course important.
>
> Let us know
>
> Norman Geist.
>
>> -----Ursprüngliche Nachricht-----
>> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
>> Auftrag von amin_at_imtech.res.in
>> Gesendet: Sonntag, 29. Juli 2012 07:10
>> An: Rajan Vatassery
>> Cc: namd-l_at_ks.uiuc.edu
>> Betreff: Re: namd-l: Low CPU usage with NAMD running on linux cluster
>>
>> What I meant was that I am distributing 24 processes over 4 nodes
>> having 8
>> processors each.Therefore each node will have 2 processors free. I have
>> not
>> completed any run till now so I cant give the "TIMING INFORMATION"
>> because the
>> simulation is running extremely slow. My 2 ns simulation is running for
>> more
>> than 2 days now.I will post that information as soon as the run is
>> completed.
>> Thanks.
>>
>> Amin.
>>
>>
>>
>>
>> > Amin,
>> > Do you really mean that you have requested 4 nodes x 8 processors
>> each
>> > = 32 processors? I'm wondering because you said you have only 24
>> > processors. Also, Branko is not asking about output frequencies, but
>> > rather the details of the output that your simulation is giving. For
>> > example, the "TIMING" information from the log file is indicative of
>> the
>> > seconds/step that your simulation is running at.
>> >
>> > rajan
>> >
>> > On Sat, 2012-07-28 at 20:40 +0530, amin_at_imtech.res.in wrote:
>> >> Thanks. I have read the link. I have a PBC system with 20,000 atoms
>> with time
>> >> step=1, dcdFreq=500, outputEnergies=1000. I am trying to run NPT
>> simulation
>> >> using 24 processors on 4 nodes having 8 processors each.
>> >>
>> >> Regards.
>> >> Amin.
>> >>
>> >>
>> >> > Amin,
>> >> >
>> >> > Provide more data about the size of your system, output data, and
>> see:
>> >> >
>> >> > http://www.ks.uiuc.edu/Research/namd/wiki/?NamdPerformanceTuning
>> >> >
>> >> > Branko
>> >> >
>> >> > On 7/28/2012 8:33 AM, amin_at_imtech.res.in wrote:
>> >> >> Dear all,
>> >> >> I am trying to run NAMD on a linux cluster. I am using NAMD
>> >> Linux-x86_64
>> >> >> (64-bit Intel/AMD with ethernet). While i am able to run the
>> program on the
>> >> >> nodes listed in the nodelist file but I find that all the
>> processes are
>> >> >> running at only 8-12 % CPU usage. Can someone please guide me?
>> >> >>
>> >> >> Regards.
>> >> >> Amin.
>> >> >>
>> >> >>
>> ______________________________________________________________________
>> >> >> सूक्ष्मजीव प्रौद्योगिकी संस्थान (वैज्ञानिक
>> औद्योगिक अनुसंधान परिषद)
>> >> >> Institute of Microbial Technology (A CONSTITUENT ESTABLISHMENT OF
>> CSIR)
>> >> >> स%u09
>>
>> ______________________________________________________________________
>> सूक्ष्मजीव प्रौद्योगिकी संस्थान (वैज्ञानिक औद्योगिक
>> अनुसंधान परिषद)
>> Institute of Microbial Technology (A CONSTITUENT ESTABLISHMENT OF CSIR)
>> सैक्टर 39 ए, चण्डीगढ़ / Sector 39-A, Chandigarh
>> पिन कोड/PIN CODE :160036
>> दूरभाष/EPABX :0172 6665 201-202
>
>
>

______________________________________________________________________
सूक्ष्मजीव प्रौद्योगिकी संस्थान (वैज्ञानिक औद्योगिक अनुसंधान परिषद)
Institute of Microbial Technology (A CONSTITUENT ESTABLISHMENT OF CSIR)
सैक्टर 39 ए, चण्डीगढ़ / Sector 39-A, Chandigarh
पिन कोड/PIN CODE :160036
दूरभाष/EPABX :0172 6665 201-202

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:52 CST