当前位置: 动力学知识库 > 问答 > 编程问答 >

cloud - PVM terminates after Adding Host

问题描述:

On Ubuntu 9.10 using PVM 3.4.5-12 (the PVM package when you use apt-get)

The program terminates after adding a host.

laptop> pvm

pvm> add bowtie-slave

add bowtie-slave

terminated

laptop>

Current Configuration only $PVM_RSH = bin/usr/ssh


I can ssh perfectly fine into the slave without a password, and run commands on it.

Any ideas?

Thanks in advance!

Here are the sample logs:

Laptop log

[t80040000] 02/11 10:23:32 laptop (127.0.1.1:xxxxx) LINUX 3.4.5

[t80040000] 02/11 10:23:32 ready Thu Feb 11 10:23:32 2010

[t80040000] 02/11 10:23:32 netoutput() sendto: errno=22

[t80040000] 02/11 10:23:32 em=0x2c24f0

[t80040000] 02/11 10:23:32 [49/à][6e/à][76/à][61/à][6c/à][69/à][64/à][20/à][61/à][72/à]

[t80040000] 02/11 10:23:32 netoutput() sendto: Invalid argument

[t80040000] 02/11 10:23:32 pvmbailout(0)

bowtie-log

[t80080000] 02/11 10:23:25 bowtie-slave (xxx.x.x.xxx:xxxxx) LINUX64 3.4.5

[t80080000] 02/11 10:23:25 ready Thu Feb 11 10:23:25 2010

[t80080000] 02/11 10:28:26 work() run = STARTUP, timed out waiting for master

[t80080000] 02/11 10:28:26 pvmbailout(0)

网友答案:

I've also been struggling with this problem. I just found a couple of the things that were failing for me.

First, my master host was starting with a node-name that was not recognized by the slave host. That is, it was calling itself "foobar" but it really should have been "foobar.example.com" so that the slave knew how to talk to it. You specify this by starting the master console like this:

pvm -nfoobar.example.com

I also specified the full name of the slave. So in the console:

add baz.mumble.example.com

Then I had a problem where the console would hang when I added the slave. Hey, at least it's not just stopping! I found out that this is because of the firewall on the slave host---the communications were getting dropped (the pvmd's don't communicate over ssh after setup, they have another port that they talk over). Unfortunately, running without a firewall is not an option for that host. By default, pvmd picks a random port number, which is not what I want. Apparently there's an undocumented environment variable, PVMNETSOCKPORT, that controls what ports it uses. Right now I'm working on getting that correctly set so that I can poke the correct hole in my firewall.

Good luck! I'll try and update this answer if I get any farther.

网友答案:

Ahh... the joys of starting up PVM! I use PVM via an external library, InterComm. Getting PVM to start nicely on any platform is always a fun exercise. Here are some things you can try:

If you can rsh to your compute nodes, set $PVM_RSH=/path/to/rsh. Otherwise, to configure via ssh:

Setup passwordless SSH and manually verify that it works.

Then, create $PVM_ROOT/ssh, containing something like:

#!/bin/sh

host=$1
shift
/usr/bin/ssh $host ". ~/.pvmprofile; [email protected]"

Once that's taken care of:

Set some environment variables (this is machine-dependent):

setenv PVM_ARCH LINUX64
setenv PVM_ROOT /users/ps14/opt-intel/pvm3
setenv PVM_BIN ${PVM_ROOT}/bin

# Set the following accordingly:    
setenv PVM_RSH ${PVM_ROOT}/ssh
#setenv PVM_RSH rsh

Now, create a ".pvmprofile" file containing these variables:

rm -f ~/.pvmprofile
env | grep PVM_ > ~/.pvmprofile

Create a hostfile containing unique hostnames:

sort -k 1,1 -u ${PBS_NODEFILE} >!  pvm_hostfile

Now, start PVM & add nodes. I like to do this as a one-liner:

printf "%s\n%s\n" conf quit|${PVM_ROOT}/lib/pvm pvm_hostfile
网友答案:

I didn't realize I could answer my own question until now. The reason that it failed was due to the hosts file in /etc/hosts.

Ubuntu has the localhost set up to 127.0.0.1 localhost, however, using PVM, it must use a real IP address. Thus I placed the actual IP address followed by my machine name on top of the localhost so PVM will read that line first. Then all was working. I don't know why it never gave me the loopback error message though.

As rescdsk commented as well, stating which to use to start the master console would work as well but I wanted to be lazy and just type pvm for it to work.

I haven't addressed the security issues yet... maybe rescdsk or Pete will have some nice suggestions for security holes. Although, my host/clusters will not be connected to the internet. Are there any concerns?

分享给朋友:
您可能感兴趣的文章:
随机阅读: