Fixing Mr. Bayes, MPI, and SSH keys

by

Here are obscure notes about solving a problem with Mr. Bayes, MPI, and SSH:

PROBLEM: Mr. Bayes (or some other MPI application) fails. When we execute this command:

mpirun -v -machinefile .bhosts -np 8 mb < script.nex

. . . we get the following output:

running /common/bin/mb on 8 freebsd_ppc ch_p4 processors
Created /Users/victor/PI26710
Password:
Parallel version of
p0_26516: p4_error: Child process exited while making connection to remote process on node003.cluster.private: 0
p0_26516: (15.092200) net_send: could not write to fd=5, errno = 32
DIAGNOSIS: Your SSH keys are not correctly setup to allow MPI to communicate with other nodes.
SOLUTION: Follow these steps. . .
  1. cd .ssh
  2. ssh-keygen -t dsa -f id_dsa
  3. cat id_dsa.pub >> authorized_keys
  4. chmod 640 authorized_keys
  5. Open authorized_keys with your favorite text editor. The first line should contain a key for you@your.awesome.cluster.
  6. Copy the first line. Paste this line once for each node in the cluster. Change the hostname to match the name of the node. For example, the first few lines of my authorized_keys file looks like this (where “. . .” are pieces I’ve abridged for security reasons):

ssh-dss AAAAB3NzaC1kc3MAAACBAO6K5GKxrd2UO. . .
. . .
b8R7y6RJCTDRDw6iOJK8xKSvnC
X8= victor@my.awesome.cluster.edu
ssh-dss AAAAB3NzaC1kc3MAAACBAO6K5GKxrd2UO. . .
. . .
b8R7y6RJCTDRDw6iOJK8xKSvnC

X8= victor@node002.cluster.private
ssh-dss AAAAB3NzaC1kc3MAAACBAO6K5GKxrd2UO. . .
. . .
b8R7y6RJCTDRDw6iOJK8xKSvnC

X8= victor@node003.cluster.private

. . . and now your MPI application should work.

If you’re fixing this problem for someone else (assuming you have root privileges), do the following additional steps:

  1. All the keys you generate will be for root@my.awesome.cluster. In authorized_keys and id_dsa.pub, change root@my.awesome.cluster to someone.else@my.awesome.cluster, where someone.else is the appropriate username.
  2. All the keyfiles you generate will be owned by root, which is not what we want. “chown USERNAME” authorized_keys and id_dsa*.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: