The TCP/UNIX version of CRL 1.0 was developed and debugged on a network of uniprocessor Sun SPARCstations (of various kinds) running SunOS 4.1.3. Interprocessor communication is effected over standard 10 Mbit/second ethernet using TCP sockets. We expect that the core elements of the TCP/UNIX version of CRL will work on other UNIX-like hardware/software platforms with only modest porting effort.
In order to use the TCP/UNIX version of CRL, PVM [5] must be
installed on the system; this allows CRL to leverage off of the group
and process management facilities provided by PVM instead of
reimplementing them from scratch. Interestingly, this also allows
applications running under the TCP/UNIX version of CRL to freely
intermix invocations of CRL and PVM functionality (with one exception;
see the description of crl_init
in
Section 6.2 below). We have tested CRL
1.0 against versions 3.3.7 and 3.3.8 of PVM.
The TCP/UNIX implementation included in CRL 1.0 is not intended to be a high-performance distributed shared memory implementation delivering speedups similar to those obtained with the Alewife and the CM-5 implementations. It is provided to allow development and experimentation by those without access to a CM-5 and to display the ease with which CRL may be ported.
To build the TCP/UNIX version of CRL 1.0:
unix% gunzip crl-1.0.tar.gzThis will produce an uncompressed version of crl-1.0.tar.gz named crl-1.0.tar (and will also remove the compressed version).
unix% tar xf crl-1.0.tarThis will create a subdirectory of the current working directory called crl-1.0 and unpack the CRL 1.0 distribution into it.
unix% cd crl-1.0/srcThis directory contains the sources for the CRL library.
unix% make -f Makefile.TCPUNIXOnce this completes, you are done building CRL. Applications intended for use with the TCP/UNIX version of CRL should be linked against the resulting object file (libcrl.o).
To build and run the example application shown in Appendix A:
unix% cd ../apps/example
unix% make -f Makefile.TCPUNIX
unix% pvmThis should put you in the PVM console application. The help command will give a short description of each console command. The most useful for us are
if (! $?prompt) exitin your .cshrc before anything is output. This line causes non-interactive shells to stop executing .cshrc.
setenv PVM_ROOT /home/crl/pvm3 setenv PVM_ARCH `$PVM_ROOT/lib/pvmgetarch` set path=($path $PVM_ROOT/lib/$PVM_ARCH \ ~/pvm3/bin/$PVM_ARCH $PVM_ROOT/bin/$PVM_ARCH)The first setenv should set
PVM_ROOT
to wherever PVM is
installed on your local system.
$path
(e.g., by adding a symbolic link to it from ~/pvm3/bin/$PVM_ARCH
and running rehash), then the sample applications can be started
simply by executing the application. Because of the way that
pvm_app.c
is written, no relative pathnames can be used when
invoking the executable (e.g., ./example won't work); only
executable names or absolute pathnames are acceptable.
unix% example
Applications in the other subdirectories of crl-1.0/apps can be compiled and run in a similar manner.
In order to enable leveraging off of PVM's group and process
management functionality, the interface to crl_init
is slightly
different under the TCP/UNIX version of CRL:
void crl_init(char *groupname)
crl_init
should be called by all member
processes simultaneously. crl_init
should be called before any
CRL functionality is utilized.
No node should call crl_init
before all the other nodes have
joined the PVM group. After CRL has been
initialized, no process is allowed to leave the group (nor are new
processes allowed to join it). Also, the instance number assigned to each
process must be consecutively assigned from 0 to crl_num_nodes
-1.
This could only be a problem if processes had previously left the group.
In the TCP/UNIX version of CRL, crl_init
will block until
network connections have been established with all other processes in
the PVM group. If insufficient network resources are available to
establish these connections, CRL will probably die catastrophically
instead of doing something useful (e.g., returning an error code or
printing a useful error message).
In order to ensure graceful and coordinated termination of a group of
PVM processes running CRL, the TCP/UNIX implementation also provides a
crl_exit
function to ensure that all nodes exit cleanly
and help eliminate mysterious errors.
void crl_exit(void)
crl_exit
should be called
before a process exists (but after it has finished using all CRL
functionality).
The sample main() in apps/pvm.common/pvm_app.c
is an
example of how to start an application using the TCP/UNIX version of
CRL. All of the applications included in the apps directory use
this sample main(). See the README file in
apps/pvm.common
for information on what main() is doing.
In general, an application using the TCP/UNIX implementation of CRL
must do the following during startup:
The applications (or symbolic links to the applications) spawned by the PVM process management code must be available in the path searched by PVM.
Even though an application may be started from an arbitrary place in the filesystem, each of the processes spawned by PVM initially executes in the user's home directory. Unless the working directory of the spawned application is changed (using standard C library calls), any files accessed by the application must have pathnames relative to the home directory.
The Makefiles supplied for use with the TCP/UNIX version of CRL 1.0 assume that the GNU C compiler (gcc) is available. If gcc is not available in your local environment or you would like to use a different C compiler, you will need to edit the Makefile.TCPUNIX files in crl-1.0/src and crl-1.0/apps/* and change the CC = gcc lines to name the C compiler that should be used instead.
The TCP/UNIX version of CRL 1.0 uses SIGIO to implement active-message-like functionality using TCP sockets. Mixing CRL with source code that manipulate signals for other reasons should only be done with the greatest of care.
Further information about PVM (including sources) can be obtained via the PVM World Wide Web Home Page (http://www.epm.ornl.gov/pvm/).