Next: 4. Evaluation
Up: PGMAKE: A Portable Distributed
Previous: 2. Background
Subsections
3. Design
For gmake to faithfully execute a process remotely, it is necessary
to duplicate the environment of the caller to the ``callee'' side.
The following list of concerns must be resolved when attempting to
remotely execute a (compilation) process.
- 1.
- Architecture and Operating System.
For most operations, it is important that the actual make commands are
executed on a machine of the same architecture and operating system of the
target. PVM contains knowledge of the particular host
architecture of each CPU in the virtual machine and can be directed to
launch jobs only on machines of a given architecture.
- 2.
- Filesystems.
All commands generated by make are executed relative to the working
directory from which the make was issued. In a networked environment,
it is important to be able to ``root'' the remote execution unit in
the correct directory before issuing commands. This brings up the
problem of uniform global naming scheme for file hierarchies.
The most widely used networked filesystem, ONC-NFS from Sun
Microsystems [9], does not enforce a global namespace for
filesystems, which presents a problem in our model. Therefore, it is
a requirement that directories in which pgmake will be invoked must
have the following properties.
- 3.
- sep=0pt
-
The directory must be available for NFS mount to all remote nodes, and
-
the name by which it is referred must be globally defined
Public domain automounters such as amd can
assist in maintaining global names and maps of shared
filesystems.[7]
- 4.
- Time.
The success of any make utility lies in its ability to check time
stamps of files and determine which ones need rebuilding. Therefore,
it is necessary to implement some type of time synchronization
protocol between nodes in the virtual machine. The popular Network
Time Protocol, ntp, for example, is available in the public
domain.[6]
- 5.
- Shell Environment Variables.
Executables launched from make may use the users' environment variables as
parameter settings. Therefore, it is important to duplicate the shell
environment variables on the caller side to the remote execution side.
Figure 1:
Flow of execution with pgmake
|
3.1 gmake Modifications and Execution Agent
In order to minimize the number of locations and software packages
that have to be modified to provide distributed operation,
PVM was left unmodified, and modifications were made only to
gmake. This was also done to make pgmake more attractive to potential
users, without adversely affecting existing installations of PVM.
Pgmake currently works with GNU make version 3.64, and the
latest known version of PVM, version 3.1.
Because GNU make already has stub provisions for remote jobs, the
bulk of our work was to construct a new module for GNU make. Four
main interface functions were supplied to provide support for remote
jobs. The primary flow of execution is depicted in Figure
1.
The following GNU make functions were provided by us to interface
with PVM:
- start_remote_job_p.
This predicate function determines if the next job should be run
locally or remotely. If the user provides the ``remote'' switch to
pgmake, jobs are allowed to run remotely, and pvm_taskid()
is called the first time to enroll the process with PVM.
- start_remote_job.
This function is called by GNU make's jobs.c module to execute a job
remotely. The function gets passed an argument vector, environment
pointer, and a standard-input file-descriptor. Pgmake forms a
message with these pieces of information along with the current
working directory and calls the PVM function pvm_spawn() to
initiate the remote job. Pgmake records the thread ID, tid,
returned from the call, into a table for future reference.
- remote_status.
This call is invoked by pgmake when it has determined that it cannot
issue any more jobs, and needs to wait for a job slot to empty, so that it
can either fill it with a new job, or end the entire compilation. At the
heart of this call is a [non]-blocking PVM call pvm_nrecv(), which
is very similar to the UNIX select() system call. When a remote job
terminates for any reason, its return status is collected and sent to the
parent, in addition to any output it generated on stdout or stderr.
- remote_kill.
Sends remote pvmd's a notice to terminate the tasks they
are currently managing (which is the execution agents.)
Our code changes to GNU make number roughly 400 lines, and reside
almost completely in one separate and new module, remote-pvm3.c.
A minor change was made outside this module for supporting two new GNU
make options: -R tells pgmake to run all of its jobs
remotely if possible, and -D turns on verbose debugging of
pvmd and pvm3_ad for pgmake.
3.2 Agent-Daemon: pvm3_ad
The pvm3_ad sits between the remote pvmd's and the actual jobs that
are executing to provide the appropriate execution environment.
The paradoxical name ``Agent-Daemon'' reflects its dual role in the
pgmake system. Pvm3_ad serves as an agent: it assists the
remote pvmd to fork a job, collect its status and output, and return
it back to pgmake. Each time pgmake needs to start a job, it spawns a
pvm3_ad, which in turn forks the actual job.
Next: 4. Evaluation
Up: PGMAKE: A Portable Distributed
Previous: 2. Background
Erez Zadok
1999-02-17