Process abstraction and management
Required reading: Chapter 7 and remaining of chapter 12
Overview
The next set of lectures we will discuss the implementation of kernel
services for process management, interprocess communication, I/O, and
file systems. Today's topic is process management and the case study
will be UNIX's process management services; all operating systems
support something similar to what UNIX supports.
A process is one address space combined with one thread. Process
management in UNIX include the functions fork, wait, exec, exit, kill,
getpid, brk, nice, sleep, and trace. (All system calls in v6 are in
the systent table on sheet 29; pick out the ones related to process
management.)
To recall how they are used, remember the structure of the shell:
while (1) {
printf ("$");
readcommand (command, args);
if ((pid = fork ()) == 0) { // child?
exec (command, args, 0); // arg 0 is name of program
} else if (pid > 0) { // parent?
wait (pid);
} else {
printf ("Unable to fork\n");
}
}
As a side note, how does the shell get started? (Init fork+execs
login, one per terminal; login execs the shell listed in the passwd file.)
How can a shell implement background jobs (a process group)?
$ compute &
The shell just doesn't call wait, but instead reads the next command.
The shell periodically polls jobs to find out what their status is by
calling wait (passing a flag not to block). (Jobs are not supported
by v6.)
A process terminates fully when (1) the process exited (perhaps
because of kill); and (2) the parent has called wait. A process that
has exited but the parent hasn't called wait, enters the zombie state.
If a parent process terminates without waiting for all of its child
processes to terminate, the remaining child processes are assigned the
parent process 1 ID (the init process ID); init waits for processes to
terminate, and thus will clean them up.
V6 code examples
- Exec. Key challenge: set up address space and setup stack with
arguments to program. See Lion's chapter 7 for the layout of a user
address space.
- 3026: what is uchar? it is a function that copies a byte from
user space, u.u_dirp (see nami.c).
- 3034: inode for arg[0].
- 3052: this code is dependent how arguments are layed out. see
icode in main.c for an example; icode is explained at the end of
Lion's chapter 6.
- 3058: copy a word from the previous mode; the previous address
space must be user address space, since we came to exec through a trap
instruction.
- 3085: read first 020 bytes from argument 0 of exec into the area
that starts with u_arg[0]; this code won't get you an A in 6.170!
(argument 0 of exec 0 is already translated, and ip is the inode for
argument 0. further more all arguments to exec have been copied into
an internal buffer.) the 020 bytes should be the a.out header.
- 3089: u.u_base is a kernel address.
- 3091: u.u_base is now interpreted as a user address.
- 3095: on PDP-11/40, the executable could be 407 (text, data, and
stack back-to-back in main memory) or 410 (text is separated from
data+stack in physical memory). u-area is right before text in 407
and right before data in 410.
- 3129: contract current address to just the u area; gets rid of the
program that was loaded in this address space.
- 3132: grow address space to have enough space to contain new program.
- 3130: xalloc reads in the text segment, if it is not in memory.
- 3140: skip 020 bytes, the a.out header.
- 3155: set the user stack pointer. when returning from the system
call, the kernel will copy this value into the user stack pointer
register.
- 3155: ap is a negative value v (see 3154). is a negative value
loaded in the user stack pointer register? (Answer: no. it is unsigned
integer, an address, thus the value is 2^16 - v, pointing exactly in
the stack at the top of the address space.)
- 3161: where are we coping the content of the buffer? (answer: to
the user stack.) at a negative address in the previous address space?
(Answer: no exactly in the right place in the stack.)
- 3188: set pc where rtt will return to 0. thus, when returning to
user space, processor will start executing at address 0 in the user
address space.
- Why are there 3 calls to estabur? Answer call 1 just checks
whether there is space; call 2 ensures that u_base = 0 points to the
beginning of the data segment to make the readi call work correctly;
call 3 sets up the address space.
- What is the content of the prototype segment address registers
after the third call to estabur? Lets assume ts = 180 blocks (block
is 64 bytes), data size is 370 blocks, ns = SSIZE = 20 blocks, and the
a.out is a 410 executable.
segment ISA ISD
0 0 w=0,ed=0,len=127
1 128 w=0,ed=0,len=51
2 16 w=1,ed=0,len=127 // skip uarea
3 144 w=1,ed=0,len=127
4 272 w=1,ed=0,len=113
5 0 w=0,ed=0,len=0
6 0 w=0,ed=0,len=0
7 278 (406-128) w=1,ed=1,len=108 (128-SSIZE)
sureg adds the offset for where the text and data segment are
stored in physical memory. for text the offset is
u.uproc->p_textp->x_caddr. for data the offset is u.u_procp->p_addr
(the address where u-area is).
- why is sureg separate from estabur? when we swap in a program
again, we have to call only sureg, because the program might be
swapped into a different location in physical memory.
- Fork. Duplicate address space, which is done by newproc()!
- newproc() returns 0 to parent; and 1 to child.
- fork returns child pid to parent. fork returns parent pid to
child; the user space library changes this to a zero (to make fork
conform to the specs of fork).
- Wait.
- 3280: look for a zombie child, and clean it up.
- Sbreak (set break point). grow address space with n bytes by
allocating physical memory of old size + n. copy old memory, if any,
into the new area.