anguish% wget -nc http://pdos.lcs.mit.edu.ezproxy.canberra.edu.au/6.824/labs/lab-4.tgz anguish% tar xzf lab-4.tgz anguish% cd lab-4 anguish% cp ../lab-3/fs.[Ch] . anguish% cp ../lab-1/lock_server.[Ch] .Edit fs.h (in the lab-4 directory) and add the following early in the file:
#include "blockdbc.h" #define LOCKS 1 // ADD #include "lock_client.h" // ADDNow change the fs constructor to take a lock client argument, and add a new class member to hold it:
class fs { public: fs(blockdbc *db, lock_client *lc); // CHANGE ... private: lock_client *lc; // ADDNow change fs.C:
fs::fs(blockdbc *xdb, lock_client *xlc) // CHANGE { db = xdb; this->lc = xlc; // ADDAt this point you should be able to run gmake and successfully compile your file server and a block server. Operations that worked in Lab 3 should still work. You have to run this new block server with your new ccfs, since now ccfs expects a lock server to be resident in the block server process.
The rest of this lab has two parts.
Your server should update a file's mtime when SETATTR changes its size, during a WRITE, and when the file is created. The server should update a directory's mtime whenever an entry is added or removed from the directory. In all cases the server should set the mtime to the current time with nfstime().
If your server passes the tester (see below), then you are done. Please modify only fs.C and fs.h. You can make any changes you like to these files.
When you're done with Part 1, the following should work:
suffering% mkdir /classfs/dir1/newdir suffering% echo hi > /classfs/dir1/newdir/newfile suffering% ls /classfs/dir1/newdir newfile suffering% rm /classfs/dir1/newdir/newfile suffering% ls /classfs/dir1/newdir suffering%
suffering% ./test-lab-4-a.pl /classfs/dir1 mkdir /classfs/dir1/d3319 create x-0 delete x-0 create x-1 checkmtime x-1 ... delete x-33 dircheck Passed all tests!
A convenient way to fix this "race condition" is for the file servers to use locks to force the two CREATE operations to happen one at a time. That is, a server would acquire a lock before starting the CREATE, and only release the lock after finishing the write of the new information back to the block server. If there are concurrent operations, the locks force one of the two operations to delay until the other one has completed.
You must choose what the locks refer to. At one extreme you could have a single lock for the whole file system, so that operations never proceed in parallel. At the other extreme you could lock each entry in a directory, or each field in the attributes structure. Neither of these is a good idea! A single global lock prevents concurrency that would have been OK, for example CREATEs in different directories. Fine-grained locks have high overhead and make deadlock likely, since you often need to hold more than one fine-grained lock.
Your best bet is to associate one lock with each file handle. Use the file handle as the name of the lock (i.e. pass the file handle to acquire and release). The convention should be that any NFS operation should acquire the lock on the file or directory it uses, perform the operation, finish updating the block server (if the operation has side-effects), and then release the lock on the file handle. One reason this is convenient is that you can add locking relatively simply by calling acquire in fs::get_fh(); thus all of your RPC implementations will acquire file handle locks without you having to modify your RPC handler code. Similarly, you can modify fs::put_fh() to release the file handle lock. You'll still need to add explicit releases for RPC handlers such as GETATTR that don't write the block store, and for error cases.
You'll use your lock server from Lab 1. lab-4.tgz includes a blockdbd.C that incorporates your lock server code, so that you don't have to start a separate server. blockdbd looks at the RPC program number to figure out whether it's a locking RPC or a block RPC. We've also included a new ccfs.C that instantiates a lock_client and connects it to your lock server. You can call acquire() and release() from fs.C as in this example:
void fs::acquire(nfs_fh3 fh, callback<void,void>::ref cb) { lc->acquire(fh2s(fh), cb); } void fs::release(nfs_fh3 fh) { lc->release(fh2s(fh)); }
suffering% ./ccfs dir1 blood 5566 & root file handle: f7dbbbadfd082018 suffering% ./ccfs dir2 blood 5566 f7dbbbadfd082018 & root file handle: f7dbbbadfd082018 suffering% suffering% ./test-lab-4-b /classfs/dir1 /classfs/dir2 Create then read: OK Unlink: OK Append: OK Readdir: OK Many sequential creates: OK Write 20000 bytes: OK Concurrent creates: OK Concurrent creates of the same file: OK Concurrent create/delete: OK Concurrent creates, same file, same server: OK test-lab-4-b: Passed all tests.If you try this before you add locking, your server will probably fail the "Write 20000 bytes" test or the "Concurrent creates" test. The reason the "Write 20000 bytes" test might fail is that it produces three concurrent WRITE RPCs (of 8192, 8192, and 3616 bytes).
You might want to test your solution to Part 2 with test-lab-4-a first, to make sure you didn't break anything. You might then test with test-lab-4-b, but giving it the same directory twice, to make sure you handle concurrent operations in one server before you go on to concurrent operations in two servers.
You can learn more about NFS loopback servers and asynchronous programming in the loop-back NFS paper. You can find the sources for this software at www.fs.net or in /u/6.824/sfs-0.7.2 and /u/6.824/classfs-0.0. You can see the NFS RPC definitions in /u/6.824/sfs-0.7.2/svc/nfs3_prot.x. The output of the RPC compiler is in /u/6.824/sfs-0.7.2-build/svc/nfs3_prot.h. You can learn more about the asynchronous programming library (wrap, callbacks, str, and strbuf) by reading the Libasync Tutorial.
Don't use a nfscall structure after you send a reply or error. The RPC library de-allocates the structure when it sends the reply. You may be tempted to use the arguments in the nfscall structure after replying in order to release locks: don't. Instead, make a copy of the arguments, or release the locks before replying.
Here's how to extract the arguments from a MKDIR RPC:
mkdir3args *a = nc->template getarg<mkdir3args> (); // a->where.dir : existing directory // a->where.name : name of new directory // a->attributes.mode.set : if non-zero, set new dir's fattr3.mode to... // *a->attributes.mode.val : the new directory's modeHere's how to reply to a MKDIR RPC. before_attrs and after_attrs are the attributes of the containing directory before and after the MKDIR.
diropres3 *res = nc->template getres<diropres3> (); res->set_status(NFS3_OK); res->resok->obj.set_present(true); *res->resok->obj.handle = fh; res->resok->obj_attributes.set_present(false); res->resok->dir_wcc = make_wcc(before_attrs, after_attrs); nc->reply(nc->getvoidres());You may need to create, or fake, a "." entry in each new directory, referring to the directory itself.
Here's how to extract the arguments from a REMOVE RPC:
diropargs3 *a = nc->template getarg<diropargs3> (); // a->dir : the directory // a->name : the file name
Here's how to reply to a REMOVE RPC:
wccstat3 *res = nc->template getres<wccstat3> (); res->set_status(NFS3_OK); *res->wcc = make_wcc(before_attrs, after_attrs); nc->reply(nc->getvoidres());
% tar czf lab-4-handin.tgz *.[Cchx] Makefile % chmod og= lab-4-handin.tgzCopy this file to ~/handin/lab-4-handin.tgz. We will use the first copy of the file that we find after the deadline.