6.824 2002 Final Project Assignment

Due date for team list: October 3
Due date for project proposal: October 10
Project conferences: October 16/17
Due date for draft report: November 7
Due date for completed project and paper: December 5th at 23:59
Mock program committee meeting: December 10

Introduction

For the final project in 6.824 you'll form groups of three or four students, pick a system you want to build, design it, implement it, and write a paper about it. The final project is structured in three parts:

Project proposal. The proposal is a short (maximum of two pages) proposal for what your project will be. It should state what problem you are solving, why you are solving it, what software you will write, and what the expected results will be. You won't be judged on your proposal; it is there to help you to get started. We'll give you written feedback about your proposal and then meet with you to discuss it.
Draft report. This should include a draft of your paper's abstract, introduction, related work, and design sections. These sections should be in good shape, close to what they would look like in the final report. You should also include a short implementation section, describing how far you've gotten implementing your software. We'd like your draft report in Postscript or PDF format; you can e-mail us either the draft or a link to it. We won't grade these drafts; we'll just e-mail you comments intended to help you write a good final report.
Project paper. Your paper should be patterned after the research papers we have read in class. It should contain a problem description and motivation, a review of related work, a description of the design of your solution, a description of your implementation, and an evaluation of how well your system solved the original problem. The paper must be ten or fewer pages in length (see below for formatting details). Your project grade will be based on the quality of your paper.

In addition, on the last day of class we will run a mock program committee meeting, in which you will evaluate each others' papers and choose the ones most likely to be accepted at a good conference.

Doing a good project is a daunting task. The most successful projects tend to be very well defined and modest in scope. We (the 6.824 staff) are very happy to be involved in all stages of your project. Please, come talk to us about your project ideas, how you should execute the project, what you should write about in your final paper, etc.

The project is to be executed in teams of 3 or 4 students. Find team-mates and send their names by e-mail to the TA. The email is due soon (see the list of deadlines at the top of this page).

Suggestions for projects

You should feel free to propose any project you like, as long as it is related to operating systems or distributed systems and has a substantial system-building and evaluation component.

If you are in the PhD program, we expect your proposal to involve some new idea; that is, it should be a research project.

We suggest that you base your implementation on the asynchronous programming library you used for some of the labs. In past years students have found sfsusrv and their web proxies to be particularly useful starting points for projects.

If you're having trouble thinking of a project idea, some of the ideas below might help get you started. You might also want to look at projects from previous years: 2000 and 2001. You could also look at what's hot in the on-line proceedings of recent SOSP, OSDI and Usenix conferences.

Make a distributed shared memory (DSM) system, so that processes running on different machines can share an address space. The Ivy DSM and Treadmarks papers describe two existing DSM systems. Like those systems, you would need a plan to allow caching but maintain consistency. You would also want to find at least one program that could take good advantage of DSM, to help you evaluate your system.
Design and implement a disk scheduler that enforces priority. The point would be to give high priority to disk reads that interactive processes are waiting for. Lower priority would be given to reads by non-interactive programs, background page-outs, read-ahead, delayed writes, &c. This might make your Emacs and X Windows faster at the expense of background compilation. You would need to demonstrate that the scheduler actually improved some aspect of system performance. The danger is that there is probably a tradeoff between enforcing priority and scheduling the disk efficiently.
Build a service that maintains consistent replicated data. You could build a general-purpose service (like DDS) or an application that replicates in a way tailored to that application's needs (like the Porcupine mailbox service).
Build a distributed spam filter, perhaps using a distributed hash table (DHT) such as CAN, Chord, or Pastry. You might use the DHT to store and share condensed descriptions of known spam e-mail messages, or of other information closely tied to known spammers (perhaps e-mail addresses or words or URLs that occur in spam). We can help you find large volumes of known spam (and known non-spam) e-mail to help you evaluate your system.
Implement a system like Network Objects in C++.
Perhaps all computers will soon have built-in secure computing hardware such as XOM. Such hardware can certainly be used to restrict what computers can do, for example by enforcing copy protection. It's also possible that secure execution hardware could be used to make computers more useful; for example, it might allow secure execution of Java applets or Web browser plug-ins or SETI@Home software, or store your passwords or RSA private keys or credit card numbers securely, or help players of multi-user network games convince each other they are not cheating, or let you walk up to anyone's computer and use it (and trust it) as if it were your own. Design an application in this space and implement it as realistically as you can. Depending on your ambition you may have to simulate the required hardware and operating system support.
Build a more full-featured version of your Semantic File System lab. You needn't preserve any of the specifics of the original semantic file system proposal, just the spirit.
Design and build a proxy that allows access via SFS to resources other than files on the server's disk. For example, build an SFS front end to a database. This would be a useful tool for making Athena resources such as Hesiod and Moira accessible with a file-system interface. Access to FTP servers via SFS may also be an interesting project. In all cases the challenge is figure out how to provide a sensible interface to objects that don't act like standard UNIX files. You may be able to learn from the Plan 9 9P protocol.
Design and build an on-disk file system representation consisting of just a B-tree. You probably want to modify sfsusrv to make calls to a B-tree package such as Berkeley DB rather than (as currently) to the UNIX file system. Your challenges are to figure out (1) how to make the NFS operations efficient using the B-tree and (2) how to make crash recovery work well. You can view this as an elegant simplification of the SGI XFS file system.
Improve NFS performance by adding support for batched commit of arbitrary operations. This might let the client cache a sequence of operations (such as creates, renames, and writes), and them commit them to the server all at once. The server could then write the whole batch to disk at once. This arrangement would be particularly attractive if the server's file system used a log, like SGI XFS or Hagmann's Cedar file system, or performed checkpointing like Netapp's WAFL. You might need to provide a way for applications to indicate the start and end of a batch of operations. The challenge here is to achieve higher performance while retaining reasonable behavior after failures.
Design a disk layout for a file system and implement it in an SFS server. Make sure your layout and update algorithms have good crash recovery properties. You may want to look at this somewhat antique 6.033 lab assignment; the goal is the same though the SFS tools are now different.

What to Hand In

Check the top of this page for due dates.

Team list: e-mail your team list (three or four members) to 6.824-staff@pdos.lcs.mit.edu.

Proposal: e-mail your proposal to 6.824-staff@pdos.lcs.mit.edu. The proposal should be no more than two pages. It should be ordinary ASCII text, not an attachment or word processor file.

Draft report: hand in your draft report by putting a PostScript file in ~/handin/final/draft.ps.

Final report: hand in your final report by putting a PostScript file in ~/handin/final/paper.ps, and a tar file containing your project source code in ~/handin/final/source.tar. Please also put an anonymized copy of your report (without your names on it) in ~/handin/final/paper-anon.ps; the class will use this for blind review in the mock program committee meeting. Your project grade will be based on the paper, not on the source.

Make sure you save enough time to write a good paper, since that's what will determine your grade!

Suggestions on Writing Style

Your paper should be as long as is necessary to explain the problem, your solution, the reasons for your choices, and your analysis of your solution. It should be no longer than that. Your paper must not exceed ten 11-point, single-spaced pages in length. Please use 1-inch margins. In general, your paper's style and arrangement should be similar to the papers we've read in class.

A good paper begins with an abstract. The abstract should summarize what a reader will learn by reading the paper. It should not be an outline of the organization of the paper. It should describe the problem to be addressed, the essential points of your solution, and any conclusions you have drawn. It should be about 150 words long.

The body of your paper should expand the points made in the abstract. Here you should:

Introduce the problem and the externally imposed constraints, and explain why the problem is worth solving.
State the goals of your solution clearly.
Describe the design of your solution. You may wish to divide the description into a high level architecture and a set of lower-level implementation decisions. This would be a good place for pictures and diagrams.
Analyze how well the system you built fulfills your goals. Depending on your system, the analysis might deal with performance in the sense of throughput or running time; but keep in mind that factors such as reliability, functionality, and usability may be as or more important goals than performance for some systems.
Briefly review related work in the area of your project. The goal is to show either how you extended existing work or how you improved on it.
Conclude with a review of lessons learned from your work.
Cite your sources as you mention them in the text of your paper, and list all references at the end of the paper; the format and style should be similar to the technical papers we read in class. When in doubt, cite the source; use "personal communication" citations if you have to (e.g. for ideas given to you by fellow students).

Write for an audience that understands basic O/S and network concepts and has a fair amount of experience applying them in various situations, but has not thought carefully about the particular problem you are dealing with.

How will we evaluate your paper?

When evaluating your paper, we will look at both content and writing.

Some content considerations:

Do you provide motivation for why the problem you chose is worthwhile or interesting?
Does your solution address the goals you stated?
Do you explain your decisions and the trade-offs?
How complex is your solution? Simple is better, yet sometimes simple won't do the job. But unnecessary complexity is bad.
Does your solution fit well with the rest of the system? If your solution requires modifying every piece of hardware, software, and data in sight, it won't be credible, unless you can come up with a very good story why everything needs to be changed.
Is your analysis clear?

Some writing considerations:

Is the report easy to understand?
Is it well organized and coherent?
Does it use diagrams where appropriate?
Is there a good abstract and bibliography?

You can find other helpful suggestions on writing this kind of report in the M.I.T. Writing Program's on-line guide to writing Design and Feasibility Reports. You may also want to look at the Mayfield Handbook's explanation of IEEE documentation style. A very good book on writing style is: "The Elements of Style," by William Strunk Jr. and E. B. White, Third Ed., MacMillan Publishing Co., New York, NY, 1979.