6.824 Spring 2005 Paper Questions
Please remember to print your answer for each day's paper question on a sheet
of paper and hand it in at the beginning of class.
Thu Apr 28: Frangipani: A Scalable Distributed File System, SOSP 1997.
Suppose a server modifies an i-node, appends the
modification to its log, then another server modifies the same i-node,
then the first server crashes. The recovery system will see
the i-node modification in the crashed server's log, but should
not apply that log entry to the i-node, because that would
un-do the second server's change.
How does Frangipani avoid or cope with this situation?
Tue Apr 26: Recovery Management in QuickSilver, 1988. (You can skip Section 4.3 and Section 5.)
Section 4.1 says that the one-phase commit protocol is used by servers that
maintain only volatile state. What additional action(s) do servers
with non-volatile (persistent) state do that requires them to participate
in two-phase commit?
Thu Apr 21: FAB: building distributed enterprise disk arrays from commodity components. (Skip Sections 4.2 and 6.)
To perform reads and writes of replicated blocks, FAB uses two types
of timestamps, ordTs and valTs, as described in Section 4.1. Can FAB
still guarantee strict linearizability without using ordTs? If yes,
explain why; otherwise, describe a scenario where this consistency
model breaks if FAB only uses valTs during reads and writes. According
to strict linearizability, a read of a block must return the latest
write to the block.
Wed Apr 13: Optimistic Replication Using
Vector Time Pairs.
The paper's Section 2.1.4 says that replicas need not keep deletion
records. Suppose that replica R1 creates file F, then synchronizes
with R2, then R1 deletes its copy of the file and discards the
deletion notice, then R2 modifies its copy of the file, and then R1
and R2 synchronize again. What reasoning will R1 use to conclude that
there is a conflict?
Thu Apr 8: TreadMarks: Distributed
Shared Memory on Standard Workstations and Operating systems, USENIX 1994.
Suppose that a simplified version of Treadmarks, called Dreadmarks,
simply sent all modifications of variables between an acquire and a
release to the next processor to acquire the same lock. Deadmarks
doesn't otherwise communicate any changes to memory. Outline a
specific simple situation in which Treadmarks would provide more
useful or intuitive memory behavior than Dreadmarks.
Thu Apr 6: Ivy, PODC 1986.
The write fault handler in Section 3.1 ends by
sending a confirmation to the manager, and that the "Write server"
code on the manager waits for this confirmation. Suppose you
eliminated this confirmation (both the send and the wait)
from Ivy. Describe a scenario in
which lack of the confirmation would cause Ivy to behave incorrectly.
You can assume that the network delivers all messages, and that
none of the computers fail.
Thu Mar 31: Hypervisor-based
Fault Tolerance, SOSP 1995.
Suppose the primary fails due to a power failure, the backup takes
over, and then power to the primary is restored. You would like the
primary to resume replicated execution so that the system could
tolerate a subsequent failure of the backup. Is this likely to
be easy with hypervisor-based fault-tolerance? Why or why not?
Tue Mar 15: XOM, 2000.
Suppose that every PC had a XOM CPU, and that a vendor of expensive CAD
software wanted to use XOM for copy protection. The idea would be that
a customer would provide his or her XOM public key to the vendor,
which would encrypt the CAD program so that only a XOM CPU with the
corresponding private key could read the program. A malicious
customer might compute a new public/private key pair and send the
public key to the vendor. If the vendor sent back a copy of the CAD
program encrypted with this public key, the customer could easily
decrypt it, thus breaking the copy protection scheme. How do you
suppose the XOM designers intended that the CAD vendor prevent this
attack?
Thu Mar 10: No question on OKWS.
Thu Mar 3: Backtracking Intrusions, 2003.
What specific purpose does the BackTracker system serve? For example:
if an intruder breaks in, modifies a bunch of files and leaves unnoticed,
has this system been defeated? Why or why not?
Tue Mar 1: Manageability,
availability, and performance in Porcupine: a highly scalable,
cluster-based mail service, 1999.
Suppose a Porcupine node crashes, then a mail message M1
arrives that should be stored in a fragment on the crashed node, then
the user reads and deletes M1, then message M2 arrives that should be
stored in the fragment, then the node reboots. The fragment is
replicated (which is why the read and delete work). Ideally the node
would end up with a copy of M2 but not M1 after the reboot. What
specific Porcupine mechanism(s) ensure that the node gets a copy of
M2? And what mechanism(s) ensure that it does not end up with a copy
of M1?
Thu Feb 26: A Coherent Distributed File
Cache with Directory Write-Behind. Read just Sections 1, 2, 3, and
4. Echo's guarantee B on page 6 says that writes to a given object
become stable in the same order that they are performed. Does FSD
provide ordering semantics at least as strict as this rule? That is,
in every situation in which Echo enforces an update order because of
this rule, would FSD also write the updates to disk in the same order?
If yes, briefly say why; if not, give a counter-example.
Thu Feb 17: Reimplementing the Cedar File System Using Logging
and Group Commit, 1987.
At the end of Section 4, the paper says that during a one-byte
file create FSD writes the leader+data page synchronously to the disk,
but records the update to the file name table in memory and only
writes it back to disk later. Why do you suppose the FSD designers
decided to write the data page synchronously? What (if anything) might
go wrong if FSD instead wrote the file's data in the in-memory disk
cache, and only wrote it to disk later?
Tue Feb 15: Flash: An efficient and portable Web server, 1999.
How would you expect the relative performance of MP, MT and AMPED servers improve if the Flash experiments were run on a multiprocessor machine? (with no modifications to the code)
Thu Feb 10: The Interaction of Architecture
and Operating System Design, 1991. We're reading this paper in order to dig
into some details of important O/S abstractions, and to talk about how
they affect performance.
You can skip Sections 3 and 4, but do read Sections 5 and 6.
The Question: Table 1 shows that the MIPS
R2000 runs applications 4.2 times faster than the CVAX, and that the
newer R3000 runs applications 6.7 times faster than the CVAX. What do
you suppose is the main reason why the R3000 is faster than the R2000?
Tue Feb 8: A toolkit for user-level file systems
, 2001.
Refer to the description of the automounter in Section 4.3.
Suppose a user level file system has asked the automounter to mount it
on /sfs/newfs. The paper says that when an application first accesses
/sfs/newfs, the automounter returns a sequence of two symbolic links.
The automounter delays the response to the second READLINK, and eventually
returns "/sfs/newfs".
What would go wrong if the automounter returned only one symbolic link,
delayed the response to the first and only READLINK, and then returned
"/sfs/newfs"?
Thu Feb 3: Design and
Implementation of the Sun Network Filesystem, 1995.
NFS file handles are intended be be file identifiers. Would it be reasonable
for an NFS server implementation to use a file's name (e.g. "/usr/rtm/webproxy1.
C")
in the file handle, instead of the inode number?
If not, give an example of a serious problem that would
result. Ignore the question of whether file handles are big enough to hold long
file names.