Select one of two possible short reports
For your second (and final) assignment in this class, I would like to ask you to design an experiment on one of two subjects, and to submit a written report, describing your experiment and your conclusions. The two subjects are as follows (again, you should choose one of these subjects, and make it the subject of your report). Each subject is presented below in the form of a speculative question. Your goal in this assignment is to think about the question posed, and describe how you would answer it. You may actually conduct your proposed experiment (which would typically involve you writing and running some test code), in which case your report may be very brief and to the point (as long as you include your code, results, and both are clearly documented). On the other hand, you may instead present an argument and offer a more speculative answer (e.g., in the form of "if [X] conditions are true, then I would expect behavior/results [Y], and my reasoning for this is [Z]." If you opt for this more narrative and anecdotal approach, your report must describe in more detail how you would go about testing your hypothesis (verifying that the statements you are making are true), and so the report would be lengthier.
A good report should not need to be longer than two to four pages. Once again, I would like to take this opportunity to remind everyone that all work must be your own, and any external sources you use to support your conclusion, or to build upon your code, must be cited as such in your report. Please submit your answers as PDFs or Text files.
Subject 1. Page Replacement Algorithms and Block Caching
Page replacement algorithms select which page (a fixed size region of memory storing bits) to remove from RAM (and probably replace with another that has been requested, and either does not exist or is stored on disk). Meanwhile block caches hold blocks (fixed size portions of storage, storing bits) that have been brought into RAM from disk, and in managing such block caches a big decision is which block to replace when a new one is requested. In this manner, page replacement and block cache management are very similar endeavors. However, in block caching we typically have a lot more time to make our decisions (and so LRU is feasible, unlike the case for virtual memory page replacement, where it is too expensive to implement). The question I would like you to answer is this: Should you use LRU for a block cache that receives its requests from another LRU cache? If yes, why? If no, why not and what else should you use?
The arrangement looks as follows:
BLOCK REQUESTS -> [ CACHE 1 ] -> CACHE1 MISSES -> [CACHE 2] -> CACHE2 MISSES
The original requests are presented to "cache 1." If a block is not found in "cache 1" then we have a cache miss, and the request is forwarded to "cache2." If the item is not found in "cache2" then it is another cache miss and is retrieved from the remote device. When a cache miss occurs, an item is selected from that cache and replaced by the requested item. As far as "cache 2" is concerned, the list of blocks requested is the list of all the cache misses from "cache 1." You should assume that there is no communication between cache 1 and cache 2. The question you are addressing is whether it is useful to use LRU in cache 2. Make whatever assumptions you feel are reasonable regarding the caching algorithm used in cache 1.
Subject 2. Storing Blocks and Remembering Where You Stored Them
The second subject is related to file systems and allocation policies. A big part of designing a file system is the allocation policy, and the addressing scheme you construct for finding blocks that form a file. For the final exam, you are expected to understand the workings of contiguous, linked, allocation-table, and index-node allocation schemes. If you elect to write about this subject, I would ask that you address the following question (regarding where to place a new block's worth of data in a storage system composed of block-sized locations): Is it better to assign a written data block to a location based on a uniform random distribution, or a random distribution with a heavy skew? In this hypothetical, you are faced with a data storage device that is the size of the planet, and you are being given data to store for later retrieval. You are not told the performance goal (so that is an assumption or range of assumptions you will need to make), but are told that the application that will be requesting data blocks will do its best to make your allocation policy seem bad. In other words, the only thing you know about the future reader of your stored data blocks is that it's an adversary to your allocation policy. And so it has been suggested that you place data randomly, thereby making it difficult for this future adversary to select a pattern of reads that results in poor performance. So once again the question is: should the random selection of location be performed using a uniform distribution, or a skewed distribution? The answer may seem obvious, or impossible to determine, but either way please remember the following as you write your report: the performance goals of the system have not been specified.
As you write your answer to either question, you are free to experiment and speculate regarding both the scope of the problem and I am looking more for your process of reasoning towards any suggested answer(s), than the specific answer you reach. Your goal is not to pick the right answer, but to demonstrate in a short report that you have thought about the question.