HW2 Answers

PART 1: COMPLETE

PART 2:

Question 2.1: Why we put cached documents in directories of different level as opposed to put them all in a single cache directory? We tend to put files in different directories in order to be able to cache different versions of the same file. The reason we have different versions is because we do not want the cache to be stale. The main reason why this is done is because it is faster to search a hierarchical structure than a flat strucutre.

Question 2.2: Why the cached document is saved with a file name generated by applying hash on the url? The hash is a unique identifier that can be able to tell the a requestor if the file has or has not been modified in the cache. For example, a browser makes a request for image.jpg. The server looks at the hash identifier and sees if the hash exists. If the hash exists, then image.jpg has not been modified. The server notifies the browser that image.jpg has not been modified, which means the browser can use the cached version of image.jpg. The hash allows for the opportunity to have a hierarchy, which is faster to search.

Question 2.3: One of the HTTP headers in the cached document file is the last modified. How the cache server uses the value of this header? The server will first look at the file that the browser is tring to access. It will then look to see when that file was created or last modified, and it will compare it against the last modified date that the browser sent. If the file has not been modified since the date that the browser has sent, then the server only needs to send back a response saying that the file has not been modified. This is much faster and cheaper in the sense of bandwidth.

Based on your benchmark results, rank the performance of those five web servers:

From fastest to slowest my results were Run 4 windom.txt (.030 s), Run 3 blanca2.txt (.041 s), Run 2 windomRP.txt (5.876 s), Run 5 www.uccs.edu.txt (15.468 s), and Run 1 blanca.txt (16.842 s).

Discuss:

* Why there is such a difference between run1 (your customized apache web server) and run2 (windom cluster)

The windom cluster took longer to make connections, but the request and reply rates were faster because the cluster had help in resolving where the page resides. In windom, we put in a configuration pointing to where the page could be quickly resolved. I also think that the blanca server is overloaded, and the windom server is much faster.

* Why there is such a difference between run1 (your customized apache web server) and run3 (blanca default apache web server)

I think the big difference is that the customized web server needs to find and return the content in /eng/eng.html, which is several file levels deep. The default apache web server has its information at the root, and the server does not need to navigate through any directories to return the correct content. The extra time in run 1 comes from the server needing to find the requested content.

To access each of the servers the URLs are the following:

http://blanca.uccs.edu:8251

http://sanluis.uccs.edu:8251

http://shavano.uccs.edu:8251

The access logs can be found at the following URLs:

http://cs.uccs.edu/~nsundqui/apache/ws1/logs/access_log

http://cs.uccs.edu/~nsundqui/apache/ws2/logs/access_log

http://cs.uccs.edu/~nsundqui/apache/ws3/logs/access_log

http://cs.uccs.edu/~nsundqui/apache/proxy/logs/access_log

http://cs.uccs.edu/~nsundqui/apache/reverseproxy/logs/access_log