The explosive growth of the World Wide Web, along
with the evolution of the HTTP, HTML, CGI, and other Internet
standards, has enabled many applications which would not have
been feasible just a few years ago. There is little doubt that
this evolution will eventually lead to support for general-purpose
distributed computing over the Web. Such computation minimally
requires access to local and remote files, private and public
data, and local and remote computation. Issues like transparency,
architectural and operating system compatibility, speed, and security
(as well as other current Internet problems) will need to be addressed
to support a robust and secure distributed computing environment,
presented in a form of a distributed file system.
In the recent years, this problem has received a
lot of attention in the academic community. It resulted in several
research projects, most notably, the Andrew File System (AFS)
at CMU, and the Web File System (WebFS) at Berkeley. Without undermining
the practical significance of the latter projects, I would like
to concentrate on the commercial implementations of distributed
file systems, specifically, Sun's WebNFS, TransArc's AFS, and
Microsoft's CIFS. Currently, these are the most important players
on the market, each striving to establish their system as a standard
for file management across the Web.
In this paper I will attempt to analyze particular
approaches adopted by each system, their strengths and weaknesses,
and point out some differences and similarities in the methods
they adopt. This task is complicated by the fact that these file
systems differ significantly not just in their implementation,
but also in their functionality and even purposes. While AFS accentuates
security and "extreme" transparency at the cost of losing
its UNIX compatibility, WebNFS is a mere extension of UNIX-based
NFS 3.0, and CIFS simply builds upon native Windows file-sharing
protocols. Nonetheless, there exists a common denominator to all
these systems, which is the ability to carry out Internet-wide
file transactions and file sharing. In the next few sections,
I will summarize most important aspects of each of the file systems
and compare them based on their effectiveness, speed, security,
compatibility with other common Internet protocols, and other
relevant issues.
Origins:
WebNFS is an enhanced version of SunSoft's NFS (Network
File System) 3.0 protocol. Taking NFS as a basis, SunSoft extended
it to be applicable over the intranet and the Internet, allowing
to read from and write to files over the Web instead of just viewing
them through a browser.
There are currently about 10 to 12 million nodes
connected to various NFS servers throughout the world. WebNFS
allows access to the information on any one of these servers.
Instead of having to use a Web browser to retrieve the files,
switch tasks, and cut and paste the data into applications, WebNFS
allows for a direct and transparent access to Web data from within
the applications themselves, compatible with the way applications
now access local disks.
NFS was originally designed at Sun Microsystems as
a file access protocol for local area networks. The basic idea
behind NFS is that clients can "mount" a server filesystem
to appear as if on the local hard drive. Instead of manually transferring
entire files across the network, the user can rely on NFS to move
the data as needed (originally in 8K blocks, now as negotiated
between the client and the server), and take advantage of local
caching.
Initially, NFS used UDP transport protocol, which performs well on local area networks (LANS), but is terrible on low bandwidth and high latency WANs like the Internet. TCP, on the other hand, is well suited for WANs, and its recent implementations had a fair performance on LANs as well. Thus, WebNFS has adopted TCP as a transport, making it more suitable for widely distributed data transactions. UDP is also supported and can be used in local network transactions.
Another limitation of the earlier versions of NFS
was the restriction of file offsets to 32 bit quantities (thus
confining the maximum file size to 4.2 GB). WebNFS extends file
offsets and a number of other parameters to 64 bits which (temporarily?)
eliminates the problem.
Firewalls:
For a long time, UDP based NFS could not get though with most
firewalls since UDP datagrams are considered insecure due to their
vulnerability to replay attacks. Utilization of TCP allows WebNFS
to avoid this problem.
Traditionally, NFS implementations use port 2049
for TCP or UDP connection. However, as an RPC based protocol,
NFS connection must first get a port number by first registering
at port 111, where Portmap service maps a port dynamically. In
the majority of cases, NFS still gets the same port 2049 as it
is designated for NFS transactions. WebNFS tries to avoid this
extra step by skipping Portmap whenever possible and going directly
to port 2049. This improvement, however, is relative, since it
poses some potential danger by eliminating the the extra security
step.
Optimizing file access:
MOUNT protocol is used to generate filehandles to
uniquely identify files and directories on the server. Originally,
all filehandles were represented as 32 bit sequences. WebNFS uses
variable filehandle size - between 0 and 64 bits - which allows
for more precision, yet saves memory and reduces network traffic
in most cases.
Another WebNFS addition was the introduction of Public
Filehandles and Milti-Component Lookup. Public filehandle
is a zero-length filehandle that is usually associated with the
root directory that is open to the outside NFS connections. Thus,
instead of having to "mount" in that directory to receive
its handle, all potential clients already have this handle and
can eliminate the MOUNT step altogether. Another slow-down for
the remote NFS calls has been path evaluation. First, the component
had to be mounted on the server, getting the root's filehandle.
Then, for each step in the path, it had to issue a separate LOOKUP
request. WebNFS protocol has the server do all the successive
lookups once it receives a handle and a path to a file or a directory
with a single Multi-Component lookup request.
All these optimizations allow to significantly reduce the amount
of network traffic and time. Consider the following example: a
client needs to get a handle for file "/foo/bar/yury.file"
from the server. Originally, this would require the following
requests:
PORTMAP, MOUNT, PORTMAP, LOOKUP foo, LOOKUP bar, LOOKUP yury.file.
With WebNFS we only need to issue one request:
LOOKUP /foo/bar/yury.file
Considering high latency for some network connections,
this seems to be a very convenient feature.
Another optimization adopted by WebNFS is using the
sliding window approach in sending requests (similar to batching
used by CIFS). Instead of waiting for response on each particular
requests, WebNFS client sends several requests based on the size
of the sliding window, as negotiated with the server. This allows
to minimize network latency delays.
WebNFS URL:
NFS requests can be used to browse files on the World Wide Web. The syntax of the NFS URL requests is similar to those of HTTP and FTP protocols and has the following form:
nfs://server[:port #]/[/]path - corresponds to 'Lookup 0x0 "[/]path"'
The default for port number is 2049. The second (optional) slash before the path corresponds to the absolute path on the server specified. Its omission implies a relative path to the directory with the public filehandle (0x0). The following is an example of NFS URL:
nfs://www.cs.utah.edu//foo/bar
which requests a filehandle for "/foo/bar",
given the absolute (relative to the root) path.
WebNFS vs. FTP:
Both protocols support similar functionality, although WebNFS seems to be more efficient for most tasks. Consider a simple task that involves transferring data from N files. FTP protocol would require establishing N+1 TCP connections to the server (one request to a control connection, plus a GET request for each file) and transferring N entire files, even though only a portion of their data is necessary. If a TCP connection is broken while transferring file data, another one is to be established anew, and the file is transferred from scratch.
WebNFS, just like FTP, also establishes a control
TCP connection once it connects to the server. However, all subsequent
file transfers go through the same connection, which eliminates
the overhead of opening and closing connections for each file.
WebNFS also keeps track of file offsets, so that only the needed
data from the files is transferred across the network. Also, if
the TCP connection is broken, once it is restarted the retransmission
resumes based on the most recent file offset received, so that
no data has to be transmitted more than once.
WebNFS vs. HTTP 1.0:
HTTP1.0 is a simple network protocol to transfer documents, designed
primarily for Web browsing. It is very similar to FTP, and poses
the same limitations: multiple TCP connections for control and
each subsequent file requested. Neither does it support file offsets.
It should be noted, however, that HTTP1.1 protocol has tackled
most of this issues.
HTTP offers support for MIME headers for the browser
to distinguish document format. WebNFS does not support MIME tags,
only raw binary data.
WebNFS seems to perform better than HTTP, especially
under heavy loads. The following graph shows the effects of increasing
clients load (measured in Web operations per second) on the server's
response time.
As we can see from the graph above, HTTP server's
response time hits its limits at about 200 WebOps/sec, while WebNFS
server performs reasonably even under the 600 WebOps/sec load.
(Source: WebNFS white paper, for more information please see the
online references).
Security:
The basis of WebNFS security is the Remote Procedure Call (RPC) layer. Currently, the following authentication flavors are supported:
File and directory access is controlled by using
UNIX permission bits.
Proxy and Caching:
NFS clients usually cache data in memory. Disk caching is also
quite common for some applications. As yet, WebNFS servers do
not have any built in proxy or proxy/cache mechanisms. Definitely,
some room for improvement.
Summary:
WebNFS may turn out to be a very helpful solution for many Internet
and intranet based applications, whenever data has to be shared
across distant servers. It offers a relatively fast, convenient,
and somewhat transparent way of data communication and may significantly
facilitate the way many problems are approached and solved in
the modern computer world. Besides, it is fully compatible with
the earlier versions of NFS servers, and so are its usage and
maintenance. It also offers a number of advantages over using
other common Internet protocols.
However, there is still a number of actual and potential
problems associated with it. For example, it does not support
server proxy/caching. The data is recopgnized in raw binary form
only. Perhaps, the biggest problem of WebNFS is its flawed security,
exaggerated even further by eliminating a number of steps for
efficiency reasons, such as skipping PORTMAP or doing everything
through one persistent TCP connection. Although WebNFS supports
higher-level identification and encryption protocols, it usually
defaults to lower level ones.
Origins:
AFS is a distributed UNIX-like filesystem for both Local Area and Wide Area Networks. Originally, it was developed at CMU as a part of Andrew project. Later, it was sold to and is currently marketed by Transarc Corp. of Pittsburg, PA. Although the system has significantly evolved since its days at CMU, Transarc decided to maintain its name and global root (/afs).
As a distributed file system, AFS enables cooperating
hosts (both client and server) to chare filesystem resources across
the network. Currently it is installed on and used by approximately
1500 nodes, which is four orders of magnitude less than that of
NFS. There are less than 100 AFS cells (assigned according to
Internet domain names), mostly corresponding to universities and
research laboratories.
Global Structure and Naming Convention:
The root of the global AFS tree is /afs. AFS supports
global naming convention, which means that any file on
any server is seen the same from any other AFS location. In fact,
requesting client or server does not even need to know the specific
server the file belongs to. Example:
/afs/yurycell/yurypath/yuryfile
Moving the entire filesystems within the same cell
is easy as never before - no need to update "/etc/filesystems"
file on each client that uses it, because all filesystems mounted
in the same cell are transparent.
If the string `@sys' appears in a file name to be
used in AFS, it is automatically replaced with AFS's concept of
the type of the machine that the file name is being expanded on.
For example, on a sun 4 running SunOS 4.1.3, pathname "/afs/cs.utah.edu/@sys/bin/"
becomes "afs/cs.utah.edu/sun4c_413/bin/".
This feature may be used in pathnames or symbolic
links to allow them to be machine independent by choosing the
correct path according to the machine's architecture.
Cells:
The next level of hierarchy in the AFS are cells.
An AFS cell is a collection of servers grouped together administratively
and presenting a single filesystem. Usually, a cell corresponds
to a set of hosts under the same Internet domain name, such as:
However, it's up to the central AFS administration
(i.e. Transarc) to assign you to a particular cell.
If one of the file servers crashes, its clients would not be affected as much since they can still operate with other data on the same cells and the remote cells it has permissions upon. Moving files from one server to another within the same cell is transparent and does not affect filenames.
Volumes and Replication:
AFS volumes represent the next level in the
AFS hierarchy. UNIX divides disks into partitions. AFS further
subdivides them into volumes. Volumes are limited in size (usually,
between several megabytes and several hundred megabytes) atomic
structures for server replications, usually representing a directory
subtree belonging to a single user. Command `fs listquota` can
be used to obtain information about the name, quota, and usage
for a particular user.
If a user needs to temporarily exceed the limit of
your volume, she cannot simply grab some more available space
from the disk. Instead, she would either have to store it on some
other volume, or be assigned another volume for herself. It's
great for enforcing disk quotas, but causes a lot of grief for
someone who has to exceed space limit in a given sub tree, causing
users to overestimate their needs and waste disk space.
AFS replicates its data by volumes. Secure
data is typically replicated within the same cell. Publicly accessible
frequently requested data is replicated in multiple locations
across the Internet. This allows to still be able to access the
data even if the owner's server is down or the network is slow.
For convenience, some volumes are declared read-only. This saves
a lot of replication effort since read-only volumes can not be
routinely modified.
AFS vs. Common Internet Protocols:
AFS is a TCP-based protocol that handles network
communication in a manner very similar to that of WebNFS (i.e.
it utilizes a persistent TCP connection for both control and data,
only recognizes raw binary data, etc.). Initial connection overhead
is slightly higher for AFS than it is for WebNFS since AFS has
to go through a number of security checks that WebNFS usually
skips. However, once the connection is established, AFS takes
advantage of its caching mechanism and usually performs faster
than NFS (25% faster than NFS 3.0 based on Andrew benchmark; no
comparison is available between AFS and WebNFS).
AFS seems to perform a lot better than FTP or HTTP1.0 (although I could not find any numbers to support this) when transferring data across the network, since it offers similar advantages to that of WebNFS when compared to these protocols. They involve maintaining singe connection, keeping track of file offsets, etc.
Caching:
AFS supports client caching. All client machines run a Cache Manager process, which maintains information about the users logged in, finds and requests data on their behalf, keeps chunks of retrieved data on local disks. Unlike WebNFS, which is stateless, AFS is stateful, i.e. it maintains the current state for each client.
Client caching reduces network traffic and speeds up "warm reads." After a client is done using a file, if any writes have been performed, it will be automatically updated on all replicated volumes.
To maintain consistency on read-write files, AFS uses the Callback mechanism, which ensures that the cached copy of a file is up-to-date. When a file is modified the fileserver breaks the callback. When the user accesses the file again the Cache Manager fetches a new copy if the callback has been broken.
Scalability:
Scalability claimed by AFS is rather impressive. It ranges between 1 server/client and 1:200 server/client (AFS goal) ratios, with the recommended ratio of 1:50. Dynamic cells (easy to add or remove clients and servers) can host tens of servers and thousands of clients. However, it is rather difficult to estimate the true (versus claimed) limits of AFS scalability, since the entire installed base for AFS is currently only about 1500 clients.
Security:
AFS uses Kerberos encryption to secure data traveling
across the network. Mutual authentication supported (both service
provider and requester identify themselves). No less or more secure
encryption algorithms are provided, no plug-ins allowed. This
means that one can not browse AFS files across the web, since
no web browsers (to the best of my knowledge) currently support
Kerberos encryption framework. On the other hand, this could be
viewed as an advantage, since (unlike WebNFS) AFS does not default
to lower levels of security under any circumstances.
AFS utilizes three system-defined protection groups:
AFS supports Access control lists (ACL's). ACL contains a list of groups and users (up to 20 total) authorized to this directory. ACLs work on directory (not file) level. Subdirectories automatically get a copy of the parent's ACL upon creation, but can be modified later independently.
There is a total of 7 access rights: lookup, read,
insert, write, delete, lock, and administer(change the ACL) per
each directory. Only administrators can execute `chgrp` and `chown`
commands. The traditional UNIX permission bits can be ignored
for all practical purposes.
Summary:
Using AFS offers a number of benefits. It supports client caching and volume replication, which significantly improves AFS performance, especially under heavy network loads. Besides, it increases the independence from the data originated on the remote cells (or even other servers within the same cell), since these data can be cached or replicated locally.
Global naming convention and location independence allow the AFS server to support transparency when accessing data both locally and remotely.
High security standards, uniform across all AFS cells,
grants the ability to maintain high levels of security in any
transaction, whether initiated locally or across the network.
Yet, there exists a number of notable disadvantages. Perhaps the greatest one is that AFS is NOT a UNIX filesystem, and is not compatible with any other filesystems. People who are used to administering UNIX file systems like NFS often find AFS installation and maintenance cumbersome and inconvenient.
Origins:
CIFS, or Common Internet file system, is Microsoft's protocol that enables data and resource sharing and communication across both local and wide area networks. It is an extension of Windows native Server Message Block (SMB) protocol that allows file and printer sharing on local networks, optimized for the Internet.
CIFS is claimed to be a "platform independent" protocol. However, it is heavily adjusted to fit communication standards built into Microsoft Windows. This, however, is also true of WebNFS and AFS favoring UNIX-based standards. It is noteworthy that SMB has been successfully adopted and utilized across various platforms in the past, and there is a strong reason to believe that CIFS will be just as auspicious in this respect.
There are currently several different versions of CIFS protocol. Server and client can negotiate which particular version is to be used in their communication.
CIFS can be run over both TCP and UDP, although TCP
has recently been prevailing on LANs, and is predominant for the
Internet communication. In this respect, CIFS is similar to WebNFS
and AFS, which also support both TCP and UDP as local network
transports, but use TCP exclusively for communications across
the Internet.
File and Printer Access:
CIFS supports all standard file operations: open, close, read, write, etc. Printers are treated just like files: they are opened and written to, causing a print job to be queued.
File and record locking is also supported by CIFS.
Once a file has been locked by an application, it can not be accessed
by non-locking applications.
Caching and Data Consistency:
CIFS supports caching, read-ahead and write-behind for all files,
including the unlocked ones. This schema is used whenever there
is only one client accessing a file, or several clients reading
from a file. If several clients are accessing a file and one (or
several) are requesting writes, this is considered unsafe. The
server then notifies all clients of the unsafe state, and other
(non-caching) safer method are adopted.
Applications can register with the server to be notified whenever
a file or directory is changed. Such updates help to avoid the
problem of clients having to constantly poll the server in case
they need consistent information.
It is clear that CIFS, just like AFS and unlike WebNFS,
is a stateful file sharing protocol. Instead of simply
processing requests as they come along, CIFS servers keep track
of the state of each client. This determines which methods should
be used and when, and allows for higher efficiency and security.
Extended Attributes:
CIFS allows non-file system attributes, such as content
description, author name, or expiration date, be added to files.
Extended attributes are optional and supplement standard file
attributes like filename, length, creation time, etc. This feature
somewhat resembles MIME tags used by the HTTP protocol. CIFS is
a pioneer in this respect, since neither WebNFS nor AFS support
this property, rather treating all files as binary data.
Directory Subtree Mounting and Replication:
CIFS supports mounting multiple servers
and disk volumes to subtrees in clients directory hierarchy to
appear as if residing on the same server and volume. This is done
in a very similar fashion with the way WebNFS and AFS mount remote
servers. Changes in the physical location of the data caused by
server reconfiguration are made transparent to the client (as
long as their names remain consistent).
Just like AFS, CIFS supports subtree replication
(which is not limited by discrete volumes, as it is done in AFS).
Such replications are transparent to the client and help improve
network load and fault tolerance (in case a remote file server
decides to crash). However, obvious problems arise in trying to
maintain consistency among replicated data. In general, it is
dealt with similar to the way servers treat locally cached copies
of single files.
Global Name Resolution:
The necessity to mount remote servers
is reduced due to the possibility of using global file names.
The syntax is the same as that for addressing local files. Consider
the following example - a remote client wants to access file "yury.doc"
residing in the "\home\ugrad\izrailev\" directory on
server "labnt0.cs.utah.edu". There are two different
ways the client could approach this task.
If it plans to access the server on a regular basis, it would be a good idea to add an index to a table containing server names a file prefixes, say, mapping Z to "home" on "labnt0.cs.utah.edu". The call would then look as following:
Z:\ugrad\izrailev\yury.doc
This is similar to the way AFS handles global name resolution, except instead of adding an index to a table, AFS would create a symbolic link.
The URL format for remote file access is also available. For the example above, it would be the following:
file://labnt0.cs.utah.edu/home/ugrad/izrailev/yury.doc
Security and Authentication:
A set of resources are available for clients to access on each server, which may include files, subdirectories, printers, etc. CIFS supports to different ways of controlling access to these resources - share level and user level.
Share level method assigns passwords to each particular resource. Several passwords for the same resource may denote different level of access privileges for that unit. Any client on the network who can identify server name, resource name, and password will be granted access.
User level method, instead of assigning passwords for to resources, keeps track of all user ID's and passwords. When requesting a resource, a client must identify the server, the resource, its ID and password. User level offers a convenient way of maintaining and modifying a list of trusted clients without having to reassign a password to a resource each time this list has to change. It is also more convenient to keep track of resource utilization statistics based on client's IDs.
Therefore, user level mechanism is preferred to share
level whenever possible. However, a number of servers and clients
who use older versions of SMB do not support user level security.
To maintain compatibility, a server has to default to whatever
mechanism was requested by the client. This is a major security
problem for CIFS, similar to that of WebNFS, which has to default
to the level of security requested by the client, and is natural
to all protocols that try to maintain backward compatibility.
For AFS, on the other hand, this is not an issue since it deliberately
makes itself incompatible with other protocols to maintain higher
security levels.
To encrypt data traveling across the net CIFS uses
DES encryption protocol. To verify each other, both client and
server use an 8-bit key humbly provided by Microsoft upon request
(see CIFS Internet-draft, section 2.10.1).
Summary:
CIFS protocol has common features with both WebNFS and AFS,
as well as a number of unique characteristics. Like AFS, it allows
for data caching and replication, which is crucial under heavy
network loads. The way it mounts data from remote servers into
the local hierarchy is similar to that of WebNFS. Global naming
resembles somewhat the AFS approach (although it can hardly be
called location independent), while CIFS URL is a clone of WebNFS's
URL notation.
High levels of security are not required when connecting to a CIFS server, which makes it just as vulnerable as WebNFS (or perhaps even more so). Both WebNFS and CIFS use DES encryption, while AFS utilizes a lot more secure Kerberos framework. Although WebNFS could potentially use Kerberos, it almost always defaults down to DES anyway.
Resource access control is handled differently in every system. CIFS uses either passwords for the resource (NOT a good idea), or user ID's and passwords. CIFS also provides access to printers by treating them as files. WebNFS uses traditional UNIX permission bits that heavily limit the combination of users and the types of access to that file. AFS uses ACLs to allow any combination of users (up to 20 groups total) and seven types of access privileges to each subdirectory (while WebNFS and CIFS do it on per-file basis). Yet, both CIFS and WebNFS maintain compatibility with their predecessors, something that AFS lacks.
All three systems use "standard" optimizations over the "veteran" Internet protocols HTTP and FTP. They all batch their requests, maintain a persistent TCP connection, and keep track of file offsets for the data that has already been transferred. While both WebNFS and AFS treat all files as binary data, CIFS appends extended file attributes. They may not be as explicit as MIME tags, but they are definitely better than plain file-system-type file attributes.
CIFS supports stateful servers in a manner similar
to AFS, something WebNFS is in need for. Both CIFS and AFS keep
track of all the clients cached and replicated data and (try to)
maintain consistency among parallel copies.
All three protocols seem to have comparable fault tolerance levels,
expressed in resuming operations terminated by network problems.
WebNFS uses a sliding window approach for batching requests, which
seems to work rather well under changing network conditions. Local
server crashes would halt both WebNFS and CIFS clients, while
AFS clients could be switched transparently to a different file
server under the same cell.
It is extremely difficult to draw any performance comparisons since very little information is available. AFS showed a 25% better performance than NFS 3.0 (WebNFS predecessor) based on Andrew benchmark, probably caused by the lack of disk caching on the part of NFS, but this could be easily compensated for by the optimizations that WebNFS introduced into connection setup and maintenance. It is hard to estimate performance without actual numbers, but based on the available information and my experience with NFS and SMB, WebNFS and CIFS should perform comparably.
As has hopefully been shown in this paper, each of
the file sharing protocols presents certain advantages over the
other ones, and is likely to be selected based on customer's particular
needs. While AFS is suited better for data sharing protected by
higher levels of security, WebNFS and CIFS simply extend already
existent protocols to be more efficient for Internet-wide transactions.
The predominant distributed file system solution will emerge based upon many conditions, where the outcome of the current architectural and operating systems competition will play a decisive role. Nevertheless, I am positive that the combined design and trial experience of many different systems will help to identify the right approaches to solving the problem of widely distributed general-purpose computing.
WebNFS:
AFS:
CIFS:
izrailev@eng.utah.edu
HOME