CLOUD COMPUTING : GOOGLE FILE SYSTEM
GOOGLE FILE SYSTEM(GFS)
v The
Google File System (GFS) was developed in the late 1990s. It uses thousands of
storage systems built from inexpensive commodity components to provide
petabytes of storage to a large user community with diverse needs.
v The main
concern of the GFS designers was to ensure the reliability of a system exposed
to hardware failures, system software errors, application errors, and last but
not least, human errors.
v The
system was designed after a careful analysis of the file characteristics and of
the access models. Some of the most important aspects of this analysis
reflected in the GFS design are:
ü Scalability
and reliability are critical features of the system; they must be considered
from the beginning rather than at some stage of the design.
ü The vast
majority of files range in size from a few GB to hundreds of TB.
ü The most
common operation is to append to an existing file; random write operations to a
file are extremely infrequent.
ü Sequential
read operations are the norm.
ü The
users process the data in bulk and are less concerned with the response time.
ü The
consistency model should be relaxed to simplify the system implementation,
butwithout placing an additional burden on the application developers
Several design decisions
were made as a result of this analysis:
ü Segment
a file in large chunks.
ü Implement
an atomic file append operation allowing multiple applications operating
concurrently to append to the same file.
ü Build
the cluster around a high-bandwidth rather than low-latency interconnection
network. Separate the flow of control from the data flow; schedule the
high-bandwidth
data flow by pipelining the data transfer over
TCP connections to reduce the response time. Exploit network topology by
sending data to the closest node in the network.
ü Eliminate
caching at the client site. Caching increases the overhead for maintaining
consistency among cached copies at multiple client sites and it is not likely to
improve performance.
ü Ensure
consistency by channeling critical file operations through a master, a
component of the cluster that controls the entire system.
ü Minimize
the involvement of the master in file access operations to avoid
hot-spot contention and to ensure scalability.
ü Support
efficient checkpointing and fast recovery mechanisms.
ü
Support an efficient garbage-collection mechanism.
v GFS
files are collections of fixed-size segments called chunks; at the time
of file creation each chunk is assigned a unique chunk handle.
v A chunk
consists of 64 KB blocks and each block has a 32-bit checksum. Chunks are stored on Linux files
systems and are replicated on multiple sites; a user may change the number of
the replicas from the standard value of three to any desired value. The chunk
size is 64 MB.
v The architecture of a GFS cluster is illustrated in the below
figure. A master controls a large number of chunk servers; it
maintains metadata such as filenames, access control information, the location
of all the replicas for every chunk of each file, and the state of individual
chunk servers
Architecture of a GFS cluster
v The
locations of the chunks are stored only in the control structure of the master’s
memory and are updated at system startup or when a new chunk server joins the
cluster. This strategy allows the master to have up-to-date information
about the location of the chunks.
v System
reliability is a major concern, and the operation log maintains a historical
record of metadata changes, enabling the master to recover in case of a
failure. To recover from a failure, the master replays the operation
log. To minimize the recovery time, the master periodically checkpoints
its state and at recovery time replays only the log records after the last checkpoint.
v Each
chunk server is a commodity Linux system; it receives instructions from
the master and responds with status information.
v The
consistencymodel is very effective and scalable. Operations, such as file
creation, are atomic and are handled by the master. To ensure
scalability, the master has minimal involvement in file mutations and
operations such as write or append that occur frequently.
v Data for
a write straddles the chunk boundary, two operations are carried out, one for
each chunk.
The steps for a write request illustrate a
process that buffers data and decouples the control flow from the data flow for
efficiency:
1. The
client contacts the master, which assigns a lease to one of the chunk
servers for a particular chunk if no lease for that chunk exists; then the master
replies with the ID of the primary as well as secondary chunk servers
holding replicas of the chunk. The client caches this information.
2. The
client sends the data to all chunk servers holding replicas of the chunk; each
one of the chunk servers stores the data in an internal LRU buffer and then
sends an acknowledgment to the client.
3. The client sends a write request to the primary
once it has received the acknowledgments from all chunk servers holding
replicas of the chunk. The primary identifies mutations by consecutive sequence
numbers.
4. The
primary sends the write requests to all secondaries.
5. Each
secondary applies the mutations in the order of the sequence numbers and then
sends an
acknowledgment to the
primary.
6. Finally,
after receiving the acknowledgments from all secondaries, the primary informs
the client.
Comments
Post a Comment