Distributed Shared Memory
Introduction
Distributed Shared Memory (DSM) is a resource management component of a
distributed operating system that implements the shared memory model in
distributed systems, which have no physically shared memory. The shared
memory model provides a virtual address space that is shared among all
computers in a distributed system. An example of this layout can be
seen in the following diagram taken from Advanced Concepts in Operating
Systems by Singhal & Shivaratri.
Distributed Shared Memory
Motivation
In DSM, data is accessed from a shared address space similar to the way
that virutal memory is accessed. Data moves between secondary and main
memory, as well as, between the distributed main memories of different
nodes. Ownership of pages in memory starts out in some pre-defined
state but changes during the course of normal operation. Ownership
changes take place when data moves from one node to another due to an
access by a particular process.
Advantages of Distruibuted Shared Memory:
- Hide data movement and provide a simpler abstraction for
sharing data. Programmers don't need to worry about memory transfers
between machines like when using the message passing model.
- Allows the passing of complex structures by reference,
simplifying algorithm development for distributed applications.
- Takes advantage of "locality of reference" by moving the entire page
containing the data referenced rather than just the piece of data.
- Cheaper to build than multiprocessor systems. Ideas can be
implemented using normal hardware and do not require anything complex to
connect the shared memory to the processors.
- Larger memory sizes are available to programs, by combining all
physical memory of all nodes. This large memory will not incur disk
latency due to swapping like in traditional distributed systems.
- Unlimited number of nodes can be used. Unlike multiprocessor
systems where main memory is accessed via a common bus, thus limiting
the size of the multiprocessor system.
- Programs written for shared memory multiprocessors can be run on DSM
systems,
There are two different ways that nodes can be informed of who owns what
page: invalidation and broadcast. Invalidation is a method that
invalidates a page when some process asks for write access to that page
and becomes its new owner. This way the next time some other process
tries to read or write to a copy of the page it thought it had, the
page will not be available and the process will have to re-request
access to that page. Broadcasting will automatically update all copies
of a memory page when a process writes to it. This is also called
write-update. This method is a lot less efficient more difficult to
implement because a new value has to sent instead of an invalidation
message.
Organization
Here is an example layout of a distributed memory management system
implemented using fault handlers and servers:
Fault Handlers
A fault handler is a proccess or potrion of a process that sits and
waits for memory faults. When there is a memory access that it cannot
deal with locally, the fault handler will make a request to a server
on some other machine in the DSM environment. It is in charge of making
sure an application or program is given the memory pages it needs
without knowing what is going on underneath and where the page is
actually coming from.
Servers
The servers from the above diagram actually service the fault handlers
requests. They know which machines own the memory page that is being
accessed and can fetch the page and deliver it to the asking process.
Example Code of Distributed Shared Memory Servers
Monitor Central Manager
Read Processes
Write Processes
Write fault handler:
Lock( PTable[p].lock);
IF I am manager THEN BEGIN
Lock(Info[p].lock);
Invalidate(p, Info[p].copyset);
Info[p].copyset := { };
ask Info[p].owner to send p to self;
receive p;
Info[p].owner = self;
Unlock(Info[p].lock]);
END
ELSE BEGIN
ask manager for write access to p;
receive p;
send confirmation to manager;
END
PTable[p].access := write;
Unlock(PTable[p].lock);
Write server:
Lock(PTable[p].lock);
IF I am owner THEN BEGIN
send copy of P;
PTable[p].access := nil;
END
Unlock(PTable[p].lock);
IF I am manager THEN BEGIN
Lock(Info[p].lock);
Invalidate(p, Info[p].copyset);
Info[p].copyset := { };
ask Info[p].owner to send p to RequestNode;
receive confirmation from RequestNode;
Info[p].owner = RequestNode;
Unlock(Info[p].lock);
END;
Dynamic Distributed Manager
Read Processes
Read fault handler:
Lock(PTable[p].lock);
ask PTable[p].probOwner for read access to p;
receive p and PTable[p].copyset;
PTable[p].probOwner = self;
PTable[p].access :=read;
Unlock(PTable[p].lock);
Read server:
Lock(PTable[p].lock);
IF I am owner THEN BEGIN
PTable[p].copyset := PTable[p].copyset U {self};
PTable[p].access :=read;
send p and PTable[p].copyset to RequestNode
PTable[p].probOwner := RequestNode;
END
ELSE BEGIN
forward request to PTable[p].probOwner
PTable[p].probOwner := RequestNode;
END;
Unlock(PTable[p].lock);
Write Processes
Write fault handler:
Lock(PTable[p].lock)
ask PTable[p].probOwner for write access to page p;
receive p and PTable[p].copyset;
Invalidate(p, PTable[p].copyset);
PTable[p].probOwner := self;
PTable[p].access := write;
PTable[p].copyset := { };
Unlock(PTable[p].lock);
Write server:
Lock(PTable[p].lock);
IF I am owner THEN BEGIN
PTable[p].access := nil;
send copy of p and PTable[p].copyset to Request Node;
PTable[p].probOwner := RequestNode;
END
ELSE BEGIN
forward request to PTable[p].probOwner;
PTable[p].probOwner := RequestNode;
END;
Unlock(PTable[p].lock);
Invalidate server:
PTable[p].access := nil;
PTable[p].probOwner := RequestNode;
Other Web resources related to
DSM
Papers written about DSM
brian@discus.ise.vt.edu