LRPC: Overview of Bottlenecks


Server Locality

In these systems, a process often doesn't know where another process is located, or on which processor it is located, as this information is protected by the operating system. This allows a great deal of assurance of system integrity. Unfortunately, the use of an RPC mechanism to acheive this transparency can impose a great deal of overhead. Researchers at the University of Washington found that for the multiprocessing platforms they were concerned with, most RPC calls were actually being directed to the local machine (94.5% under Taos for the DEC Firefly). This indicates not only some overhead in the RPC mechanism, but that most of the mechanism was overhead. Multiprocessor systems are not priced to be wasted resources!

Data Transfer Requirements

Analysis of the procedure calls handled through the RPC mechanism showed that most calls were to a small number of procedures, and that only byte-copying was required to transfer the data passed as parameters. Also, most calls passed a small amount of information (less than 200 bytes for a majority of calls, and less than 50 for most of those). Other research has suggested optimizations for the case of small parameters and some compiler-based mechanisms for passing parameters directly in machine registers.

Required Activities

Much of the overhead which is part of the RPC mechanism is required to maintain the system integrity and access protections. For any procedure call, a theoretical minimum dispatch overhead may be computed which includes a procedure call & return, two kernel traps, and two context switches. Each context switch includes both process and virtual memory information. Any additional work required to implement RPC is considered the overhead of the mechanism rather than useful work. The authors of the reference cited below identify the following issues which must be addressed under all circumstances:

Stub overhead
The overhead of the RPC implementation includes calling the run-time stub which performs call indirection and parameter marshalling. This is virtually impossible to avoid since this is the mechanism by which transparency is achieved. This also implies that avoiding this is not necessarily desirable.

Message buffer overhead
Traditionally, the marshalled parameters are passed as a message through the kernel, requiring two message copies for the call and two for the return.

Access validation
This is required to ensure that the called process exists at the time of the call, and also that the caller is still valid on the return.

Message transfer
Message queues require management, which can be fast but must be performed within the kernel.

Scheduling
The kernel must interpose and manipulate various data structures in the scheduling mechanism to ensure that the caller and callee are each active at appropriate times, and do not gain or relinquish control out of order. This is required to map the logical thread of control to the two actual threads.

Context switch
This context switches from caller to callee and back are hard to avoid.

Dispatch
Within the server process, a thread must be dispatched to fulfill the execution request.

As mentioned above, some implementations use optimization techniques to reduce various aspects of the overhead, but no single research project had achieved significant reduction of the overhead required. Optimizations included elimination of message copying, handoff scheduling, and shared message buffers. Heavyweight stubs remain a significant bottleneck.


Go Back to the Operating Systems page.
Go Back to the RPC page.
Fred L. Drake, Jr., fdrake@csgrad.cs.vt.edu
Last modified: Saturday, 25 March 1995