The Synthetix MemGuard Kernel Programmer's Interface Qian Zhang, qian@cse.ogi.edu v1.0, June 1997 This document is an overview and reference manual of the Synthetix MemGuard Kernel Programmer's Interface. ______________________________________________________________________ Table of Contents: 1. Motivation 2. Overview 2.1. Functionality 2.2. Architecture 2.3. Algorithms 3. Kernel Extension 4. Installation 5. Man Page 5.1. NAME 5.2. SYNOPOSIS 5.3. OVERVIEW 5.4. DESCRIPTION 5.5. EXAMPLES 5.6. SEE ALSO 5.7. BUGS ______________________________________________________________________ 1. Motivation In your operating system kernel, you have a fully generalized procedure, P, that is prepared to deal with any situation. It is slow because it is always testing for conditions that are seldom true, but being a general piece of code, it always has to test for them. Now suppose you know that, for periods of time, certain of those conditions are false, and that P doesn't have to check for them. So you specialize P for those situations. Making assumptions about asynchronous transistory conditions is called optimism. Finding all points in all the code of all the processes that can invalidate the assumption is the discipline of specifying Quasi-Invariants: "Invariant", because they are assumptions that make the code of the specialization of P correct, and "Quasi", because invalidation can happen at any time. Any invalidation to a QI (Quasi-Invariant) is represented by writes to some data structures, called Quasi-Invariant Terms). For example, if a QI is the condition that two variables are equal, updating either of the two variables may potentially violate the condition. Guarding against writing to the two variables is able to locate all potential violations. The Synthetix Guarding Tools consist of TypeGuard, a compile-time guarding tool, and MemGuard, a virtual memory based run- time guarding tool. MemGuard protects a quasi-invariant term by write-protecting the physical page with the term in it. Any write to the page will trigger a page protection fault, so that the system can catch all writes to the quasi-invariant term in the page and report an error to the system programmer. Then, the system programmer can insert proper code to the reported places to make sure that the specialized system always runs under correct assumptions. By iterating this procedure, the system programmer eventually finishes the specialized system testing. 2. Overview 2.1. Functionality MemGuard allows the kernel programmer to enable and disable MemGuard facility dynamically. This is useful to focus debugging on a particular section of the kernel. MemGuard also provides a protect function to mark a piece of data memory as a quasi-invariant term. MemGuard will report error if there is any unexpected write to the piece of memory. Accordingly, there is a release function for notifying MemGuard not to protect a quasi- invariant term. When a quasi-invariant term is unexpectedly updated, MemGuard reports an error to the system programmer. The programmer now can insert proper code to this place to make sure the correctness of the specialized system. However, even though the programmer may decide to insert nothing since the write will not violate the quasi-invariant, the update has to be finished somehow. MemGuard provides a write function allowing the kernel writes to a protected quasi-invariant term explicitly without MemGuard complaining. 2.2. Architecture MemGuard is a kernel library. For portability, it is organized as two levels. The lower level is the Memory Protection level, which is MemGuard's architecture dependent part. The higher level is the MemGuard level, which is MemGuard's architecture independent part. The Memory Protection level maps the underlying architecture to an abstraction machine, which provides required memory protection functionality for MemGuard. The MemGuard level is implemented on this abstraction machine. 2.3. Algorithms These algorithms are highly simplied here. Many details (especially synchronization details) are ignored. Protect a quasi-invariant term: 1. Save the physical page's original protection. 2. Add the quasi-invariant term to the MemGuard data structures. 3. Set the page's protection to read-only. Release a quasi-invariant term: 1. Remove the quasi-invariant term from the MemGuard data structures. 2. If there is no quasi-invariant term left in the page, restore the original page protection. Explicitly write to a quasi-invariant term: 1. Restore the original page protection. 2. Write to the quasi-invariant term. 3. Set the page's protection to read-only. When there is a write to a protected page, the write will trigger a page protection fault. If the fault is because of MemGuard protection, the MemGuard page protection fault handler will take over the control. If the write is to a quasi-invariant term, the handler will report an error. If the write is to another data structure which happens to be in the same physical page with a quasi-invariant term, the handler will try to finish the write transparently. The MemGuard Page Protection Fault Handler: 1. Report an error if the write is to a quasi-invariant term. 2. Update stacked system flag register (set single-step mode upon returning from the handler). 3. Restore the original page protection. When the execution returns from the page protection fault handler, the physical page will be set back to read-write and the single-step mode will be set. The current instruction will be restarted automatically and finish its write. Since the system is running on single-step mode, the finishing of the faulting instruction will trigger a single- step trap. The MemGuard single-step trap handler will try to protect the current page again. The MemGuard Single-Step Trap Handler: 1. Protect the page again. 2. Restore the stacked system flag register (clear single-step mode upon returning from the handler). 3. Kernel Extension (The MemGuard prototype works on the Linux 2.0.27 kernel only.) The architecture-dependent part of MemGuard is put under: /usr/src/linux-2.0.27/include/asm-i386/memguard /usr/src/linux-2.0.27/arch/i386/memguard These source files provide the abstraction machine for the higher level of MemGuard. The architecture-independent part of MemGuard is put under: /usr/src/linux-2.0.27/include/memguard /usr/src/linux-2.0.27/memguard These source files implement MemGuard on top of the abstraction machine. The MemGuard initialization is done by patching: /usr/src/linux-2.0.27/init/main.c Since a quasi-invariant term may happen to be in the same physical page with a task structure, task-switches may result in page protection faults. However, the Pentium processor requires a task gate to call the page-fault handler during task-switches, but the page-fault handler in Linux is not designed this way. Page protection faults are totally avoided during context-switches by patching: /usr/src/linux-2.0.27/kernel/sched.c Pentium's 4MB page facility greatly increases false sharing (a normal data structure happens to be in the same page with a quasi-invariant) ratio and hurts MemGuard performance. This facility is disabled by patching: /usr/src/linux-2.0.27/arch/i386/kernel/setup.c The Linux page fault handler is modified to identify MemGuard page protection faults and call the MemGuard page protection fault handler. The modification is done by patching: /usr/src/linux-2.0.27/arch/i386/mm/fault.c The Linux debug trap handler is modified for restoring the system state after the single-stepped faulting instruction. The modification is done by patching: /usr/src/linux-2.0.27/arch/i386/kernel/traps.c 4. Installation The Linux Kernel 2.0.27 patch for MemGuard comes in RPM (Red Hat Package Manager) format and GNU compressed tar format. The installation steps: 1. Install the Linux kernel 2.0.27 source package. 2. Install the MemGuard package. 3. Patch the original kernel source from the patch file, memguard.patch. 4. In the kernel configuration, answer "Yes" to "OGI Synthetix MemGuard" from the "General Setup" group to use MemGuard or answer "No" not to use MemGuard. 5. Refer to the "README" file in the kernel source directory (/usr/src/linux). 5. Man Page This section also appears in the MemGuard man page (/usr/man/man9/memguard.9). 5.1. NAME The MemGuard API - Programming interface to the MemGuard function library for the MemGuard users. 5.2. SYNOPOSIS #include void mgReset(void); int mgEnable(void); int mgDisable(void); int mgAdd( char *name, caddr_t vaddr, u_int size ); int mgDelete( caddr_t vaddr ); int mgSet( caddr_t vaddr, u_int size, char *content ); int mgGet( caddr_t vaddr, u_int size, char *content ); int mgAssert( caddr_t vaddr, u_int size, char *content ); int mgPrintErr(void); 5.3. OVERVIEW MemGuard is a memory-protection-based run-time guarding tool for the Synthetix project, which is aimed at operat- ing system specialization. A specialized system tries to run specialized components according to some particular system state properties. To ensure correctness, a system should always runs a specialized component under correct assumptions on system states. Since any change to a state property is indicated by writes to some particular data structure, guarding against these writes helps find all potential changes to system state properties. MemGuard uses memory protection (page protection) hardware to write- protect the data structures that we are concerned with. These data structures are called quasi-invariant terms. Any unexpected write to a quasi-invariant term will trigger page protection fault, so that the system can catch the write and report an error to the programmer. By using MemGuard, a specialized system guarantees to run under correct specialization assumptions. MemGuard is a function library integrated into the kernel. It provides an API as the programming interface for the MemGuard users. 5.4. DESCRIPTION void mgReset(void); mgReset removes the MemGuard protection and resets MemGuard data structures to their initial state. MemGuard is also disabled. Nothing is returned. int mgEnable(void); mgEnable enables the MemGuard protection to protect quasi- invariant terms. Among the return values, MG_ENABLE_OK means success; MG_ENABLE_ENABLED means that MemGuard was already enabled. int mgDisable(void); mgDisable disables the MemGuard protection so that quasi- invariant terms are unprotected until Mem- Guard is enabled again. However, the user can con- tinue to delete old quasi- invariant terms and add new ones. Newly added quasi- invariant terms will be protected as soon as MemGuard is re- enabled. Among the return values, MG_DISABLE_OK means suc- cess; MG_DISABLE_DISABLED means that MemGuard was already disabled. int mgAdd( char *name, caddr_t vaddr, u_int size ); mgAdd protects a quasi-invariant term whose virtual address and size are vaddr size. The name of the quasi-invariant term helps debugging. Among the return values, MG_ADD_OK means success; MG_ADD_VADDR means invalid vaddr; MG_ADD_SIZE means invalid size; MG_ADD_NAME means invalid name; MG_ADD_CONFLICT means that the perspective quasi- invariant term is conflict with existing quasi- invariant terms; MG_ADD_FULL means that there are already too many quasi- invariant terms. int mgDelete( caddr_t vaddr ); mgDelete removes the MemGuard protection from the quasi- invariant term whose virtual address is vaddr. Among the return values, MG_DELETE_OK means success; MG_DELETE_VADDR means that the quasi- invariant term does not exist. int mgSet( caddr_t vaddr, u_int size, char *content ); mgSet writes to the quasi-invariant term whose vir- tual address and size are vaddr and size with the new value stored in the buffer starting at content. Among the return values, MG_SET_OK means success; MG_SET_VADDR means that the quasi- invariant term does not exist; MG_SET_SIZE means invalid size; MG_SET_CONTENT means that there was an error during the copy. int mgGet( caddr_t vaddr, u_int size, char *content ); mgGet reads the content of the quasi-invariant term whose virtual address and size are vaddr and size into the buffer starting at content. Among the return values, MG_GET_OK means success; MG_GET_VADDR means that the quasi-invariant term does not exist; MG_GET_SIZE means invalid size; MG_GET_CONTENT means that there was an error during the copy. int mgAssert( caddr_t vaddr, u_int size, char *content ); mgAssert asserts the content of the quasi-invariant term whose virtual address and size are vaddr and size is equal to the content of the buffer starting at content. Among the return values, MG_ASSERT_OK means success; MG_ASSERT_VADDR means that the quasi-invariant term does not exist; MG_ASSERT_SIZE means invalid size; MG_ASSERT_CONTENT means that the contents are different or there was an error during the copy. int mgPrintErr(void); mgPrintErr prints out the error message for the just finished MemGuard function. This function is similar to the user-level perror function. Among the return values, MG_PERROR_OK means success; MG_PERROR_ERROR means that the previous error is not understandable. The user should report this error to the author of MemGuard. 5.5. EXAMPLES 5.6. SEE ALSO memguard_internal(9) 5.7. BUGS Current error messages from MemGuard sometimes are incom- plete and conservative because of lack of accurate analy- sis of the faulting instruction. Current MemGuard does not support multi-processor. Since a kernel variable may be in the same page with a variable that will be written during the page protection fault handling, protecting the former may result in strange behaviors of the kernel. For example, some of our specialization experiments need to mark some fields of a task structure as quasi-invariant terms, but unfortu- nately, protecting task structures will crash the system. In these situations, thorough system testing without Mem- Guard is required. Future MemGuard will try to avoid this case as much as possible. Similar to the copy-on-write mechanism, future user- level MemGuard will not have this problem when it is used for memory protection in user address space.