Reducing Minor Page Fault Overheads through Enhanced Page Walker

abstract

Application virtual memory footprints are growing rapidly in all systems from servers down to smartphones. To address this growing demand, system integrators are incorporating ever larger amounts of main memory, warranting rethinking of memory management. In current systems, applications produce page fault exceptions whenever they access virtual memory regions that are not backed by a physical page. As application memory footprints grow, they induce more and more minor page faults. Handling of each minor page fault can take a few thousands of CPU cycles and blocks the application till the OS kernel finds a free physical frame. These page faults can be detrimental to the performance when their frequency of occurrence is high and spread across application runtime. Specifically, lazy allocation-induced minor page faults are increasingly impacting application performance. Our evaluation of several workloads indicates an overhead due to minor page faults as high as 29% of execution time. In this article, we propose to mitigate this problem through a hardware, software co-design approach. Specifically, we first propose to parallelize portions of the kernel page allocation to run ahead of fault time in a separate thread. Then we propose the Minor Fault Offload Engine (MFOE), a per-core hardware accelerator for minor fault handling. MFOE is equipped with a pre-allocated page frame table that it uses to service a page fault. On a page fault, MFOE quickly picks a pre-allocated page frame from this table, makes an entry for it in the TLB, and updates the page table entry to satisfy the page fault. The pre-allocation frame tables are periodically refreshed by a background kernel thread, which also updates the data structures in the kernel to account for the handled page faults. We evaluate this system in the gem5 architectural simulator with a modified Linux kernel running on top of simulated hardware containing the MFOE accelerator. Our results show that MFOE improves the average critical path fault handling latency by 33 and tail critical path latency by 51. Among the evaluated applications, we observed an improvement of runtime by an average of 6.6%.

authors

Gratz, Paul

published proceedings

ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION

author list (cited authors)

Tirumalasetty, C., Chou, C. C., Reddy, N., Gratz, P., & Abouelwafa, A.

citation count

1

complete list of authors

Tirumalasetty, Chandrahas||Chou, Chih Chieh||Reddy, Narasimha||Gratz, Paul||Abouelwafa, Ayman

publication date

December 2022

publisher

Association for Computing Machinery (ACM) Publisher

published in

ACM Transactions on Architecture and Code Optimization (TACO) Journal

Reducing Minor Page Fault Overheads through Enhanced Page Walker Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL