Online Timescale Theory for Concurrent Memory Allocation
- Pengcheng Li, Google
- Time: 2019-04-04 10:15
- Host: Prof. Xiaotie Deng
- Venue: Room 204, Courtyard No.5, Jingyuan
Concurrent memory allocation is increasingly important to parallel performance, but it is challenging because a program has data of many sizes and demand changes dynamically from thread to thread. Modern allocators use highly tuned heuristics but do not provide uniformly good performance when the level of concurrency increases from a few threads to hundreds of threads.
This paper presents a new timescale theory to model the memory demand in real time. Using the new theory, an allocator can adjust its synchronization frequency using a single parameter called allocations per fetch (APF). The paper presents the on-line timescale theory, the design of the tunable APF allocator, and evaluation of its speed and memory efficiency. Compared with four state-of-the-art concurrent allocators, the new allocator improves the throughput of MongoDB by 55%, reduces the tail latency of a Web server by over 60%, and increases the speed of a selection of synthetic benchmarks by up to 24x while using the same amount of memory.
My name is Pengcheng Li. I joined Google on Dec. 2016, mainly working on program analysis and optimization, google-wide profiling, tcmalloc, etc. Prior to this position, I have been working at the University of Rochester since 2012 fall for my Ph.D. study. My research area focuses on memory management using timescale statistics, cache locality modeling, and program performance optimization. During Ph.D. period, I published 20 papers and patents, including 3 US patents. Before that, I worked at Baidu for almost two years from 2010 to 2012 on load balancer systems. I received my master degree from Institute Computing Technology, Chinese Academy of Science in 2010 and my bachelor degree from University of Science and Technology of China in 2007.