Simple & Optimal Quantile Sketch---Combining Greenwald-Khanna with Khanna-Greenwald
- Hongxun Wu, UC Berkeley
- Time: 2023-12-20 16:00
- Host: Dr. Tongyang Li
- Venue: Room 204, Courtyard No.5, Jingyuan
Abstract
Finding the approximate median (or more generally, quantiles) in a stream is a fundamental problem in data science. 20 years ago, Greenwald and Khanna proposed a deterministic algorithm for approximate streaming quantiles, whose space complexity was later proven to be optimal. This algorithm is very popular and gets "implemented" in spark-SQL and other software packages.
However, since it is complicated, these "implementations" are all oversimplified heuristics without theoretical guarantees. In this talk, I will present an algorithm that achieves both simplicity and optimality.
Biography
Hongxun Wu is a second-year PhD student at the theory group of UC Berkeley, advised by Jelani Nelson and Avishay Tal.His research interest is in low space computation, especially its derandomization and streaming algorithms.