A Production Quality Sketching Library for the Analysis of Big Data
- 時段：9/10 09:35~10:10 （配合美西時區）
- 講者姓名：Lee Rhodes
Distinguished Architect / Verizon Media
Lee Rhodes is a Distinguished Architect at Yahoo (now Verizon Media). He created the DataSketches project in 2012 to address analysis challenges in Yahoo’s large data processing pipelines. DataSketches was Open Sourced in 2015 and is now in incubation at Apache Software Foundation. He is an author or coauthor on sketching work published in ICDT, IMC, and JCGS. He obtained his Master’s Degree in Electrical Engineering from Stanford University and a bachelor’s degree in physics from San Diego State University.
The first part of the talk will discuss common problematic queries of big data where traditional analysis methods don’t work well. The second part of the talk will introduce the fundamental concepts of sketching and how sketches’ use of stochastic processes and probabilistic analysis along with appropriate system architecture can achieve orders-of-magnitude improvement in system performance. Finally, a quick overview of the open-source Apache DataSketches Library, which is dedicated to production systems that must process big data.