Faster is Better: The Future of Analytics is 1-Trillion Rows Per Second
Today's world is always connected, and constantly creating data. As a result, today's intelligent applications need to respond and react immediately to drive business value.
It is well known that having an interactive response time of under a quarter of a second gives people incredible satisfaction. When you deliver response time that drops down to about a quarter of a second, results seem to be instantaneous to users.
With large data sets and concurrency needs, giving all customers that level of speed can seem beyond reach. This can sometimes result in developers taking shortcuts, such as precomputing summary aggregates. This can lead to a rigid user experience where if you tweak your query a little, for example adding an extra grouping column, suddenly it runs orders of magnitude slower. It also means that your answers are not real time, i.e. not on the latest data.
To push the frontier of real-time insight, MemSQL and Intel have worked together to run a single SQL query that achieves over a trillion rows per second scan with grouping and aggregation.
The ability to do a query at this speed was based on the latest MemSQL release running on a cluster with 24 Intel Xeon Platinum 8180 Processors with 26 cores each. MemSQL uses AVX-2 SIMD extensions and vectorized operations directly on encoded columnstore data to enable this result.
In this session, we will discuss how running queries this fast on industry-standard hardware can enhance analytics in your applications.
Eric Hanson is a principal product manager at MemSQL, responsible for query processing, extensibility, and geo-spatial feature areas.
He is a PhD graduate of UC Berkeley, was an Air Force officer, a professor of computer science at the University of Florida during the 1990s, and a principal program manager and developer in the SQL Server team at Microsoft from 2002-2016.
Eric was named a Hive committer for contributions to Stinger. He is a technology expert on data warehousing, column stores, and vectorized query execution.