Uber uses Presto to improve flexibility in data warehousing and querying

Given the increasing size and complexity of enterprise data, technology is constantly being updated. And distributed query engine PrestoDB is poised to provide a swathe of improvements in large-scale business intelligence and analytics, according to Girish Baliga (pictured, right), governing board chair at the Presto Foundation and senior engineering manager at Uber Technologies Inc.

“Presto is an invaluable engine that can connect to all these different storage and data formats,” Baliga said. “It also allows us to have a single entry point for our users to run their SQL engines and get insights rather quickly compared to some of the other engines that we have at Uber.”

Uber has its own internal deployments, for which it leans heavily on Presto given the company’s choice of an open data stack. Presto also plays nicely with the other open-source data and storage formats like Hadoop, Hive and Spark, Baliga added.

Baliga and Steven Mih (pictured, left), co-founder and chief executive officer of Ahana Cloud Inc., spoke with theCUBE industry analyst Lisa Martin in advance of the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event, airing on April 5. They discussed open data lakes, warehousing, and exciting updates from the Presto Foundation. (* Disclosure below.)

Traditional data warehousing examined

For years, organizations have relied on a mostly unchanged data warehousing style for their business intelligence and analytics. In its current form, that operating style is often ill-equipped to handle the complex data types and sources that exist today. Open cloud data lakes, like Ahana’s, built atop of SQL engines like Presto, are giving organizations a more flexible, relatively inexpensive option at scale, according to Mih.

“What’s happening is that people are putting semistructured and unstructured data, for example, in cloud data lakes or other data lakes, and they find that they can query directly with a SQL query engine like Presto,” Mih explained. “And that lets you have a much more flexible approach to dealing with getting insights out of your data. That’s why companies are moving to a more modern architecture.”

Another big use case being enabled by solutions like Presto is in ad-hoc and interactive queries, according to Mih.

There’s so much data that’s being generated and stored, and you need to be able to query that data in place with very, very high performance, meaning that you can get answers back in seconds. That lets you have the interactive ability to drill into the data and innovate your business,” he said.

Standing under the Linux umbrella

Both Mih and Baliga are noteworthy members of the Presto Foundation, and that commonality underlies a commitment toward improving the technology, which is heavily relied upon by both Uber and Ahana Cloud. Both these companies, in addition to a host of others, make up the “consortium of companies that all want to see Presto continue to get bigger and bigger,” according to Mih.

While under the Linux Foundation today, PrestoDB originally existed as a project inside of Facebook. Upon maturity, it was made open-source and donated to Linux, where it still resides today. Unsurprisingly, the Presto project is steadily being improved, with a wealth of new features already being tested or in the development pipeline.

“RaptorX is a multilevel caching system that has been fantastic,” Mih said. Aria optimizations are another area. We at Ahana have also developed some security features; we’re donating the integrations with Apache Ranger, and that’s the type of things that we do to help the community.”

Alongside creating new features for Presto-reliant organizations to enjoy, big companies like Uber — with their high-capacity data needs — are showing immense trust in Presto’s burgeoning developer community. This community includes includes names like Facebook and Ahana Cloud, and they help maintain the ecosystem.

The participating organizations within Presto fall into two broad categories: those (like Uber) that use it internally and others (like Ahana) that offer it as a service to other companies. The former, according to Baliga, bring scale and reliability, while the latter provides flexibility and extensibility. The interplay between these two different stakeholder types is largely responsible for making the project what it is today.

As an open-source project under the guardianship of the Linux Foundation, Presto users are also assured of the sustained transparency of the ecosystem without any sudden licensing changes that could bring added costs or usage limitations, Mih and Baliga pointed out.

Here’s the complete video interview, part of SiliconANGLE’s and theCUBE’s pre-event coverage of the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event.

(* Disclosure: TheCUBE is a paid media partner for the AWS Startup Showcase: “Data as Code — The Future of Enterprise Data and Analytics” event. Neither Ahana Cloud Inc., the sponsor for theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.