Databrick’s CEO: Data Silos A Nightmare For Companies’ Ability To Benefit From Big Data

Data is king but for the vast majority of companies, they have no clue how to get a handle on all the information they’ve amassed. At this point they know they are supposed to collect reams of data on their customers but what to do with it all is still largely unknown. That’s even with Google and Facebook rolling out all sorts of tools to help companies analyze and manage their big data.

Become a Subscriber

Please purchase a subscription to continue reading this article.

Subscribe Now

The reason for all this confusion? Silos. That’s according to Databrick’s executives CEO Ali Ghodsi and Matei Zaharia co-founder and chief technologist, who said at a recent summit it’s because of the siloing of company data within organizations. Companies have long kept data walled off, hurting their ability to use the data to gain a competitive edge. Speaking during the European Spark+AI Summit in London last week, the executives argued that a divide between the IT department and the rest of the enterprise is creating this situation where companies aren’t able to capitalize on the promises of big data. Within the data are patterns that if interpreted correctly, could unearth business secrets that will help the organization grow and provide greater personalization for its customers. But it can’t be walled off to be effective. "Data engineers have liability; they need to make sure systems are secure and will work for decades to come," The Register quoted Ghodsi as saying during the conference. "In the line of business, where data scientists sit, they know the business and know what to do to move the needle, but need the data."

That is where Databricks comes in. Aiming to accelerate innovations for customers using data, it built an open source platform for data science teams to collaborate with other lines within the business to build products around the data they have amassed. Databricks is backed by Andreessen Horowitz, NEA and Battery Ventures as well as others and counts, Viacom, Shell and HP as customers. The company has raised $247 million since its inception five years ago.

During the same presentation Databrick’s co-founder, Zaharia argued that a huge bottleneck to machine learning systems and data analyzation is that there are too many cooks in the kitchen. On on side are the data scientists who create the models and the engineers who prepare the data for the product teams. Bringing all the experts together can be difficult requiring a lot of back and forth between the disparate teams. Changes and updates can take a long time, just as mistakes and bugs can delay a project, making any machine learning efforts frustrating for the companies. Databricks wants to get more customers using machine learning rather than talking about it and thinks its open source platform is the way to go. With it, they can track the code and data used for the machine learning tests and put that together in reusable packages. "Everyone who has tried to do machine learning knows it's complex,” Zaharia told The Register. "People are excited about having an open-source project in this space."