I recently attended the Strata+Hadoop World conference in New York City held from September 27-29. I have been to the conference in previous years, and in the past you could say that Hadoop World was a bit of a geek-fest. The conference focused on things like core Hadoop infrastructure, exciting new software like Spark, and new innovations in the open source world. This year—while all of that was still here—many of the keynotes discussed bringing data analytics to the enterprise and focused on how we can utilize data to drive business outcomes.
There has been a major shift in the target market for analytics over the past few years. Early efforts focused on consumers and how to properly deliver advertisements. These areas were low-hanging fruit, as companies like Google and Twitter had the advanced technical staff capable of seeing complex data projects to completion. We are now looking at ways to bring data analytics to Middle America and large enterprises that drive the rest of the economy. Significant focus has been placed on the areas of finance, manufacturing, and healthcare. The latter is being driven by advancements in biotechnology, which I will review later in this series.
When it comes to the infrastructure required, bringing data analytics to the enterprise requires a shift in approach. Enterprises have an expectation of turn-key solutions, and generally do not have the interest to invest in experiments. Those promoting the Hadoop ecosystem realize this, and much effort is being undertaken to simplify the stack and make it easier to deploy. Cloudera, Hortonworks, and MapR are all trying to bring enterprise features to their Hadoop distributions. There is still a long way to go, though, as utilization of most tools within the Hadoop stack still require a significant technical investment.
Newer tools outside of the Hadoop ecosystem are arriving onto the scene to attempt to solve these problems more quickly. They may not have the full swath of capabilities that Hadoop offers, but they provide the ability to gain value from data much more quickly. As one presenter put it, "simple is often better than complete."
Outside of the Hadoop space, we saw presentations from companies (IBM, Dell/EMC, SAS, and Google, to name a few) who are all attempting to deliver "big data" products in a manner that is consumable for the enterprise. Dell/EMC (through the Pivotal subsidiary) is promoting Platform-as-a-Service offerings, where a full data analytics stack – from storage to servers to software – is sold as a complete offering. Google presented their BigQuery cloud-based enterprise data warehouse solution. IBM and SAS were focused on convincing data scientists that they could be more open with their software offerings.
There were many other presenters and vendors speaking about simplifying, automating, and bringing enterprise-ready features to the data analytics stack. While Hadoop is still a major part of the "big data" space, it is now far from being the only player. The enterprises are going to target specific applications and specific use cases which can be deployed easily and will drive business outcomes quickly.
One of the hottest topics in relation to bringing analytics to the enterprise was machine learning. With machine learning, companies can analyze their data, look for patterns, and build models for predicting future business outcomes. In Part 2 of this series, I will discuss machine learning in the enterprise, including a discussion around RoundTower's own machine learning product: Fast Answers.
Part 2: Data Analytics for the Enterprise
Machine Learning and
Driving Business Outcomes
As discussed in Part 1 of this series, a major focus coming out of Hadoop World was the drive to bring data analytics to the enterprise. Effort is underway to simplify the analytics stack – particularly the Hadoop ecosystem – and to deliver it in a turn-key way that is expected by enterprise businesses. In addition, new analytics products and tools are popping up to simplify data processing and automate the complicated data science processes.
POWERED BY NUTONIAN EUREQA®
RoundTower Technologies and Nutonian have partnered to deliver the industry’s leading data science solution, Fast Answers Powered by Eureqa.
This collaboration brings together Nutonian’s industry leading expertise in data science and machine learning with RoundTower’s capabilities in delivering data analytics platforms.