Skip to main content

Hortonworks’s New Vision for Connected Data Platforms

Courtesy of Hortonworks
On March 1, I had the opportunity to attend this year’s Hortonworks Analyst Summit in San Francisco, where Hortonworks announced several product enhancements and new versions and a new definition for its strategy going forward.

Hortonworks seems to be making a serious attempt to take over the data management space, while maintaining a commitment to open sources and especially to the Apache Foundation. Thus as Hortonworks keeps gaining momentum, it’s also consolidating its corporate strategy and bringing a new balance to its message (combining both technology and business).

By reinforcing alliances, and at the same time moving further towards the business mainstream with a more concise messaging around enterprise readiness, Hortonworks is declaring itself ready to win the battle for the big data management space.

The big question is if the company’s strategy will be effective enough to succeed at this goal, especially in a market already overpopulated and fiercely defended by big software providers.

Digesting Hortonworks’s Announcements
The announcements at the Hortonworks Analyst Summit included news on both the product and partner fronts. With regards to products, Hortonworks announced new versions for both its Hadoop Data (HDP) and Hadoop Dataflow (HDF) platforms.

HDP—New Release, New Cycle
Alongside specific features to improve performance and reinforce ease of use, the latest release of Apache HDP 2.4 (figure 1) includes the latest generation of Apache’s large-scale data processing framework, Spark 1.6, along with Ambari 2.2, Apache’s project for making Hadoop management easier and more efficient.

The inclusion of Ambari seems to be an important key for the provision of a solid, centric management and monitoring tool for Hadoop clusters.


Figure 1. Hortonworks emphaszes enterprise readiness for its HDP version
(Image courtesy of Hortonworks)

Another key announcement with regard to HDP is the revelation of a new release cycle for HDP. Interestingly, it aims to provide users with a consistent product featuring core stability. The new cycle will enable, via yearly releases, HDP services such as HDFS, YARN, and MapReduce as well as Apache Zookeeper to align with a compatible version of Apache Hadoop with the “ODPi Core,” currently in version 2.7.1. These can provide standardization and ensure a stable software base for mission critical workloads.

On the flip side, those extended services that run on top of the Hadoop core, including Spark, Hive, HBase, Ambari and others will be continually released throughout the year to ensure these projects are continuously updated.

Last but not least, HDP’s new version also comes with the new Smartsense 1.2, Hortonworks’s issue resolution application, featuring automatic scheduling and uploading, as well as over 250 new recommendations and guidelines.

(post-ads)

Growing NiFi to an Enterprise Level
Along with HDP, Hortonworks also announced version 1.2 of HDF, Hortonworks’s offering for managing data in motion by collecting, manipulating, and curating data in real time. The new version includes new streaming analytics capabilities for Apache NiFi, which powers HDF at its core, and support for Apache Storm and Apache Kafka (figure 2).

Another noteworthy feature coming to HDF is its support for integration with Kerberos, a feature which will enable and ease management of centralized authentication across the platform and other applications. According to Hortonworks, HDF 1.2 will be available to customers in Q1 of 2016.

Figure 2. Improved security and control added to Hortonworks new HDF version
(Image courtesy of Hortonworks)


Hortonworks Adds New Partners to its List
The third announcement from Hortonworks at the conference was a partnership with Hewlett Packard Labs, the central research organization of Hewlett Packard Enterprise (HPE).

The collaboration mainly has to do with a bipartisan effort to enhance performance and capabilities of Apache Spark. According to Hortonworks and HPE, this collaboration will be mainly focused on the development and analysis of a new class of analytic workloads which benefit from using large pools of shared memory.

Says Scott Gnau, Hortonworks’s chief technology officer, with regard to the collaboration agreement:

This collaboration indicates our mutual support of and commitment to the growing Spark community and its solutions. We will continue to focus on the integration of Spark into broad data architectures supported by Apache YARN as well as enhancements for performance and functionality and better access points for applications like Apache Zeppelin.

According to both companies, this collaboration has already generated interesting results which include more efficient memory usage and increased performance as well as faster sorting and in-memory computations for improving Spark’s performance.

The result of these collaborations will be derived as new technology contributions for the Apache Spark community, and thus carry beneficial impacts for this important piece of the Apache Hadoop framework.

Commenting on the new collaborations, Martin Fink, executive vice president and chief technology officer of HPE and board member of Hortonworks, said:

We’re hoping to enable the Spark community to derive insight more rapidly from much larger data sets without having to change a single line of code. We’re very pleased to be able to work with Hortonworks to broaden the range of challenges that Spark can address.

Additionally Hortonworks signed a partnership with Impetus Technologies, Inc., another solution provider based on open source technology. The agreement includes collaboration around StreamAnalytix™, an application that provides tools for rapid and less code development of real-time analytics applications using Storm and Spark. Both companies have the aim that with the use of HDF and StreamAnalytix together, companies will gain a complete and stable platform for the efficient development and delivery of real-time analytics applications.

But The Real News Is …
Hortonworks is rapidly evolving its vision of data management and integration, and this was in my opinion the biggest news of the analyst event. Hortonworks’s strategy is to integrate the management of both data at rest (data residing in HDP) and data in motion (data HDF collects and curates in real-time), as being able to manage both can power actionable intelligence. It is in this context that Hortonworks is working to increase integration between them.

Hortonworks is now taking a new go-to-market approach to provide an increase in quality and enterprise readiness to its platforms. Along with ensuring that ease of use will avoid barriers for end use adoption its marketing message is changing. Now the Hadoop-based company sees the need to take a step further and convince businesses that open source does more than just do the job; it is in fact becoming the quintessential tool for any important data management initiative—and, of course, Hortonworks is the best vendor for the job. Along these lines, Hortonworks is taking steps to provide Spark with enterprise-ready governance, security, and operations to ensure readiness for rapid enterprise integration. This to be gained with the inclusion of Apache Ambari and other Apache projects.

One additional yet important aspect within this strategy has to do with Hortonworks’s work done around enterprise readiness, especially regarding issue tracking (figure 3) and monitoring for mission critical workloads and security reinforcement.


Figure 3. SmartSense 1.2 includes more than 250 recommendations
(Image courtesy of Hortonworks)


It will be interesting to see how this new strategy works for Hortonworks, especially within the big data market where there is extremely fierce competition and where many other vendors are pushing extremely hard to get a piece of the pie, including important partners of Hortonworks.

Taking its data management strategy to a new level is indeed bringing many opportunities for Hortonworks, but these are not without challenges as the company introduces itself into the bigger enterprise footprint of the data management industry.

What do you think about Hortonworks’s new strategy in data management? If you have any comments, please drop me a line below and I’ll respond as soon as I can.

(Originally published)

Comments

  1. Thank you so much for this nice information. Hope so many people will get aware of this and useful as well. And please keep update like this.

    Big Data Services

    Data Lake Services

    Advanced Analytics Solutions

    Full Stack Development Services

    ReplyDelete

Post a Comment

Popular posts from this blog

Machine Learning and Cognitive Systems, Part 2: Big Data Analytics

In the first part of this series, I described a bit of what machine learning is and its potential to become a mainstream technology in the industry of enterprise software, and serve as the basis for many other advances in the incorporation of other technologies related to artificial intelligence and cognitive computing. I also mentioned briefly how machine language is becoming increasingly important for many companies in the business intelligence and analytics industry. In this post I will discuss further the importance that machine learning already has and can have in the analytics ecosystem, especially from a Big Data perspective. Machine learning in the context of BI and Big Data analytics Just as in the lab, and other areas, one of the reasons why machine learning became extremely important and useful in enterprise software is its potential to deal not just with huge amounts of data and extract knowledge from it—which can somehow be addressed with disciplines such as data

The BBBT Sessions: HortonWorks, Big Data and the Data Lake

Some of the perks of being an analyst are the opportunities to meet with vendors and hear about their offerings, their insight on the industry and best of all, to be part of great discussions and learn from those that are the players in the industry. For some time now, I have had the privilege of being a member of the Boulder BI Brain Trust (BBBT), an amazing group consisting of Business Intelligence and Data Management analysts, consultants and practitioners covering various specific and general topics in the area. Almost every week, the BBBT engages a software provider to give us a briefing of their software solution. Aside from being a great occasion to learn about a solution, the session is also a tremendous source for discussion.  I will be commenting on these sessions here (in no particular order), providing information about the vendor presenting, giving my personal view, and highlighting any other discussion that might arise during the session. I would like to start with

SAP Data Hub and the Rise of a New Generation of Analytics Solutions

“Companies are looking for a unified and open approach to help them accelerate and expand the flow of data across their data landscapes for all users. SAP Data Hub bridges the gap between Big Data and enterprise data, enabling companies to build applications that extract value from data across the organization, no matter if it lies in the cloud or on premise, in a data lake or the enterprise data warehouse, or in an SAP or non-SAP system.” This is part of what Bernd Leukert, SAP’s member of the executive board for products & innovation mentioned during SAP’s Big Data Event held at the SAP Hudson Yards office in New York City as part of the new SAP Data Hub announcement and one that, in my view, marked the beginning of a small yet important trend within analytics consisting on the launch or renewed and integrated software platforms for analytics, BI and data science. This movement, marked by other important announcements including Teradata’s New Analytics Platform as well