Skip to main content

Machine Learning and Cognitive Systems, Part 3: A ML Vendor Landscape


In parts One and Two of this series I gave a little explanation about what Machine Learning is and some of its potential benefits, uses, and challenges within the scope of Business Intelligence and Analytics.

In this installment of the series, and the last devoted to machine learning before we step into cognitive systems, I will attempt to provide a general overview of the Machine Learning (ML) market landscape, describing some, yes, only some, of the vendors and software products that are using ML for performing Analytics and Intelligence, so, here a brief market landscape overview.

Machine learning: a common guest with no invitation

It is quite surprising to find Machine Learning has a vast presence in many of today’s modern analytics applications. Its use is driven by:


  • The increasing need to crunch data that is more complex and more voluminous, at greater speed and with more accuracy—I mean really big data
  • The need to solve increasingly business problems that require methods out of conventional data analysis.


An increasing number of traditional and new software providers, forced by specific market needs to radically evolve their existing solutions or moved by the pure spirit of innovation, have followed the path of incorporating new data analytics techniques to their analytics offering stack, both explicitly, or simply hidden within white curtains.

For software providers that already offer advanced analytics tools such as data mining, incorporating machine learning functionality into their existing capabilities stack is an opportunity to evolve their current solutions and take analytics to the next level.

So, it is quite possible that if you are using an advanced business analytics application, especially for Big Data, you are already using some machine learning technology, whether you know it or not.

The machine learning software landscape, in brief 

One of the interesting aspects of this seemingly new need for dealing with increasingly large and complex sets of information is that many of the machine learning techniques originally used within pure research labs have already gained entrance to the business world, via their incorporation within analytics offerings. New vendors often may incorporate machine learning as the core of their analytics offering, or just as another of the functional features available in their stack.

Taking this into consideration, we can find a great deal of software products that offer machine learning functionality, to different degrees. Consider the following products, loosely grouped by type:

From the lab to the business

In this group we can find a number of products, most of them based on an open-source licensing model that can help organizations to test machine learning and maybe take their first steps.

Weka

A collection of machine learning algorithms written in Java that can be applied directly over a dataset, or can be called from a custom Java-coded program, Weka is one of the most popular machine learning tools used in research and academia. It is written under the GNU General Public License, so it can be downloaded and used freely, as long as you comply with the GNU license terms.

Because of its popularity, a lot of information is available about the use of and development with Weka. It still can prove to be challenging for some users not familiar with machine learning, but it’s quite good for those who want to uncover explore the bits and bytes of using machine learning analysis on large datasets.

R

Probably the most popular language and environment for statistical computing and graphics, R is a GNU project that comprises a wide variety of statistical and graphical techniques with a high degree of scalability. No wonder that R is one of the most widely used statistical tools used by students.

The way the R project is designed to work is by having a core or based system set of statistical features and functions that can be extended with a large set of function libraries provided within the Comprehensive R Archive Network (CRAN).

Within the CRAN library, it is possible to download the necessary functions for multivariate analysis, data mining, and machine learning. But it is fair to assume that it takes a bit of effort to put machine learning to work with R.

Note: R is also of special interest owing to its increasing popularity and adoption via a commercial offering for R called Revolution Analytics, an offering I discuss below.

Jubatus

Jubatus is an online distributed machine learning framework. It is distributed under GNU Lesser General Public License  version 2.1, which makes Jubatus another good option for the learning, trial, and—why not—exploitation of machine learning techniques within a reduced budget.

The framework can be installed in different flavors of Linux, such as Red Hat, Ubuntu, and others, as well as within the Mac OS X. Jubatus includes client libraries for C++, Python, Ruby, and Java. Some of its functional features include a list of machine learning libraries for applying different techniques such as graph mining, anomaly detection, clustering, classification, regression, recommendation, etc.

Apache Mahout

Mahout is Apache’s machine learning algorithm library. Distributed under a commercially friendly Apache software license, Mahout comprises a core set of algorithms for clustering, classification and collaborative filtering that can be implemented on distributed systems.

Mahout supports three basic types of algorithms or use cases to enable recommendation, clustering and classification tasks.


One interesting aspect of Mahout is its goal to build a strong community for the development of new and fresh machine learning algorithms.

Apache Spark

Spark is Apache Hadoop’s general engine for processing large-scale data sets. The Spark engine is also an open source engine that enables users to generate applications in Java, Scala, or Python.

Just like the rest of the Hadoop family, Spark is designed to deal with large amounts of data, both structured and unstructured. The Spark design supports cyclic data flow and in-memory computing, making it ideal for processing large data sets at high speed.

In this scenario, one of the engine’s main components is the MLlib, which is Spark’s machine learning library. The library works using the Spark engine to perform faster than MapReduce and can operate in conjunction with NumPy, Python’s core scientific computing package, giving MLlib a great deal of flexibility to design new applications in these languages.

Some of the algorithms included within MLlib are:

  • K-means clustering with K-means|| initialization
  • L1- and L2-regularized linear regression
  • L1- and L2-regularized logistic regression
  • Alternating least squares collaborative filtering, with explicit ratings or implicit feedback
  • Naïve-Bayes multinomial classification
  • Stochastic gradient descent


While this set of applications gives users hands-on machine learning, at no cost, they can still be somewhat challenging when it comes to putting these applications to work. Many of them require special skills in the art of machine learning or in Java or MapReduce to fully develop a business solution.

Still, these applications can enable new teams to start working on machine learning and experienced ones to develop complex solutions for both small and big data. 
(post-ads) Machine learning by the existing players

As we mentioned earlier in this series, the evolution of Business Intelligence is demanding an increasing incorporation of machine learning techniques into existing BI and Analytics tools.

A number of popular enterprise software applications have already expanded their functional coverage to include machine learning—a useful ally—within their stacks.

Here are just a couple of the vast number of software vendors that have added machine learning either to their core functionality or as an additional feature-product of their stack.

IBM

It is no secret that IBM is betting strong in the areas of advanced analytics and cognitive computing, especially with Watson, IBM’s cognitive computing initiative and an offering which we will examine in the cognitive computing part of this series. IBM can potentially enable users to develop machine learning analytics approaches via its SPSS product stack, which incorporates the ability to develop some specific machine learning algorithms via the SPSS Modeler.

SAS

Indubitably SAS is one of the key players in the advanced analytics arena, with a solid platform for performing mining and predictive analysis, for both general and industry vertical purposes. It has incorporated key machine language techniques to be adopted for different uses. Several ML techniques can be found within SAS’ vast analytics platform, from SAS Enterprise and Tex Miner products to its SAS High-Performance Optimization offering.

An interesting fact to consider is SAS’ ability to provide industry and line-of-business approaches for many of its software offerings, encapsulating functionality with prepackaged vertical functionality.
(post-ads) 

Embedded machine learning

Significantly, machine learning techniques are reaching the core of many of the existing powerhouses as well as the newcomers in the data warehouse and Big Data spaces. Via its incorporation as embedded technologies within their database technologies, some analytic and data warehouse providers have now incorporated machine learning techniques, to varying degrees, to their database structures. 

1010Data

The New York-based company, a provider of Big Data and discovery software solutions, offers a set of what it calls in-database analytics in which a set of analytics capabilities is built right into 1010Data’s database management engine. Machine learning is included along with a set of in-database analytics such as clustering, forecasting, optimization, and others.

Teradata

Among its multiple offerings for enterprise data warehouse and Big Data environments, Teradata offers the Teradata Warehouse Miner, an application that packages a set of data profiling and mining functions that includes machine learning algorithms alongside predictive and mining ones. The Warehouse Miner is able to perform analysis directly in the database without undergoing a data movement operation, which ease the process of data preparation. 

SAP

SAP HANA, which may be SAP’s most important technology initiative ever, will now support almost all (if not actually all) of SAP’s analytics initiatives, and its advanced analytics portfolio is not the exception.

Within HANA, SAP originally launched SAP HANA Advanced Analytics, in which a number of functions for performing mining and prediction take place. Under this set of solutions it is possible to find a set of specific algorithms for performing machine learning operations.

Additionally, SAP has expand its reach into predictive analysis and machine learning via the SAP InfiniteInsight predictive analytics and mining suite, a product developed by KXEN, which SAP recently acquired.

Revolution Analytics

As mentioned previously, the open source R language is becoming one of the most important resources for statistics and mining available in the market. Revolution Analytics, a company founded in 2007, has been able to foster the work done by the huge R community and at the same time develop a commercial offering for exploiting R benefits, giving R more power and performance resources via technology that enables the use of R for enterprise data intensive applications.

Revolution R Enterprise is Revolution Analytics’ main offering and contains the wide range of libraries provided by R enriched with major technology improvements for enabling the construction of enterprise-ready analytics applications. The application is available for download both as workstation and server versions as well as on demand via the AWS Marketplace.

The new breed of advanced analytics

The advent and hype of Big Data has also become a sweet spot for innovation in many areas of the data management spectrum, especially in the area of providing analytics for large volumes of complex data.

A new wave of fresh and innovative software providers is emerging with solutions that enable businesses to perform advanced analytics over Big Data and using machine learning as a key component or enabler for this analysis.

A couple of interesting aspects of these solutions:


  1. Their unique approach to providing specific solutions to complex problems, especially adapted for business environments, combining flexibility and ease of use to make it possible for business users with a certain degree of statistical and mathematical preparation to address complex problems in the business.
  2. Many have them have already, at least partially, configured and prepared specific solutions for common business problems within line-of-business and industries via templates or predefined models, easing the preparation, development, and deployment process.


Here is a sampling of some of these vendors and their solutions:

Skytree

Being that Skytree’s tagline is “The Machine Learning Company,” it’s pretty obvious that the company has machine learning in its veins. Skytree has entered the Big Data Analytics space with a machine learning platform for performing mining, prediction, and recommendations with, according to Skytree, an enterprise-grade machine learning offering.

Skytree Server is its main offering. A Hadoop-ready machine learning platform with high-performance analytics capabilities, it can also connect to diverse data streams and can compute real-time queries, enabling high-performance analytics services for churn prediction, fraud detection, and lead scoring, among others.

Skytree also offers a series of plug-ins to be connected to the Skytree Server Foundation to improve Skytree’s existing capabilities with specific and more advanced machine learning models and techniques.

BigML

If you Google BigML, you will find that “BigML is Machine Learning for everyone.”

The company, founded in 2011 in Corvallis, Oregon, offers a cloud-based large-scale machine learning platform centered on business usability and at highly competitive costs by providing advanced analytics via a subscription-based offering.

The application enables users to prepare complete analytics solutions for a wide range of analysis scenarios, from collecting the data and designing the model to creating special analytics ensembles.

Since it is a cloud-based platform, users can start using BigML services via a number of subscription-based and/or dedicated options. An attractive approach for those organizations trying to make the best of advanced analytics with less use of technical and monetary resources.

Yottamine Analytics

Founded in 2009 by Dr. David Huang, Yottamine has taken Dr. Huang contributions to the theory of machine learning to practice and reflected it within the Yottamine Predictive Service (YPS).

YPS is an on-demand advanced analytics solution based on the use of web services, which allows users to build, deploy, and develop advanced big data analytics solutions.

As an on-demand solution it offers a series of subscription models based on clusters and nodes, with payment based on the usage of the service in terms of node hours—a pretty interesting quota approach. 

Machine learning is pervasive

Of course, this is just a sample of the many advanced analytics offerings that exist. Others are emerging. They use machine learning techniques to different degrees and for many different purposes, specific or general. New companies such as BuildingIQ, emcien, BayNote,  Recommind, and others are taking advantage of the use of machine learning to provide unique offerings in a wide range of industry and business sectors.

So what?

One of the interesting effects of companies dealing with increasing volumes of data and, of course, increasing problems to solve is that techniques such as Machine Learning and other Artificial Intelligence and Cognitive Computing methods are gaining terrain in the business world.

Companies and information workers are being forced to learn about these new disciplines and use them to find ways to improve analysis accuracy, the ability to react and decide, and prediction, encouraging the rise of what some call the data science discipline.

Many of the obscure tools for advanced analytics traditionally used in the science lab or at pure research centers are now surprisingly popular within many business organizations—not just within their research and development departments, but within all their lines of business.

But on the other hand, new software is increasingly able not only to help in the decision-making process, but also to be proactive in reproducing and automatically improving complex analysis models, recommendations, complex scenario analysis to enable early detection and prediction and, potentially, data-based decisions. 

Whether measuring social media campaign effectiveness, effectively predicting sales, detecting fraud, or performing churn analysis, these tools are remaking the way data analysis is done within many organizations.

But this might be just the beginning of a major revolution in the way software serves and interacts with humans. An increasing number of Artificial Intelligence disciplines, of which machine learning is a part, are rapidly evolving and reaching mainstream spaces in the business software world in the form of next-generation cognitive computing systems.

Offerings such as Watson from IBM might be the instigators of a new breed of solutions that go well beyond what we have so far experienced with regard to computers and the analysis process, So, I dare you to stay tuned for my next installment on Cognitive Systems and walk with me to discovery these new offerings.

Comments

  1. Machine learning is a promising area of ​​knowledge in which many students want to work. but not many students who have a predisposition to programming and have an education connected to programming can comprehend machine learning, not to mention that ordinary people, and of ourse humanitarians like professional writers who work for https://buypapercheap.net/ that presents the service that offers cheap essay writing cheap essay writing - cannot even understand the essence of machine learning. A person engaged in machine learning can make a lot of money, but not every person can complete this complex programming.

    ReplyDelete
  2. Machine learning is a very advanced technology. Today, everything in the world is declining because machine learning is a very important programming language on standby My dream is to be a Programmer. But my field is different from a programming language. I research a lot of new topics and write topics on them like nursing education dissertation topics, Marketing Dissertation Topics, Dissertation Writing Services, etc. Our company provides quality services and great service to every customer.

    ReplyDelete
  3. Remain on-pattern by shopping with Wayfair promotion codes intended to assist you with saving. ... 27 dynamic Wayfair markdown codes to shop furniture and stylistic layout for your home. wayfair discount code

    ReplyDelete
  4. In the latest tech era, popularity of machine learning is increasing day by day. During my professional career, I met with many scholars who were in the field of robotics and machine learning. Talking with them, this was clearly told by them that very soon, machine learning will be enough improved at the level to control machines like humans. The same is conveyed by many dissertation writers in their researches on robotics.

    ReplyDelete
  5. Hello, this article is very good. I also recommend you to read about Web 2.0 vs Web 3.0 approaches.

    ReplyDelete

Post a Comment

Popular posts from this blog

Machine Learning and Cognitive Systems, Part 2: Big Data Analytics

In the first part of this series, I described a bit of what machine learning is and its potential to become a mainstream technology in the industry of enterprise software, and serve as the basis for many other advances in the incorporation of other technologies related to artificial intelligence and cognitive computing. I also mentioned briefly how machine language is becoming increasingly important for many companies in the business intelligence and analytics industry. In this post I will discuss further the importance that machine learning already has and can have in the analytics ecosystem, especially from a Big Data perspective. Machine learning in the context of BI and Big Data analytics Just as in the lab, and other areas, one of the reasons why machine learning became extremely important and useful in enterprise software is its potential to deal not just with huge amounts of data and extract knowledge from it—which can somehow be addressed with disciplines such as data

The BBBT Sessions: HortonWorks, Big Data and the Data Lake

Some of the perks of being an analyst are the opportunities to meet with vendors and hear about their offerings, their insight on the industry and best of all, to be part of great discussions and learn from those that are the players in the industry. For some time now, I have had the privilege of being a member of the Boulder BI Brain Trust (BBBT), an amazing group consisting of Business Intelligence and Data Management analysts, consultants and practitioners covering various specific and general topics in the area. Almost every week, the BBBT engages a software provider to give us a briefing of their software solution. Aside from being a great occasion to learn about a solution, the session is also a tremendous source for discussion.  I will be commenting on these sessions here (in no particular order), providing information about the vendor presenting, giving my personal view, and highlighting any other discussion that might arise during the session. I would like to start with

SAP Data Hub and the Rise of a New Generation of Analytics Solutions

“Companies are looking for a unified and open approach to help them accelerate and expand the flow of data across their data landscapes for all users. SAP Data Hub bridges the gap between Big Data and enterprise data, enabling companies to build applications that extract value from data across the organization, no matter if it lies in the cloud or on premise, in a data lake or the enterprise data warehouse, or in an SAP or non-SAP system.” This is part of what Bernd Leukert, SAP’s member of the executive board for products & innovation mentioned during SAP’s Big Data Event held at the SAP Hudson Yards office in New York City as part of the new SAP Data Hub announcement and one that, in my view, marked the beginning of a small yet important trend within analytics consisting on the launch or renewed and integrated software platforms for analytics, BI and data science. This movement, marked by other important announcements including Teradata’s New Analytics Platform as well