close
Home
About
Subscribe to our Newsletter
Use of Cookies
twitter
facebook
linkedin
google+
instagram
pinterest
rss
Home
Research
_Research Calendar
_Upcoming
_Research Library
Community
_Media Partners
Events
_Calendar
_List
_Live Now!
_Live Now! 2
About DoT
_About
_Contact
_Subscribe to our Newsletter
_Upcoming Book!
__Learn More...
__Subscribe and Stay tuned!
Categories
advanced an
advanced analytics
alation
Alpine Data
altiscale
altiscale data platform
altiscale insight cloud
Amazon Web Services
analytic database
analytics
analytics platforms
apache
apache hadoop
appian
artificial intelligence
automation
automation anywhere
autonomous databases
AWS
azure
BBBT
BBBT sessions
beyond core
BI
big data
big data analytics
big data world
big data world london
big query
bigdata
BigML
blue prism
board international
book commentary
book review
books
bpm
bpm automation
business analytics
business intelligence
business process management
CDM
channel data management
Cleargraph
cloud
cloud computing
cloud native con
cloudera
cloudnativecon
CNCF
cognitive computing
cognitive systems
collibra
containers
core analytx
coveo
CPM
crm
crowdbabble
D3
data
data science
data discovery
data governance
data lake
data lake management
data management
data of things
data platform
Data Point
data preparation
data science
data streaming
data visualization
data warehouse
data warehousing
database
database automation
databases
dataiku
datarobot
decision support systems
deep learning
dell
devops
digital transformation
discovery
dmti spatial
domo
domopalooza 2017
DoT
dundas
dwh
embedded linux
Emcien
enterprise performance management
environics analytics
EPM
ETL
exasol
gdelt project
google
guest posts
hadoop
high performance computing
high performance data analytics
hortonworks
HPC
HPDA
Hyper
ibm
IBM Spectrum Computing
ibm watson
IFS
in memory
in-memory
inmemory
integrated analytics system
integrify
intelli3
Intelligence Central
intellisphere
internet of things
interviews
IoT
IRPA
IT
IT operations
Itesoft
itsapiens
K2
kaypok
klipfolio
knomos
kofax
kubecon
linux
linux container
location intelligence
machine learning
metadata
microsoft
Microsoft Azure
Microsoft Azure Marketplace
ML
mnubo
mobile
mobile BI
mobility
necto
newgen analytics
newgen BI
nexalogy
NLP
nosql
ODSC
ODSC East
OMG
open source
oracle
outlier
panorama software
pegasystems
PHEMI
Podium Data
predictive analytics
Python
qlik
qliktech
qlikview
R
Raymie Stata
rdbms
real-time analytics
relational
reporting
reporting tools
revolution analytics
rocana
rubikcloud
salesforce
Salient DMN
SAP
SAP data hub
SAP Leonardo
SAS
semeon
sentient enterprise
sentiment analytics
smart cities
social media
sugarcrm
tableau
Tebleau Software
tech
teradata
teradata analytics platform
Teradata Database
Teradata Everywhere
Teradata in the Cloud
teradata kylo
terr
think big analytics
thinkcx
tibco spotfire
Toad
tsss
UiPath
Vitria
VMWare
voltdb
vtiger
wtf
wtf is
yarn
Zaloni
zoomdata
zyme
Page Layouts
Home
Follow Us
Machine Learning and Cognitive Systems, Part 3: A ML Vendor Landscape
analytics
artificial intelligence
BI
big data
business analytics
business intelligence
cognitive systems
data
data management
decision support systems
machine learning
predictive analytics
/
June 09, 2014
/
1
Tweet
In parts
One
and
Two
of this series I gave a little explanation about what Machine Learning is and some of its potential benefits, uses, and challenges within the scope of Business Intelligence and Analytics.
In this installment of the series, and the last devoted to machine learning before we step into cognitive systems, I will attempt to provide a general overview of the Machine Learning (ML) market landscape, describing some, yes, only some, of the vendors and software products that are using ML for performing Analytics and Intelligence, so, here a brief market landscape overview.
Machine learning: a common guest with no invitation
It is quite surprising to find Machine Learning has a vast presence in many of today’s modern analytics applications. Its use is driven by:
The increasing need to crunch data that is more complex and more voluminous, at greater speed and with more accuracy—I mean really big data
The need to solve increasingly business problems that require methods out of conventional data analysis.
An increasing number of traditional and new software providers, forced by specific market needs to radically evolve their existing solutions or moved by the pure spirit of innovation, have followed the path of incorporating new data analytics techniques to their analytics offering stack, both explicitly, or simply hidden within white curtains.
For software providers that already offer advanced analytics tools such as data mining, incorporating machine learning functionality into their existing capabilities stack is an opportunity to evolve their current solutions and take analytics to the next level.
So, it is quite possible that if you are using an advanced business analytics application, especially for Big Data, you are already using some machine learning technology, whether you know it or not.
The machine learning software landscape, in brief
One of the interesting aspects of this seemingly new need for dealing with increasingly large and complex sets of information is that many of the machine learning techniques originally used within pure research labs have already gained entrance to the business world, via their incorporation within analytics offerings. New vendors often may incorporate machine learning as the core of their analytics offering, or just as another of the functional features available in their stack.
Taking this into consideration, we can find a great deal of software products that offer machine learning functionality, to different degrees. Consider the following products, loosely grouped by type:
From the lab to the business
In this group we can find a number of products, most of them based on an open-source licensing model that can help organizations to test machine learning and maybe take their first steps.
Weka
A collection of machine learning algorithms written in Java that can be applied directly over a dataset, or can be called from a custom Java-coded program, Weka is one of the most popular machine learning tools used in research and academia. It is written under the GNU General Public License, so it can be downloaded and used freely, as long as you comply with the GNU license terms.
Because of its popularity, a lot of information is available about the use of and development with Weka. It still can prove to be challenging for some users not familiar with machine learning, but it’s quite good for those who want to uncover explore the bits and bytes of using machine learning analysis on large datasets.
R
Probably the most popular language and environment for statistical computing and graphics, R is a GNU project that comprises a wide variety of statistical and graphical techniques with a high degree of scalability. No wonder that R is one of the most widely used statistical tools used by students.
The way the R project is designed to work is by having a core or based system set of statistical features and functions that can be extended with a large set of function libraries provided within the Comprehensive R Archive Network (CRAN).
Within the CRAN library, it is possible to download the necessary functions for multivariate analysis, data mining, and machine learning. But it is fair to assume that it takes a bit of effort to put machine learning to work with R.
Note:
R is also of special interest owing to its increasing popularity and adoption via a commercial offering for R called Revolution Analytics, an offering I discuss below.
Jubatus
Jubatus is an online distributed machine learning framework. It is distributed under GNU Lesser General Public License version 2.1, which makes Jubatus another good option for the learning, trial, and—why not—exploitation of machine learning techniques within a reduced budget.
The framework can be installed in different flavors of Linux, such as Red Hat, Ubuntu, and others, as well as within the Mac OS X. Jubatus includes client libraries for C++, Python, Ruby, and Java. Some of its functional features include a list of machine learning libraries for applying different techniques such as graph mining, anomaly detection, clustering, classification, regression, recommendation, etc.
Apache Mahout
Mahout is Apache’s machine learning algorithm library. Distributed under a commercially friendly Apache software license, Mahout comprises a core set of algorithms for clustering, classification and collaborative filtering that can be implemented on distributed systems.
Mahout supports three basic types of algorithms or use cases to enable recommendation, clustering and classification tasks.
One interesting aspect of Mahout is its goal to build a strong community for the development of new and fresh machine learning algorithms.
Apache Spark
Spark is Apache Hadoop’s general engine for processing large-scale data sets. The Spark engine is also an open source engine that enables users to generate applications in Java, Scala, or Python.
Just like the rest of the Hadoop family, Spark is designed to deal with large amounts of data, both structured and unstructured. The Spark design supports cyclic data flow and in-memory computing, making it ideal for processing large data sets at high speed.
In this scenario, one of the engine’s main components is the
MLlib
, which is Spark’s machine learning library. The library works using the Spark engine to perform faster than MapReduce and can operate in conjunction with
NumPy
, Python’s core scientific computing package, giving MLlib a great deal of flexibility to design new applications in these languages.
Some of the algorithms included within MLlib are:
K-means clustering with K-means|| initialization
L1- and L2-regularized linear regression
L1- and L2-regularized logistic regression
Alternating least squares collaborative filtering, with explicit ratings or implicit feedback
Naïve-Bayes multinomial classification
Stochastic gradient descent
While this set of applications gives users hands-on machine learning, at no cost, they can still be somewhat challenging when it comes to putting these applications to work. Many of them require special skills in the art of machine learning or in Java or MapReduce to fully develop a business solution.
Still, these applications can enable new teams to start working on machine learning and experienced ones to develop complex solutions for both small and big data.
(post-ads)
Machine learning by the existing players
As we mentioned earlier in this series, the evolution of Business Intelligence is demanding an increasing incorporation of machine learning techniques into existing BI and Analytics tools.
A number of popular enterprise software applications have already expanded their functional coverage to include machine learning—a useful ally—within their stacks.
Here are just a couple of the vast number of software vendors that have added machine learning either to their core functionality or as an additional feature-product of their stack.
IBM
It is no secret that IBM is betting strong in the areas of advanced analytics and cognitive computing, especially with
Watson
, IBM’s cognitive computing initiative and an offering which we will examine in the cognitive computing part of this series. IBM can potentially enable users to develop machine learning analytics approaches via its
SPSS product stack
, which incorporates the ability to develop some specific machine learning algorithms via the SPSS Modeler.
SAS
Indubitably SAS is one of the key players in the advanced analytics arena, with a solid platform for performing mining and predictive analysis, for both general and industry vertical purposes. It has
incorporated key machine language techniques
to be adopted for different uses. Several ML techniques can be found within SAS’ vast analytics platform, from SAS Enterprise and Tex Miner products to its SAS High-Performance Optimization offering.
An interesting fact to consider is SAS’ ability to provide industry and line-of-business approaches for many of its software offerings, encapsulating functionality with prepackaged vertical functionality.
(post-ads)
Embedded machine learning
Significantly, machine learning techniques are reaching the core of many of the existing powerhouses as well as the newcomers in the data warehouse and Big Data spaces. Via its incorporation as embedded technologies within their database technologies, some analytic and data warehouse providers have now incorporated machine learning techniques, to varying degrees, to their database structures.
1010Data
The New York-based company, a provider of Big Data and discovery software solutions, offers a set of what it calls
in-database analytics
in which a set of analytics capabilities is built right into 1010Data’s database management engine. Machine learning is included along with a set of in-database analytics such as clustering, forecasting, optimization, and others.
Teradata
Among its multiple offerings for enterprise data warehouse and Big Data environments, Teradata offers the
Teradata Warehouse Miner
, an application that packages a set of data profiling and mining functions that includes machine learning algorithms alongside predictive and mining ones. The Warehouse Miner is able to perform analysis directly in the database without undergoing a data movement operation, which ease the process of data preparation.
SAP
SAP HANA, which may be SAP’s most important technology initiative ever, will now support almost all (if not actually all) of SAP’s analytics initiatives, and its advanced analytics portfolio is not the exception.
Within HANA, SAP originally launched
SAP HANA Advanced Analytics
, in which a number of functions for performing mining and prediction take place. Under this set of solutions it is possible to find a set of specific algorithms for performing machine learning operations.
Additionally, SAP has expand its reach into predictive analysis and machine learning via the
SAP InfiniteInsight predictive analytics and mining suite
, a product developed by KXEN, which SAP recently acquired.
Revolution Analytics
As mentioned previously, the open source R language is becoming one of the most important resources for statistics and mining available in the market. Revolution Analytics, a company founded in 2007, has been able to foster the work done by the huge R community and at the same time develop a commercial offering for exploiting R benefits, giving R more power and performance resources via technology that enables the use of R for enterprise data intensive applications.
Revolution R Enterprise
is Revolution Analytics’ main offering and contains the wide range of libraries provided by R enriched with major technology improvements for enabling the construction of enterprise-ready analytics applications. The application is available for download both as workstation and server versions as well as on demand via the
AWS Marketplace
.
The new breed of advanced analytics
The advent and hype of Big Data has also become a sweet spot for innovation in many areas of the data management spectrum, especially in the area of providing analytics for large volumes of complex data.
A new wave of fresh and innovative software providers is emerging with solutions that enable businesses to perform advanced analytics over Big Data and using machine learning as a key component or enabler for this analysis.
A couple of interesting aspects of these solutions:
Their unique approach to providing specific solutions to complex problems, especially adapted for business environments, combining flexibility and ease of use to make it possible for business users with a certain degree of statistical and mathematical preparation to address complex problems in the business.
Many have them have already, at least partially, configured and prepared specific solutions for common business problems within line-of-business and industries via templates or predefined models, easing the preparation, development, and deployment process.
Here is a sampling of some of these vendors and their solutions:
Skytree
Being that Skytree’s tagline is “The Machine Learning Company,” it’s pretty obvious that the company has machine learning in its veins. Skytree has entered the Big Data Analytics space with a machine learning platform for performing mining, prediction, and recommendations with, according to Skytree, an enterprise-grade machine learning offering.
Skytree Server
is its main offering. A Hadoop-ready machine learning platform with high-performance analytics capabilities, it can also connect to diverse data streams and can compute real-time queries, enabling high-performance analytics services for churn prediction, fraud detection, and lead scoring, among others.
Skytree also offers a
series of plug-ins
to be connected to the Skytree Server Foundation to improve Skytree’s existing capabilities with specific and more advanced machine learning models and techniques.
BigML
If you
Google BigML
, you will find that “BigML is Machine Learning for everyone.”
The company, founded in 2011 in Corvallis, Oregon, offers a cloud-based large-scale machine learning platform centered on business usability and at highly competitive costs by providing advanced analytics via a subscription-based offering.
The application enables users to prepare complete analytics solutions for a wide range of analysis scenarios, from collecting the data and designing the model to creating special analytics ensembles.
Since it is a cloud-based platform, users can start using BigML services via a number of subscription-based and/or dedicated options. An attractive approach for those organizations trying to make the best of advanced analytics with less use of technical and monetary resources.
Yottamine Analytics
Founded in 2009 by Dr. David Huang, Yottamine has taken Dr. Huang contributions to the theory of machine learning to practice and reflected it within the
Yottamine Predictive Service (YPS)
.
YPS is an on-demand advanced analytics solution based on the use of web services, which allows users to build, deploy, and develop advanced big data analytics solutions.
As an on-demand solution it offers a series of subscription models based on clusters and nodes, with payment based on the usage of the service in terms of node hours—a pretty interesting quota approach.
Machine learning is pervasive
Of course, this is just a sample of the many advanced analytics offerings that exist. Others are emerging. They use machine learning techniques to different degrees and for many different purposes, specific or general. New companies such as
BuildingIQ
,
emcien
,
BayNote
,
Recommind
, and others are taking advantage of the use of machine learning to provide unique offerings in a wide range of industry and business sectors.
So what?
One of the interesting effects of companies dealing with increasing volumes of data and, of course, increasing problems to solve is that techniques such as Machine Learning and other Artificial Intelligence and Cognitive Computing methods are gaining terrain in the business world.
Companies and information workers are being forced to learn about these new disciplines and use them to find ways to improve analysis accuracy, the ability to react and decide, and prediction, encouraging the rise of what some call the data science discipline.
Many of the obscure tools for advanced analytics traditionally used in the science lab or at pure research centers are now surprisingly popular within many business organizations—not just within their research and development departments, but within all their lines of business.
But on the other hand, new software is increasingly able not only to help in the decision-making process, but also to be proactive in reproducing and automatically improving complex analysis models, recommendations, complex scenario analysis to enable early detection and prediction and, potentially, data-based decisions.
Whether measuring social media campaign effectiveness, effectively predicting sales, detecting fraud, or performing churn analysis, these tools are remaking the way data analysis is done within many organizations.
But this might be just the beginning of a major revolution in the way software serves and interacts with humans. An increasing number of Artificial Intelligence disciplines, of which machine learning is a part, are rapidly evolving and reaching mainstream spaces in the business software world in the form of next-generation cognitive computing systems.
Offerings such as Watson from IBM might be the instigators of a new breed of solutions that go well beyond what we have so far experienced with regard to computers and the analysis process, So, I dare you to stay tuned for my next installment on Cognitive Systems and walk with me to discovery these new offerings.
Become a Patron!
Facebook
fbbox/https://www.facebook.com/dofthings
Media Partners
Popular
Zaloni and its Data Lake Management Approach: Interview with Scott Gidley
The accelerated evolution of the big data and database markets and the increasing need for managing larger volumes of diverse types of data ...
A D3 Image is Worth a Thousand Words: Interview with Morgane Ciot
Many things have been said and done in the realm of analytics, but visualizations remain as the forefront of the data analysis process, whe...
2017 Teradata Influencer Summit: Blending In on the New Management Era
A couple of weeks ago I was fortunate to be invited to attend the 2017 influencer summit event in the beautiful venue chosen by Teradata ...
Cloudera Analyst Event: Facing a New Data Management Era
I have to say that I attended this year’s Cloudera analyst event in San Francisco with a mix of excitement, expectation and a grain of sa...
The BBBT Sessions: Outlier, and the Importance of Being One
It has been some time since my last write up about my briefings with the Boulder Business Intelligence Brain Trust (BBBT) , multiple busi...
Recent
5/recent posts
Random
5/random posts
Ads
Latest Tweets
Tweets by @jgptec
Copyright © 2016 'D' of Things. All Rights Reserved. Powered by
Blogger
no
Name
Email
*
Message
*