Gartner’s Magic Quadrant for Advanced Analytics Platforms 2014

Gartner’s Magic Quadrant for Advanced Analytics Platforms 2014

Predictive analytics and other categories of advanced analytics are becoming a major factor in the analytics market. We evaluate the leading providers of advanced analytics platforms that are used to build solutions from scratch.

It’s such a pivotal moment for data scientists and the growing open-source R community that Gartner has embarked on its first ever Magic Quadrant for Advanced Analytics Platforms. Gartner estimates advanced analytics to be a $2 billion market that spans a broad array of industries globally, and ‘Gartner predicts business intelligence and analytics will remain top focus for CIOs Through 2017.’ We believe that this new Magic Quadrant puts a spotlight on big data as the great analytics disruptor which we feel highlights the need for solutions like Revolution Analytics’ that are built upon a flexible, open platform, and designed for today’s Big Data Big Analytics challenges.” — Dave Rich

Magic Quadrant for Advanced Analytics Platforms 2014

What is Big Data?

Big Data – Definition

There is no universal definition of what constitutes “Big Data” and Wikipedia offers only a very weak and incomplete one: “Big data is a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time”.

IBM offers a good, simple overview:

Big data spans three dimensions: Volume, Velocity and Variety.

Volume – Big data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.
Velocity – Often time-sensitive, big data must be used as it is streaming in to the enterprise in order to maximize its value to the business.
Variety – Big data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more.

Bryan Smith of MSDN adds a fourth V:

Variability – Defined as the differing ways in which the data may be interpreted. Differing questions require differing interpretations.

Google Trends on Big Data:
Below is a figure from Google Trends showing the growth of search interest for “big data” as compared to “web analytics” and “business intelligence”:

Big Data Terms / Tags:

Source article

Want to read more about Big Data?
Marc Smith from Social Media Research Foundation Speaks on Big Data

Predictive Analytics vs Data Mining

I just read a very good article about Predictive Analytics, source can be found here.

Predictive Analytics vs Data Mining

Technology Cycle:
Data warehousing is a mature technology, with approximately 70 percent of Forrester Research survey respondents indicating they have one in production. Data mining has endured significant consolidation of products since 2000, in spite of initial high-profile success stories, and has sought shelter in encapsulating its algorithms in the recommendation engines of marketing and campaign management software. Statistical inference has been transformed into predictive modelling. As we shall see, the emerging trend in predictive analytics has been enabled by the convergence of a variety of factors.

Technology Hierarchy:
In the technology hierarchy, data warehousing is generally considered an architecture for data management. Of course, when implemented, a data warehouse is a database providing information about (among many other things) what customers are buying or using which products or services and when and where are they doing so. Data mining is a process for knowledge discovery, primarily relying on generalizations of the “law of large numbers” and the principles of statistics applied to them. Predictive analytics emerges as an application that both builds on and delimits these two predecessor technologies, exploiting large volumes of data and forward-looking inference engines, by definition, providing predictions about diverse domains.

The method of data warehousing is structured query language (SQL) and its various extensions. Data mining employs the “law of large numbers” and the principles of statistics and probability that address the issues around decision making in uncertainty. Predictive analytics carries forward the work of the two predecessor domains. Though not a silver bullet, better algorithms in operations research, risk minimization and parallel processing, when combined with hardware improvements and the lessons of usability testing, have resulted in successful new predictive applications emerging in the market. (Again, see Figure 1 on predictive analytics enabling technologies.) Widely diverging domains such as the behaviour of consumers, stocks and bonds, and fraud detection have been attacked with significant success by predictive analytics on a progressively incremental scale and scope. The work of the past decade in building the data warehouse and especially of its closely related techniques, particularly parallel processing, are key enabling factors. Statistical processing has been useful in data preparation, model construction and model validation. However, it is only with predictive analytics that the inference and knowledge are actually encoded into the model that, in turn, is encapsulated in a business application.

This results in the following definition of predictive analytics: Methods of directed and undirected knowledge discovery, relying on statistical algorithms, neural networks and optimization research to prescribe (recommend) and predict (future) actions based on discovering, verifying and applying patterns in data to predict the behavior of customers, products, services, market dynamics and other critical business transactions. In general, tools in predictive analytics employ methods to identify and relate independent and dependent variables – the independent variable being “responsible for” the dependent one and the way in which the variables “relate,” providing a pattern and a model for the behavior of the downstream variables.

In data warehousing, the analyst asks a question of the data set with a predefined set of conditions and qualifications, and a known output structure. The traditional data cube addresses: What customers are buying or using which product or service and when and where are they doing so? Typically, the question is represented in a piece of SQL against a relational database. The business insight needed to craft the question to be answered by the data warehouse remains hidden in a black box – the analyst’s head. Data mining gives us tools with which to engage in question formulation based primarily on the “law of large numbers” of classic statistics. Predictive analytics have introduced decision trees, neural networks and other pattern-matching algorithms constrained by data percolation. It is true that in doing so, technologies such as neural networks have themselves become a black box. However, neural networks and related technologies have enabled significant progress in automating, formulating and answering questions not previously envisioned. In science, such a practice is called “hypothesis formation,” where the hypothesis is treated as a question to be defined, validated and refuted or confirmed by the data.

More info about Data Mining?