IBM Watson Analytics: Automating Visualization, Descriptive, and Predictive Statistics

JMIR Public Health Surveill. 2016 Oct 11;2(2):e157. doi: 10.2196/publichealth.5810.

Abstract

Background: We live in an era of explosive data generation that will continue to grow and involve all industries. One of the results of this explosion is the need for newer and more efficient data analytics procedures. Traditionally, data analytics required a substantial background in statistics and computer science. In 2015, International Business Machines Corporation (IBM) released the IBM Watson Analytics (IBMWA) software that delivered advanced statistical procedures based on the Statistical Package for the Social Sciences (SPSS). The latest entry of Watson Analytics into the field of analytical software products provides users with enhanced functions that are not available in many existing programs. For example, Watson Analytics automatically analyzes datasets, examines data quality, and determines the optimal statistical approach. Users can request exploratory, predictive, and visual analytics. Using natural language processing (NLP), users are able to submit additional questions for analyses in a quick response format. This analytical package is available free to academic institutions (faculty and students) that plan to use the tools for noncommercial purposes.

Objective: To report the features of IBMWA and discuss how this software subjectively and objectively compares to other data mining programs.

Methods: The salient features of the IBMWA program were examined and compared with other common analytical platforms, using validated health datasets.

Results: Using a validated dataset, IBMWA delivered similar predictions compared with several commercial and open source data mining software applications. The visual analytics generated by IBMWA were similar to results from programs such as Microsoft Excel and Tableau Software. In addition, assistance with data preprocessing and data exploration was an inherent component of the IBMWA application. Sensitivity and specificity were not included in the IBMWA predictive analytics results, nor were odds ratios, confidence intervals, or a confusion matrix.

Conclusions: IBMWA is a new alternative for data analytics software that automates descriptive, predictive, and visual analytics. This program is very user-friendly but requires data preprocessing, statistical conceptual understanding, and domain expertise.

Keywords: data analysis; data mining; machine learning; natural language processing; statistical data analysis.