Achieving real-time machine learning and deep learning with in-memory computing
An in-memory computing platform with continuous learning capabilities enables a range of real-time decision making use cases. What might some of these cases be and how will they affect the future of machine learning and deep learning?
Machine learning (ML) and artificial intelligence (AI) applications – based on deep learning (DL) technologies – are driving advances across industries and within organizations.
According to a report issued by Capgemini Consulting, 75 percent or more of organizations responding to a survey are implementing or using AI to increase sales of new products and services, increase operational efficiency, enhance customer satisfaction, and generate new insights and better analysis. Meanwhile, IDC predicts spending on AI and Machine Learning (ML) will grow from $12 billion in 2017 to $57.6 billion by 2021, and Deloitte Global predicts the number of machine learning pilots and implementations will double in 2018 compared to 2017, doubling again by 2020.
Real-time continuous learning use cases powered by ML
However, the emerging demand for ML and DL to support continuous learning applications in which a system’s learning model is updated whenever new data is added to the system presents technical challenges for IT. Consider the following real-time continuous learning use cases powered by ML:
- New credit card approval during the checkout process. A shopper fills out the required information, which is immediately submitted to the credit card company. The credit card company then needs to “score” the customer against a variety of factors, pulling historical data from potentially multiple systems to identify the customer’s credit score, demographics, their previous purchases, etc., assigning different weights to each factor to arrive at a risk score for determining whether to approve or reject the application – all in a few seconds or less.
- Fraud prevention. A bank has developed a historical model of what indicates a loan application is likely fraudulent, but as the system ingests new credit applications the system continually updates the machine learning model based on the new data to identify in real-time any emerging trends that might indicate a new concerted effort to acquire credit fraudulently. Any related fraudulent activity can then be immediately identified.
- Ecommerce recommendations. Online shopping recommendation engines are based on historical data such as web page visits and purchase patterns, but they are far more powerful – and deliver an increased ROI – if they incorporate real-time continuous learning. Incorporating the latest web page information, referral information, and purchase patterns into the machine learning model can result in real-time improvements to the recommendation engine model, resulting in improved recommendations based on the latest data available.
Other common machine learning use cases include applications related to mortgage approvals, logistics, transportation system maintenance and better real-time business decision-making.
Though currently not as prevalent as machine learning, deep learning continues to generate interest. DL applications essentially analyze large amounts of outcomes in order to identify patterns in the data that can be used to interpret new incoming data. A simple example would be image or voice recognition: a DL system might be fed thousands of pictures of cats in order to develop the ability to identify whether the object in a new picture is a cat. This capability – at real-time speeds and with continuous learning – will clearly be essential for the development of voice recognition systems, self-driving cars, and eventually autonomous robots.
Achieving real-time ML and DL with in-memory computing
Companies attempting to implement continuous learning ML or DL use cases may run into an extreme processing bottleneck because the data is typically stored in a standard disk-based transactional database (OLTP). Performing analysis on this data using traditional ML and DL methods typically requires an extract, transform, and load (ETL) process to move the data into an analytical database (OLAP), where the data can be analyzed. However, ETL is a batch process that may require considerable time to complete because of the size of the data, which means the data in the OLAP system is stale before the analysis even takes place.
A solution to this challenge was recently introduced in Apache® Ignite™ 2.4. Apache Ignite, an open source in-memory computing platform, accelerates existing relational and NoSQL databases and provides horizontal scalability, high availability and strong consistency with distributed SQL. The platform includes an in-memory data grid, in-memory database, streaming analytics, and now artificial intelligence and continuous learning capabilities (including integrated Machine Learning and multilayer perceptron Deep Learning features).
Built on Apache Ignite, GridGain® product versions add various enterprise-grade features to Apache Ignite. GridGain solutions also benefit from ongoing QA testing by GridGain engineers, the incorporation of software updates that have not yet been released in the Apache Ignite code base, and expert professional services, making the GridGain in-memory computing platform an enterprise-grade solution for production environments. The Ignite ML and DL capabilities are supported in GridGain Professional Edition 2.4 and higher and GridGain Enterprise and Ultimate Editions 8.4 and higher.
GridGain continuous learning framework
The GridGain Continuous Learning Framework serves as a building block for in-process HTAP (hybrid transactional/analytical processing) for operational data sets up to petabytes in scale. The ML and DL libraries have been optimized for massively parallel processing, which allows the system to run each ML or DL algorithm locally against the distributed data residing in memory on each node of the GridGain cluster. This allows continuous recalculation of the model to reflect the most recent data in the system without impacting database performance.
Some companies, such as Amazon and Google, may have huge budgets for their real-time machine learning and deep learning applications. For companies that don’t, GridGain, based on the open source Apache Ignite project, offers a cost-effective path to implementing real-time ML and DL capabilities that can power the next-generation of digital transformation and omnichannel customer experience applications that will drive future success.