At DataTorrent, we focus on enabling our customers to get successful business outcomes for fast data products. Our customers are able to develop and launch big data applications within a few weeks. Moreover. post launch these applications have low support costs. We have achieved this by relentlessly focusing on ways to lower time to market and total cost of ownership for our customers.
Our product includes Apoxi™, which is a framework that helps put together applications by stitching together data services. There are significant new features being added in RTS 3.10 to enable data services for machine learning and artificial intelligence. Analytics are made available for both data-in-motion and data-at-rest. Additionally, all the Apoxi enabled applications automatically get to use common IT data services. Some notable IT data services include, store and replay, support for heterogeneous grids, cloud agnostic operators, online analytics service, machine scoring service etc. The list of the latest data services and applications are available in DataTorrent’s AppFactory.
In this blog, I am discussing a reference architecture of such an application found in AppFactory. The use case being covered is retail recommendation using machine learning. We have partnered with MindStix for this application template. MindStix has expertise in data science and they will bring this expertise to bear for our customers. DataTorrent provides the product that enables customers to develop similar applications. RTS 3.10 includes features of Apoxi framework that will be required for customers to reduce their time to value.
Retail Recommender Use Case: Today enterprises are no longer only web, or only mobile, or only brick and mortar-based. Even Amazon has brick and mortar shops. In this ecosystem, enterprises get data streams from multiple sources, i.e. Omni-channels. The task of getting to know the retail customer requires processing all these data streams. This is inherently a fast big data use case.
Historically, retailers relied heavily on recommendation engines to provide personalized recommendation for both online and offline services. These recommendation engines are an integral part of retailers’ IT stack. They help customers decide on various aspects of purchase decisions and help retailers boost profits. The bulk of the first generation recommendation engines were scaled up and were not ready for big data. With the advent of Omni-channel, scale up recommendation engines will no longer work; they had to be scaled out. Additionally, recommendation engines based on basic rules are getting outperformed by those that use machine learning, or artificial intelligence. In this blog, we will walk through how a combined team of MindStix and DataTorrent created a sample application within a few weeks by stitching together fast big data services.
Online websites such as ecommerce, airline ticketing, dating website etc, provide overwhelming choices. Recommendation engines help customers dig through these choices. Additionally, they personalize these choices based on user behavior patterns, which then improve customer retention. Recommendation applications need to ingest, process (clean, filter, normalize, join), analyze (machine scoring, OLAP), and deliver results back to the front end in real-time. Most websites receive millions of users on a daily basis. The task of understanding the user is thus an adaptive process that has to continually improve, learn, and keep trying ways to get better conversion. The end result has to be a personalized experience for each individual user that keeps up with that user’s behavior. In fact, a recommendation engine that is able to adapt to real-time changes is the most effective. For example, if a particular sports team suddenly performs outstanding, retail recommendation should adapt and recommend the sale of that team’s product to sports fans and to those users from the same geographic location. Handling such scenarios requires robust, efficient, and sophisticated recommendation engines.
There are multiple tools data scientists and machine learning experts employ to develop machine learning models for recommendation systems. But by not standardizing, they risk not being able to launch a product and reduce this to being a mere science experiment.
In this blog, we will discuss a way to get data engineers to take what data scientists work on into production in a way that DevOps can easily launch and support these applications. For that, lets identify a common pattern in the recommendation engine. The common pattern is to ingest, process multiple streams, run common analytics on this data, and provide customer retail recommendations by using machine learning algorithms.
Reference Architecture for ML based Recommendations: There is a common pattern for machine learning based on fast big data applications that are used for retail recommendations and similar products. The following diagram shows how these steps are implemented in an Apoxi framework. The diagram shows two ingestion services, and two machine scoring services, but in reality, there will be more than two.
Every service in the above application is fully operationalized and ready to be in production in a very short time. Each service comes with both system and application metrics along with visualization. Access to real-time and historical metrics is provided in DataTorrent’s RTS product that helps both in live uptime and historical analysis to continuously lower support costs. All these have data visualization that helps in ease-of-use and makes data humanly consumable. The scores from machine learning models are available in real-time so that the recommendation service can provide the best results. The application also makes data available for machine learning model training.
We have observed companies face steep challenges and spend significant amounts of time putting these models into production. Apart from that, after deployment of these models into production they then have to ensure uptime and measure the effectiveness of these models which requires additional infrastructure and engineering tasks. Apoxi focuses on solving the operational and productization challenges so as to reduce a customer’s time to market. Additionally, Apoxi ensures that our customers can successfully operate recommendations based on machine learning models.
As an example of the above pattern, a joint team comprising of engineers from our partner Mindstix, and DataTorrent, developed a real-time recommendation application template using Spark-based machine learning. For operational and productization support DataTorrent RTS 3.10 was used.
Before going in to details of our implementation we will have quick overview of different recommendation techniques.
Recommendations: Any recommendations system based on machine learning faces cold-start issues and needs enough data to train algorithms. In this phase, you need to support how fast you can train and measure performance of your algorithms.
Normally there are three types of recommendation engines:
- Content Based Filtering: This technique is used to recommend products based on products liked by user or based products similar to liked products. It’s also known as item similarity. This technique is very useful for cold-start where there is not much data available to train algorithms but there still is information about the items. Machine learning efforts can be started with product catalogs, as these catalogs have a relationship between the items.
- Collaborative Filtering: Collaborative filtering is used to recommend items based on users. Indirectly this technique recommends products which similar users may like. This technique faces cold-start issue as it needs data about what other users likes and what a particular user liked. It also needs behavioral dimensions to run models on.
- Hybrid Recommendation: Both these approaches can be combined. These are hybrid recommenders which try to address disadvantages of both the above recommendation techniques.
Implementation Details: We have demonstrated the reference architecture shown in figure 1. We have selected a few data services from DataTorrent’s AppFactory to implement this reference architecture. These services include ingestion service that connects to Kafka, ingests, parses, cleans, and filters data. These events are then normalized and published on an Apoxi service bus. These events are then scored by a machine scoring model that is derived from Spark-ML. Post recommendation, another service enriches the events with additional data such as categories, detailed descriptions etc. It also tags these events as “RECOMMENDED,” and publishes these events on the service for DataTorrent’s Online Analytics Service to consume. Apache SuperSet based visualization service is then used to provide the business operations team the ability to query and visualize product and user analytics. This implementation also leverages store and replay service to make the output data of the recommendation service available to data scientists so they can continually train their machine learning model.
To measure the effectiveness of our recommendation application, user actions on the recommendation are collected and analyzed. A separate ingestion service collects this data, tags the events as “VIEWED”, and makes the data available to the same Online Analytics Service. Do note that in the second application that is used to analyze the success of the recommendation engine, the same (or similar) ingestion service, store and replay service is also used. This is an example of the ability to reuse services that Apoxi provides, and it reduces the time it takes to create new fast data applications. Since “VIEWED” event data contains the same dimensions (schema) as “RECOMMENDED” events, OLAP queries can be run that combine these events. Business operations can do various dimensional analytics on this data. Operations metrics are available for both system and application metrics.
Measuring Effectiveness of Recommender: Visualizations play an important role in deciding the effectiveness of machine learning models. Machine learning models are trained as batch jobs and deployed in a production environment. It’s very essential to identify and react to trend changes in events to avoid revenue losses. Real-time visualization and application metrics play a significant role in identifying such changes earlier. We have used Online Analytics Service and visualizations to measure the effectiveness of Spark Based Recommender. We have chosen recommendations by Category, Effectiveness by City & Effectiveness of Category as business metrics to measure a model’s performance. Measurability along with capability to dynamically update machine learning models in the recommender data service provides an effective and efficient way to iterate over various training experiments with minimal time spent on development. This application is operable and has low support cost.
Figure 3: Recommendations By Category
Figure 4: Effectiveness By Category
Figure 5: Effectiveness By City
Figure 6: System and Application Metrics for Spark Recommender Service
This implementation showcases DataTorrent’s RTS product and the Apoxi framework included in it. It demonstrates how to operationalize and productize Spark-based machine learning. Customers can take this reference architecture and expand it to use with multiple machine scoring models and have the recommender decision service make use of all the scores and decide on a recommendation. Additionally, by leveraging analytics between recommended and viewed events, enterprises can continually improve their models and ensure that they keep up with future trends. Other ML services include PMML based models, as well as support for Python Based ML libraries. At DataTorrent, we continue to focus on robust and production-ready frameworks to enable customers to deploy ML models quickly.