What is Online Account Takeover?

In the online world, we all have a digital identity that we use to log in to websites. Fraudsters trick these websites by stealing our credentials and using them to commit frauds. With centralized accounts in these systems, the liability and impact of fraud has multiplied many fold.

In 2017, Bloomberg announced that Uber’s CSO and other execs were ousted after hackers got access to over 57 million Uber accounts. One of the most infamous cases was when Hillary Clinton’s presidential staff accounts were hacked. These examples are just the tip of the iceberg. Online account takeover is much more pervasive. According to this Forbes report, account takeover activities are constantly on the rise.

Today in the internet era, fraudsters use various sophisticated means including: malware, phishing, or wholesale data breaches, to get access to account information. The organizations that you trust with your personally identifiable information need to undertake strong security measures and keep them up to date in order to combat the latest techniques fraudsters use.

With the advent of mobiles and other technologies, enterprises have multiple channels of access, namely web, mobile, and brick-mortar shops. The source of a data breach can be from anywhere in the world. Patterns have to be identified and defended across multiple channels. A breach in one impacts all channels. Thus, an ATO detection in one channel needs to be able to notify the identity of the fraudster to all other channels in real-time. Enterprises not equipped with an IT staff to handle these cases are lot more susceptible to ATO attacks.

How DataTorrent Can Help Prevent Online Account Takeover

As part of DataTorrent’s AppFactory, we have released an Online Account Takeover (ATO) Prevention Application to address such account takeover detection use cases. More services can be added to the application that help enterprises take preventive action against any further damage. Moreover, hooks are provided to detect and update other applications of impacted accounts in real-time. Channels, with account breach information, are made available in DataTorrent’s application backplane (built over Apache Kafka) to provide information across entire enterprise.

Other groups can then take action on breached accounts in real-time, enabling enterprises to prevent the impact of the breach across all channels, services, and products as soon as a breach is detected. At a high level, enterprises can process, enrich, analyze, and act, in real-time, upon multiple streams of account event information to prevent account takeover in an omni-channel environment.

The platform provides real-time insights on both the system level (DevOps) and the business level (business analysts). The product is fault tolerant, highly available, scalable, enables end-to-end exactly once semantics, and has zero data loss. The application can be customized by implementing custom data types and writing rules as per your business needs. Additionally, data engineers can add machine scoring services as needed. Apoxi™ integrates very well with a machine learning infrastructure, and thereby enables enterprises to successfully set themselves up for adoption of machine learning and artificial intelligence.

DataTorrent’s ATO Prevention Application

Lets walk through the data services and components that are stitched together to construct the ATO application. There are five major parts of the ATO application. Two of these are components, and three are services. These five are: Ingestion component, CEP/rules engine component, CEP/Rules Workbench Service, Online Analytics Service, and Online Analytical Visualization Service.

Ingestion component: Data needs to be ingested from various sources and prepared for detecting account takeover. During the data ingestion and preparation phase, data from different sources is published on Kafka. Examples of sources include syslog, web server logs, other user activity records etc. Kafka acts as an input to the rest of the processing pipeline. The ingestion component prepares the data and converts it into a POJO for applying account takeover rules. The tasks include:

  • Parse: In this phase, the messages from syslog entries or any other log entries are gathered. In some cases, the events can be emitted straight to a message bus on the front end servers, bypassing the need to read the logs. This phase parses the input records and generates java objects to be passed on for enrichment. During the parse phase, data is checked for errors and is archived. Both bad and good records are available in HDFS. Enterprises can choose Amazon S3 or Azure Storage with minimal changes. DataTorrent’s AppFactory has these components in its Microdata Services section. Archiving can also be achieved by using the store and replay service available within Apoxi framework. This feature is very valuable for future upgrades. New upgrades can be certified by running the newer version of the application with replayed data and comparing it with results of the current application in production.
  • Enrich: The incoming records do not carry all the required information that is needed for detecting account takeover. For example, a login record may contain the user ID but may not contain user details such as name, address, normal geolocation, IP blacklist, usual user behavior, etc. The missing values can be enriched from an external source so that the event record has all the data needed for analytics. These details are required to do in-depth pattern analysis and other operations. The component supports multiple data sources for data enrichment, such as: json file, jdbc stores, etc.The component includes the capability to do geo data enrichment. Which means you can fill in location details by looking at the source IP address of a record. This information is useful for detecting fraudulent activity.
  • Omni-Channel: At times enterprises may need to combine data streams for different channels before processing for account takeover. This can easily be done by adding more ingestion components. It is easy to join multiple data streams post-processing by emitting enriched POJOs on our application backplane. Thus fraud processing can be done on individual data streams or on omni-channel data streams with minimal development. The ability to have a dedicated parse and enrich component per data stream helps to support different protocols, different formats, and different ingestion rates of omni-channel feed(s).

CEP/Rules Engine Component: In the previous processing component, the data is prepared and made ready for detecting account takeover. This detection is done using rules provided by the data analyst. The Drools CEP/Rules engine, licensed under Apache 2.0, is used for processing static rules. Example rules are provided for behavioral/pattern analysis that can indicate potential account hacking attempts which can be acted upon. Enterprises can write custom rules as per business needs and the execution part is handled by the Rule Execution Operator of this component. For example, a business login from a country that is different than the user’s home country or multiple login failures within 10 minutes can be a suspect. Such rules are written in Drools-supported formats. A strong feature added by datatorrent platform is the capability of parallel processing to get results in real-time. Traditionally, most of the rule execution engines do not support parallel processing and are done in a scale-up manner. The drools operator can partition that adds scale-out feature to CEP/Rules engine component, thereby providing better performance benefits and scale.

CEP/Rules Workbench Service: The ATO application is integrated with the CEP/Drools Workbench Service. It is a customized version of the Drools Workbench. This service enables data scientists, data engineers, and business analysts to write rules. Engineers can specify and share schema, launch and upgrade rules, and manage the rules life cycle. The workbench UI is integrated with rest of the RTS user interface, and provides a seamless experience.



Online Analytics Service: The ATO application includes an online analytics service. Full OLAP analytics are available for business intelligence on fraud data. Data can be sliced and diced based on various dimensions. It is very easy to add raw data, system metrics, as well as application metrics for OLAP analytics. Developers can add more custom metrics as needed and get OLAP analytics on it. For example, it is easy to add OLAP analytics on data parsing errors. This can pinpoint errors that may have come about due to a particular version of mobile devices. OAS leverages the Druid OLAP engine, which is available under the Apache 2.0 license. OAS includes tight integration with data-in-motion platform, fault tolerance, high availability, hardening, and a seamless alignment of Druid scale out with data-in-motion scale out. OLAP analytics is available for both real-time as well as historical data.

Online Analytical Visualization Service: OAS comes integrated with an online analytical visualization service. This service includes real-time as well as historical visualization dashboards. Users can create their own custom dashboards or integrate widgets into third party dashboards as needed.

Developers can add a machine scoring service to the ATO application with minimal efforts by leveraging machine scoring support available in RTS. The results of machine scoring can then be either consumed as is by other applications or can be joined with the fraud results from the CEP/Rules Service. Developers can also write a machine scoring service in Spark. The results of machine scoring and CEP can be fed back to machine learning infrastructure by using store and replay feature of Apoxi, or HDFS/S3 archival components from AppFactory.

The short development time for DataTorrent’s ATO application is proof that we have the ability to enable customers to develop applications quickly by stitching together data services and components. Enterprises can add their custom logic and integration points to this application to enable ease of integration into their current product stack. The operationalization of this omni-channel application does not need major structural changes across the entire IT stack. Enterprises can achieve the desired business outcome of detecting and defending account takeover attacks with a fast time to market and low total cost of ownership. This is one of the important things as Gartner reported in November 2017 that they underestimated the failure rate of big data projects in 2016 to be at 60%, when in reality today it is at 85% today. The main cause of enterprises not achieving successful outcomes from all the great innovation in big data is the lack of productization and operational deficiencies in getting open source to work.

At DataTorrent, we ask what it would take for enterprise customers to achieve successful business outcomes with fast time to value and low total cost of ownership. With DataTorrent RTS, enterprises have been able to reduce the time it takes for customers to launch a big data product from up to 18 months to 60 days. Additionally, this increase in productivity also shows up as a decrease in support costs, including the ability to upgrade quickly as per market needs. The time it takes to launch makes all the difference between failure and success.

Stay tuned for more components, services and applications that we continue to add to AppFactory. Do contact us if you are interested ATO application or any other reusable component/service in AppFactory.