Apache Apex

Enterprise-grade unified stream and batch processing

Enterprise Grade, Fault Tolerant, Scalable

Apache Apex processes big data in-motion via a large-scale, high-throughput, low-latency, fault-tolerant platform with correct processing guarantee, one that’s also easily operated. An enterprise-grade Hadoop YARN native platform, Apache Apex with its unified stream processing architecture can be used for real-time and batch processing use cases. It also provides a simple API that enables users to write or re-use generic Java code to set up big-data applications.

Download Apache Apex


Apache Apex provides some unique features that similar platforms currently don't offer, such as fine grained, incremental recovery to only reset the portion of a topology that is affected by a failure, support for elastic scaling based on the ability to acquire (and release) resources as needed as well as the ability to alter topology and operator properties on running applications.

Event processing guarantees (end-to-end exactly once)
In-memory performance & scalability
Fault tolerance,state management, automatic recovery for all components
Event-time and out of order processing, rolling and tumbling window support Hadoop-native YARN & HDFS implementation
Security and multi-tenancy
Elasticity with dynamic resource allocation

Why Apache Apex?

It's Enterprise Grade

Apex, a Hadoop YARN native platform, processes big data in-motion in a way that is highly scalable, highly performant, fault tolerant, stateful, secure, and distributed—and that's easily operated. That means you'll have a powerful platform at your fingertips, not a steep learning curve.

It's Low Barrier-to-Entry

Apex provides API that enable developers to write or re-use generic Java code. That's advantageous for you because you don't need to seek specialty experience to craft big-data applications—yet you'll reap all the benefits of real-time data integration and analytics.

It speeds time-to-market

Accelerate development of business logic and reduce time-to-market with the Apache Apex platform. Apex streamlines the development and productization of Hadoop applications to speed time-to-market from concept to reality.

It's Modular

The Apex platform comes with Malhar, a library of operators that can be leveraged to quickly create new and non-trivial applications. This includes the adapters (or connectors) that integrate with many existing technologies as data sources and destinations, like Kafka, HDFS, Cassandra, MongoDB, Oracle, Hive, HBase, Vertica, and many more.