This is a blog to summarize and recap the presentation made by the PubMatic team at the Apex Meetup on Nov 19 2015, at MapR office in San Jose.

In this meetup, Dev Tagare from PubMatic shared their real time use cases and showed how PubMatic is leveraging real time and near real time business decision making using DataTorrent and Apex. To give some background on PubMatic, PubMatic is a leading marketing automation software company for publishers. PubMatic helps publishers monetize their digital assets.  Through real-time analytics, yield management, and workflow automation, PubMatic enables publishers to make smarter inventory decisions and improve revenue performance.

Here is a nice visual of the Ad Technology and Online Advertising value chain. As you can see, a fairly complex value chain with many players involved and depending on which side of the fence you are (demand side or supply side), there are different incentives and drivers which create value. What’s also very important is decisions and ad serving is happening in sub seconds.


PubMatic’s vision was to continue to build on their leadership and add more and more value to their customers, who are primarily the publishers. Before DataTorrent and Apache Apex, they had no real time solution offering. Publishers and buyers would make inventory decisions on data that was hours and days late. In the fast moving world of ad tech, where insights are highly perishable, the value of near real time was tremendous to publishers to help optimize and maximize their revenue from advertisers. With DataTorrent, PubMatic was able to offer their first real time solution to the market. This enabled reporting of critical metrics around campaign monetization. As a part of the project and product evolution, the next offering was a real time streaming which reduced the latency from about 20 min to under a minute, again playing to the sentiment, that in this industry, insights are so perishable that quicker access to data and insights is critical.

Before After_PubMatic

In the meetup, our guest speaker Dev Tagare, Architect at PubMatic also went to show the architecture of the Real Time Platform and showed how the different pieces fall together. He talked about the different operators that they use. Operators are like the building blocks that help developers build an application and solution on the DataTorrent Apache/Apex platform. The main operators that are used in this solution are:

  • S3 reader (File Input Operator)
  • Kafka Input Operator
  • Dimension Store
  • HDHT

After the meetup, as we were networking people’s feedback was very positive. Some of them said they did not realize how intricate and complex this Ad Tech space was. Others found the meetup very educational. Some other comments included that this was a very good example of how valuable real time insights are.

I would like to extend my special thanks to Dev Tagare from Pubmatic for being so gracious with his time and coming and presenting what they are doing , which is truly disruptive. I would also like to thank the MapR team for being very gracious hosts, and especially Alicia Alvarez for helping out and making this meetup a success.

Join our Apex Meetup (please join it as per your geographical location)

Join our Apex community

Download DataTorrent Sandbox here

Download DataTorrent Enterprise Edition here