Most companies use an Extract, Transform, Load (ETL) model for analytics that was likely implemented a good handful of years back with the “then” established technologies. Throwing everything blindly into a data lake and poking at the data and calling this a sorted answer is analogous to doing the robot dance at a family gathering and thinking you’re the bomb.
I saw a blog post when I first got to DataTorrent around the value and preference of micro-batch and “near real-time” streaming in which the author suggested that it was prudent to stay with micro-batch streaming as “speed can kill” and micro-batch was good enough.
But surely “near real-time” is good enough?
Look, there are cases where at rest, batch or micro-batch do what you need. But in today’s digital economy, when the difference between winning and losing is measured in milliseconds (ms) and not days, we would be kidding ourselves if we believed this was true. This fallacy is perpetuated by integrators and vendors pushing elongating yesterday’s architectures, or clients subscribing to “if it ain’t broke don’t fix it”.
Boiling business down to its most basic elements – it’s all about Money-In versus Money-Out, you need to make more money than you spend, and you need to continually focus on improving both ends. Beating your competition, getting clients to spend more per purchase, securing your and their assets all utilize technology and analytics to be smarter and faster.
My dad was an avid fisherman, I was dragged to the Lake District to fly fish, the canals of Halifax, the pier in Nice, and even a spot of Trout tickling. In each case during my dismal attempts to apply old techniques I heard the words “come on son, do it right; it’s as easy as shooting fish in a barrel.”
So why change? Why move my process and actions into the data stream? Surely it won’t gain me much.
Yes and No:
Regarding the use case of monitoring fraudulent credit card usage, with an ETL and poke model you get Fraud Detection. By shifting left you practice Fraud Prevention!
Speaking of point of purchase…
“Mr. Churchward, thank you for purchasing these rather fetching gold toe socks. We did notice that your wife had browsed a pair of Fry Boots which we know she doesn’t have yet and would certainly augment her already splendid shoe collection. We actually have her size and color in stock and I am sure it would be a lovely surprise for her so I wanted to see if you were interested in picking them up now before I conclude this transaction?”
Blending customer loyalty, internet activities, and physical purchases is happening here and now.?
“Mr. Churchward, thank you for calling and booking an appointment with the Podiatrist. I also notice that you have a couple of routine preventative visits that are overdue. I have a hold on the relevant doctors’ calendars back to back and if you could possibly extend your visit with us for another 30 minutes then we can get them all squared away in one trip?”
The reality is that “shifting left” isn’t just a nice idea but it enables you to harness the real power of Customer 360° in NOW time.
Here’s a link to an application and a slight homage to my dad’s fateful words. https://www.datatorrent.com/apex-performance-demo
I did take it one step further to illustrate how fundamental the difference between analytics in motion and post storage and abstracted to the analogy of three data pipelines representing three hungry people fending for themselves and fishing in a stream, the faster they see the fish and catch the fish, the more likely they survive and thrive.
Three data pipelines representing these three people fishing are:
1. Apache Apex on Data-In-Motion
2. Apex serving an ETL and poke model – (same technology just applying analytics and actions on stored data)
3. Apache Spark Streaming
Safe harbor statement, DataTorrent fully embraces and deploys the KASH (Kafka, Apex, Spark, HDFS) stack being the de jour building blocks of any great analytics platform so this isn’t banging on competition or engines but moreover it’s articulating what you’re missing by not running analytics and taking actions on data-in-motion.
It’s live, real pipelines all optimized as they would be in production. Think this is staged or not possible? Then ping us, we’ll peel away the covers and we’ll even assist you getting an application into production in < 60 days. If we can’t for some reason, then we’ll credit you for all services for the work we have undertaken for said project.
“Shift Left” and take advantage the difference matter!