Number of months ago I joined one of most interesting start-ups in Silicon Valley. This company, called  DataTorrent, develops a Hadoop based computation platform for processing real-time streaming data… big data. I lead product management here.

I have been employed by companies developing enterprise- and telco-grade products for last 16 years, so real time, scalability, and even “big data” are not exactly new terms for me. However here we are doing things at the scale, which was even unthinkable just a few years ago. I will blog not only about  the platform, but also about some interesting facts I am learning as part of the exciting experience building it.

Some time ago, one of our engineers did a benchmark, trying to see how well the system scaled. After finding and resolving several bottlenecks, he could see the system scale up really well almost linearly before hitting hardware barrier and flattening up. If you are interested, see more details in the next blog entry. Real time stream computations are CPU-bound, so it was expected that we would run out of cores at some point,. However, we hit the wall much faster than we expected. It took us some time to figure out what happened. I thought it might be interested to others as well. Essentially, it was about logical and physical CPUs. For those, who are not familiar with it, this is a story.

More than 10 years ago Intel introduced a new feature in their microprocessors, called  Hyper-Threading. It started to add more transistors on CPU that replicate some parts of CPU core, but these are not “real” or “physical” cores. However operating system (Windows or Linux) actually sees  two “virtual” or “logical” cores for each processor core that is physically present. You will not see it in Windows 8 redesigned Task Manager, which shows only composite CPU utilization. Too see it, one needs to use either external utility (like free  CPU-Z) or command line. On my machine it will look like this:

TaskManager

CPU-Z

BTW, for nerds amongst us, it is also possible to check using ‘WMIC CPU Get‘ command in Command window.

CPU

 

Of course the machines we use in our cluster are much more powerful and (surprise!) run Linux, not Windows.  When using Linux ‘nproc‘ (show number of processors) or ‘lscpu‘, each one would report 24 (yes, twenty four) CPUs.  And  this was THE reason for the initial confusion.

Looking a bit deeper, we could see they actually have 2 CPUs with 6  hyper-threaded  cores each, so the total number of logical cores is 24 (2 x 6 x 2).

cluster

 

Coming back to the  original story, when we substituted the correct number of physical cores into Excel, we saw that having more threads (very CPU intensive) than physical cores  was  still beneficial ‘ until it hit 1.25 threshold. Of course, we had many other background threads (OS, JVM …), but they were mostly idle.

So, does Hyper-threading really help? It depends. When one searches the internet, there are some users reporting significant gaps ‘ usually when the processing is IO-bound. In our benchmark, where the bottleneck is mostly in processing (CPU-bound), the gain is relatively low…. We have confirmed that Hyper-Threading is not a”hype” and it does provide more horsepower ‘ even for CPU intensive tasks like ours. However do not expect any wonders ‘ logical CPU is still not a substitute for a real (read “physical”) one.