Big Data is getting really hot these days, and along with it, some of its pet terms. One of those terms is “In Memory Analytics.”
All of a sudden EVERYBODY — be they providers of data warehouse appliances, OLAP tools, data visualization tools, or other business intelligence derivatives — are claiming to have “In Memory Analytics.” Even if what they offer is not truly “in memory,” or (for that matter) not really “analytics.”
While I dislike marketing that is deliberately misleading, I see how this situation developed. A lot of customers have hopped on the bandwagon and are specifically asking for “In-Memory Analytics” by name. But the thing is, no matter how they phrase it, the actual customer problem is not “I need in-memory analytics.” Instead, the problem is “I want to analyze a my data with near-instantaneous response.” Thus, vendors who have covered at least some part of this problem are inclined to just slap the “In-Memory Analytics” label on their wares, so that they can at least be considered, even if the term is not technically accurate.
The actual definition of “In-Memory Analytics” is a specific technical approach to the instantaneous analysis problem — an approach where the entire dataset is pulled into the server’s RAM in order to greatly speed up analytic queries.
It works great, but In-Memory Analytics is not the end-all-be-all approach by any means. It doesn’t deserve to be synonymous with “instantaneous analysis,” because:
- Other technical approaches may also provide solutions to this problem.
- “In-memory” cannot solve the problem at all for larger data sets.
More color on the first point: In the next few years, we’re going to see lots of new approaches and technologies that provide the user with instantaneous insight into large data sets. For example, MPP data warehouses, Hadoop, MapReduce, and other technologies will combine and be refined to give the user (the analyst) instant understanding of their data – regardless of whether all the necessary data is “in-memory” or not.
And on the second point: no matter what, we will always want to analyze data sets that are larger than what can be stored in memory alone. The amount of data in the world is growing faster than Moore’s law and memory capacity improvements. (Don’t believe me? Take your favorite “in-memory” analysis problem and add a requirement for LOTS of historical data.) Further, for the foreseeable future, disk storage will continue to cost a tiny fraction of memory storage. Right now, Amazon Web Services charges about $0.10 per GB on disk, while memory costs 240 times as much, about $24.00 per GB. So, expect lots of situations where the entire data set is huge — so huge that most of it is on cheap disk and only the most frequently accessed data is in-memory. In-Memory Analytics alone won’t cut it.
{ 0 comments }