Friday 18 September 2015

Hadoop use cases: research underway to identify common themes

Within the next week or so I should conclude a piece of research that summarises the published customer case studies for Hadoop adoption. It's been a fascinating project to work on. With all the hype around Big Data, and technologies like Hadoop it's often difficult to get a clear objective view of real usage. The research has examined close to 200 customer case studies to identify common themes in adoption, usage and benefits, the aim being to provide a reference for those looking to adopt Hadoop. The initial findings have shown some interesting insight:

  • whilst much of the focus of Hadoop has been around the perception of it providing a low cost analytical platform, driven by a combination of its open source foundation and use of commodity hardware, this is not the most referenced benefit
  • instead it is scalability that is most commonly quoted, with almost 2/3 [65%] of documented customer stories highlighting this as a key factor in adopting Hadoop. A common driving factor here is that organisations identify a need to retain more history than they could previously handle, or to explore new data sources that had inherently high volumes of data.
  • the next most reported benefit, and directly related to scalability, was speed of analytics (the time taken to run queries), with 57% of descriptive case studies highlighting this advantage
  • the 'cost driver', comes in third place with 39% of customers specifically highlighting the savings in adopting Hadoop  
What's always interesting when looking at factors like scalability of analytics, and speed of queries, is to understand 'in comparison to what'. Many of the use cases undertook comparative benchmarks against other technologies, but many have migrated up from other platforms, most commonly MySQL or SQL server (by number of customers). In these latter cases, the 'faster' argument is always a bit thin; new commodity hardware is always going to be more performant that older solutions. Another reason why it's so difficult to get objective information of user adoption of technologies like Hadoop.

I'll add another blog post here once the research summary is available, with a link for download. Alternatively drop me an email kevin [at] datafit.co.uk and I'll email it when ready (your email address won't get added to a mailing list or distributed further).

No comments:

Post a Comment