It could be said that claiming something is big borders on subjective, so this is what makes a term like ‘big data’ so intriguing. This buzzword appears to have been a recent addition to the ITC dialog, but what does it really mean? Various groups believe they have a handle on it. Interestingly, if you comb the internet you’ll see the term being sponsored by companies such as NetApp, EMC, and IBM as vendors that manage big or large amounts of data. Does this mean that these companies are p
ositioning themselves to capitalise on this slice of the pie or are they still trying to ascertain what it really means in a commercial sense before it’s more widely adopted by the market?
If we start with Wikipedia definition of big data, then it’s “a collection of data sets so large and complex that it becomes awkward to work with using on-hand database management tools”.
So I wonder if this is why that sales order report I ran last week took so long? Was my database considered big data?
The reality is that most of our SME businesses in Australia have databases that are not much larger than 100GB.
This on the grander scale is quite tiny.
As it is early days, big data sizes are a constantly moving target, but as of 2012 it is thought that they range from a few dozen terabytes to a few petabytes or even an exabyte of data in a single data set.
This of course is considerably larger than the previously referred to 100GB. To explain this further, one terabyte is 1000GB.
A petabyte is 1000 times that, and an exabyte is 1000 petabytes.
This means that the 100GB dataset is just 0.00001 per cent of a petabyte.
So in practical terms, what is an example of big data?
If we look at one of the largest retailers in the world, Walmart, who handles more than 1 million transactions per hour, it has databases estimated to be more than 2.5 petabytes in size.
At this size, managing them effectively represents a core processing requirement as well as massive opportunities for the business – if they had the tools to process it.
Imagine how valuable this information would be, given it contains customer buying habits across the wider enterprise which could be used for benchmarking or supplier purchasing negotiations.
This data also has the potential to be sliced and diced in a business intelligence tool by territory, brand, season, style, colour, size, quantities, vales, discounts and the list goes on.
For a business this size they’ll also be able to launch enormously more efficient marketing and sales campaigns and even sell the data back to the vendors who are looking to better understand their sell-through percentages.
Further examples of where big data is common include social data collated via social networks, web logs, internet search indexing, call detail records, and large scale e-commerce.
Some of the industries that will benefit are astronomy, chemical research, genomics, biological, complex scientific research, military surveillance, medical records, photography and video including surveillance archives.
Quoting a recent Cisco forecast, by the end of 2015, global data centre IP traffic will reach 4.8 million zettabytes and traffic will reach 402 exabytes per month.
If we refer to the above terminology then that relates to 4.8 million terabytes worth of potential untapped data out there.
From an Australian perspective there are probably only a handful of retailers collecting enough data in what would be considered big.
Here we are referring to the data collected via Coles’ FlyBuys, Myer’s Myer One, and Woolworths’ Everyday Rewards card systems.
These programs are generating large and continually expanding data sets with the purpose to use this data for predicting customer buying patterns, which can in turn personalise offers (market) to elevant participants.
You may have seen emails or SMS’ with offerings that speak directly to you.
Most retailers I speak to agree they want to understand more about their customers via analytics but are struggling to make sense of their most basic current data, including customer details and purchase history.
Some are working on CRM initiative to capture more attributes and buying patterns.
Take for example the value gained if via their websites they could capture and analyse what products customers are searching for to draw a direct correlation to stockholding in store to support their omni-channel infrastructure.
It’s probably fair to say that another issue facing SME retailers is they might not have the capabilities or capital expenditure and are struggling to ascertain the appropriate technologies and skills required to convert their existing data into actions.
This year a big data conference was held in Australia by CeBIT, which indicates to me that there’s a need to analyse its impact and potential options from a technology perspective.
I can only ascertain that the message for all retailers is to explore methods now on how to capture more data, store it and keep abreast of the developments around processing it, even if yours isn’t considered ‘big data’.
* Stephen Duncan is product manager retail and CRM at Pronto Software. He can be contacted at stephen.duncan@pronto.net.