As per EMCs study along with IDC, last year, big data became a big topic across nearly every area of IT. IDC defines big data technologies asanewgeneration of technologies and architectures, designed toeconomicallyextractvaluefrom very largevolumesof a widevarietyof data by enabling high-velocitycapture, discovery, and/or analysis. There are three main characteristics of big data: the data itself, the analytics of the data, and the presentation of the results of the analytics. Then there are the products and services that can be wrapped around one or all of these big data elements.
The digital universe itself, of course, comprises data all kinds of data. However, the vast majority of new data being generated is unstructured. This means that more often than not, we know little about the data, unless it is somehow characterized or tagged a practice that results in metadata. Metadata is one of the fastest-growing sub segments of the digital universe (though metadata itself is a small part of the digital universe overall). We believe that by 2020, a third of the data in the digital universe (more than 13,000 exabytes) will have big data value, but only if it is tagged and analyzed.
Not all data is necessarily useful for big data analytics. However, some data types are particularly ripe for analysis, such as:
- Surveillance footage.Typically, generic metadata (date, time, location, etc.) is automatically attached to a video file. However, as IP cameras continue to proliferate, there is greater opportunity to embed more intelligence into the camera (on the edge) so that footage can be captured, analyzed, and tagged in real time. This type of tagging can expedite crime investigations, enhance retail analytics for consumer traffic patterns, and, of course, improve military intelligence as videos from drones across multiple geographies are compared for pattern correlations, crowd emergence and response, or measuring the effectiveness of counterinsurgency.
- Embedded and medical devices.In the future, sensors of all types (including those that may be implanted into the body) will capture vital and non-vital biometrics, track medicine effectiveness, correlate bodily activity with health, monitor potential outbreaks of viruses, etc. all in real time.
- Entertainment and social media.Trends based on crowds or massive groups of individuals can be a great source of big data to help bring to market the next big thing, help pick winners and losers in the stock market, and yes, even predict the outcome of elections all based on information users freely publish through social outlets.
- Consumer images.We say a lot about ourselves when we post pictures of ourselves or our families or friends. A picture used to be worth a thousand words, but the advent of big data has introduced a significant multiplier. The key will be the introduction of sophisticated tagging algorithms that can analyze images either in real time when pictures are taken or uploaded or en masse after they are aggregated from various Web sites.
These are in addition, of course, to the normal transactional data running through enterprise computers in the course of normal data processing today. Candidates for Big Data illustrates the opportunity for Big Data analytics in just these areas alone.
All in all, in 2012, we believe 23 per cent of the information in the digital universe (or 643 exabytes) would be useful for big data if it were tagged and analyzed. However, technology is far from where it needs to be, and in practice, we think only 3% of the potentially useful data is tagged, and even less is analyzed.
Call this the Big Data gap information that is untapped, ready for enterprising digital explorers to extract the hidden value in the data. The bad news: This will take hard work and significant investment. The good news: As the digital universe expands, so does the amount of useful data within it.
Add new comment