Driven by the rising use of rich media and e-mail applications and given that policies favour storage at more than one place, unstructured data in organisations is witnessing an astounding growth today. According to a recent report released by IDC, unstructured data in traditional data centres will eclipse the growth of transaction-based data that has so far accounted for the bulk of enterprise storage needs. The report further projects that the transactional data will see a compound annual growth rate (CAGR) of 21.8%, while unstructured data will grow at 61.7% at data centres.
Moreover, the critical data is estimated to see 52% growth annually. This situation has eventually asked for an increased effort from IT departments of the organisation to minimise the total amount of storage and network bandwidth required, and to improve availability and lower the total cost of ownership (TCO) in terms of hardware, administration, and environmental costs. And thats where tools like data de-duplication comes to the rescue. Often referred as a cost-effective information management tool, the technology helps enterprises to address the pressing IT challenges related to effective usage of storage.
In simple terms, data de-duplication, also called intelligent compression, refers to the elimination of redundant data. It plays a role of a catalyst
in controlling storage costs and helping enterprises to simplify operations and better manage workloads at remote offices, virtual machines, and data centres. By elminating the need for additional hardware, it brings with it economic benefits as well. Let's understand how de-dup can solve the puzzle of umpteen amount of growing data that gets stored in an organisation. Consider this: a marketing manager sends out a 1MB presentation to each member of his sales team, thereby making the same presentation available to 5-10 different mailboxes. Now, on the enterprise storage and backup systems, the same file may get copied in 10 locations, occupying a 10MB space, a redundancy that is of no use. Now by allowing the run of deduplication process, the duplicate data is deleted, leaving only one copy of the data to be stored, leaving more storage space more for you. It essentially does three things-redundancy identification, fingerprinting and redundancy elimination.
Data de-duplication helps IT managers to utilise disk archiving platforms, which means increased storage capacity at any given time, says Gaurav Kohli, Consultant, Xebia IT Services. For optimum usage, De-duplication hash calculations are created on the target device as the data enters the device, in real time. If the device spots a block that it already stored on the system it does not store the new block and instead just references to the existing block.
"Regardless of the operating system, applications, or the file system, all data is written to storage by using data reference pointers. After that, the catalogue and indices of all data objects are maintained using hash and a comparison between the two data objects is made. explains Suresh Kakkar, Practice Manager, Wipro.
Moreover, if you are still using tape as an alternate backup medium, the technology will help you to make smaller backup windows, which means relatively lesser tapes. This not only saves costs, but also eases manageability, Kohli notes.