Big data is a frequently used term that describes the exponential growth and availability of structured and unstructured data, data so large and complex that it can be difficult to process via traditional data processing applications.
In the Big data era, companies want to to make use of large quantities of information because they realise information is a key business asset, providing opportunities for improvement and future growth. As a result, organisations undertaking Big data projects are at risk of becoming information hoarders, with all of the costs and risks associated with this approach.
Business Intelligence programmes have been around for some years now, providing valuable information contained within structured data. However, structured data is only a fraction of the information of value in companies today. The content analytics to handle unstructured data are not as mature as Business Intelligence solutions, so as a result, many organisations hoard their information wherever it resides.
Reports indicate that the amount of information managed by enterprise data centres will grow by a factor of 50 over the next 10 years, which is concerning when you consider that as much as 75% of that information is likely to be of no value to the organisation whatsoever.
Information Governance and Big Data
Organisations need to rethink their approach to information in the world of Big data. Currently, most organisations believe the best policy is to retain all information in the hope that they will be able to identify and exploit meaningful information wherever it resides. This not only highly inefficient and greatly increases storage costs, it comes with potentially serious compliance and eDiscovery risks.
Think of email. Surveys have shown that over 50% of email users actively delete their emails. This is usually time-based deletion. The user doesn’t know whether the information in the email is of value to the organisation or whether it may be needed in any future litigation procedures.
An Information Governance programme would help here, to better manage information. The goal is to ensure that, as far as possible, only meaningful information resides in corporate systems, whether it is structured or unstructured information or information held in the new communications channels that reside outside the corporate firewall.
A recent survey showed that less than 30% of organisations had plans to introduce a Governance programme for Big data in the next 12 months. If you are considering Information Governance for Big data, you should fully understand:
- How information is to be stored, identified, collected and reviewed
- Which communication channels are used within the organisation and how information in those channels is created and communicated
- What the business value of structured and unstructured information is to the organisation
- How your Big data programmes operate in conjunction with regulatory compliance and eDiscovery responsibilities
- When information can be removed from corporate systems and the process for disposal