The data landscape has evolved considerably in the last decade, leading to more data creation, retention, and analysis to gain insights and make data-driven decisions. A challenge facing many organisations is not only how to best manage, store, protect and share data, but also how to calculate the true cost of storage management. Quantum’s Eric Bassier explores the hidden costs of storage management and how modern storage solutions can add business value through data insights and automation.
Current data growth trends have amplified the challenges organisations face in storing and managing large amounts of data. In particular, the growth of unstructured data – from video, audio and image files to geospatial, genomic and sensor data – continues to rise with estimations that it will represent 80% of all the data on the planet by 2025. The World Economic Forum estimated that over 44ZB of data was collected in 2020, while IDC predicts a five-year compound annual growth rate (CAGR) of 26% by 2024.
In this data-driven world, making data work for the business requires a new level of insights and automation. Huge unstructured data growth has exacerbated the challenges organisations already face in storing large volumes of data using pre-cloud, legacy architectures. To help alleviate this burden, organisations are seeking solutions that can provide insights into their ever-growing data.
However, calculating the true cost of storage management isn’t as straightforward as one might think.
One problem is that traditional storage TCO calculators don’t include the cost of managing data over its lifecycle. Everyone from the CIO to the storage admin wants a solution that doesn’t just store data but can also help unlock tangible business value from the data through insights such as who owns it, where it should live (on prem or in the cloud, which tier and when), how it should be protected and when it should be deleted.
Another problem with current TCO calculations is the false assumption that the value of data decreases over time. They also don’t take into account other business variables such as the need for data viability, resiliency, security, and mobility. Organisations want to automate the use of different classes of storage based on policies defined by business requirements. Lack of automation forces them to move data manually or using custom scripts that are error prone and require upkeep. Both methods consume additional resources that aren’t accounted for in today’s TCO calculations.
Adding value beyond traditional storage TCO
True TCO must incorporate the value of the data to the business, along with the cost and opportunity loss of managing the lifecycle of the data. Being storage efficient means achieving maximum productivity with minimum wasted effort or expense. The next generation of methodologies affecting system efficiency must be based on data insights and policy-based automation.
Through data insights, organisations can eliminate the need for scripts that crawl their entire storage system trying to gather single point-in-time statistics at the expense of application I/O and human resources. Capturing business intelligence gives organisations granular and relevant insights that enable timely access to data and provide insights into data-centric behaviour anomalies.
With policy-based automation, organisations can control the data lifecycle by proactively moving data based on application requirements at that point in time – for example, whether it needs a performance tier, or the cloud to leverage elastic compute.
It can also be used to protect and secure data based on compliance needs or delete data after its defined useful life. By automatically and purposefully placing data where it will be most effective, this kind of automation can enable organisations to improve process completion times, increase the number of projects per resource, and reduce wasted effort.
Hidden cloud costs
The growth of unstructured data has also led many organisations to rely on private or public cloud to address short- and long-term storage needs. However, the cost of storing data in the cloud can quickly spiral out of control due to ‘hidden’ costs that can be unpredictable and change with usage.
For example, putting data in the public cloud is free, but egress can result in unintended costs at multiple dollars per terabyte retrieved. If data is cold and stored only as an insurance policy, egress costs may not be noticeable. If data needs to be retrieved periodically, identifying and pulling only data that’s needed can minimise egress costs. Every piece of data retrieved that’s not relevant is money wasted. Organisations need the ability to categorise and organise data to ensure that only relevant data is consuming storage and compute resources.
In addition, each public cloud has its own interface, which makes it hard to move data from cloud to cloud. Organisations need to be able to move data across clouds and premises based on metadata and tags without the need to interface with each one separately.
Once data is in the cloud, it’s hard to track it. What data is being stored and why can be difficult questions to answer without knowing who created the data, who owns it, and what value it represents to the organisation.
Everytime a search is performed, there’s an associated cost in time and money. Organisations need a metadata and tags repository so that they can perform real-time searches without having to access the data or put a strain on computing, human, or budget resources.
Once a file is moved to the cloud, if it needs to be accessed, it must be manually retrieved. Each operation may result in a cost, which can result in a significant financial expense over many operations.
Data needs to be tagged based on how applications use it so that the data can be automatically copied or moved to where it’s most efficient to keep it. In some cases, a copy of the data may remain on premises while another copy is stored in the cloud, and data is delivered to the application from wherever is most efficient. Machine learning and read-ahead techniques can be used to minimise access times and data movement.
Calculating the true cost of storage management is clearly more complex than it seems. True TCO should account for increases in productivity, enabling organisations to take on a greater number of projects, improve customer satisfaction with decreasing time to market, and achieve greater output and higher revenue.
Using data insights and automation, organisations can achieve operational efficiencies and effectiveness while managing their data’s lifecycle, putting them in the best possible position operationally, financially, and competitively.