“Data storage cannot be a set and forget exercise”
We spoke to Kumar Goswami, CEO of Komprise, about data management problems and data storage strategies. Without a thoughtful plan and process for the continual management of data from a cost, performance and business-need perspective, CIOs face impending disaster.
The preeminence of the data-driven business has been gaining steam over the years; CIOs without a practical plan to get there won’t survive. Yet evolving into a data-driven organization requires a multi-faceted strategy, from technology decisions and hiring to setting organizational priorities.
Analytics and AI tools, people, skills and culture are of course necessary ingredients for data-driven operations. What may be overlooked lies deeper: how the data is stored and managed itself. Data storage cannot be a set and forget exercise. Our world has changed too much in the past decade.
Without a thoughtful plan and process for the continual management of data from a cost, performance and business-need perspective, CIOs face impending disaster.
JAXenter: Given that set up, take us back to how we got to the current data management problem in the first place. How did it start?
Kumar Goswami: Twenty years ago, all but the very largest companies had just one or two file servers in the data center and the data being stored was largely transactional. Data sprawl was not a concern. Yet with the rise of the smartphone, online business models, IoT and cloud apps, a data deluge ensued. Coincidentally, compute power became cheap enough, thanks in part to the cloud, to where companies could analyze massive volumes of data like never before. Meanwhile, data volumes have grown exponentially in recent years as technology has become pervasive into every aspect of work and home life with smart home devices, industrial automation, medical devices, edge computing and more.
A host of new storage technologies have come into play as a result, including software-defined storage (SDS), high performing all-flash NAS arrays, HCI, and many flavors of cloud storage. But storage innovation has not solved the problem and in some cases has made it worse because of all the silos. It has become unfeasible to keep pace with the growth and diversity of data—which these days is primarily unstructured data.
JAXenter: Despite the data explosion you just described, I’m guessing that IT organizations haven’t necessarily changed storage strategies, correct?
Kumar Goswami: That’s right. They keep buying expensive storage devices because unassailable performance is required for critical or “hot” data. The reality is that all data is not diamonds. Some of it is emeralds and some of it is glass. By treating all data the same way, companies are creating needless cost and complexity.
For example, let’s look at backups. The purpose of regular backups is to protect the hot or critical data, to which departments need reliable, regular access for everyday operations. Yet as hot data continues to grow, the backup process becomes sluggish. So, you purchase expensive, top-of-line backup solutions to make this faster, but you still need ever-more storage for all these copies of your data. The ratio of unique data (created and captured) to replicated data (copied and consumed) is roughly 1:9. By 2024, IDC expects this ratio to be 1:10. Most organizations are backing up and replicating data which is in fact rarely accessed and better suited to low-cost archives such as in the cloud.
Beyond backup and storage costs, organizations must also secure all of this data. A one-size-fits-all strategy means that all data is secured to the level of the most sensitive, critically important data. Large companies are spending 15% of IT budgets on security, according to a recent survey.
JAXenter: So, where do large enterprises go from here? What is the best approach to modern enterprise data management?
Kumar Goswami: It’s time for IT execs to create a sustainable enterprise data management model appropriate for the digital age. By doing so, organizations can not only save significantly on storage and backup costs, but they will be able to better leverage “hidden” and cold data for analytical purposes. Here are the tenets of this new model:
- Automation. It is no longer sufficient to do the annual spring-cleaning exercise of data assets. This needs to be a continual, automated process of discovery, using analytics to deliver insight into data (date, location, usage, file type) and then categorize the data into what is hot, warm, cool and cold. Having ad hoc conversations with departmental managers is inefficient and no longer scalable. Get data on your data.
- Segmentation. At a minimum, create two buckets of the data: hot and cold. The cold bucket will always be much larger than the hot one, which should remain relatively stable over time. On average, data becomes cold after 90 days but depending on the industry, this can vary. For instance, a healthcare organization that’s storing large image files may consider a file warm after three days and cold after 60 days. Select a data management solution which can automatically move data to the age-appropriate storage device or cloud service.
- Dead data planning. It can be difficult to know with confidence when and how IT can eventually delete data, especially in a highly-regulated organization. Deleting data is part of the full lifecycle management process, even though some organizations never delete anything. Analytics can often indicate which data can be safely deleted. For instance, an excellent use case relates to ex-employees. Companies are often unknowingly storing large amounts of email and file data from employees who have left the company. In many cases, that data can be purged—if you know where it lives and what it contains.
- A future-proof storage ecosystem. New data storage technologies are always around the corner: DNA storage is just one example. Organizations should work toward a flexible, agile data management strategy so that they can move data from one technology or cloud service to the next without the high cost of vendor lock-in; this can be in the form of excessive cloud egress and rehydration expenses which are common with many proprietary storage systems. This viewpoint can be rife with conflict in IT shops with entrenched vendor relationships and the desire for “one throat to choke.” Over time though, this can be a limiting strategy with more downsides than upsides.
SEE ALSO: Why Integration Testing is Taking Off
IT leaders have the potential to unleash previously untapped unstructured data stores to enhance employee productivity, improve customer experience and support new revenue streams. Getting there requires evolving from traditional data management practices. It’s time to stop treating all data the same – perpetuating the endless cycle of backups, mirroring, replication and storage technology refresh. By developing a modern information lifecycle management model with analytics, automation, and data segmentation, organizations can have the right data in the right place, at the right time and managed at the best price.