The lifeblood of any application is the data it uses to make decisions. It used to be that all data and applications resided within the confines of the corporate firewall, but public cloud platforms like Amazon Web Services, Google Cloud Platform and Microsoft Azure provide elasticity and cutting-edge features to today’s organizations. Because of that, more and more companies are challenging the conventional thinking that all corporate data should always be stored on premises.
But when is it appropriate to silo data, versus integrating it across different cloud platforms?
Low Hanging Fruit: Dev/Test
The easiest, most cost-effective step an organization can take to leverage the elasticity a public cloud provider offers is to run dev/test workloads on them with sanitized data. The sporadic nature of those dev/test workloads is a perfect match for the on-demand provisioning, pay-by-the-hour capacity that public clouds provide.
Even if your dev/test workloads run eight hours a day, five days a week, without public cloud providers an organization is still paying capital expense on internal hardware 24 hours a day, seven days a week. By creating a sanitized data set for dev/test workloads that mimics production but lowers risk (should that data be breached), companies can free up that 128-hour difference between 8-5 and 24/7 through hardware reduction cost savings.
Production Marketing Website: Data That’s Already Public
The second easiest place to look for data that can be moved to the public cloud without risk is a production marketing website. Data associated with these applications are already public, and their constantly changing demand, based on various marketing activities, makes them great candidates for taking advantage of cloud elasticity.
Harder Decisions: Sensitive Production Data
With the easy choices now made, some harder decisions need to be examined. Public cloud platforms can provide features instantly that would take much longer to build within your data center. For example, Internet of Things, Big Data, device farms and data streaming services are difficult to reproduce in-house at a reasonable cost, but companies typically don’t want to risk moving that data outside a corporate firewall.
One option is to set up Virtual Private Networks (VPN) with the public cloud providers, so that services running on them can have secured access to part of an internal network, and therefore the data on that network. Public cloud providers make setups like this easy to accomplish, but for data that is used by both internal and external applications, care must be taken with the network segmentation so that a breach to the external world on the VPN does not lead to the internal consumers of the data as well.
Another option is to create an Application Programming Interface (API) for the data source that uses HTTPS encryption and key authentication access. This is more costly than the VPN approach, but prevents raw access to the data and creates a cleaner entry point that does not carry nearly the same network segmentation risk.
Consider All Your Options
Any organization looking to augment on premises infrastructure with cloud platforms needs to take data access seriously. Some data is sensitive enough that it needs to remain inside internal silos, but there are plenty of opportunities for other data. Sanitizing dev/test workloads can reap huge benefits, enabling a company to offload internal hardware for public cloud elasticity. Marketing websites already have public data in them, and their wildly changing demand makes them a good fit for public cloud as well. Other pieces of data may be needed both places, and whether a VPN or API approach is used, both offer ways to integrate data across internal and public cloud infrastructures.
A 20+ year tech industry veteran, Pete Johnson is the VP of Product Evangelism and Enablement at CliQr Technologies. He can be found on Twitter at @nerdguru.