Using automation to connect the dots in a hybrid cloud
Interview with Dave Shannon: David Shannon has been a director and consultant in data, analytics, AI, and automation for over 20 years. In his current role as head of hyperautomation at SAS UK and Ireland, he helps organizations drive digital transformation through AI and automation.
While most CIOs put effort into making sure their hybrid cloud environment has secure connections, too many overlook the need for orchestrated processes and seamless data flows.
The IT assets of most organisations have been gravitating towards the cloud for more than a decade. CIOs recognise the advantages of the cloud—especially the ease with which they can scale up and back down again and the fact that offloading the heavy lifting to the cloud frees in-house technologists to focus on more strategic tasks, such as innovating to gain competitive advantage.
But cloud is not necessarily a good fit for everything an organisation already has. Many legacy applications cannot be adapted to the cloud environment, and the compromises required for “lift and shift” don’t always make sense. Sometimes it takes years to move a legacy system to the cloud. Consequently, many organisations wind up with a hybrid cloud environment, which consists of at least one on-premises data center and at least one cloud service.
Challenges inherent to hybrid cloud environments
These organisations find themselves straddling at least two separate installations, which brings on a new set of issues. “The first challenge is security and compliance,” says David Shannon, head of hyperautomation at SAS UK and Ireland. “You have to connect and integrate the two environments in a legally compliant and secure way. The second big challenge is data. You need to ensure reliable data exchange among systems that are not co-located.”
Most of the data usually winds up on the cloud, so analytics platforms and the operational systems that use the data tend to be on the cloud too. But often they need to reach into an on-premises facility to extract data from a legacy system or from an application that is left on-premises for other reasons. In the case of multicloud, they might also have to reach into a separate cloud environment.
“These exchanges introduce new difficulties around how you orchestrate processing,” says Shannon. “Different types of workloads are run in different locations. And you need timely output from different systems because processes in other locations depend on it. Your ability to coordinate all that from a central location, while complying with your security policy, makes all the difference in how well you maximize your return on IT spend.”
One way to avoid cost overruns is to run analytics and advanced processing where the data is stored, a notion sometimes called “push down analytics.” This arrangement minimises the cost of large data transfers and speeds up execution. The rest of the processes can be built around that configuration. The major analytics platform can sit in the cloud and connect to data agents located on-premises to ingest the data in that environment. Each data agent is responsible for the analytical processing or querying in the location where they sit, returning summary information to systems in other environments.
Cost is important, but so is visibility. “Getting that consistent visibility of what’s going on where is a constant challenge today,” says Shannon. “But you need the big picture, through catalogues and lineage. This integrated view helps you avoid duplication, such as two processes doing the same thing in different locations, which usually leads to duplicated data.”
“Once you have view of all the processes in different locations, you can determine dependencies and SLAs,” says Shannon. “What you design in terms of high availability, resilience, timeliness, and latency will be determined by where things occur and who or what depends on the output.”
The need for central control
When different teams with different skill sets working in different locations need to join two or more systems, they rarely sit down together to come up with an optimal solution. There may not be a central point of control—both in terms of people management and engineering systems—and the different teams may not have the same priorities.
So people often resort to manually exporting information to flat files or other formats, and then transferring it by hand. This arrangement opens the possibility for data to diverge. For example, the underlying data may change at the source after the copy is made and before it is used by a process in a different environment.
Automation will solve many of the problems, but only if it operates across environments and is controlled centrally. Robotic process automation (RPA) and DevOps processes can be used for automation within the cloud. But most information security officers are reluctant to allow a cloud process from outside of the organisation to reach into an on-premises data center and directly invoke a process that returns data back to the cloud.
The goal is to build a set of pipelines or flows that present data to whichever systems need to consume it (or the intelligence derived from it) within the boundaries of security and resilience requirements. Remote data agents are necessary, and the authentication and authorisation credentials needed by those agents must be managed in the environment where they run. You certainly don’t want to store in the cloud the credentials for a database that sits on-premises.
You also need to make sure the different processes can interoperate. Take for example, a case where query needs to be pushed down into a database that sits on-premises in an old RDBMS and holds very valuable and highly trusted data. If the process making the request runs much faster than the older database, you need to add a layer to ensure compatibility. You can only do this if you have visibility into all the different things happening in the different locations and ways of making changes centrally.
Visibility is important, but once again, so is cost—especially now. “Many CIOs have been surprised by rising cloud costs over the last year,” says Roger Tunnard, CEO of Ignite Technology.
This interview forms part of Ignite Insights Series. Our second event “Managing Hybrid Cloud Costs: The Automation Advantage” is now ready to watch on-demand.