Achieving resilience through well-orchestrated data pipelines

Achieving resilience through well-orchestrated data pipelines

Independent Journalist Pat Brans and Ignites Technical Director Martin Hulbert.

There probably isn’t a single organisation in the industrial world that doesn’t have at least one data pipeline underpinning a critical process—for example, reporting, order fulfilment, or payment processing. Often, the chain of events can be improved through automation.

A data pipeline is the stream of information across an organisation from start to finish, with data coming from various systems and moving across many other systems before getting to its destination where it’s aggregated and analysed—and maybe stored in a data lake or warehouse. The information might come from inside the organisation or from outside, generated or collected by suppliers, partners, and even government bodies.

Data pipeline orchestration is about taking all the individual processes and data stores from all the different systems—whether they be disparate or homogenous, in house or outside the organisation—and orchestrating them to allow the information to flow smoothly from its source to its destination.

The challenges of resilience

One of the issues when data pipeline orchestration is not centralized is that you may not spot a problem until it’s too late. It’s not until you get to the end goal—for example, a report—that you see something’s wrong. It was wrong eight hours ago because that’s when the process started, but you had no way of knowing.

You might look first at the data warehouse. If the problem isn’t there, you start moving up the pipeline to try to find the point at which it broke. And of course, each time you go to the next system, you have to bring in a new set of system administrators. The number of people involved grows bigger—and somewhere along the line, the problem gets escalated to senior management, maybe even the C-suite.

Now you have C-level executives on calls with twenty to thirty people all trying to work out what went wrong. Eventually you get to the problem, but several hours too late. You fix the problem and execute whatever it was that went wrong, churning through the data pipeline.

The pipeline was maybe designed around your website, or maybe it involves getting orders through or getting invoices out to customers. When a problem occurs somewhere along a complex series of interdependent steps these critical processes come to a screeching halt. Then you have to track back through the pipeline to work out what went wrong and start it over again.

You can save a lot of time and resources by using a service orchestration platform to centrally configure and monitor your critical data pipelines both within your organisation and outside. But until recently, no technologies were available to truly coordinate data flows.

While APIs allowed you to connect different types of systems more easily, both inside and outside the organization, each connection stood in isolation. Moreover, many of the tools on the market worked on one specific machine or system, lacking the ability to span different systems and technologies. Processes running on different clouds don’t communication easily with one another—and it’s difficult to get them to work with on-premises systems.

Service orchestration platforms

When companies start to look at workload automation, they begin to find a way of bringing the pieces together. Service orchestration platforms are specifically designed to automate and visualise processes, workflows, and data pipelines across an entire organisation and outside the organisation.

A service orchestration platform is something like a spider, with a central administration point, which is essentially the heart of the tool—the body of the spider. And then beyond that are legs, which are the agents that then reach out into the on-premises network. They reach out into different types of cloud or on-premises systems.

Such a platform puts thousands of different jobs and tasks at your fingertips creating an overall workflow from end to end that tells you exactly what’s going to happen and when, and which parts are dependent on which other parts so you can line up all the pieces. You might have eight systems that produce one part of the puzzle, and they run in sync and do their parts exactly at the right stages. You can have those eight systems do their thing and then bring the results together with the output from another set of systems.

A centralized service automation tool provides various remediation steps—from simply alerting an administrator that something’s going wrong to taking one or more actions. The tool is aware of things that it can do to solve certain problems. If something is running slowly, for example, it might restart it or kill off certain other jobs that are known to cause a problem.

Normally these things would take a person to do. If this is running at one o’clock in the morning, you don’t want your administrators keeping an eye on things in the wee hours of the morning, when you get a system can do that for you.

The orchestration platform gives you end-to-end visibility. Moreover, it can gather and collect statistics over time, which will allow you to spot deviations. The platform can flag the deviations and notify you—or you can configure it to automatically perform remediation actions.

Data pipeline orchestration can be configured in several ways. One common setting is to have a data pipeline to run on a schedule—for reports, invoicing, or website maintenance. But another way of running a pipeline is for data scientists and analysts, who are becoming increasingly critical to organizations, to run a data refresh ad-hoc from start to finish to get reports as needed. And of course, you want the system to avoid any potential dependency issues or maintenance windows.

Sometimes it isn’t until something goes wrong that you under the dependencies between processes and systems—and these can be small systems that are off to the side. Someone decides to do some maintenance that takes a system offline, without knowing data was being pulled from the system. This could ruin a critical data pipeline.

This article was prepared following an interview between Independent Journalist Pat Brans and Ignites Technical Director Martin Hulbert.

The interview forms part of Ignites Insights Series. You can attend the first Webinar in our series entitled Building Operational Resilience: Harnessing Automation for Business Success, by clicking the banner below.