Using Modern Data Analytics with Legacy System Data: Yes, You Can
- By Daniel Zagales, 66degrees
- October 09, 2023
Data assets are like houses: They're not always as modern as we'd like them. But just because they're a little outdated doesn't mean you have to scrap them and build replacements from scratch.
In the context of houses, it's common to have properties built decades before anyone talked about smart homes or green energy systems. But if you want to add these technologies to an existing house, you can do it quickly enough. Retrofitting may not be as cheap or easy as installing modern systems in a new-construction building. Still, it's by no means impossible, and the investment of time and effort is well worth it if it leads to a more efficient and comfortable home.
Likewise, some data assets are hosted on legacy application platforms that can't easily integrate with modern data analytics solutions designed for cloud-centric architectures. That does not mean, however, that the only way to modernize your approach to analytics when working with legacy system data is to refactor or rebuild your applications and migrate to the cloud. With the right approach, it's possible to "retrofit" your legacy platform to take advantage of modern analytics techniques and services while leaving your legacy system data in place.
In this article, I'd like to walk through tips for doing just that—applying modern analytics strategies to legacy system data.
What are legacy system data and modern data analytics?
Let me define what I mean when discussing legacy system data and modern data analytics.
Legacy system data is any data asset hosted on a legacy application platform. Examples of legacy application platforms include the classic platforms available from vendors like Oracle and SAP (which now offer modern, cloud-based platforms but continue to support their legacy platforms). Mainframes, which remain in widespread use among large enterprises, also count as legacy platforms.
We apply the "legacy" label to these platforms because, although they still work perfectly well in most cases, they are typically not as scalable or flexible as modern public clouds. Businesses use these platforms because they've depended on them for years to host key workloads, and migrating those workloads to the cloud is often not feasible, at least not in the short term.
As for modern data analytics, I'm referring here to any type of modern, cloud-centric data processing or analytics solution—such as the data analytics services offered by major public cloud providers or third-party analytics tools that expect data to live in a cloud service, such as Amazon S3 rather than in-premises legacy platforms. Mostly, these modern analytics offerings aren't designed with legacy platforms in mind and don't natively integrate with them.
Applying modern analytics to legacy platforms
But just because native integrations aren't available doesn't mean you can't integrate. Here are some ways to connect data assets hosted on legacy platforms to modern analytics tools.
Migrate or replicate your data
Probably the simplest approach in most cases is to migrate or copy your data from the legacy platform to a location where it will be more readily accessible to modern analytics services. For example, there are various tools and procedures available (which Amazon details in a guide to SAP data replication) that can replicate data stored on legacy platforms such that it is available in a public cloud storage service. And once you get your data there, you can connect it to modern data analytics services relatively easily.
The caveat of this approach is that you don't move your applications along with the data, and you may end up with two copies of the data – one in the legacy platform and one in the cloud. For that reason, you'll need to keep the copies of your data synced by, for example, periodically re-replicating data from the legacy platform to the cloud.
Legacy data processing services
A second approach—more complicated to implement but more flexible and manageable once set up—is to process and analyze data using modern data services capable of supporting on-prem data, including data hosted in legacy platforms. Examples include Fivetran Local Data Processing (formerly HVR 6) and Google Cloud Cortex Framework. These solutions can directly connect to legacy system data and then integrate it into modern analytics pipelines.
These solutions come with a price tag, and they're proprietary, so you may end up locked into specific vendor tooling. However, they offer the significant benefit of allowing you to leave your data in place while analyzing it in a modern way.
The benefits of legacy data analytics
At this point, you might be wondering: If there are tradeoffs to applying modern analytics to legacy system data, why wouldn't I just move it to the cloud?
In an ideal world, you would move everything to the cloud. But in the real world, cloud migration can consume significant time and resources, especially if you're dealing with legacy applications and data whose architectures don't lend themselves easily to migration to cloud environments.
So, rather than wishing all of your data were in the cloud or undertaking a complete migration to the cloud, leveraging modern analytics for legacy data is a means of making the most of a suboptimal reality. Over time, you can slowly migrate your data resources to the cloud, but in the meantime, you can take advantage of the techniques described above to analyze your data in a modern way and derive greater value from it.
Plus, this strategy helps technical stakeholders highlight the benefits of modern analytics approaches, which can gain buy-in for increased investment in modern, cloud-centric technology. Businesses with extensive legacy application investments often struggle to overcome the inertia blocking the way to cloud migration. Suppose you show your executives how much additional value will be gained by applying modern analytics to legacy data. In that case, it becomes easier to kick-start the process that will eventually get all of your assets into the cloud—although, again, you won't have to wait for it to arrive before you can modernize your approach to analytics.
Conclusion
Given unlimited time and resources, every business would modernize every facet of its IT resources by moving them to the cloud and working with them using cloud-native approaches. But that's rarely feasible, especially in the present era of budget constraints and staffing shortages. Just as demolishing every existing house and replacing it with a fully modern, new-construction alternative is not practical, many organizations don't want to migrate wholesale to the cloud.
Fortunately, you don't have to build everything anew to use modern analytics technologies. You can keep your legacy platforms and assets while retrofitting them using modern analytics techniques that you can extend to legacy resources.
Daniel Zagales is the vice president of data engineering at 66degrees.
The views and opinions expressed in this article are those of the author and do not necessarily reflect those of CDOTrends. Image credit: iStockphoto/chainatp