Data Science Data Management

Your Data Infrastructure in the New Normal

By Nick Lim, TIBCO Software
July 21, 2020

The rush to remote work resulted in multiple challenges for organizations around the world, including Singapore, and leading the list of challenges was technology. Business organizations are forced to scale IT infrastructure to support the sudden shift, resulting in a migration to cloud-based applications and solutions, a rush on hardware that can support a remote environment, and challenges scaling VPNs to support remote worker security.

These concerns in technology persist while the world gradually opens during the ‘new normal’ period, since the remote work setup won’t go away. In fact, in a Gartner survey of 317 CFOs and finance leaders that 74% will move at least 5% of their previously on-site workforce to permanently remote positions post-COVID-19. Companies have options to stay fully remote or to apply staggered workforce, which means part of the employees will still have to work from home.

As such, having the right technical infrastructure in place to support remote workers remains critical. Companies will need to continue to upgrade their infrastructure to operate at scale while reducing expenses, performance and security issues. Cloud seems to be the best option, and for some organizations, hybrid cloud is the answer since they can still host their data on-premises, as the hybrid cloud model supports both private and public clouds, giving them a combination of flexibility and security during IT infrastructure scaling.

However, in terms of data management in a remote or blended work setting, the diagnostics, resolution and optimization of data infrastructure has emerged challenged considering the vast, dynamic and interconnected nature of the underlying resources. With larger and much more complex sources and datasets, how can the data infrastructure, especially for the larger enterprises, cope with the challenges resulting from employees working in either remote or hybrid working arrangements during these unprecedented times?

The importance of democratizing data

While every business organization wants to be more data-driven, the silos created by the current work conditions driven by the pandemic forced companies, mostly enterprises, to rethink their approach to data access. Before, only a few elite data scientists could perform the task of analyzing complex data. Now, the goal is to empower anybody to use data at any time to make faster decisions with no barriers to access or understanding, allowing no gatekeepers that create a bottleneck at the gateway to the data.

Data democratization seems to be the popular approach, since it is said to be the future of managing big data and realizing its value. It is crucial in allowing data to pass safely from the hands of a few analysts into the hands of the masses within a company. Businesses that have armed their employees with the right tools and understanding are succeeding today because they are arming all their employees with the knowledge necessary to make smart decisions and provide better customer experiences.

In essence, data democratization starts with breaking down information silos as the first step toward user empowerment. Ideally, the tools will filter the data and visualizations shared with each individual — allowing employees visualize their data and align them with the organization’s KPIs: metrics, goals, targets, and objectives that have been aligned from the top-down that enable data-driven decisions.

With the right visualization and analytics tools in place, training the team becomes the next essential step. Since data democratization depends on the concept of self-service analytics, every team member must be trained up to a minimum level of comfort with the tools, concepts, and processes involved in order to participate.

Lastly, you cannot have a democracy without checks and balances. The final step to sharing data across your teams is data governance. Mismanagement or misinterpretation of data is a genuine concern. Therefore, a center of excellence is recommended to keep the use of data on the straight and narrow. This center of excellence should have a goal to drive adoption of data usage which is made possible by owning data accuracy, curation, sharing, and training. These teams are often most successful when they have a budget, a cross-section of skill sets, and executive approval.

Having a single source of truth

The remote working or blended working setup also paved the way for business organizations to having multiple data sources that are often silo-ed in different tools owned by different teams. Like missing pieces of a jigsaw, it is impossible to create a single source of truth (SSOT) — a concept used to ensure that everyone in the organization bases business decisions on the same data — from silo-ed data. Sales tools don’t easily integrate web or product analytics data. Marketing tools don’t easily integrate subscription data. You need to have a complete, unified profile of every person and company that has ever interacted with your brand.

As an effect, workflows are slowed down when a company does not invest in a single source of truth for their business intelligence. All too often we see that people's day-to-day workflow is slowed down by not having confident access to business data. Decision making is also impaired by a lack of a single source of truth in an organization, which should be driving their decision making by using as much data as is available. Where there is uncertainty as to the “current-ness” or validity of data, organizations often cannot answer what should be relatively trivial, though important. (NOTE: current-ness refers to the value of data for a period of time; this value can reduce rapidly when the market landscape is uncertain or changes rapidly).

That’s why deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional de-normalization of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information.

Removing bias in analytics

As companies rush to data democratization headlong as an answer to remote working or hybrid arrangements, the bias in data analytics becomes a major issue and steps should be taken to reduce it by developing advanced algorithms.

A chief cause of making the wrong decisions, bias in data analytics can happen either because the humans collecting the data are biased or because the data collected is biased. When data is biased, we mean that the sample is not representative of the entire population. But the more people in a business organization that tap on data analysis, the more there will be biases when collecting and analyzing responses which favor their research or project. In this regard, data scientists and citizen data scientists should be open to all kinds of viewpoints that would ultimately help to take better decisions.

Using deep analysis of data to help you with decision making is a good idea, but it can also backfire if the data is biased. Machine learning algorithms do precisely what they are taught to do and are only as good as their mathematical construction and the data they are trained on. Algorithms that are biased will end up doing things that reflect that bias. But bias in data analytics can be avoided by framing the right questions, which allow respondents to answer without any external influences, and by constantly improving algorithms.

Coping with recent times

Whether remote working will become part of the new normal for companies or they would prefer to prepare for the return-to-work of their workforce post-COVID 19, it is important to strengthen data infrastructure to allow for a more seamless and unified work process where the whole organization will have instant access to the same updated data.

One example of how the power of data is utilized for the benefit of the workforce in the post-COVID-19 situation is the newly launched TIBCO GatherSmart solution. Through an employee mobile app and employer control center, GatherSmart draws from analytics-driven data to help employers and HR in giving employees a digital daily passport to confirm if they have the green light to return to the office. GatherSmart helps ascertain if employees live in a “hotspot” and, with one click of a button, they can be advised if is safer to stay at home at any given day.

This powerful data-driven solution is made possible as GatherSmart pools the technical resources of TIBCO LABS and leverages the TIBCO Connected Intelligence Cloud platform.

Indeed, creating data infrastructure starts with the cloud. Then comes instrumenting and ensuring the telemetry is in place for the data center to aggregate as much data as possible. Essentially, the result will be a registry of all the assets in one place. A consistent asset model across the system will deliver higher value analytics, and that will allow expert data scientists or the democratized citizen data scientists to better control the context to gain insight.

Beyond reducing upfront costs and longer-term investment, analytics will decrease failures and interventions too. It will provide visibility and improve asset performance for higher uptime and longer meantime between failure. Ultimately, risk will be lower, and the life cycle optimized when applying data driven asset management, i.e. predictive analytics.

Nick Lim, general manager for APJ at TIBCO Software Inc. wrote this article. The views and opinions expressed in this article are those of the author and do not necessarily reflect those of CDOTrends. Photo credit: iStockphoto/metamorworks