Becoming a “Data Trader”

Following its split from the New York Stock Exchange in 2014, Euronext became the first pan-European exchange in the eurozone, fusing together the stock markets of Amsterdam, Brussels, Dublin, Lisbon, and Paris. Euronext comprises close to 1,300 issuers, reporting a total market capitalization of 3,700 billion euros at the end of March 2018.

In 2016, Euronext began the typical process of migrating its data to the cloud. Except that this migration had nothing typical about it at all. First off, the Euronext database contained 100 TB of data – one of the biggest in Europe. Then there was the fact that this was not just a simple transfer of a database to a hosted platform. The idea was to create a governed data lake with self-service access for business units and clients in an effort to monetize new services and generate additional revenues.

Migrating to a Governed Cloud

“We use Optiq, an incredible trading platform, with systems that practically work in nanoseconds,” said Abderrahmane Belarfaoui, chief data officer (CDO) at Euronext. The huge Euronext database is the active memory of transactions handled directly by the stock exchange operator
(1.5 billion messages and one billion transactions per day).

“The database is compressed (at a rate of 400 percent) but some information was not being archived for lack of space,” said Belarfaoui about the problems of the old system.

Before 2016, Euronext stored its data on site, on hardware from one of the big names in the industry. But Euronext’s storage needs continued to grow, especially following several acquisitions, such as the Dublin Stock Exchange and Fast Match in the U.S.

“Our IT infrastructure had reached the end of its lifecycle in our European operations, where regulators were expecting that Euronext store more and more data,” said Belarfaoui.

“Moreover, sometimes we had to wait six to twelve hours after market close on days with important events, such as the U.K. Brexit vote before we could send the data to business units and clients.”

The situation prompted the CDO to look at moving to a hybrid cloud model. “We still keep trading platform information on an on-site server because lag times are not yet available on the cloud,” explained Belarfaoui. “We also use AWS Managed Services in serverless mode together with Amazon S3 to have access to a data warehouse with unlimited storage capacity. For analysis, we use Amazon Redshift. And taking advantage of the cloud’s great scalability, we can run the whole system while anticipating events that cause high volumes on the markets.”

Still, the transition to a Platform as a Service (PaaS) does require one key condition: remaining independent of the cloud provider.

Why Talend?

Euronext chose Talend Big Data to absorb real-time data in the data lake, including internal data from its own trading platform; and external data, such as that from Reuters and Bloomberg.

“The core of the data lake is managed by Talend. It was very important for us to keep this “independence” compared to the layers below Talend. So, if tomorrow Euronext wants to change clouds, they can,” said the CDO, happy about the greater flexibility.

In an ultra-regulated world, Talend has also proven to be highly adept at meeting the challenges of data lake governance and regulatory compliance. Being able to safely open data involves knowing it inside out, keeping track of changes and the history of data feeds, and knowing how to classify them in a granular structure.

“We have an Amazon S3 storage that is shared by everyone. So, I have to know who owns data from the start (the data owner), who has access to what, whom to ask, who can use it, and who has priority over whom. Our data stewards protect the organization of our data,” added Belarfaoui.

This governance strategy is applied in very specific tools, such as the Talend Data Catalog. A dictionary is created together with each technical project for each individual market. These dictionaries are used to find the history of end-to-end data, from the sources to the reporting.

“Now I can see when S3 is the data source, I can add value to the data, combine it with other data, and convert it into other data in Redshift,” said the CDO, who is very satisfied with the new process.
“I can also add tags. Typically, we add the storage duration. For example, whether data has to be kept for ten years, or five years (per MiFID II), or if it should be archived.”

At the same time, data lineage with Talend drastically reduces impact analysis costs. “One simple example comes to mind: we plan to change the value of an index on the British stock market. Once we integrate it into our systems, it propagates itself pretty much everywhere. Currently, we have to figure 200 person-days just to find the index in our different systems. But with the dictionary, we are able to run this data lineage with just one click.”

Monetizing Stock Market Data

Two years after its launch, the governed lake project with Talend and AWS is a success. “The initial returns are more than positive,” said Belarfaoui. “On the technical side, we can manage ten times more iso-budget data.”

Beyond the improved architecture, the migration is also positioning Euronext to become a “data trader.” The stock market operator wanted to be able to refine and add to its wealth of data in order to monetize it. In fact, the sale of data already brings in 20 percent of Euronext’s revenues.

“Traders actually sell, buy, and make their investment decisions in milliseconds. They have a huge appetite for aggregated data in real time [especially on] who sells which stock, to whom, at what price and when. We are in the best position to track the performance of the CAC 40 or other indexes and sell that information to investors through our Datashop platform,” said Belarfaoui.

In addition to clients, this project also involves giving data scientists and business units self-service access to this data, which they can analyze in data sandboxes for tasks such as market monitoring. Belarfaoui explained: “We can set up an environment for a data scientist in less than one day, compared to the 40 days it used to take, and we have moved from D+1 analytics to real-time analytics. This is fundamental to understanding markets, clients, competitors, and how they interact.”

This is a real turning point for Euronext. “In 2016, we identified the need, but we didn’t have the capacity to do it. At the time, we could only relay the volumes of market activity to market regulators (MiFID II). Today, we can dig deeper. Under the General Data Protection Regulations (GDPR), I have to know where personal data is stored. If I receive requests for modification or deletion, I can find the data, thanks to the dictionary,” said Belarfaoui. “Similarly, a user who searches a transaction can instantly see if it is confidential. Once data is identified as being critical, the data steward can deny user access.”

Euronext is just at the beginning of its digital transformation. A study is currently underway on the deployment of Talend’s Master Data Management (MDM) solution.

“We are working on ‘golden sources’ within all of our systems (CRM, trading, billing, finance, various departments, subsidiaries, etc.). The goal is to make all Euronext data even cleaner and of higher quality, such as by being sure that a client is consistently represented across all systems. Such standards will make our data even more usable,” said Belarfaoui.

This case study was contributed by Talend.

