A.I. Machine Learning

Ping An Trumps Prestigious ML Contest

By CDOTrends editors
March 28, 2020

Ping An keeps flexing its AI muscle. The latest is achieving the pole position at Stanford Question Answering Dataset 2.0 (SQuAD 2.0) of Stanford University—for the third time.

SQuAD 2.0 is a test of machine reading comprehension. It is an important benchmark test that pits rival ML teams from international players with each other, while comparing their efforts with those from humans. Previous winners include Microsoft, Google and Alibaba.

The test involves a reading comprehension dataset, comprising questions on a set of Wikipedia articles. The answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

SQuAD 2.0 combines the 100,000 questions in SQuAD 1.1 with over 50,000 unanswerable questions that look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible but also determine when no answer is supported by the paragraph and abstain from answering.

In this competition, the ensemble model of ALBERT + DAAF + Verifier submitted by Ping An Technology achieved an Exact Match (EM) score of 90.386 for answers that were an exact match to the standard answers, and an F1 score of 92.777 for partially correct answers.

The DAAF (Data Augmentation and Auxiliary Feature) is a learning framework developed by Ping An and played a key role in the test. The framework contains forward and backward algorithms. The forward algorithm can absorb the data for enhancement from external data, and the backward algorithm can filter out the data that has a negative impact on enhancement.

Both DAAF and F1 results places Ping An first overall among global competitors. Shanghai Jiao Tong University is second. Google and Qianxin share fourth position. A previous effort by Ping An is also rwanked third.

Both Ping An scores beat average human performance, according to SQuAD 2.0. Ping An's EM score of 90.386 was 3.56% higher. The F1 score of 92.777 was 3.33% higher.

Photo credit: iStockphoto/nicescene

Ping An Trumps Prestigious ML Contest

Recommended Stories

NEC Claims Its New LLMs Are Faster Than GPT-4

Microsoft Releases Phi-3 “Small Language Model”

The Clear And Present Danger of Open LLMs

Will RPA Platforms Remain Relevant? AI Agents May Hold the Answer

Meet Hadrian X, the Robot Bricklayer Disrupting Construction

Recommended Whitepapers

Are You Data and AI Ready?

AI for IT Leaders: Deploying a Future-Proof IT Infrastructure

Advance Your Business With AI/ML

Top Considerations for Building a Production-Ready AI/ML Environment

Top 5 Considerations for Your AI/ML Platform