IMDA, AI Verify Foundation Launch Gen AI Evaluation Sandbox
- By Paul Mah
- November 01, 2023
The Infocomm Media Development Authority (IMDA) and the AI Verify Foundation this week launched a Generative AI (Gen AI) Evaluation Sandbox, a new initiative to support the development of trusted generative AI products and reveal potential gaps.
Created to establish a common standard approach to assess generative AI, the sandbox will make use of a new Evaluation Catalogue to establish common baseline methods and recommendations for large language models.
It comes in the wake of a white paper published in June titled “Generative AI: Implications for trust and governance”, which identified risks and proposed ideas for senior leaders in government and businesses on the responsible adoption of generative AI.
Tackling generative AI harm
IMDA noted that efforts to tackle the potential harms of generative AI have been piecemeal. To support the broader adoption of trustworthy Gen AI, IMDA is inviting industry partners to collaboratively build evaluation tools and capabilities in the Gen AI Evaluation Sandbox.
The Sandbox aims to reveal gaps in the current landscape of Gen AI evaluations, particularly in domain-specific and cultural-specific areas that are currently under-developed, such as human resources and security, as well as cultural-specific areas which are currently under-developed.
Key model developers such as Google, Microsoft, Anthropic, IBM, Nvidia, Stability.AI, and AWS have already joined, as well as app developers with concrete use cases and third-party testers like Deloitte, EY and TÜV SÜD. All participants in the Sandbox will aid in creating a more robust testing environment.
An IMDA spokesman told Singapore broadsheet The Straits Times: “Large language models today are trained on Internet data, which may not be representative of the nuances of Singapore’s cultural context. For example, in terms of knowledge understanding, it may not appreciate that within racial groups, there is a diversity of faiths and languages.”
“As cultural evaluation is a nascent area, we work with model developers to develop a methodology to identify and weed out these concerns in the models in a systematic manner, which could also be applied to other countries besides Singapore.”
The AI Verify Foundation has seven pioneering premier members: the Infocomm Media Development Authority (IMDA), Aicadium (Temasek's AI Centre of Excellence), IBM, Microsoft, Google, Red Hat, and Salesforce. There are also more than 60 general members onboard, including well-known brands and tech firms such as Adobe, DBS, Meta, Huawei, SenseTime, and Singapore Airlines.
The white paper “Generative AI: Implications for trust and governance” can be downloaded here (pdf).
Paul Mah is the editor of DSAITrends. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose. You can reach him at [email protected].
Image credit: iStockphoto/Vadym Plysiuk
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.