NUS Researchers Develop Open-Source Tool Against Data Leaks From AI Systems

NUS researchers have developed a tool to help companies check if their AI services are vulnerable to data leaks. Known as “Machine Learning Privacy Meter” (ML Privacy Meter), the tool is touted as the first such open-source tool.

The machine learning (ML) models that enable AI services are typically trained using large datasets. While none of the original data is present after a model has been trained, recent years has seen the rise of techniques that lifts the lid on the ML black box.

Adversarial ML

Known as a Membership Inference Attack, it works with access to the AI service, using ML in an adversarial manner. This “shadow training” can be leveraged to create a new model to potentially reconstruct the original dataset likely used to train the AI model.

This can be a problem when it comes to activities of individual users, such as their purchases, health data, and locations they travel to, for instance.

Inference attacks are difficult to detect as the attacker appears as a regular user, and there are currently no full-fledged tools that are readily available to help companies determine if their AI services are at risk, according to Asst Professor Reza Shokri from the National University of Singapore’s School of Computing (NUS Computing).

Developed over the last three years, ML Privacy Meter simulates attacks on the AI service using a standardized general attack formula. This general attack formula provides a framework for AI algorithm to be properly tested and quantified against various types of membership inference attacks.

By analyzing the result of the privacy analysis, the tool provides a scorecard which details how accurately the attackers could identify the original datasets used for training. The scorecards can help organizations identify weak spots in their datasets and show the results of possible techniques that they can use to mitigate possible Inference attacks.

“When building AI systems using sensitive data, organizations should ensure that the data processed in such systems are adequately protected. Our tool can help organizations perform internal privacy risk analysis or audits before deploying an AI system,” said Asst Prof Shokri.

“Data protection regulations such as the General Data Protection Regulation mandate the need to assess the privacy risks to data when using machine learning. Our tool can aid companies in achieving regulatory compliance by generating reports for Data Protection Impact Assessments.”

Asst Prof Shokri and his co-authors had previously presented the theoretical work underpinning this tool at the IEEE Symposium on Security and Privacy.

Image credit: iStockphoto/olando_o