NUS Deep-Learning AI System Now an Apache Top-Level Project

NUS Computing Professor Ooi Beng Chin and Director of NUS Smart Systems Institute (standing, third from right) led the NUS team that developed Apache SINGA.

The Apache SINGA open-source project developed by a team of researchers from the National University of Singapore (NUS) is now classified as a top-level project (TLP) under the Apache Software Foundation.

Designed to support traditional machine learning models, Apache SINGA provides a flexible architecture for scalable distributed training and can work on a wide range of hardware.

Deep learning on cheap

Led by Professor Ooi Beng Chin, Apache SINGA was initiated by the Database System Research Group from NUS School of Computing together with Zhejiang University and NetEase in 2014. The first official release under Apache Incubator was in October 2015, until it was graduated to a TLP a few weeks ago.

“We saw an increasing demand for deep learning and machine platforms in 2012, but there was a lack of efficient distributed platforms. The graduation is a mark of recognition for Apache SINGA, but this is just the beginning,” said Prof Ooi.

Prof Ooi hopes that Apache SINGA can make an impact on deep learning the same way Apache HTTP Servers did for website servers.

Deep learning is a subset of machine learning that seeks to leverage artificial neural networks to generate meaningful insight from large amounts of data. While a typical centralized deep learning system would require a supercomputer for the processing ability to process a vast amount of data, Apache SINGA addresses this with a distributed system that works across a large number of regular computers.

Overcoming technical hurdles

The biggest technical hurdle that the team faced when developing Apache SINGA revolved around scalability. Scalability is a measure of how the system performs when there are more computing resources provided, and is a key technical challenge for all distributed deep learning systems including Apache SINGA, says Prof Ooi to CDOTrends.

“When we increase the number of machines from 1 to 2 (or from 100 to 200), we expect the processing time to be cut by half. This is very important for processing complex machine learning models and big datasets. Due to the communication overhead between machines, however, we may not be able to achieve half reduction,” explained Prof Ooi.

“To overcome this challenge, we carried out various optimizations on Apache SINGA such as running computation and communication in parallel and avoiding transmitting small data which incurs high overhead. Our system can therefore scale well to hundreds of machines.”

Practical applications

Apache SINGA currently powers applications across multiple sectors including healthcare, banking and finance, software development and cybersecurity. For example, it is used to power the FoodLG app to identify a dish based on a photo uploaded by an end-user.

Under the hood, Apache SINGA was leveraged to create a deep-learning AI system to determine the nutritional value and calorie count of a dish by analyzing the image. This helps patients and doctors monitor the food intake which can help with the prevention of pre-diabetics, personalized diet planning, and weight loss programs.

According to Prof Ooi, five hospitals in Singapore are currently using different versions of FoodLG to promote healthy living and facilitate disease management for ailments such as diabetes, hypertension and high cholesterol.

Elsewhere, the National University Hospital (NUH) and the Singapore General Hospital are also leveraging Apache SINGA to analyze MRI and X-ray images to improve the identification of health problems.

So what’s next for the Apache SINGA project? The next step for Apache SINGA is to further enhance its system to allow non-AI experts to use it, as well as to streamline so that the project can potentially run on edge devices in tomorrow’s hyperconnected 5G world.