The Chinese government recently said it would invest heavily in artificial intelligence to ensure its companies, government and military dominate the field by 2030. Now there's growing evidence that China may not have that far to go to claim the AI crown.
Perhaps there's no better place to note China's rise in AI than with this year's ImageNet competition, an influential AI contest where teams from across the world compete over which algorithms can best recognize images.
Out of the 27 teams competing, more than half were Chinese-based research teams from universities or companies, and all the top performers were from China. The results were something of a repeat from last year, when Chinese scientists also dominated a field of 84 teams from around the world. To be sure, leading AI players like Google, who won top results in 2014, haven't participated in the last couple ImageNets. But China's dominance in the last two years of the competition shows just how much serious AI work is coming out of the country these days.
In this year's competitions, top results for the closely-monitored image classification challenge had an error rate of only 2.25% from a team called WMW, a small jump from the previous year's 2.99% error rate. WMW's team included two researchers from Beijing-based autonomous vehicle startup Momenta -- Jie Hu and Gang Sun -- as well as Li Shen from the University of Oxford. In an email to Forbes, the Chinese researchers said they use a technique called "squeeze and excitation," which both enhances useless feature and suppresses less useful ones of a convolutional neural network.
A big jump over the previous year happened in object detection, which refers to a computer's ability to recognize objects and identify them in an image -- there are three apples in the picture and one cat, for example. The winning team, called DBAT, achieved an accuracy of 73.1% over last year's 66.3%. The DBAT team consisted of a collection of eight researchers from China's Nanjing University and two from Imperial College London.
Since starting in 2010, ImageNet (or Large Scale Visual Recognition Challenge) has emerged as an influential event in the AI research community to track the latest advances in image recognition systems. The year 2012, in particular, is regarded as a watershed moment for AI and deep learning when a team from the University of Toronto made a major breakthrough in image recognition accuracy. Led by Alex Krizhevsky, the PhD student used a deep neural network to train a model and achieved image classification error rate of 15% -- a giant leap from the previous year's rate of around 25%. His model, called AlexNet, demonstrated the viability of deep learning systems, which had been around since at least the 1950s, but until then hadn't been taken very seriously. (Krizhevsky and his advisor, AI pioneer Geoffrey Hinton, both now work at Google's AI lab.)
“2012 was really the year when there was a massive breakthrough in accuracy, but it was also a proof of concept for deep learning models, which had been around for decades,” said Olga Russakovsky, a computer science processor at Princeton University and an ImageNet organizer. “It really was the first time these models had been shown to work in context of large-scale image recognition problems.”
Deep learning techniques have since taken off like wild fire in the AI community as well as at nearly every tech company. These AI systems very loosely resemble how the brain functions -- many neurons networked together with synapses. The systems are trained on massive sets of data and are able to pick out patterns in the data.
Following the 2012 contest, large tech companies like Google and Microsoft began taking part in ImageNet to show off their latest advances in deep learning-based image recognition systems. In 2014, Google entered the competition with a team called GoogLeNet, and made a big breakthrough in object detection accuracy: 43.9% from the previous year's 22.6% accuracy. ImageNet makes for good marketing: In 2013, AI research Matt Zeiler launched his AI startup, Clarifai, while achieving top results at ImageNet in image classification -- a jump to 12% error rate from Krizhevsky's 15% the year before.
ImageNet's organizers wanted to stop running the classification challenge in 2014 and focus more on object localization and detection as well as video later on, but the tech industry continued to track classification closely throughout the years.
Now, ImageNet is shutting down because of performance saturation in challenges like image classification, said Alex Berg, a computer science professor at the University of North Carolina at Chapel Hill and an ImageNet organizer. "There's not a lot of room on the top," he said.
"I think ImageNet is still making massive progress," added Russakovsky. "But it's healthy for the community to start focusing on perhaps other tasks, challenges or datasets."
Some in the AI community are wondering what research-focused AI challenges might take ImageNet's place. One possible contender Russakovsky points to is the COCO (or Common Objects in Context) contest. Berg is also working on putting together a challenge for image recognition based strictly on real world data using smartphone cameras. One contest, called WebVision, requires teams to train their models on images culled from the internet that haven't been exhaustively labeled, like ImageNet's dataset.
The results for the WebVision challenge were recently announced and the top performer was Shenzhen-based Malong Technologies, maker of AI developer tools for image recognition tasks. Malong achieved a 94.78% accuracy rate in classifying the web images. Malong is a private business, but it opened a joint AI research lab with Tsinghua University with official sponsorship from the Shenzhen government, which is making offers of $1 million to any AI efforts kickstarted there.
"AI is so fierce now, you need any competitive advantage you can get," said Matt Scott, cofounder and chief technology officer at Malong. "Government support is one of the very helpful things going on in China."