A Chinese language model performs better than OpenAI’s GPT-3 and Google’s PaLM. Huawei shows an alternative to Codex.
Large AI models for language, code and images play a central role in the current proliferation of artificial intelligence. The researchers at Stanford University therefore even want to call these models “basic models”. The pioneer in the development of very large AI models is the American AI company OpenAI, whose GPT-3 language model was the first to demonstrate the usefulness of such AI systems.
In addition to many text tasks, GPT-3 also demonstrated rudimentary code capabilities. OpenAI then leveraged its close collaboration with Microsoft to use the Github data to train the great Codex code model. The Codex also serves as the basis for Github’s CoPilot.
Chinese AI companies are developing powerful alternatives to Western models
Meanwhile, the list of great language models of Western companies and institutions is long.
In addition to GPT-3, there is PaLM from Google, Jurassic-1 from AI21 Labs, OPT models from Meta, BigScience BLOOM and Luminous from Aleph Alpha, for example. Code templates are also available from Google, Amazon, Deepmind, and Salesforce. However, these models are mostly trained with Western data and therefore not suitable for use in China – if access is possible or permitted.
Chinese companies and research institutes therefore began to produce their own alternatives at the latest with the presentation of GPT-3. In 2021, for example, Huawei introduced PanGu-Alpha, a 200 billion parameter language model trained with 1.1 terabytes of data in Chinese. The Beijing Artificial Intelligence Academy (BAAI) unveiled Wu Dao 2.0, a 1.75 trillion-parameter multimodal model, in the same year.
GLM-130B language model outperforms GPT-3
Now, researchers from China’s Tsinghua University have unveiled GLM-130B, a bilingual language model that outperforms Metas OPT, BLOOM, and OpenAI’s GPT-3, according to the team’s benchmarks. The model’s Few-Shot performance in Chinese and English surpassed the level of the previous high-end GPT-3 model in the Massive Multi-Task Language Understanding (MMLU) benchmark.
The team also tested the GLM-130B against LAMBADA, a zero-hit benchmark for predicting the last word of a word sequence. The benchmark is used to evaluate the language modeling capabilities of large language models.
Here, the Chinese model outperformed even the previous leader PaLM – despite 410 billion fewer parameters. For the training, the team relied on a method developed at Tsinghua University (GLM), as well as 400 Nvidia A100 GPUs.
This is the first time that a great Chinese linguistic model has surpassed Western models. GLM-130B is available on Github and HuggingFace.
Code Model PanGu Encoder Reaches Codex Performance
As a consistent evolution of PanGu, Huawei’s Noah’s Ark Lab and Huawei Cloud also recently introduced a Chinese alternative to Copilot, Codex and other code patterns. PanGu-Coder complements the code like western models and builds on the work done with PanGu. Like Codex, PanGu follows a training method similar to language models – the main difference is in the training data: code instead of text.
PanGu-Coder comes in several models, ranging from 317 million to 2.6 billion parameters. According to Huawei, the Chinese models are on par with Codex, AlphaCode and the alternatives in human ratings – and in some cases surpass them. The company also presents a variant trained with an organized dataset (PanGu-Coder-FT) which still works a little better.
PanGu-Coder arrives just under a year after the release of OpenAI’s Codex. Huawei thus follows the model of PanGu-Alpha, also released a little less than a year after GPT-3.