
The Japanese supercomputer Fugaku was tasked with digesting Japanese text to develop a Japanese version of ChatGPT.Credit: Kyodo News via Getty
Japan is building its own versions of ChatGPT – the artificial intelligence (AI) chatbot made by US firm OpenAI that caused a worldwide stir after its unveiling a year ago.
The Japanese government and big tech companies like NEC, Fujitsu and SoftBank are sinking millions of dollars into building AI systems that are based on the same basic technology, known as large language models (LLMs), but that use Japanese, rather than translated English versions.
“Current public LLMs, such as GPT, excel in English, but fall short in Japanese due to differences in the alphabet system, limited data and other factors,” says Keisuke Sakaguchi, a researcher at Tohoku University in Japan who specializes in natural languages. process
English bias
LLM typically uses large amounts of data from publicly available sources to learn patterns of natural speech and prose. They are trained to predict the next word based on the previous words in a piece of text. ChatGPT’s previous model, GPT-3, was trained on most of the text in English.
ChatGPT’s extraordinary ability to hold human-like conversations has delighted and concerned researchers. Some see it as a labor-saving device; Others worry that it could be used to falsify scientific papers or data.
In Japan, there are concerns that AI systems trained on data sets from other languages may not be able to understand the complexities of Japan’s language and culture. Sentence structure in Japanese is completely different from English. So ChatGPT must translate the Japanese query into English, find the answer, and then translate the response into Japanese.
While English has only 26 letters, written Japanese has two sets of 48 basic characters, plus 2,136 regularly used Chinese characters, or kanji. Most kanji have two or more syllables, and another 50,000 or so rarely used kanji exist. Given that complexity, it’s not surprising that ChatGPT can stumble across language.
In Japanese, ChatGPT “sometimes produces very rare characters that most people have never seen before, and strange unknown words result”, says Sakaguchi.
cultural norms
For an LLM to be useful and commercially viable, it must accurately reflect cultural practices as well as language. For example, if ChatGPT is asked to write a job-application e-mail in Japanese, it may skip the standard expressions of politeness and look like an obvious translation from English.
To measure how sensitive LLMs are to Japanese culture, a group of researchers launched Rakuda, a measure of how well LLMs can answer open-ended questions on Japanese topics. Rakuda co-founder Sam Passaglia and his colleagues asked ChatGPT to compare the fluency and cultural appropriateness of answers to standard prompts. They used the tool to rank the results based on a preprint published in June that showed GPT-4 agreed with human reviewers 87% of the time.1. The best open-source Japanese LLM ranks fourth on Rakuda, while in first place, perhaps unsurprisingly given that it’s also a competition judge, is GPT-4.
“Of course the Japanese LLMs are getting better, but they’re far behind GPT-4,” says Passaglia, a physicist at the University of Tokyo who studies Japanese language models. But there is no reason, in principle, that the Japanese LLM cannot equal or surpass GPT-4 in the future, he says. “It’s not technically insurmountable, but just a question of resources.”
A major effort to create a Japanese LLM is to use the Japanese supercomputer Fugaku, one of the fastest in the world, to train mainly on Japanese-language input. With support from Tokyo Institute of Technology, Tohoku University, Fujitsu and the government-funded RIKEN Group of Research Center, the resulting LLM is expected to be released next year. Unlike GPT-4 and other proprietary models, it will join other open-source LLMs in making the code available to all users. According to Sakaguchi, who is involved in the project, the team hopes to provide at least 30 billion parameters, which are values that influence its output and can serve as parameters for its size.
However, the Fugaku LL.M. will probably be succeeded by an even larger one. Japan’s Ministry of Education, Culture, Sports, Science and Technology is funding the creation of a Japanese AI program based on scientific needs that will learn from published research and generate scientific hypotheses by accelerating the identification of targets for investigation. A model can start with 100 billion parameters, more than half the size of GPT-3, and expand over time.
“We hope to dramatically accelerate the scientific research cycle and expand the discovery space,” says Makoto Taiji, deputy director of the RIKEN Center for Biosystem Dynamics Research, about the project. LLM could cost at least ¥30 billion (US$204 million) to develop and is expected to be publicly released in 2031.
Expanding capabilities
Other Japanese companies are already commercializing or planning to commercialize their own LLM technologies. Supercomputer maker NEC began using Japanese-based generative AI in May, claiming it cuts internal report generation time by 50% and internal software source code by 80%. In July, the company began offering customized generative AI services to customers.
Masafumi Oyamada, senior principal researcher at NEC Data Science Laboratories, says it can be used “in a wide range of industries such as finance, transportation and logistics, distribution and manufacturing.” He adds that researchers can combine this work with writing code, writing and editing papers, and surveying existing published papers.
Japanese telecommunications firm SoftBank, meanwhile, is investing about ¥20 billion in generative AI trained on Japanese text and plans to launch its own LLM next year. SoftBank, which has 40 million customers and a partnership with OpenAI investor Microsoft, says it aims to help companies digitize their businesses and increase productivity. SoftBank expects its LLM to be used by universities, research institutes and other institutions.
Meanwhile, Japanese researchers hope that accurate, effective and made-in-Japan AI chatbots can help accelerate science and bridge the gap between Japan and the rest of the world.
“If the Japanese version of ChatGPT can be made accurate, it will have good results for people who want to learn Japanese or do research on Japan,” says Shotaro Kinoshita, a medical technology researcher at Keio University School of Medicine. Tokyo. “As a result, international joint research can have a positive impact.”