Technology Why Meta's Big Language Model Doesn't Work for Researchers

Why Meta’s Big Language Model Doesn’t Work for Researchers

-

Couldn’t attend Transform 2022? Check out all the top sessions in our on-demand library now! Look here.


When Alan Turing came up with the Turing test in 1950, it was a test of a machine’s ability to exhibit intelligent behavior indistinguishable from that of a human. Turing proposed that a computer can be said to possess artificial intelligence (AI) if it can create human answers to questions.

Thanks to big language models, we’ve now reached the point where computers can write text on just about any subject we give them — and for the most part, it’s very persuasive and human.

Tell it to write a sentence about, “Why does Elon Musk like to knit?” and what it yields is arguably as good as what a human could write:

Some possible reasons why Elon Musk might enjoy knitting could include the fact that it is a relaxing and meditative activity that can help to clear one's mind, and it also allows for a great deal of creativity and self-expression.
Additionally, knitting can be a very social activity, and Elon Musk may enjoy the opportunity to chat and connect with other knitters.

[Source: OpenAI Playground using text-davinci-002 model]

Summary complex text

Examples like these are nice, but the greater value proposition of using large language models is less about writing crazy prose and more about summarizing complex text. These use cases are interesting in all sectors. For example, AI can distil information about potential prospects for sales intelligence purposes, or it can summarize investment documents in finance.

Event

MetaBeat 2022

MetaBeat will bring together thought leaders to offer advice on how metaverse technology will change the way all industries communicate and do business October 4 in San Francisco, CA.

Register here

What I’m particularly interested in, however, is summarizing scientific papers for researchers.

The problem is that there is an abundance of research being published all over the world. More than 4.2 million scientific research papers were published in 2020. Even within specialized fields of research, hundreds of papers are published every year – how can a researcher keep up with all this while conducting his own research? The abstract of a paper refers only to the research detail contained in it.

When Meta recently made its language model, OPT-175B, open source, it sounded promising to academic researchers. It is said to offer better performance than OpenAI’s GPT-3 and use only 15% of GPT-3’s computer resources to train it.

Putting models to the test

So how does Meta perform on scientific research?

We tested it first on a technical press releaseand this was the summary we received:

​​Forward Looking Statements, which speak only as of the date of this press release. Artelo undertakes no obligation to publicly update any forward-looking statement, whether as a result of new information, future events or otherwise.
Investor Relations Contact:
Westwicke Partners on behalf of Artelo Biosciences
Mike Piccinino, CFA

It’s not a good start. The model focuses on the investor at the end of the press release and calculates that this is the most important information the reader should know. You could argue it’s important, but it’s not what we’re looking for. The contact with the investors is not even correct, it is manufactured by the model!

Then we tried it on a paper from the Journal of Electronic Materials:

Journal of ELECTRONIC MATERIALS, Vol. 38, No. 7, 2009 DOI: 10.1007/s11664-009-0701-y (cid:1) 2009 TMS Special Issue Paper thermally. The samples were then pulverized and analyzed by XRD, TGA, and XPS. The XRD patterns showed that CaCo2O4+d crystallizes as a calcium-ferrite-type structure, which consists of a corner- and edge- shared CoO6 octahedron network including one-dimensional double chains. The CaCo2O4+d phase has a finite density of states at the Fermi level. The XPS results showed that CaCo2O4+d has a composition of CaCo2O4+d = (Ca2CoO3)0.62CoO2. The XPS results also showed that CaCo2O4+d has

[Source: Meta OPT-175B]

At first glance, it seems to contain important information, but there is clearly redundant information, such as the details of the paper that are not important to the summary, and I would more accurately describe the result as a paraphrasing of part of the text rather than of summarizing all its aspects.

Small-scale models perform better

So how does a smaller language model compare? Through experience in the field, we designed our Iris.ai IPG model to have only 25 million parameters – as opposed to 175 billion from Meta and OpenAI – but trained it on millions of domain-specific scientific papers. Our research has shown that this model performs very differently on the same paper:

Metallic temperature dependence of the seebeck coefficient s with a large thermoelectric power (s=151lv/kat387k) has a finite density of states at the fermi level. In this paper, we report the crystal structure and physical properties of caco2o4+d. We find a new compound caco2o4+d, which exhibits a large thermoelectric power, even though it has a finite density of states at the fermi level. Motivated by the simple guiding principle mentioned previously, we searched for new phases thermoelectric properties related as a thermoelectric material applicable to high-temperature use.

[Source: Iris.ai IPG]

You can see that the sentence structure is a little more simplistic than a large language model, but the information is much more relevant. In addition, the computational cost to generate that news article summary is less than $0.23. To do the same on OPT-175 would cost about $180.

The container ships of AI models

You would expect that large language models with enormous computing power, such as OPT-175B, would be able to process the same information faster and with higher quality. But where the model falls down is in specific domain knowledge. It doesn’t understand the structure of a research paper, it doesn’t know what information is important, and it doesn’t understand chemical formulas. It’s not the model’s fault – it’s just not trained on this information.

The solution is therefore to just train the GPT model on material papers, right?

To a certain extent, yes. If we can train a GPT model on material papers then it will be good to summarize them, but large language models are – by nature – large. They are the proverbial container ships of AI models – it is very difficult to change direction. This means that to evolve the model with reinforcement of learning, hundreds of thousands of material papers are needed. And this is a problem – this amount of paper simply does not exist to train the model. Yes, data can be fabricated (as is often the case with AI), but this reduces the quality of the output – the power of GPT comes from the variety of data it is trained on.

A revolution in the ‘how’

This is why smaller language models work better. Natural language processing (NLP) has been around for years, and while GPT models have made headlines, the sophistication of smaller NLP models is constantly improving.

After all, a model trained on 175 billion parameters will always be difficult to handle, but a model that uses 30 to 40 million parameters is much more agile for domain-specific text. The added benefit is that it will use less computing power, so it also costs a lot less to run.

From a scientific research point of view, what interests me most, AI will increase the potential for researchers, both in academia and industry. The current pace of publishing produces an inaccessible amount of research, consuming the time of the academics and the company’s resources.

The way we designed Iris.ai’s IPG model reflects my belief that certain models not only have the potential to revolutionize what we study or how quickly we study it, but also how we approach different disciplines of scientific research as a whole. They give talented minds significantly more time and resources to collaborate and generate value.

This potential for any researcher to harness the world’s research is what drives me forward.

Victor Botev is the CTO at Iris AI.

DataDecision makers

Welcome to the VentureBeat Community!

DataDecisionMakers is where experts, including the technical people who do data work, can share data-related insights and innovation.

If you want to read about the latest ideas and up-to-date information, best practices and the future of data and data technology, join us at DataDecisionMakers.

You might even consider contributing an article yourself!

Read more from DataDecisionMakers

Shreya Christinahttp://ukbusinessupdates.com
Shreya has been with ukbusinessupdates.com for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider ukbusinessupdates.com team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

Latest news

1xbet: полную Руководство Для Начинающих Игроков 1xbet являлось Одной Из одним Популярных Букмекерских Контор В Мире, предлагая Широкий Спектр Ставок На Спортивные переломные, А...

1xbet: полную Руководство Для Начинающих Игроков 1xbet являлось Одной Из одним Популярных Букмекерских Контор В Мире, предлагая Широкий Спектр...

Mostbet Mobil Tətbiq: Azərbaycandan Olan Oyunçular üçün Xülasə 2023

IOS cihazlarının istifadəçilərinin tətbiqi uydurmaq üçün bu addımlara əməl etməsinə lüzum yoxdur, çünki tətbiq endirildikdən sonra cəld avtomatik olaraq...

Mostbet Hindistan Formal Saytı 25,000 Pulsuz Oyna Proloq Və Qeydiyyat

Bundan artıq, profilinizə iç olmaq oyunçulara var-yox bukmeker kontorları ilə idmana yox, həm də oyun avtomatlarında mərc etməyə macal...

Mostbet Az 90 Azərbaycanda Bukmeker Və Casino Bonus 550+250fs

Vəsaitlər uğurla emal edildikdən sonra, bax: əksəriyyət ödəniş üsulları ötrü depozitlər adətən dəqiqədən ən çəkmir. Sonra qalan vur-tut nəticəni...

Казино Онлайн 1xbet Играть Онлайн и Казино ᐉ 1xbet Co

Казино Онлайн 1xbet Играть Онлайн и Казино ᐉ 1xbet Com1xbet Авиатор Играть Бесплатно И на Деньги На Сайте 1хбетContentОфициальное...

Mosbet: Onlayn Kazino Və Idman Mərcləri

Kazino və Canlı Kazino tez-tez provayderlərin müasir oyunları ilə yenilənir, buna ötrü də bu oyunları ilk dönüm oynayanlar...

Must read

You might also likeRELATED
Recommended to you