Technology IBM Research helps extend PyTorch to enable open-source cloud-native...

IBM Research helps extend PyTorch to enable open-source cloud-native machine learning


- Advertisment -

Watch the Low-Code/No-Code Summit on-demand sessions to learn how to successfully innovate and achieve efficiencies by upskilling and scaling citizen developers. Watch now.

Foundation models have the potential to change the way organizations build and train artificial intelligence (AI) with machine learning (ML).

A major challenge for building foundation models is that until now they have generally required the use of specific types of network and infrastructure hardware to operate efficiently. There is also limited support for developers who want to build a base model with a fully open source stack. It’s a challenge that IBM research tries to help solve in different ways.

>>Don’t miss our new special issue: Zero trust: the new security paradigm.<

“Our question was: can we train foundation models, but in such a way that we do it on basic hardware? And make it more accessible instead of just being in the hands of a select few researchers,” Raghu Ganti, principal research associate at IBM, told VentureBeat.


Intelligent security stop

On December 8, learn about the critical role of AI and ML in cybersecurity and industry-specific case studies. Register for your free pass today.

register now

To that end, IBM announced today that it has developed and contributed code to the open-source PyTorch machine learning project to make the technology work more efficiently with standard Ethernet-based networks. IBM has also built an open source operator that helps optimize PyTorch deployment on the Red Hat OpenShift platform, which is based on the open source Kubernetes cloud container orchestration project.

To infinity and beyond: how IBM helped expand PyTorch

To date, many base models have been trained on hardware that supports the InfiniBand networking stack typically found only on high-performance computing (HPC) hardware.

While GPUs are the foundation of AI, there is a need for powerful networking technology to connect multiple GPUs together. Ganti explained that it is possible to train large models without InfiniBand networks, but it is inefficient in a number of ways.

For example, he said that with the standard PyTorch technology, training a model with 11 billion parameters over an Ethernet-based network can be done with only 20% GPU efficiency. Improving that efficiency is what IBM did alongside the PyTorch community.

“This is a very complex problem and there are a lot of knobs to tune,” said Ganti.

The knobs that need tweaking are all about ensuring optimized GPU and network usage. Ganti said the goal is to keep both the network and GPU busy at the same time to speed up the overall training process.

The code to optimize PyTorch to work better over Ethernet was merged into the PyTorch 1.13 update that became generally available on October 28.

“We were able to go from 20% GPU usage all the way to 90%, and that’s a 4.5x improvement in terms of training speeds,” said Ganti.

Shifting PyTorch into high gear for faster training

In addition to the code improvements in PyTorch, IBM has also been working on the open-source Red Hat OpenShift Kubernetes platform to support base model development.

Ganti said part of what they’ve been doing is making sure that the maximum bandwidth the Ethernet network can provide is reflected at the pod level in OpenShift.

Using Kubernetes to train foundation models is not a new idea. Open AIthe organization behind some of the most widely used models, including GPT-3 and DALL-E publicly discussed how it uses Kubernetes. What is new, according to IBM, is that the technology for this is available as open source. IBM has open sourced a Kubernetes operator that provides the necessary configuration to help organizations scale a cluster to support large model training.

With the PyTorch Foundation, more open-source innovation is now possible

Until September, PyTorch was operated as an open-source project managed by Meta. That changed on September 12, when the PyTorch Foundation was announced as a new organizing body led by the Linux Foundation.

Ganti said IBM’s effort to contribute code to PyTorch actually started before the announcement of the new PyTorch Foundation. He explained that under Meta’s administration, IBM could not actually directly commit code to the project. Instead, the code had to be committed by Meta staffers who had commit access.

Ganti expects PyTorch to become more collaborative and open under the leadership of the Linux Foundation. “I think so [PyTorch Foundation] will improve open-source collaboration,” said Ganti.

VentureBeat’s mission is to become a digital city plaza where tech decision makers can learn about transformative business technology and execute transactions. Discover our Briefings.

Shreya Christina
Shreya has been with for 3 years, writing copy for client websites, blog posts, EDMs and other mediums to engage readers and encourage action. By collaborating with clients, our SEO manager and the wider team, Shreya seeks to understand an audience before creating memorable, persuasive copy.

Latest news

SQL Server Change Data Capture (CDC): Revolutionizing Data Tracking and Analysis

In today's data-centric world, the ability to efficiently and accurately track changes in databases is crucial for organizations of...

App vasitesile inanılmaz Pin-Up poker oyunu

ContentAzərbaycanda mövcud olan Depozit və Çıxarma MetodlarıPin Up Casino Oyunçuları üçün bonuslarİlk depozit bonusunu necə əldə etmək olarSlot maşınlarının...

Pin up indir android ⭐️ Pinup indir mobil cazino uygulamasıdır

ContentPin Up-ı iOS-lara nece yükləyib quraşdırmaq olar?Pin Up Casino Azerbaycan YuklePin Up indir android mobil Apk uygulamasıPınup İlk Üyelik...

Pin-up kazino bonusları ᐉ İlk depozit üçün promo kodu PINUPBEST

ContentRəsmi sayt Pin UpAviator Pin UP oynaya biləcəyiniz yerlər - vebsayt və proqramPin Up kazinosunda oyun kateqoriyalarıDepozit mükafatları yoxdurAndroid...
- Advertisement -

Immediate Edge Review 2022 Warning Scam or Legit Read Before Trading

Finally, we are at the conclusion that investors should give Immediate Edge a try for cryptocurrency trading. We are...

Immediate Edge Review 2023: Is It a Scam or Legit? Find Now!

Hacked trading accounts have been reported, with users losing their funds. Immediate Edge puts a high level of protection...

Must read

- Advertisement -

You might also likeRELATED
Recommended to you