At the interface of physics, mathematics and artificial intelligence

Name: At the interface of physics, mathematics and artificial intelligence
Start: 2023-05-29T09:00:00+02:00
End: 2023-06-02T19:00:00+02:00
Location: Pollica Physics Center

29 May 2023 to 2 June 2023

Pollica Physics Center

Europe/Rome timezone

The math of training large neural networks, with some analogy to physics

Not scheduled

Pollica Physics Center

Castello dei Principi Capano, Pollica, Salerno, Italy

Greg Yang (Microsoft Research)

Recently, the theory of infinite-width neural networks led to the first technology, muTransfer, for tuning enormous neural networks that are too expensive to train more than once. For example, this allowed us to tune the 6.7 billion parameter version of GPT-3 using only 7% of its pretraining compute budget, and with some asterisks, we get a performance comparable to the original GPT-3 model with twice the parameter count. In this talk, I will explain the core insight behind this theory. In fact, this is an instance of what I call the Optimal Scaling Thesis, which connects infinite-size limits for general notions of “size” to the optimal design of large models in practice. I'll end with several concrete key mathematical research questions whose resolutions will have incredible impact on the future of AI.

There are no materials yet.

At the interface of physics, mathematics and artificial intelligence

The math of training large neural networks, with some analogy to physics

Pollica Physics Center

Speaker

Description

Presentation materials

Choose timezone

At the interface of physics, mathematics and artificial intelligence

Speaker

Description

Presentation materials