LW - AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0 by James Fox

The Nonlinear Library: LessWrong

内容由The Nonlinear Fund提供。所有播客内容（包括剧集、图形和播客描述）均由 The Nonlinear Fund 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品，您可以按照此处概述的流程进行操作https://zh.player.fm/legal。

3M ago 9:55

MP3•单集首页

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on September 22, 2024 16:12 (13d ago)

What now? This series will be checked again in the next day. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0, published by James Fox on July 7, 2024 on LessWrong.
TL;DR
We are excited to announce the fourth iteration of ARENA (Alignment Research Engineer Accelerator), a 4-5 week ML bootcamp with a focus on AI safety! ARENA's mission is to provide talented individuals with the skills, tools, and environment necessary for upskilling in ML engineering, for the purpose of contributing directly to AI alignment in technical roles.
ARENA will be running in-person from LISA from 2nd September - 4th October (the first week is an optional review of the fundamentals of neural networks).
Apply
here before 23:59 July 20th anywhere on Earth!
Summary
ARENA has been successfully run three times, with alumni going on to become MATS scholars and LASR participants; AI safety engineers at Apollo Research, Anthropic, METR, and OpenAI; and even starting their own AI safety organisations!
This iteration will run from 2nd September - 4th October (the first week is an optional review of the fundamentals of neural networks) at the London Initiative for Safe AI (LISA) in Old Street, London. LISA houses small organisations (e.g., Apollo Research, BlueDot Impact), several other AI safety researcher development programmes (e.g., LASR Labs, MATS extension, PIBBS, Pivotal), and many individual researchers (independent and externally affiliated).
Being situated at LISA, therefore, brings several benefits, e.g. facilitating productive discussions about AI safety & different agendas, allowing participants to form a better picture of what working on AI safety can look like in practice, and offering chances for research collaborations post-ARENA.
The main goals of ARENA are to:
Help participants skill up in ML relevant for AI alignment.
Produce researchers and engineers who want to work in alignment and help them make concrete next career steps.
Help participants develop inside views about AI safety and the paths to impact of different agendas.
The programme's structure will remain broadly the same as ARENA 3.0 (see below); however, we are also adding an additional week on evaluations.
For more information, see our website.
Also, note that we have a Slack group designed to support the independent study of the material (join link here).
Outline of Content
The 4-5 week program will be structured as follows:
Chapter 0 - Fundamentals
Before getting into more advanced topics, we first cover the basics of deep learning, including basic machine learning terminology, what neural networks are, and how to train them. We will also cover some subjects we expect to be useful going forward, e.g. using GPT-3 and 4 to streamline your learning, good coding practices, and version control.
Note: Participants can optionally skip the program this week and join us at the start of Chapter 1 if they'd prefer this option and if we're confident that they are already comfortable with the material in this chapter.
Topics include:
PyTorch basics
CNNs, Residual Neural Networks
Optimization (SGD, Adam, etc)
Backpropagation
Hyperparameter search with Weights and Biases
GANs & VAEs
Chapter 1 - Transformers & Interpretability
In this chapter, you will learn all about transformers and build and train your own. You'll also study LLM interpretability, a field which has been advanced by Anthropic's Transformer Circuits sequence, and open-source work by Neel Nanda. This chapter will also branch into areas more accurately classed as "model internals" than interpretability, e.g. recent work on steering vectors.
Topics include:
GPT models (building your own GPT-2)
Training and sampling from transformers
TransformerLens
In-context Learning and Induction Heads
Indirect Object Identification
Superposition
Steering Vectors
Chapter 2 - Reinforcement Learning
In this chapter, you w...

1851集单集

#The Nonlinear Fund #Podcasting Education #Of TexttoSpeech