Readings

Italicized resources are required, and other resources are suggested.

Table of contents

Introduction
1. Deep Learning Review
Risk Analysis
Robustness
1. Adversarial Robustness
2. Long Tails and Distribution Shift
Monitoring
Control
Systemic Safety
X-Risk

Introduction

Deep Learning Review

Risk Analysis

Complex Systems and AI Safety
Emergence (Intermediate)
Introduction to STAMP (video, slides)
Safe Design, Consequence assessment, Safety models
Systemantics Appendix
How Complex Systems Fail

Robustness

Adversarial Robustness

Long Tails and Distribution Shift

Monitoring

OOD and Malicious Behavior Detection

Interpretable Uncertainty

Transparency

Trojans

Detecting and Forecasting Emergent Behavior

Control

Power-Seeking

Honest AI

Machine Ethics

Systemic Safety

Forecasting

ML for Cyberdefense

Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions

Cooperative AI

X-Risk