Link Search Menu Expand Document

Readings

Italicized resources are required, and other resources are suggested.

Table of contents

  1. Introduction
    1. Deep Learning Review
  2. Risk Analysis
  3. Robustness
    1. Adversarial Robustness
    2. Long Tails and Distribution Shift
  4. Monitoring
    1. OOD and Malicious Behavior Detection
    2. Interpretable Uncertainty
    3. Transparency
    4. Trojans
    5. Detecting and Forecasting Emergent Behavior
  5. Control
    1. Power-Seeking
    2. Honest AI
    3. Machine Ethics
  6. Systemic Safety
    1. Forecasting
    2. ML for Cyberdefense
    3. Cooperative AI
  7. X-Risk

Introduction

Deep Learning Review

Risk Analysis

Robustness

Adversarial Robustness

Long Tails and Distribution Shift

Monitoring

OOD and Malicious Behavior Detection

Interpretable Uncertainty

Transparency

Trojans

Detecting and Forecasting Emergent Behavior

Control

Power-Seeking

Honest AI

Machine Ethics

Systemic Safety

Forecasting

ML for Cyberdefense

Cooperative AI

X-Risk