Videos
Support Us

No

Quest | CBMM Seminar Series: Incomplete Objectives and AI Safety: The Theory and Practice of AI Alignment

Dec 4, 2023 - 4:00 pm

Dylan Hadfield-Menell (CSAIL)

Venue: Singleton Auditorium (46-3002) Speaker/s: Dylan Hadfield-Menell (CSAIL)

Abstract: For AI systems to be safe and effective, they need to be aligned with the goals and values of users, designers, and society. In this talk, I will discuss the challenges of AI alignment and go over research directions to develop safe AI systems. I'll begin with theoretical results that motivate the alignment problem broadly. In particular, I will show how optimizing incomplete goal specifications reliably causes systems to select unhelpful or harmful actions. Next, I will discuss mitigation measures that counteract this failure mode. I will focus on approaches for incorporating human feedback into objectives, interpreting and understanding learned policies, and maintaining uncertainty about intended goals.

Organizer: Hector Penagos Organizer Email: cbmm-contact@mit.edu

CBMM10 Panel: Language and Thought

Embedded thumbnail for CBMM10 Panel: Language and Thought

CBMM10 Panel: Language and Thought

CBMM10 Panel: Neuroscience to AI and back again

Embedded thumbnail for CBMM10 Panel: Neuroscience to AI and back again

CBMM10 Panel: Neuroscience to AI and back again

CBMM10 Panel: Interacting with the physical world

Embedded thumbnail for CBMM10 Panel: Interacting with the physical world

CBMM10 Panel: Interacting with the physical world

CBMM10: CBMM after 10 years — the Quest now

Embedded thumbnail for CBMM10: CBMM after 10 years — the Quest now

CBMM10: CBMM after 10 years — the Quest now

CBMM10: Welcome

Embedded thumbnail for CBMM10: Welcome

CBMM10: Welcome

Minute-scale periodicity of neuronal firing in the human entorhinal cortex

Aghajan, Z. M., Kreiman, G. & Fried, I. Minute-scale periodicity of neuronal firing in the human entorhinal cortex. Cell Reports 42, 113271 (2023).

Cervelli menti algoritmi

Poggio, T. & Magrini, M. Cervelli menti algoritmi. 272 (Sperling & Kupfer, 2023). at <https://www.sperling.it/libri/cervelli-menti-algoritmi-marco-magrini>

How do we tell where a sound is coming from?

Embedded thumbnail for How do we tell where a sound is coming from?

How do we tell where a sound is coming from?

Model metamers reveal divergent invariances between biological and artificial neural networks

Feather, J., Leclerc, G., Mądry, A. & McDermott, J. H. Model metamers reveal divergent invariances between biological and artificial neural networks. Nature Neuroscience (2023). doi:10.1038/s41593-023-01442-0

Pages