Discovering Latent Knowledge in Language Models Without Supervision

Contrast-Consistent Search

Jul 11, 2023

How can I infer the beliefs of a neural network in an unsupervised way?

That question motivates Burns et al. to propose the Contrast-Consistent Search (CCS) method in their work "Discovering Latent Knowledge in language models without Supervision", published at ICLR 2023.

An overview of the work 👇

Link to the paper: https://arxiv.org/abs/2212.03827

Samuel Albanie

Discussion about this post