Automated Alignment Researchers: Using large language models to scale scalable oversight
Automated Alignment Researchers: Using Large Language Models to Scale Scalable Oversight Automated Alignment Researchers: Using large language models to scale scalable oversight As large language models (LLMs) continue to improve at an unprecedented rate, alignment research faces two critical questions. The first concerns how alignment can keep pace with the rapid advancements in AI capabilities….
