Import AI 455: Automating AI Research
Welcome to Import AI, a newsletter dedicated to exploring advancements in AI research. In this edition, we delve into the potential for AI systems to autonomously conduct research and development (R&D), a concept that could redefine the landscape of artificial intelligence.
The Future of AI R&D
Recent observations lead to a compelling conclusion: there is a significant likelihood (over 60%) that AI systems capable of independently developing their successors will emerge by the end of 2028. This prospect raises profound questions about the implications for society and the future of technology.
The automation of AI research could signify a pivotal moment, akin to crossing a Rubicon, where the trajectory of AI development becomes unpredictable. This article aims to outline the reasons supporting this belief, examine the evidence, and discuss the potential consequences of such advancements.
Evidence for Automated AI R&D
To understand the shift towards automated AI R&D, we must analyze various benchmarks and trends in AI capabilities. While each benchmark has its limitations, the overall trends provide valuable insights into the progress being made.
The Coding Singularity
AI systems have revolutionized software development, primarily through two trends: improved coding capabilities and enhanced task automation. These advancements are exemplified by benchmarks such as SWE-Bench and METR.
SWE-Bench: Solving Real-World Software Engineering Problems
SWE-Bench is a coding test that evaluates AI systems on their ability to resolve real-world issues found on platforms like GitHub. When SWE-Bench was launched in late 2023, the leading AI model, Claude 2, achieved a mere 2% success rate. In contrast, the Claude Mythos Preview scored an impressive 93.9%, indicating a significant leap in coding competency.
This benchmark serves as a reliable indicator of AI’s impact on software engineering. Many professionals in leading tech labs now rely on AI systems not only for coding but also for testing and verifying their code. This indicates that AI is increasingly capable of automating substantial portions of AI R&D.
METR: Measuring AI Task Completion
METR provides insights into the complexity of tasks that AI systems can complete, measured by the time a skilled human would require to accomplish them. The progress observed is remarkable:
- In 2022, GPT-3.5 could handle tasks that took approximately 30 seconds for a human.
- By 2023, GPT-4 improved this to 4 minutes.
- In 2024, the capability rose to 40 minutes.
- By 2025, GPT-5.2 could manage tasks taking around 6 hours.
- As of 2026, the capability has increased to approximately 12 hours.
Experts predict that by the end of 2026, AI systems may be able to tackle tasks requiring up to 100 hours of human effort. This escalation in task complexity correlates with the rise of agentic coding tools, which allow AI systems to operate independently for extended periods.
AI’s Role in Scientific Research
The essence of modern scientific inquiry involves setting research directions, conducting experiments, and validating results. Recent advancements in AI capabilities are beginning to facilitate these processes, thereby accelerating the pace of scientific discovery.
Key Scientific Skills for AI R&D
AI systems are increasingly proficient in several critical areas essential for AI research:
- Replicating Research Results: AI can now reproduce findings from scientific papers with greater accuracy.
- Chaining Machine Learning Techniques: AI is becoming adept at integrating various methodologies to solve complex problems.
- Optimizing AI Systems: AI can refine its own algorithms, enhancing efficiency and performance.
Implementing Scientific Papers
A crucial aspect of AI research is the ability to read and replicate results from scientific literature. Progress in this area has been significant, with benchmarks like CORE-Bench demonstrating AI’s capability to reproduce experimental results across various studies.
As AI systems become more skilled at these tasks, they can take on larger roles in the research process, potentially leading to breakthroughs that were previously reliant on human researchers.
Conclusion
The trajectory of AI development suggests that we are on the brink of a new era where AI systems may autonomously conduct research and development. The evidence indicates a rapid evolution in AI capabilities, particularly in coding and scientific inquiry.
As we approach the possibility of fully automated AI R&D, it is essential to consider the implications for society, ethics, and the future of technology. The potential for AI to innovate and drive research autonomously could lead to unprecedented advancements, but it also raises questions about accountability, oversight, and the role of human researchers in the future.
Note: The views expressed in this article reflect the author’s perspective based on current trends and research in AI. The future of AI R&D remains uncertain, and ongoing discussions are crucial as we navigate this evolving landscape.

