Utilize AI to analyze application runtime data (e.g., rendering time, communication latency), obtain optimization suggestions (such as reducing component re-rendering, reusing hardware connections), ...
I'm publishing these notes before the summit so I can't revise my expectations after the fact. What follows is unpolished: observations, questions, ...
We evaluate DeepCode on the PaperBench benchmark (released by OpenAI), a rigorous testbed requiring AI agents to independently reproduce 20 ICML 2024 papers from scratch. The benchmark comprises 8,316 ...