Today we're excited to introduce Devin, the first AI software engineer. Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork. Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser. When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted. Check out what Devin can do in the thread below.
@cognition_labs It's not over till it's over
@cognition_labs "has even completed real jobs on Upwork" That's not the benchmark you think it is.
@cognition_labs Is it over for us folks?
@cognition_labs there are engineering students in college now who's degrees were rock-solid six-figure job tickets when they enrolled, and now they might not have a job at all when they graduate. People need to recalibrate how fast this stuff is moving
@cognition_labs rip me, replaced by a machine before even getting the skills and degree.
@cognition_labs as a software engineer, i'm finished
@cognition_labs This is how exponential singularity looks like
@cognition_labs Maybe the first junior developer, but solving only 13% of issues is not something a "software engineer" does.