RE: LeoThread 2024-10-22 09:10

You are viewing a single comment's thread from:

RE: LeoThread 2024-10-22 09:10

View the full context
View the direct parent

taskmaster4450le (81)in LeoFinance • last year

Ofir Press, a postdoctoral researcher at Princeton University who helped develop SWE-bench, says that agentic AI tends to lack the ability to plan far ahead and often struggle to recover from errors. “In order to show them to be useful we must obtain strong performance on tough and realistic benchmarks,” he says, like reliably planning a wide range of trips for a user and booking all the necessary tickets.

last year in LeoFinance by taskmaster4450le (81)

$0.00

Sort:

Trending