Discussion about this post

User's avatar
Owen Skriloff's avatar

ScaleAI’s analysis shows that with a 20% error rate per action—a generous estimate for current LLMs—an agent attempting a five-step task has only a 32% chance of getting everything right.

Chris, may I have the link to ScaleAI? I could not find it at https://scale.com/leaderboard/tool_use_enterprise

Expand full comment
1 more comment...

No posts