Thanks for the clear repro. I can confirm this also happens in the async path. The issue is inside _aiter_next_step — when the timeout fires, the coroutine is cancelled before tool_output is appended to intermediate_steps.
I'm working on a fix that catches asyncio.CancelledError and appends a ToolAgentAction with a sentinel timeout message so the agent can see what happened. PR coming soon.
Hit this exact bug with a custom retriever that calls an external vector DB. Typical queries are fine but large index scans blow past the 5-second default.
Temporary workaround: wrap the tool call in a try/except asyncio.TimeoutError inside the tool itself and return a placeholder string. Not ideal, but it stops the infinite loop until the fix lands.
@hwchase17 any ETA on the PR? We're blocking a release on this. Setting max_execution_time=None stops the loop for us, but we'd prefer a real fix in case we need the timeout for other reasons.
Setting max_execution_time=None did stop the loop on our end too. @hwchase17 — I'll test the fix PR when it's up. Thanks everyone for the quick responses.
Describe the bug
When using
AgentExecutorwith a tool that takes longer thanmax_execution_time, the last tool result is silently dropped from context. The agent then re-issues the same tool call in an infinite loop, never receiving the result.To reproduce
from langchain.agents import AgentExecutor executor = AgentExecutor(agent=agent, tools=tools, max_execution_time=5) result = executor.invoke({"input": "run slow_tool"}) # slow_tool takes 7 s — result is never seen by the agentExpected behavior
The tool result should be preserved in context when execution time is exceeded, or the agent should receive a graceful timeout error so it can recover.
Environment: langchain 0.2.14, Python 3.11.8, gpt-4o-2024-08-06