
A new technical guide demonstrates how to instrument a LangGraph agent with Datadog Agent Monitoring and the LLM Observability SDK so teams can trace full agent runs, visualize workflows, and surface failures, latency, token usage and response quality. The walkthrough targets engineers who need to observe agent steps, tool calls and LLM interactions rather than treating agents as black boxes; that end‑to‑end telemetry helps pinpoint where runs fail or slow down and measure cost and quality.
The guide adapts a sample LangGraph agent from the handbook Build & Run AI Agents (originally published in Japanese by open source engineer Minorun). The example — with source code available on GitHub — implements an agent that researches user questions, summarizes findings and routes results through integrations, including Tavily for web search and Amazon SNS for output delivery.
Architecturally, the sample agent is organized as three interacting nodes in a ReAct‑style loop. An agent node calls Claude Sonnet 4.6 via Amazon Bedrock to decide whether to invoke a tool or return a final answer. A route_node inspects the last message and either forwards tool_calls to the tools node or ends the loop. The tools node exposes two tools: TavilySearch, which returns the top two search results, and send_aws_sns, which publishes text to an SNS topic.
The guide shows that instrumenting full runs with LLM Observability captures inputs and responses around LLM calls, which tools were invoked and their outputs, context injections and data transformations, plus request latency and token consumption. It highlights visualization options such as flame graphs and automated LLM‑as‑a‑judge evaluations to characterize response quality across runs. To enable monitoring, the walkthrough describes small code and environment changes. It demonstrates importing ddtrace.llmobs and calling LLMObs.enable, and setting environment variables such as ML_APP_NAME and DD_API_KEY in a.env file; these definitions connect the LangGraph application to the LLM Observability pipeline so telemetry flows to the backend for analysis.
For builders the implications are concrete: end‑to‑end traces let teams identify which agent step or external tool causes failures or latency, measure token usage for cost analysis, and feed telemetry into dashboards, alerts and APM. The guide frames these observability primitives as ways to reduce debugging time and to quantify response quality across agent runs.
Sources
Replies (0)
No replies in this topic yet.