Day 4 – Observability and Safety Guardrails
Today's Focus
Add observability to your agent, evaluate output quality, and apply safety guardrails.
Tasks
- Add structured logging throughout your agent pipeline using Python's
structlogor Node'spino: every LLM call should logmodel,input_tokens,output_tokens,latency_ms,stop_reason, and the number of tool calls. Visualise one run's token usage in a summary at the end. - Implement cost tracking: using Anthropic's published token prices for Claude Sonnet, calculate and log the estimated USD cost of each API call. Add a
max_cost_usdparameter to your agent — raise aBudgetExceededErrorif the cumulative cost exceeds the limit. - Write an evaluation harness: create 10 test cases (
{"input": str, "expected_output": str, "eval_type": "exact"|"contains"|"llm_judge"}). Forllm_judgecases, use a second LLM call to assess whether the agent output correctly answers the question. Report a pass rate at the end. - Add a safety guardrail: before executing any tool call, check if the requested action is on an allowlist. If Claude tries to call a tool that is not defined or attempts to call
eval()oros.system()via code generation, log the attempt, refuse the tool execution, and return an errortool_result. - Implement human-in-the-loop for high-impact actions: add a
send_emailtool that prints a confirmation prompt and waits for the user to typeyesbefore sending. Test that the agent correctly waits for approval.
Reading / Reference
- Anthropic: Reducing hallucinations.
- OWASP LLM Top 10 — especially prompt injection and insecure tool execution.
- Anthropic: Model pricing — for cost calculations.