A short, opinionated checklist for the failure modes that come up most often when picking up the framework. If your symptom isn't here, open a discussion — the next pass updates this page.

"API key not configured" / credentials missing

LLM clients in tnsai-llm read their credentials from environment variables by default. Each provider expects its own variable:

Provider	Variable
Anthropic	`ANTHROPIC_API_KEY`
OpenAI	`OPENAI_API_KEY`
Gemini	`GEMINI_API_KEY`
Ollama	none (local daemon)
NVIDIA NIM	`NVIDIA_API_KEY`
others	see the provider list in `tnsai-llm`'s README

Things to check, in order:

The variable is set in the process where the JVM runs — not just in your shell. IDE run-configs and systemd units need their own env block.
The variable name matches exactly (Anthropic is ANTHROPIC_API_KEY, not CLAUDE_API_KEY; OpenAI rejects OPEN_AI_API_KEY).
The key itself works against a curl test outside the framework — accounts in trial mode sometimes have models the key can't reach.

If you want to bypass env-var lookup, every client builder accepts an explicit .apiKey(...) argument.

"Tools registered but the LLM never calls them"

Two common causes:

The LLM doesn't support function-calling. Local Ollama models and older OpenAI snapshots (gpt-3.5-turbo-0301) frequently lack tool-use. AgentBuilder.build() enforces this at build time via LLMCapabilityValidator (validator code AGENT-V003); if the validator says your LLM can't call tools, swap the model.
The tool isn't actually registered. Use agent.getToolsForLLM() after build to confirm the catalog the agent will send. If your @Tool-annotated method is missing, check that the POJO holding it is passed via AgentBuilder.toolPojos(...) (not the legacy .tool() path — see migration for 0.7.0).

"LLM Capability Exception: streaming not supported"

BridgeLLMClient (the bridge used in some test harnesses and lightweight setups) does not stream — calling streamChat() against it throws LLMCapabilityException since 0.5.0. Two paths forward:

Install tnsai-llm and use the real provider client.
Call chat() (buffered single-shot) instead of streamChat().

"Context too long" / `CONTEXT_TOO_LONG` from a provider

Every provider has its own context window. TnsAI surfaces the provider's typed error code (CONTEXT_TOO_LONG) via LLMException.getProviderDetails().getCode() so you can branch on it. Mitigations, in order of preference:

Trim history before the call — ConversationCompactor (in Sona) or your own compaction step.
Use a smaller-input model in the same family (Haiku/Mini variants typically have the same context as their larger sibling, but the response budget changes).
If the prompt is dominated by RAG context, switch the retrieval strategy from top-K embedding to BM25 with a tighter K — usually wins back 30-60% of input tokens.

Tests hang or fail flakily under load

Two known patterns:

Async cancellation tests — there's a long-standing 200-iteration concurrent register/cancel race in CancellationTokenTest$Concurrency that's hardened but occasionally still flakes under noisy CI runners. If your test re-runs green in isolation, treat the first failure as flake.
HTTP server tests on a fixed port — HttpMcpServerTest and tests using com.sun.net.httpserver.HttpServer can race on port allocation when run in parallel. Surefire's forkCount setting and JUnit's @Execution(SAME_THREAD) are the usual fixes.

If your own test hangs, check whether it's waiting on a virtual thread that never produces output — ToolCallFilter chains that require user confirmation will block indefinitely unless an approval channel is wired (see agents → reliability).

"Agent.chat() throws IllegalStateException"

The agent is already in the STOPPING / STOPPED / FAILED state. Since 0.5.0 calls to chat() after stop() raise rather than fail silently. Check agent.getState() before retrying, and only call stop() from cleanup paths.

FAQ — short answers to the most common starter questions.
Migration — which versions break the API.
Reliability guides — resilience patterns, error handling, retries.

Troubleshooting

"API key not configured" / credentials missing

"Tools registered but the LLM never calls them"

"LLM Capability Exception: streaming not supported"

"Context too long" / `CONTEXT_TOO_LONG` from a provider

Tests hang or fail flakily under load

"Agent.chat() throws IllegalStateException"

On this page

Troubleshooting

"API key not configured" / credentials missing

"Tools registered but the LLM never calls them"

"LLM Capability Exception: streaming not supported"

"Context too long" / CONTEXT_TOO_LONG from a provider

Tests hang or fail flakily under load

"Agent.chat() throws IllegalStateException"

Related

On this page

"Context too long" / `CONTEXT_TOO_LONG` from a provider