Tutorial 6: Using LLM personas
This tutorial shows how to use mts1b-llm personas from any service — for narrative summaries, structured extraction, or as a CRO veto layer.
Time: ~20 minutes. Prerequisites: paper-trading profile installed; one provider key (Anthropic / OpenAI / Google) in Vault.
Step 1 — Configure a provider
vault kv put secret/mts1b/llm/anthropic \
api_key=<your_anthropic_api_key>
# Or OpenAI:
vault kv put secret/mts1b/llm/openai \
api_key=<your_openai_api_key>
mts1b-deploy restart mts1b-llm
mts1b-deploy status mts1b-llm
# ✓ mts1b-llm /healthz green, providers=[anthropic]
Step 2 — Smoke test
mts mts1b-llm complete \
--persona equities_analyst \
--prompt "Summarize what AAPL does in one paragraph."
Expected:
Apple Inc. designs, manufactures, and sells consumer electronics
(iPhone, iPad, Mac, Apple Watch, AirPods), wearables, and accessories,
along with services (Apple Music, iCloud, the App Store, Apple TV+).
Founded in 1976 and headquartered in Cupertino, California, Apple is
one of the largest companies in the world by market capitalization.
Its operating system iOS and macOS form a tightly integrated ecosystem
with the company's hardware.
Model: claude-sonnet-4-5
Cached: no (first call)
Cost: $0.0024 USD
Latency: 1.43s
Step 3 — List available personas
mts mts1b-llm personas list
Sample output (35 personas):
CRO Risk officer — veto orders
equities_analyst Equities research
options_strategist Options pricing + flow
crypto_specialist Crypto market structure
macro_economist Macro themes + rate decisions
quant_screener Find ideas from data
news_summarizer Daily news digest
sec_filing_reader 10-K / 10-Q parsing
sentiment_classifier News tone scoring
... (26 more)
Step 4 — Programmatic use
from mts1b_llm import LLM
from pydantic import BaseModel
llm = LLM()
# Plain text completion
response = await llm.complete(
persona="news_summarizer",
prompt="Summarize today's market moves for: $AAPL, $MSFT, $NVDA",
context={
"AAPL": {"return": -0.012, "news_count": 8, "sentiment_avg": -0.2},
"MSFT": {"return": 0.008, "news_count": 3, "sentiment_avg": 0.1},
"NVDA": {"return": 0.045, "news_count": 22, "sentiment_avg": 0.6},
},
max_tokens=300,
)
print(response.text)
# Structured output via pydantic schema
class Theme(BaseModel):
name: str
confidence: float
affected_tickers: list[str]
rationale: str
themes = await llm.complete(
persona="macro_economist",
prompt="Identify the top 3 macro themes driving markets today.",
context={"date": "2026-05-23", "vix": 14.5, "10y_yield": 4.3, "spx": 5680},
output_schema=Theme,
n=3,
)
for t in themes:
print(f" {t.name}: confidence={t.confidence:.2f}, tickers={t.affected_tickers}")
Step 5 — Custom persona
Create /etc/mts1b/personas/options_volatility_specialist.yaml:
name: options_volatility_specialist
description: Identifies vol mispricing and skew anomalies
default_model: claude-sonnet-4-5
fallback_models: [gpt-4-turbo]
temperature: 0.15
max_tokens: 1000
system_prompt: |
You are an options volatility specialist at a quant fund.
Focus on:
- Identifying skew anomalies
- Vol surface mispricings
- Calendar spread opportunities
Use precise quant terminology. Cite specific Greeks when relevant.
budget_usd_per_day: 2.0
Reload personas (no restart needed):
mts mts1b-llm personas reload
Use:
response = await llm.complete(
persona="options_volatility_specialist",
prompt="What does today's TSLA vol surface tell me?",
context={
"atm_iv": 0.65,
"25d_call_iv": 0.62,
"25d_put_iv": 0.74,
"30d_realized": 0.55,
},
)
print(response.text)
Step 6 — Use as CRO veto
mts mts1b-riskengine envelope set \
--fund-id paper-momentum \
--cro-veto-enabled true \
--cro-veto-confidence-threshold 0.85 \
--cro-veto-timeout-s 5.0
Now every order passes through the CRO veto gate (gate #7). The persona is CRO (built-in). If the veto fires (with confidence > 0.85), the order is rejected with reason CRO_VETO_FIRED + the LLM's rationale.
⚠️ Fail-OPEN: if the LLM doesn't respond in 5s, the order passes. Don't make it required infrastructure.
Step 7 — Monitor cost
mts mts1b-llm cost --window 7d --group-by persona
persona n_calls in_tok out_tok cost_usd avg_latency
equities_analyst 412 842k 203k $4.21 1.34s
CRO 1238 45k 12k $0.89 0.42s
news_summarizer 89 310k 28k $1.45 2.10s
quant_screener 23 180k 52k $0.98 3.40s
... (10 more)
TOTAL 2541 2.4M 478k $14.32 1.21s
If a persona is unexpectedly expensive, investigate (caching missing? long prompts? wrong model selected?).
Step 8 — Semantic cache check
mts mts1b-llm cache stats --window 7d
Hit rate: 73% (cached: 1856, miss: 685)
Total cost saved: $8.13 USD
Avg cosine sim of hits: 0.973
If hit rate is below 30% on a routine workload, your prompts are too variable. Add structure (use output_schema) to enable caching.
What's next
mts1b-llmrepo spec — full API + governor + evals- Concept: Risk envelopes — CRO veto in gate hierarchy
- Tutorial: Custom strategy — uses LLM for narrative summarization