Tutorial 6: Using LLM personas

This tutorial shows how to use mts1b-llm personas from any service — for narrative summaries, structured extraction, or as a CRO veto layer.

Time: ~20 minutes. Prerequisites: paper-trading profile installed; one provider key (Anthropic / OpenAI / Google) in Vault.

Step 1 — Configure a provider

vault kv put secret/mts1b/llm/anthropic \
  api_key=<your_anthropic_api_key>

# Or OpenAI:
vault kv put secret/mts1b/llm/openai \
  api_key=<your_openai_api_key>

mts1b-deploy restart mts1b-llm
mts1b-deploy status mts1b-llm
# ✓ mts1b-llm /healthz green, providers=[anthropic]

Step 2 — Smoke test

mts mts1b-llm complete \
  --persona equities_analyst \
  --prompt "Summarize what AAPL does in one paragraph."

Expected:

Apple Inc. designs, manufactures, and sells consumer electronics
(iPhone, iPad, Mac, Apple Watch, AirPods), wearables, and accessories,
along with services (Apple Music, iCloud, the App Store, Apple TV+).
Founded in 1976 and headquartered in Cupertino, California, Apple is
one of the largest companies in the world by market capitalization.
Its operating system iOS and macOS form a tightly integrated ecosystem
with the company's hardware.

Model:    claude-sonnet-4-5
Cached:   no (first call)
Cost:     $0.0024 USD
Latency:  1.43s

Step 3 — List available personas

mts mts1b-llm personas list

Sample output (35 personas):

CRO                       Risk officer — veto orders
equities_analyst          Equities research
options_strategist        Options pricing + flow
crypto_specialist         Crypto market structure
macro_economist           Macro themes + rate decisions
quant_screener            Find ideas from data
news_summarizer           Daily news digest
sec_filing_reader         10-K / 10-Q parsing
sentiment_classifier      News tone scoring
... (26 more)

Step 4 — Programmatic use

from mts1b_llm import LLM
from pydantic import BaseModel

llm = LLM()

# Plain text completion
response = await llm.complete(
    persona="news_summarizer",
    prompt="Summarize today's market moves for: $AAPL, $MSFT, $NVDA",
    context={
        "AAPL": {"return": -0.012, "news_count": 8, "sentiment_avg": -0.2},
        "MSFT": {"return": 0.008,  "news_count": 3, "sentiment_avg": 0.1},
        "NVDA": {"return": 0.045,  "news_count": 22, "sentiment_avg": 0.6},
    },
    max_tokens=300,
)
print(response.text)

# Structured output via pydantic schema
class Theme(BaseModel):
    name: str
    confidence: float
    affected_tickers: list[str]
    rationale: str

themes = await llm.complete(
    persona="macro_economist",
    prompt="Identify the top 3 macro themes driving markets today.",
    context={"date": "2026-05-23", "vix": 14.5, "10y_yield": 4.3, "spx": 5680},
    output_schema=Theme,
    n=3,
)
for t in themes:
    print(f"  {t.name}: confidence={t.confidence:.2f}, tickers={t.affected_tickers}")

Step 5 — Custom persona

Create /etc/mts1b/personas/options_volatility_specialist.yaml:

name: options_volatility_specialist
description: Identifies vol mispricing and skew anomalies
default_model: claude-sonnet-4-5
fallback_models: [gpt-4-turbo]
temperature: 0.15
max_tokens: 1000
system_prompt: |
  You are an options volatility specialist at a quant fund.
  Focus on:
    - Identifying skew anomalies
    - Vol surface mispricings
    - Calendar spread opportunities
  Use precise quant terminology. Cite specific Greeks when relevant.
budget_usd_per_day: 2.0

Reload personas (no restart needed):

mts mts1b-llm personas reload

Use:

response = await llm.complete(
    persona="options_volatility_specialist",
    prompt="What does today's TSLA vol surface tell me?",
    context={
        "atm_iv": 0.65,
        "25d_call_iv": 0.62,
        "25d_put_iv": 0.74,
        "30d_realized": 0.55,
    },
)
print(response.text)

Step 6 — Use as CRO veto

mts mts1b-riskengine envelope set \
  --fund-id paper-momentum \
  --cro-veto-enabled true \
  --cro-veto-confidence-threshold 0.85 \
  --cro-veto-timeout-s 5.0

Now every order passes through the CRO veto gate (gate #7). The persona is CRO (built-in). If the veto fires (with confidence > 0.85), the order is rejected with reason CRO_VETO_FIRED + the LLM's rationale.

⚠️ Fail-OPEN: if the LLM doesn't respond in 5s, the order passes. Don't make it required infrastructure.

Step 7 — Monitor cost

mts mts1b-llm cost --window 7d --group-by persona

persona                       n_calls    in_tok    out_tok    cost_usd    avg_latency
equities_analyst              412        842k      203k       $4.21       1.34s
CRO                           1238       45k       12k        $0.89       0.42s
news_summarizer               89         310k      28k        $1.45       2.10s
quant_screener                23         180k      52k        $0.98       3.40s
... (10 more)

TOTAL                         2541       2.4M      478k       $14.32       1.21s

If a persona is unexpectedly expensive, investigate (caching missing? long prompts? wrong model selected?).

Step 8 — Semantic cache check

mts mts1b-llm cache stats --window 7d

Hit rate: 73% (cached: 1856, miss: 685)
Total cost saved: $8.13 USD
Avg cosine sim of hits: 0.973

If hit rate is below 30% on a routine workload, your prompts are too variable. Add structure (use output_schema) to enable caching.

What's next

mts1b-llm repo spec — full API + governor + evals
Concept: Risk envelopes — CRO veto in gate hierarchy
Tutorial: Custom strategy — uses LLM for narrative summarization

Step 1 — Configure a provider​

Step 2 — Smoke test​

Step 3 — List available personas​

Step 4 — Programmatic use​

Step 5 — Custom persona​

Step 6 — Use as CRO veto​

Step 7 — Monitor cost​

Step 8 — Semantic cache check​

What's next​