Configuration Guide¶

This is the canonical config reference for Pali.

Source of truth, in order:

internal/config/defaults.go
internal/config/config.go
internal/config/validation.go
pali.yaml.example
deploy/docker/pali.container.yaml for the container runtime profile

Config Files¶

pali.yaml.example: committed canonical template
pali.yaml: local default runtime file
custom config path: supported everywhere via -config

The committed default config is intentionally zero-dependency:

vector_backend: sqlite
entity_fact_backend: sqlite
embedding.provider: lexical

That makes first boot easy on any machine. It is not the highest-quality retrieval setup. For better semantic recall and ranking, move to ollama, onnx, or openrouter once the basic deployment is working.

The container profile at deploy/docker/pali.container.yaml keeps the same zero-dependency defaults, but changes host and path assumptions to match containers:

server.host: 0.0.0.0
database.sqlite_dsn: file:/var/lib/pali/pali.db?cache=shared
service-name URLs for qdrant, neo4j, and ollama

pali init will create the target config file from pali.yaml.example when it is missing:

pali init -config /etc/pali/pali.yaml

Resolution Order¶

code defaults
YAML values
legacy environment fallbacks
explicit PALI_* environment overrides

Legacy environment fallbacks:

OPENROUTER_API_KEY -> openrouter.api_key
NEO4J_PASSWORD -> neo4j.password

Common explicit overrides:

PALI_SERVER_HOST
PALI_SERVER_PORT
PALI_DATABASE_SQLITE_DSN
PALI_VECTOR_BACKEND
PALI_ENTITY_FACT_BACKEND
PALI_QDRANT_BASE_URL
PALI_NEO4J_URI
PALI_NEO4J_PASSWORD
PALI_EMBEDDING_PROVIDER
PALI_EMBEDDING_OLLAMA_BASE_URL
PALI_OPENROUTER_API_KEY
PALI_AUTH_ENABLED
PALI_AUTH_JWT_SECRET

The config loader now supports PALI_* overrides across the main runtime sections so containerized deployments can inject settings without mutating the baked YAML file.

Current Defaults¶

server:
  host: 127.0.0.1
  port: 8080

vector_backend: sqlite
entity_fact_backend: sqlite
default_tenant_id: ""
importance_scorer: heuristic

postprocess:
  enabled: true
  poll_interval_ms: 250
  batch_size: 32
  worker_count: 2
  lease_ms: 30000
  max_attempts: 5
  retry_base_ms: 500
  retry_max_ms: 60000

structured_memory:
  enabled: false
  dual_write_observations: false
  dual_write_events: false
  max_observations: 3

retrieval:
  answer_type_routing_enabled: true
  early_rank_rerank_enabled: true
  temporal_resolver_enabled: true
  open_domain_alternative_resolver_enabled: false
  scoring:
    algorithm: wal
    wal:
      recency: 0.1
      relevance: 0.8
      importance: 0.1
    match:
      recency: 0.05
      relevance: 0.70
      importance: 0.10
      query_overlap: 0.10
      routing: 0.05
  search:
    adaptive_query_expansion_enabled: false
    adaptive_query_max_extra_queries: 2
    adaptive_query_weak_lexical_threshold: 0.62
    adaptive_query_plan_confidence_threshold: 0
    candidate_window_multiplier: 5
    candidate_window_min: 50
    candidate_window_max: 200
    candidate_window_temporal_boost: 40
    candidate_window_multi_hop_boost: 80
    candidate_window_filter_boost: 30
    early_rerank_base_window: 25
    early_rerank_max_window: 25
  multi_hop:
    entity_fact_bridge_enabled: true
    llm_decomposition_enabled: false
    decomposition_provider: openrouter
    openrouter_model: openai/gpt-oss-120b:nitro
    ollama_base_url: http://127.0.0.1:11434
    ollama_model: deepseek-r1:7b
    ollama_timeout_ms: 2000
    max_decomposition_queries: 3
    enable_pairwise_rerank: true
    token_expansion_fallback: true
    graph_path_enabled: false
    graph_max_hops: 2
    graph_seed_limit: 12
    graph_path_limit: 128
    graph_min_score: 0.12
    graph_weight: 0.25
    graph_temporal_validity: false
    graph_singleton_invalidation: true

parser:
  enabled: false
  provider: heuristic
  ollama_base_url: http://127.0.0.1:11434
  ollama_model: deepseek-r1:7b
  openrouter_model: openai/gpt-oss-120b:nitro
  ollama_timeout_ms: 20000
  store_raw_turn: true
  max_facts: 4
  dedupe_threshold: 0.88
  update_threshold: 0.94
  answer_span_retention_enabled: false

profile_layer:
  support_links_enabled: false

database:
  sqlite_dsn: file:pali.db?cache=shared

qdrant:
  base_url: http://127.0.0.1:6333
  api_key: ""
  collection: pali_memories
  timeout_ms: 2000

neo4j:
  uri: bolt://127.0.0.1:7687
  username: neo4j
  password: ""
  database: neo4j
  timeout_ms: 2000
  batch_size: 256

embedding:
  provider: lexical
  fallback_provider: lexical
  ollama_base_url: http://127.0.0.1:11434
  ollama_model: mxbai-embed-large
  ollama_timeout_seconds: 10
  model_path: ./models/all-MiniLM-L6-v2/model.onnx
  tokenizer_path: ./models/all-MiniLM-L6-v2/tokenizer.json

openrouter:
  base_url: https://openrouter.ai/api/v1
  api_key: ""
  embedding_model: openai/text-embedding-3-small:nitro
  scoring_model: openai/gpt-oss-120b:nitro
  timeout_ms: 10000

ollama:
  base_url: http://127.0.0.1:11434
  model: deepseek-r1:7b
  timeout_ms: 2000

auth:
  enabled: false
  jwt_secret: ""
  issuer: pali

logging:
  dev_verbose: false
  progress: true

Important Runtime Notes¶

vector_backend: sqlite is implemented.
vector_backend: qdrant is implemented.
entity_fact_backend: sqlite and entity_fact_backend: neo4j are implemented.
embedding.provider: lexical is the default because it requires no external services.
embedding.provider: lexical is appropriate for CI, smoke tests, and local no-model runs.
embedding.provider: lexical is not the best retrieval quality option; move to ollama, onnx, or openrouter when you want stronger semantic search.
embedding.provider: onnx requires both model files and an ONNX Runtime shared library.
embedding.provider: openrouter requires openrouter.api_key.
retrieval.multi_hop.llm_decomposition_enabled is off by default.
retrieval.answer_type_routing_enabled is on by default.
retrieval.early_rank_rerank_enabled is on by default and is intended to lift relevant hits from ranks 11-25 into 1-10 before increasing retrieval depth.
retrieval.temporal_resolver_enabled is on by default for stronger temporal answer normalization.
retrieval.search.* controls adaptive query variants, candidate overfetch windows, and rerank window depth.
retrieval.open_domain_alternative_resolver_enabled is the gated path for deterministic open-domain label/choice resolution.
parser.answer_span_retention_enabled stores extra answer-bearing metadata on parsed memories without replacing existing canonical memory content.
profile_layer.support_links_enabled stores source-support lines on summary/profile memories so retrieval can surface the summary and its backing evidence together.
retrieval.multi_hop.decomposition_provider: none is only valid when LLM decomposition is disabled.

Category Improvement Rollout¶

These flags were originally added for the single-hop / temporal / open-domain improvement slice. They are now enabled by default in runtime defaults and pali.yaml.example.

For baseline-only comparison runs, disable these explicitly:
retrieval.answer_type_routing_enabled: false
retrieval.early_rank_rerank_enabled: false
retrieval.temporal_resolver_enabled: false
Optional, still-experimental toggles remain off by default:
retrieval.open_domain_alternative_resolver_enabled: false
parser.answer_span_retention_enabled: false
profile_layer.support_links_enabled: false

Benchmark and Test Profiles¶

Provider base profiles live under test/config/providers/.

Benchmark entrypoints live under test/benchmarks/profiles/.

The benchmark scripts render a runtime config from the provider profile, then copy both into each result directory:

config.profile.yaml
config.rendered.yaml

That is the canonical record of benchmark configuration.

Rendering a Config for Tests or Benchmarks¶

go run ./cmd/configrender \
  -profile test/config/providers/mock.yaml \
  -out /tmp/pali.eval.yaml \
  -host 127.0.0.1 \
  -port 18080 \
  -vector-backend sqlite \
  -sqlite-dsn "file:/tmp/pali.eval.sqlite?cache=shared"

Run it with:

go run ./cmd/pali serve -config /tmp/pali.eval.yaml

Setup Command¶

Safe local bootstrap:

pali init -config pali.yaml

Useful flags:

-skip-model-download
-download-model
-skip-runtime-check
-skip-ollama-check
-ollama-base-url
-ollama-model
-model-id

Validation Rules Worth Remembering¶

postprocess.* timing and batch fields must be positive
retrieval.search.* window and threshold fields must remain within their documented bounds
parser.max_facts must be positive
parser thresholds must stay in [0,1]
structured_memory.max_observations must be positive when dual-write modes are enabled
OpenRouter settings are required when OpenRouter-backed embedding, parsing, or scoring is enabled
Neo4j password is required when entity_fact_backend: neo4j