Configuration Guide¶
This is the canonical config reference for Pali.
Source of truth, in order:
internal/config/defaults.gointernal/config/config.gointernal/config/validation.gopali.yaml.exampledeploy/docker/pali.container.yamlfor the container runtime profile
Config Files¶
pali.yaml.example: committed canonical templatepali.yaml: local default runtime file- custom config path: supported everywhere via
-config
The committed default config is intentionally zero-dependency:
vector_backend: sqliteentity_fact_backend: sqliteembedding.provider: lexical
That makes first boot easy on any machine. It is not the highest-quality retrieval setup. For better semantic recall and ranking, move to ollama, onnx, or openrouter once the basic deployment is working.
The container profile at deploy/docker/pali.container.yaml keeps the same zero-dependency defaults, but changes host and path assumptions to match containers:
server.host: 0.0.0.0database.sqlite_dsn: file:/var/lib/pali/pali.db?cache=shared- service-name URLs for
qdrant,neo4j, andollama
pali init will create the target config file from pali.yaml.example when it is missing:
Resolution Order¶
- code defaults
- YAML values
- legacy environment fallbacks
- explicit
PALI_*environment overrides
Legacy environment fallbacks:
OPENROUTER_API_KEY->openrouter.api_keyNEO4J_PASSWORD->neo4j.password
Common explicit overrides:
PALI_SERVER_HOSTPALI_SERVER_PORTPALI_DATABASE_SQLITE_DSNPALI_VECTOR_BACKENDPALI_ENTITY_FACT_BACKENDPALI_QDRANT_BASE_URLPALI_NEO4J_URIPALI_NEO4J_PASSWORDPALI_EMBEDDING_PROVIDERPALI_EMBEDDING_OLLAMA_BASE_URLPALI_OPENROUTER_API_KEYPALI_AUTH_ENABLEDPALI_AUTH_JWT_SECRET
The config loader now supports PALI_* overrides across the main runtime sections so containerized deployments can inject settings without mutating the baked YAML file.
Current Defaults¶
server:
host: 127.0.0.1
port: 8080
vector_backend: sqlite
entity_fact_backend: sqlite
default_tenant_id: ""
importance_scorer: heuristic
postprocess:
enabled: true
poll_interval_ms: 250
batch_size: 32
worker_count: 2
lease_ms: 30000
max_attempts: 5
retry_base_ms: 500
retry_max_ms: 60000
structured_memory:
enabled: false
dual_write_observations: false
dual_write_events: false
max_observations: 3
retrieval:
answer_type_routing_enabled: true
early_rank_rerank_enabled: true
temporal_resolver_enabled: true
open_domain_alternative_resolver_enabled: false
scoring:
algorithm: wal
wal:
recency: 0.1
relevance: 0.8
importance: 0.1
match:
recency: 0.05
relevance: 0.70
importance: 0.10
query_overlap: 0.10
routing: 0.05
search:
adaptive_query_expansion_enabled: false
adaptive_query_max_extra_queries: 2
adaptive_query_weak_lexical_threshold: 0.62
adaptive_query_plan_confidence_threshold: 0
candidate_window_multiplier: 5
candidate_window_min: 50
candidate_window_max: 200
candidate_window_temporal_boost: 40
candidate_window_multi_hop_boost: 80
candidate_window_filter_boost: 30
early_rerank_base_window: 25
early_rerank_max_window: 25
multi_hop:
entity_fact_bridge_enabled: true
llm_decomposition_enabled: false
decomposition_provider: openrouter
openrouter_model: openai/gpt-oss-120b:nitro
ollama_base_url: http://127.0.0.1:11434
ollama_model: deepseek-r1:7b
ollama_timeout_ms: 2000
max_decomposition_queries: 3
enable_pairwise_rerank: true
token_expansion_fallback: true
graph_path_enabled: false
graph_max_hops: 2
graph_seed_limit: 12
graph_path_limit: 128
graph_min_score: 0.12
graph_weight: 0.25
graph_temporal_validity: false
graph_singleton_invalidation: true
parser:
enabled: false
provider: heuristic
ollama_base_url: http://127.0.0.1:11434
ollama_model: deepseek-r1:7b
openrouter_model: openai/gpt-oss-120b:nitro
ollama_timeout_ms: 20000
store_raw_turn: true
max_facts: 4
dedupe_threshold: 0.88
update_threshold: 0.94
answer_span_retention_enabled: false
profile_layer:
support_links_enabled: false
database:
sqlite_dsn: file:pali.db?cache=shared
qdrant:
base_url: http://127.0.0.1:6333
api_key: ""
collection: pali_memories
timeout_ms: 2000
neo4j:
uri: bolt://127.0.0.1:7687
username: neo4j
password: ""
database: neo4j
timeout_ms: 2000
batch_size: 256
embedding:
provider: lexical
fallback_provider: lexical
ollama_base_url: http://127.0.0.1:11434
ollama_model: mxbai-embed-large
ollama_timeout_seconds: 10
model_path: ./models/all-MiniLM-L6-v2/model.onnx
tokenizer_path: ./models/all-MiniLM-L6-v2/tokenizer.json
openrouter:
base_url: https://openrouter.ai/api/v1
api_key: ""
embedding_model: openai/text-embedding-3-small:nitro
scoring_model: openai/gpt-oss-120b:nitro
timeout_ms: 10000
ollama:
base_url: http://127.0.0.1:11434
model: deepseek-r1:7b
timeout_ms: 2000
auth:
enabled: false
jwt_secret: ""
issuer: pali
logging:
dev_verbose: false
progress: true
Important Runtime Notes¶
vector_backend: sqliteis implemented.vector_backend: qdrantis implemented.entity_fact_backend: sqliteandentity_fact_backend: neo4jare implemented.embedding.provider: lexicalis the default because it requires no external services.embedding.provider: lexicalis appropriate for CI, smoke tests, and local no-model runs.embedding.provider: lexicalis not the best retrieval quality option; move toollama,onnx, oropenrouterwhen you want stronger semantic search.embedding.provider: onnxrequires both model files and an ONNX Runtime shared library.embedding.provider: openrouterrequiresopenrouter.api_key.retrieval.multi_hop.llm_decomposition_enabledis off by default.retrieval.answer_type_routing_enabledis on by default.retrieval.early_rank_rerank_enabledis on by default and is intended to lift relevant hits from ranks11-25into1-10before increasing retrieval depth.retrieval.temporal_resolver_enabledis on by default for stronger temporal answer normalization.retrieval.search.*controls adaptive query variants, candidate overfetch windows, and rerank window depth.retrieval.open_domain_alternative_resolver_enabledis the gated path for deterministic open-domain label/choice resolution.parser.answer_span_retention_enabledstores extra answer-bearing metadata on parsed memories without replacing existing canonical memory content.profile_layer.support_links_enabledstores source-support lines on summary/profile memories so retrieval can surface the summary and its backing evidence together.retrieval.multi_hop.decomposition_provider: noneis only valid when LLM decomposition is disabled.
Category Improvement Rollout¶
These flags were originally added for the single-hop / temporal / open-domain improvement slice.
They are now enabled by default in runtime defaults and pali.yaml.example.
- For baseline-only comparison runs, disable these explicitly:
retrieval.answer_type_routing_enabled: falseretrieval.early_rank_rerank_enabled: falseretrieval.temporal_resolver_enabled: false- Optional, still-experimental toggles remain off by default:
retrieval.open_domain_alternative_resolver_enabled: falseparser.answer_span_retention_enabled: falseprofile_layer.support_links_enabled: false
Benchmark and Test Profiles¶
Provider base profiles live under test/config/providers/.
Benchmark entrypoints live under test/benchmarks/profiles/.
The benchmark scripts render a runtime config from the provider profile, then copy both into each result directory:
config.profile.yamlconfig.rendered.yaml
That is the canonical record of benchmark configuration.
Rendering a Config for Tests or Benchmarks¶
go run ./cmd/configrender \
-profile test/config/providers/mock.yaml \
-out /tmp/pali.eval.yaml \
-host 127.0.0.1 \
-port 18080 \
-vector-backend sqlite \
-sqlite-dsn "file:/tmp/pali.eval.sqlite?cache=shared"
Run it with:
Setup Command¶
Safe local bootstrap:
Useful flags:
-skip-model-download-download-model-skip-runtime-check-skip-ollama-check-ollama-base-url-ollama-model-model-id
Validation Rules Worth Remembering¶
postprocess.*timing and batch fields must be positiveretrieval.search.*window and threshold fields must remain within their documented boundsparser.max_factsmust be positive- parser thresholds must stay in
[0,1] structured_memory.max_observationsmust be positive when dual-write modes are enabled- OpenRouter settings are required when OpenRouter-backed embedding, parsing, or scoring is enabled
- Neo4j password is required when
entity_fact_backend: neo4j