Basic Configuration

Distributed Knowledge can be configured through command-line parameters and configuration files. This document covers the basic configuration options to get started with the system.

Command Line Parameters

The Distributed Knowledge client (dk) accepts several command-line parameters:

Parameter	Description	Default	Required
`-userId`	User identifier in the network	None	Yes
`-server`	WebSocket server URL	`wss://distributedknowledge.org`	Yes
`-modelConfig`	Path to LLM configuration file	`./model_config.json`	Yes
`-rag_sources`	Path to RAG source file (JSONL)	None	No
`-vector_db`	Path to vector database directory	`/tmp/vector_db`	No
`-private`	Path to private key file	None	No
`-public`	Path to public key file	None	No
`-project_path`	Root path for project files	Current directory	No
`-queriesFile`	Path to queries storage file	`./queries.json`	No
`-answersFile`	Path to answers storage file	`./answers.json`	No
`-automaticApproval`	Path to approval rules file	`./automatic_approval.json`	No

Example Usage

A basic command to start the Distributed Knowledge client:

./dk -userId="research_team" \
     -private="./keys/private.pem" \
     -public="./keys/public.pem" \
     -project_path="/path/to/project" \
     -server="wss://distributedknowledge.org" \
     -rag_sources="./data/rag_sources.jsonl"

LLM Configuration

The LLM configuration file specifies which model provider and settings to use. It should be in JSON format:

Example: Anthropic Configuration

{
  "provider": "anthropic",
  "api_key": "sk-ant-your-anthropic-api-key",
  "model": "claude-3-sonnet-20240229",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 1000
  }
}

Example: OpenAI Configuration

{
  "provider": "openai",
  "api_key": "sk-your-openai-api-key",
  "model": "gpt-4",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 2000
  }
}

Example: Ollama Configuration

{
  "provider": "ollama",
  "model": "llama3",
  "base_url": "http://localhost:11434/api/generate",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 2000
  }
}

RAG Sources Configuration

The RAG (Retrieval Augmented Generation) system uses a JSONL file to define knowledge sources. Each line in this file represents a document in JSON format:

{"text": "The capital of France is Paris.", "file": "geography.txt"}
{"text": "Water boils at 100 degrees Celsius at sea level.", "file": "science.txt"}

Each entry should include:

text: The content of the document
file: A name or identifier for the document

Authentication Keys

For secure communication, Distributed Knowledge uses RSA key pairs. If not provided, temporary keys will be generated, but for production use, you should create and specify permanent keys.

Generating Keys

Generate a key pair using OpenSSL:

# Generate ED25519 private key
ssh-keygen -t ed25519 -f private_key.pem -N ""

# Extract public key
ssh-keygen -y -f private_key.pem > public_key.pem

Key Security

Important security considerations:

Keep your private key secure and never share it
The public key can be shared with others for verification
Use appropriate file permissions (e.g., chmod 600 private_key.pem)

Automatic Approval Configuration

The automatic approval system uses a JSON file containing an array of condition strings:

[
  "Accept all questions about public information",
  "Allow queries from trusted peers",
  "Reject questions about personal data"
]

These conditions are used to determine which incoming queries should be automatically accepted or rejected.

Directory Structure

A recommended directory structure for your Distributed Knowledge setup:

dk/
├── config/
│   ├── model_config.json
│   └── automatic_approval.json
├── data/
│   ├── knowledge_base.jsonl
│   └── vector_database/
├── keys/
│   ├── private_key.pem
│   └── public_key.pem
├── storage/
│   ├── queries.json
│   └── answers.json
└── dk  # executable

Next Steps

After completing basic configuration:

Learn about advanced configuration options
Explore network configuration
Set up automatic approval rules
Configure LLM parameters