Skip to content

Basic Configuration

Distributed Knowledge can be configured through command-line parameters and configuration files. This document covers the basic configuration options to get started with the system.

Command Line Parameters

The Distributed Knowledge client (dk) accepts several command-line parameters:

Parameter Description Default Required
-userId User identifier in the network None Yes
-server WebSocket server URL wss://distributedknowledge.org Yes
-modelConfig Path to LLM configuration file ./model_config.json Yes
-rag_sources Path to RAG source file (JSONL) None No
-vector_db Path to vector database directory /tmp/vector_db No
-private Path to private key file None No
-public Path to public key file None No
-project_path Root path for project files Current directory No
-queriesFile Path to queries storage file ./queries.json No
-answersFile Path to answers storage file ./answers.json No
-automaticApproval Path to approval rules file ./automatic_approval.json No

Example Usage

A basic command to start the Distributed Knowledge client:

./dk -userId="research_team" \
     -private="./keys/private.pem" \
     -public="./keys/public.pem" \
     -project_path="/path/to/project" \
     -server="wss://distributedknowledge.org" \
     -rag_sources="./data/rag_sources.jsonl"

LLM Configuration

The LLM configuration file specifies which model provider and settings to use. It should be in JSON format:

Example: Anthropic Configuration

{
  "provider": "anthropic",
  "api_key": "sk-ant-your-anthropic-api-key",
  "model": "claude-3-sonnet-20240229",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 1000
  }
}

Example: OpenAI Configuration

{
  "provider": "openai",
  "api_key": "sk-your-openai-api-key",
  "model": "gpt-4",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 2000
  }
}

Example: Ollama Configuration

{
  "provider": "ollama",
  "model": "llama3",
  "base_url": "http://localhost:11434/api/generate",
  "parameters": {
    "temperature": 0.7,
    "max_tokens": 2000
  }
}

RAG Sources Configuration

The RAG (Retrieval Augmented Generation) system uses a JSONL file to define knowledge sources. Each line in this file represents a document in JSON format:

{"text": "The capital of France is Paris.", "file": "geography.txt"}
{"text": "Water boils at 100 degrees Celsius at sea level.", "file": "science.txt"}

Each entry should include:

  • text: The content of the document
  • file: A name or identifier for the document

Authentication Keys

For secure communication, Distributed Knowledge uses RSA key pairs. If not provided, temporary keys will be generated, but for production use, you should create and specify permanent keys.

Generating Keys

Generate a key pair using OpenSSL:

# Generate ED25519 private key
ssh-keygen -t ed25519 -f private_key.pem -N ""

# Extract public key
ssh-keygen -y -f private_key.pem > public_key.pem

Key Security

Important security considerations:

  • Keep your private key secure and never share it
  • The public key can be shared with others for verification
  • Use appropriate file permissions (e.g., chmod 600 private_key.pem)

Automatic Approval Configuration

The automatic approval system uses a JSON file containing an array of condition strings:

[
  "Accept all questions about public information",
  "Allow queries from trusted peers",
  "Reject questions about personal data"
]

These conditions are used to determine which incoming queries should be automatically accepted or rejected.

Directory Structure

A recommended directory structure for your Distributed Knowledge setup:

dk/
├── config/
│   ├── model_config.json
│   └── automatic_approval.json
├── data/
│   ├── knowledge_base.jsonl
│   └── vector_database/
├── keys/
│   ├── private_key.pem
│   └── public_key.pem
├── storage/
│   ├── queries.json
│   └── answers.json
└── dk  # executable

Next Steps

After completing basic configuration:

  1. Learn about advanced configuration options
  2. Explore network configuration
  3. Set up automatic approval rules
  4. Configure LLM parameters