The Kiwix Archive: Difference between revisions

From Sea of Fate
Jump to navigationJump to search
Line 10: Line 10:
* VM Config: Debian | 4 Cores | 6GB RAM.
* VM Config: Debian | 4 Cores | 6GB RAM.
* Storage: 4TB XFS disk mounted at /mnt/docker_data and an additional 5TB XFS disk for ArchiveBox /mnt/archive_data
* Storage: 4TB XFS disk mounted at /mnt/docker_data and an additional 5TB XFS disk for ArchiveBox /mnt/archive_data
==[[Linux Commands]]==
A set of Linux commands to help show the progress of indexing and file copying


==πŸ‹ The Software Stack (Docker)==
==πŸ‹ The Software Stack (Docker)==

Revision as of 09:45, 3 March 2026

πŸ“– Introduction

Kiwix is an offline content reader that allows you to browse massive websitesβ€”like Wikipedia, StackExchange, or Project Gutenbergβ€”without an internet connection.

  • The Format: It uses highly compressed .ZIM files. A single file can contain the entirety of Wikipedia (with images) or the complete medical encyclopedia.
  • The Goal: To provide a permanent, offline knowledge base that remains accessible even if the internet is down, serving everyone on your local network.
  • Synergy: Works alongside OpenAlex (scholarly search) and ArchiveBox (personal web snapshots) to create a three-tier local research library.

πŸ’Ύ The Infrastructure

Blackberry has been slimmed down to be more efficient now that indexing is handled elsewhere.

  • Host: Blackberry
  • VM Config: Debian | 4 Cores | 6GB RAM.
  • Storage: 4TB XFS disk mounted at /mnt/docker_data and an additional 5TB XFS disk for ArchiveBox /mnt/archive_data

Linux Commands

A set of Linux commands to help show the progress of indexing and file copying

πŸ‹ The Software Stack (Docker)

Installing Docker & Compose

Before installing Dockge, we must install the Docker engine and the Compose plugin officially on Debian.

# Update and install dependencies
sudo apt update && sudo apt install -y ca-certificates curl gnupg
# Add Docker’s official GPG key
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install Docker Engine and Compose Plugin
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
# Optional: Allow your user to run docker without sudo
sudo usermod -aG docker $USER

πŸ› οΈ Installing Dockge

Dockge allows us to manage our "Stacks" (Docker Compose files) through a clean web interface.

# Preparation: Create directories
mkdir -p /opt/stacks /opt/dockge
cd /opt/dockge
# Download and Start Dockge
curl https://raw.githubusercontent.com/louislam/dockge/master/compose.yaml --output compose.yaml
docker compose up -d

πŸ› οΈ Preparation: Storage Folders

Organize The ZIM files on the 5TB disk so the container can find them easily.

mkdir -p /mnt/docker_data/stacks/kiwix-archive/zim/

πŸ“„ Kiwix YAML (The Stack)

Deploy this in your Dockge instance on Blackberry (Port 5001) and name it kiwix

services:
  kiwix:
    image: ghcr.io/kiwix/kiwix-serve:latest
    container_name: kiwix_wikipedia
    volumes:
      - /mnt/docker_data/stacks/kiwix-archive/zim:/data
    ports:
      - 8081:8080
    command:
      - --library
      - library.xml
    restart: unless-stopped
networks: {}

🌐 Accessing and Using the Library

Tool URL Purpose
Kiwix Web UI http://blackberry:8080 Browsing your downloaded offline libraries.
ZIM Library https://kiwix.org/en/download Where to download new content (Wikipedia, StackOverflow, etc.).
German Zim Library https://ftp.fau.de/kiwix/zim/ lists of zim files.
Kewix org https://download.kiwix.org/zim/ More Lists of Zim files

Indexing Files after download

There are three helper scripts in the /mnt/docker_data/stacks/kiwix-archive/zim/ directory that will help add new zim files to the xml index

Filename Purpose
./audit_incomplete.sh scans zim dir for incomplete files and move them to incomplete directory
./sweep_parts.sh Checks any files in the incomplete directory for files that have corresponding completed downloaded Zim
./index_vault.sh Adds new Zim to XML index file