Production Deployment

This guide covers what you need beyond the basic installation to run a reliable, secure production node. It addresses systemd hardening, backups, disaster recovery, log management, resource monitoring, and kernel tuning -- everything that separates a node that works from a node that stays up.

Prerequisites

Complete the Installation and Security Guide before applying these production hardening steps. This guide assumes you are running monod as a dedicated non-root user with systemd.

systemd Hardening

The Installation guide provides a basic systemd unit file. For production, extend it with additional security and resource directives.

Extended Unit File

sudo tee /etc/systemd/system/monod.service > /dev/null <<EOF
[Unit]
Description=Monolythium Node
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=monod
Group=monod
WorkingDirectory=/home/monod
ExecStart=/usr/local/bin/monod start --home /home/monod/.mono

# Restart configuration
Restart=on-failure
RestartSec=3
StartLimitInterval=0

# Resource limits
LimitNOFILE=65535
LimitNPROC=4096

# Environment
Environment="HOME=/home/monod"

# --- Production hardening ---
MemoryMax=12G
CPUWeight=90
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/home/monod/.mono
ProtectKernelModules=true
ProtectKernelTunables=true
RestrictNamespaces=true
RestrictSUIDSGID=true

[Install]
WantedBy=multi-user.target
EOF

After editing, reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart monod

Directive Reference

Directive	What It Does	Why It Matters
`MemoryMax=12G`	Kills the process if it exceeds 12 GB of RAM	Prevents a memory leak from consuming all host memory and crashing other services
`CPUWeight=90`	Gives `monod` high CPU scheduling priority (default is 100; lower values get less CPU)	Ensures consensus-critical work is not starved by background processes
`NoNewPrivileges=true`	Prevents the process (and its children) from gaining additional privileges via setuid/setgid binaries	Blocks privilege escalation if the process is compromised
`PrivateTmp=true`	Gives `monod` its own isolated `/tmp` directory	Prevents other processes from reading or tampering with temporary files
`ProtectSystem=strict`	Mounts the entire filesystem read-only except explicitly allowed paths	Stops a compromised process from modifying system binaries or configuration
`ProtectHome=true`	Makes all home directories inaccessible except the allowed paths	Protects other users' data from being read or modified
`ReadWritePaths=/home/monod/.mono`	Allows writes only to the chain data directory	The sole exception to `ProtectSystem=strict` -- `monod` can write chain state here
`ProtectKernelModules=true`	Blocks loading or unloading kernel modules	Prevents a compromised process from inserting rootkits or rogue drivers
`ProtectKernelTunables=true`	Makes `/proc/sys`, `/sys`, and similar paths read-only	Prevents runtime modification of kernel parameters
`RestrictNamespaces=true`	Denies creation of new Linux namespaces	Prevents container escape techniques and namespace-based privilege escalation
`RestrictSUIDSGID=true`	Prevents setting SUID/SGID bits on files	Blocks a common vector for privilege escalation

정보

If monod fails to start after adding these directives, check journalctl -u monod -n 50 for Permission denied errors. The most common cause is a missing path in ReadWritePaths.

Backup Strategy

What to Back Up

Item	Path	Priority	Notes
Validator key	`~/.mono/config/priv_validator_key.json`	Critical	Loss means losing your validator. Compromise means potential double-signing.
Node key	`~/.mono/config/node_key.json`	High	Defines your P2P identity. Replaceable but causes peer disruption.
Account keys (keyring)	`~/.mono/keyring-*`	Critical	Contains account private keys for signing transactions.
Configuration	`~/.mono/config/config.toml`, `app.toml`, `client.toml`	Medium	Recreatable, but saves time during recovery.
Chain data	`~/.mono/data/`	Do not back up	Re-sync from a snapshot or genesis instead. Chain data is large and changes every block.

Automated Backup Script

Create a cron job that runs monarch backup keys daily:

# Create backup script
sudo -u monod tee /home/monod/backup-keys.sh > /dev/null <<'SCRIPT'
#!/bin/bash
set -euo pipefail

BACKUP_DIR="/home/monod/backups"
DATE=$(date +%Y-%m-%d)
BACKUP_FILE="${BACKUP_DIR}/keys-${DATE}.tar.gz.enc"

mkdir -p "$BACKUP_DIR"

# Use monarch to create an encrypted backup
monarch backup keys --output "$BACKUP_FILE" --home /home/monod/.mono

# Keep only the last 30 daily backups locally
find "$BACKUP_DIR" -name "keys-*.tar.gz.enc" -mtime +30 -delete

echo "[$(date)] Backup complete: $BACKUP_FILE"
SCRIPT

sudo chmod +x /home/monod/backup-keys.sh

Schedule it with cron:

# Run daily at 03:00 UTC
sudo -u monod crontab -e
# Add this line:
# 0 3 * * * /home/monod/backup-keys.sh >> /home/monod/backups/backup.log 2>&1

Using monod directly instead of monarch:

# Manual encrypted backup of key files
tar czf - -C /home/monod/.mono/config priv_validator_key.json node_key.json | \
  openssl enc -aes-256-cbc -pbkdf2 -out /home/monod/backups/keys-$(date +%Y-%m-%d).tar.gz.enc

Backup Verification

Test your backup restoration monthly. An untested backup is not a backup.

# Decrypt and list contents (does not overwrite anything)
openssl enc -d -aes-256-cbc -pbkdf2 -in /home/monod/backups/keys-2026-03-30.tar.gz.enc | tar tzf -

Off-Site Storage

Store encrypted backups in at least two separate locations (USB drive, safety deposit box, a different cloud provider)
Never store unencrypted keys on cloud storage or network-attached drives
Use unique, strong encryption passwords -- store those passwords separately from the backups
Rotate backup encryption passwords quarterly

Validator Key Security

Your priv_validator_key.json is the single most sensitive file. If it is compromised, an attacker can double-sign and permanently slash your validator. Encrypt it before any transfer and never leave unencrypted copies on network-accessible storage.

Disaster Recovery

When a server fails, your recovery speed determines how many blocks you miss and whether you get jailed.

Recovery Procedure

Provision a new server meeting the hardware requirements
Install monod and monarch following the Installation guide

Restore keys from backup:

# Using monarch
monarch backup restore /path/to/backup/

# Or manually
openssl enc -d -aes-256-cbc -pbkdf2 -in keys-backup.tar.gz.enc | \
  tar xzf - -C /home/monod/.mono/config/

Download canonical configuration for your network:

monarch join --network Testnet --home /home/monod/.mono

Sync the chain -- choose one:
- From snapshot (30-60 minutes): monarch snapshot apply --network Testnet
- Via state-sync (15-30 minutes): monarch state-sync --network Testnet
- From genesis (several hours): let the node sync naturally

Verify the node is signing:

monarch status
# Confirm: catching_up = false, signing blocks

Recovery Time Estimates

Method	Estimated Time	Disk Required
State-sync	15-30 minutes	Minimal (recent state only)
Snapshot restore	30-60 minutes	Depends on snapshot size
Genesis sync	Several hours to days	Full chain history

Disaster Recovery Runbook

Use this checklist when recovering from a server failure:

Double-Signing Risk

Before starting your restored validator, confirm with certainty that the old server is stopped or destroyed. Running two instances of the same validator simultaneously causes double-signing and permanent slashing. See the Security Guide for the safe migration procedure.

Log Management

Unmanaged logs will eventually fill your disk. Configure journald to cap log storage.

journald Configuration

Create a drop-in configuration for monod log limits:

sudo mkdir -p /etc/systemd/journald.conf.d

sudo tee /etc/systemd/journald.conf.d/monod.conf > /dev/null <<EOF
[Journal]
SystemMaxUse=2G
SystemKeepFree=1G
SystemMaxFileSize=200M
MaxRetentionSec=30day
EOF

sudo systemctl restart systemd-journald

Directive	Effect
`SystemMaxUse=2G`	Total journal storage capped at 2 GB
`SystemKeepFree=1G`	Always keep at least 1 GB of disk free
`SystemMaxFileSize=200M`	Individual journal files rotate at 200 MB
`MaxRetentionSec=30day`	Logs older than 30 days are automatically deleted

Custom Log Files

If you redirect monod output to custom log files (not recommended -- prefer journald), configure logrotate:

sudo tee /etc/logrotate.d/monod > /dev/null <<EOF
/var/log/monod/*.log {
    daily
    rotate 14
    compress
    delaycompress
    missingok
    notifempty
    create 0640 monod monod
}
EOF

Viewing Logs

# Using monarch
monarch logs --lines 100
monarch logs --follow

# Using journalctl directly
sudo journalctl -u monod -n 100 --no-pager
sudo journalctl -u monod -f
sudo journalctl -u monod --since "1 hour ago"
sudo journalctl -u monod --priority=err

Resource Monitoring

Quick Health Audit

monarch doctor

monarch doctor checks 13 categories including disk space, peer count, sync status, key file permissions, and version compatibility. Run it after any configuration change.

Continuous Monitoring

# Enable built-in Prometheus metrics
monarch metrics enable

# Verify metrics endpoint is live
curl -s localhost:26660/metrics | head -5

See the Monitoring guide for full Prometheus and Grafana setup.

Warning Thresholds

Metric	Warning	Critical	How to Check
Disk usage	> 70%	> 85%	`df -h /home/monod/.mono`
Memory usage	> 80%	> 95%	`free -h`
CPU usage (sustained)	> 70%	> 90%	`top -bn1 \| grep monod`
Peer count	< 5	0	`monarch status` or `curl -s localhost:26657/net_info \| jq '.result.n_peers'`
Block height lag	> 10 blocks	> 100 blocks	Compare `monarch status` height to explorer
Disk I/O wait	> 20%	> 40%	`iostat -x 1 5`
Open file descriptors	> 50,000	> 60,000	`ls /proc/$(pgrep monod)/fd \| wc -l`
Missed blocks (validators)	> 0 in 1 hour	> 5 in 1 hour	`monarch status` or Prometheus `tendermint_consensus_missing_validators`

Disk Growth Monitoring

Chain data grows continuously. Monitor the growth rate to predict when you will need more storage:

# Check current data size
du -sh /home/monod/.mono/data

# Check disk usage over time (add to cron for tracking)
echo "$(date +%Y-%m-%d) $(du -sb /home/monod/.mono/data | cut -f1)" >> /home/monod/disk-growth.log

Kernel Tuning

Production nodes benefit from kernel parameter adjustments. Apply these via sysctl:

sudo tee /etc/sysctl.d/99-monolythium.conf > /dev/null <<EOF
# Maximum number of file handles for the entire system
fs.file-max = 100000

# Maximum queue length for incoming connections
net.core.somaxconn = 4096

# Reduce swap usage (prefer RAM over swap)
vm.swappiness = 10

# Allow reuse of TIME_WAIT sockets for new connections
net.ipv4.tcp_tw_reuse = 1
EOF

# Apply immediately
sudo sysctl --system

Parameter Reference

Parameter	Value	Default	Purpose
`fs.file-max`	`100000`	~65,000	Sets the system-wide maximum number of open file descriptors. A node maintaining hundreds of P2P connections and database files can exhaust the default limit.
`net.core.somaxconn`	`4096`	4096 (modern kernels)	Maximum length of the listen queue for incoming connections. Prevents connection drops during P2P peer surges. Older kernels default to 128.
`vm.swappiness`	`10`	60	Controls how aggressively the kernel swaps memory pages to disk. A value of 10 keeps more data in RAM, reducing latency for consensus operations.
`net.ipv4.tcp_tw_reuse`	`1`	0	Allows reusing sockets in `TIME_WAIT` state for new outbound connections. Useful when the node cycles through many short-lived P2P connections.

Verify the settings are applied:

sysctl fs.file-max net.core.somaxconn vm.swappiness net.ipv4.tcp_tw_reuse

Persistence

Settings in /etc/sysctl.d/ persist across reboots. Running sysctl --system applies them immediately without a reboot.

Production Checklist

Use this checklist before considering your node production-ready:

systemd

Extended unit file with all hardening directives applied
MemoryMax set to an appropriate value for your server
Service enabled for auto-start on boot (systemctl enable monod)
Verified service restarts correctly after kill -9

Backups

Automated daily key backup configured (cron + monarch backup keys)
Backup encryption password stored separately from backups
Off-site backup in at least two locations
Backup restoration tested within the last 30 days

Disaster Recovery

Written runbook with step-by-step recovery procedure
Recovery tested on a fresh server at least once
Snapshot or state-sync method verified and working

Logs

journald configured with size limits (SystemMaxUse=2G)
Log retention policy set (30 days recommended)
Log aggregation configured if running multiple nodes

Monitoring

monarch doctor passes all checks
monarch metrics enable activated
Prometheus scraping metrics endpoint
Grafana dashboards configured
Alerting rules for node down, missed blocks, low peers, disk usage
External uptime monitoring configured

Kernel

Security

SSH key-only authentication (password auth disabled)
Firewall configured (monarch firewall)
Node running as dedicated non-root user
Key file permissions verified (chmod 600)
Sentry architecture configured for validators (see Sentry Architecture)

Installation -- Basic setup and systemd configuration
Security Guide -- Key management, server hardening, double-signing prevention
Monitoring -- Prometheus, Grafana, and alerting setup
Troubleshooting -- Common issues and solutions
Sentry Architecture -- Protecting validators with sentry nodes

systemd Hardening​

Extended Unit File​

Directive Reference​

Backup Strategy​

What to Back Up​

Automated Backup Script​

Backup Verification​

Off-Site Storage​

Disaster Recovery​

Recovery Procedure​

Recovery Time Estimates​

Disaster Recovery Runbook​

Log Management​

journald Configuration​

Custom Log Files​

Viewing Logs​

Resource Monitoring​

Quick Health Audit​

Continuous Monitoring​

Warning Thresholds​

Disk Growth Monitoring​

Kernel Tuning​

Parameter Reference​

Production Checklist​

systemd​

Backups​

Disaster Recovery​

Logs​

Monitoring​

Kernel​

Security​

Related​

systemd Hardening

Extended Unit File

Directive Reference

Backup Strategy

What to Back Up

Automated Backup Script

Backup Verification

Off-Site Storage

Disaster Recovery

Recovery Procedure

Recovery Time Estimates

Disaster Recovery Runbook

Log Management

journald Configuration

Custom Log Files

Viewing Logs

Resource Monitoring

Quick Health Audit

Continuous Monitoring

Warning Thresholds

Disk Growth Monitoring

Kernel Tuning

Parameter Reference

Production Checklist

systemd

Backups

Disaster Recovery

Logs

Monitoring

Kernel

Security

Related