Validator Best Practices

This guide covers operational best practices for running a reliable validator.

Infrastructure

Hardware

Practice	Recommendation
Use quality hardware	Don't cut corners on critical infrastructure
Provision headroom	2x minimum specs for safety margin
Use NVMe storage	SSD at minimum, NVMe preferred
Monitor hardware health	Track disk, CPU, memory, network

Network

Practice	Recommendation
Use dedicated IP	Static IP required for P2P
Configure firewall	Only open necessary ports
Use redundant connectivity	Multiple network paths if possible
Monitor latency	High latency affects consensus participation

Hosting

Practice	Recommendation
Choose reliable providers	Track record matters
Geographic diversity	Consider disaster scenarios
Avoid shared hosting	Dedicated servers preferred
Plan for scaling	Storage will grow

Security

Key Management

Practice	Recommendation
Secure `priv_validator_key.json`	Most critical file
Use hardware security	HSM for production validators
Backup keys securely	Offline, encrypted, multiple locations
Never copy keys between active nodes	Causes double-signing

Access Control

Practice	Recommendation
Use SSH keys only	Disable password auth
Restrict SSH access	Whitelist IPs if possible
Use non-root user	Run node as unprivileged user
Enable fail2ban	Block brute force attempts

Network Security

Practice	Recommendation
Minimal open ports	26656 required, others as needed
Use firewall	iptables, ufw, or cloud firewall
Consider sentry architecture	Protects validator from DDoS
Monitor for intrusion	Log analysis, IDS

Sentry Architecture

For production validators:

                  Internet
                      |
          +-----------+-----------+
          |           |           |
       Sentry1     Sentry2     Sentry3
          |           |           |
          +-----------+-----------+
                      |
               Private Network
                      |
                  Validator

Sentry Configuration

# On sentries
pex = true
persistent_peers = "validator_id@private-ip:26656"
private_peer_ids = "validator_id"

Validator Configuration

# On validator
pex = false
persistent_peers = "sentry1_id@ip:26656,sentry2_id@ip:26656"

Monitoring

Essential Metrics

Metric	Alert Threshold
Node health	Any failure
Block height	Not increasing
Peers	Below 5
Disk space	Below 20%
Memory	Above 90%
CPU	Sustained above 80%
Missed blocks	Any

Tools

Tool	Use Case
Prometheus	Metrics collection
Grafana	Visualization
PagerDuty/OpsGenie	Alerting
Tendermint metrics	Native node metrics

Alerting

Configure alerts for:

Node offline
Missing blocks
Low peers
High resource usage
Disk space warning
Upgrade announcements

Upgrades

Binary Upgrades

Monitor for upgrade announcements
Test new binary in advance
Use Cosmovisor for automatic upgrades
Have recovery plan (forward-only recovery)

Using Cosmovisor

# Directory structure
~/.monod/cosmovisor/
├── genesis/bin/monod
├── upgrades/
│   └── upgrade-name/bin/monod
└── current -> genesis (symlink)

Cosmovisor automatically switches binaries at upgrade height.

Upgrade Checklist

Backup and Recovery

What to Backup

Item	Frequency	Storage
`priv_validator_key.json`	Once (secure)	Encrypted, offline
`node_key.json`	Once	Secure storage
Operator keys	Once	Hardware wallet
Config files	After changes	Version control

What NOT to Backup

Chain data (re-sync instead)
Temporary files

Recovery Procedure

Provision new server
Install binary
Restore configuration
Restore validator key (carefully!)
Sync from genesis or snapshot
Verify before starting validation

!!! danger "Avoid Double-Signing" Never run two validators with the same key simultaneously. Always stop the old node first.

Communication

With Delegators

Practice	Recommendation
Provide contact info	Let delegators reach you
Announce maintenance	Advance notice
Post incident reports	Transparency builds trust
Share performance data	Uptime, rewards

With Community

Practice	Recommendation
Join validator chats	Coordinate with peers
Participate in governance	Vote and discuss
Respond to upgrade calls	Be prepared
Share knowledge	Help other validators

Incident Response

Preparation

Document runbooks
Define escalation paths
Practice recovery procedures
Have backup contact methods

During Incident

Acknowledge the incident
Assess severity
Follow runbooks
Communicate status
Escalate if needed

Post-Incident

Document timeline
Identify root cause
Implement fixes
Update procedures
Share learnings

Checklist

Regular validator health check:

FAQ

How often should I check my validator?

Automated monitoring 24/7. Manual check weekly.

What uptime should I target?

99.9% or higher. Every missed block is lost revenue and reputation.

Should I run my own sentries or use a service?

Running your own gives more control. Services add complexity but may offer benefits.

How do I handle planned maintenance?

Brief maintenance (< jail threshold) is fine. Announce and choose low-traffic times.

Requirements - Infrastructure needs
Slashing - Avoiding penalties
Monitoring - Monitoring setup
Upgrades - Upgrade procedures

Infrastructure​

Hardware​

Network​

Hosting​

Security​

Key Management​

Access Control​

Network Security​

Sentry Architecture​

Sentry Configuration​

Validator Configuration​

Monitoring​

Essential Metrics​

Tools​

Alerting​

Upgrades​

Binary Upgrades​

Using Cosmovisor​

Upgrade Checklist​

Backup and Recovery​

What to Backup​

What NOT to Backup​

Recovery Procedure​

Communication​

With Delegators​

With Community​

Incident Response​

Preparation​

During Incident​

Post-Incident​

Checklist​

FAQ​

Related​