Passioni - Armando Passaro

Passioni › Informatica › Homelab

Guida: Monitoraggio hardware — temperature, SMART, UPS

04/03/2026 21:16

Tenere sotto controllo la salute dell'hardware

Il monitoraggio hardware previene guasti catastrofici: temperature anomale, dischi in deterioramento, batteria UPS scarica.

1. Temperature CPU/GPU

apt install lm-sensors -y
sensors-detect  # seguire il wizard
sensors          # leggere temperature

# GPU NVIDIA
nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader

2. SMART (salute dischi)

apt install smartmontools -y

# Stato rapido
smartctl -H /dev/sda

# Report completo
smartctl -a /dev/sda

# Test breve
smartctl -t short /dev/sda

# Monitoraggio automatico
systemctl enable --now smartd

3. Configurare smartd (/etc/smartd.conf)

# Monitorare tutti i dischi, email se errore
DEVICESCAN -a -o on -S on -n standby,q -s (S/../.././02|L/../../6/03) -W 4,45,55 -m admin@esempio.it

4. UPS con NUT

apt install nut -y

# /etc/nut/ups.conf
[myups]
driver = usbhid-ups
port = auto

# /etc/nut/upsmon.conf
MONITOR myups@localhost 1 admin password master
SHUTDOWNCMD "/sbin/shutdown -h +0"

5. Script di alerting

#!/bin/bash
TEMP=$(sensors | grep "Core 0" | awk "{print \$3}" | tr -d "+°C")
if (( $(echo "$TEMP > 80" | bc -l) )); then
    echo "ALLARME: temperatura CPU $TEMP gradi" | mail -s "Temp Alert" admin@esempio.it
fi

Il monitoraggio preventivo è essenziale: un disco SMART che segnala errori va sostituito immediatamente, prima di perdere dati.