Difference between revisions of "Havarie-Plan"

From
Jump to: navigation, search
(Created page with "= Havarie-Plan = == zuviele Zugriffe == <pre> var/run/openvpn .status auf undef prüfen, ob es zuviele sind cat *.status | grep UNDEF | wc -l (zählt die undef) wenn ja...")
 
(3. Netzwerk)
 
(11 intermediate revisions by one other user not shown)
Line 1: Line 1:
= Havarie-Plan =
+
= Werkzeugkasten =
 +
 
 +
== 1. Systemzustand & Ressourcen (CPU, RAM, Load) ==
 +
 
 +
=== Grundlegend ===
 +
 
 +
<code>top</code> / <code>htop</code> – laufende Prozesse, CPU/RAM
 +
 
 +
<code>atop</code> – sehr detailliert (CPU, RAM, I/O, Netzwerk)
 +
 
 +
<code>uptime</code> – Load Average
 +
 
 +
<code>free -h</code> – Speicherübersicht
 +
 
 +
<code>vmstat 1</code> – CPU, I/O, Speicher
 +
 
 +
=== Tiefergehend ===
 +
 
 +
<code>pidstat</code> – Prozessbezogene Statistiken
 +
 
 +
<code>mpstat</code> – CPU-Auslastung pro Core
 +
 
 +
<code>numactl</code>, <code>numastat</code> – NUMA-Analyse
 +
 
 +
== 2. Storage & I/O ==
 +
 
 +
=== Übersicht ===
 +
 
 +
<code>df -hT</code> – Dateisysteme & Typen
 +
 
 +
<code>du -sh *</code> – Speicherverbrauch
 +
 
 +
<code>lsblk -f</code> – Blockgeräte
 +
 
 +
<code>mount</code>, <code>findmnt</code>
 +
 
 +
=== I/O-Analyse ===
 +
 
 +
<code>iostat -xz 1</code> – Latenz, I/O-Wait
 +
 
 +
<code>iotop</code> – Disk-Last pro Prozess
 +
 
 +
<code>blktrace</code>, <code>blkparse</code> – Low-Level
 +
 
 +
<code>lsusb</code>
 +
 
 +
=== Dateisysteme ===
 +
 
 +
<code>fsck</code>
 +
 
 +
<code>tune2fs</code>, <code>dumpe2fs</code>
 +
 
 +
<code>xfs_repair</code>, <code>xfs_growfs</code>
 +
 
 +
== 3. Netzwerk ==
 +
 
 +
=== Basis ===
 +
 
 +
<code>ip a</code>, <code>ip r</code>, <code>ip n</code>
 +
 
 +
<code>ss -tulpn</code>  (Kernel)
 +
 
 +
<code>netstat -tulpn</code>  (/proc/..)
 +
 
 +
<code>ping</code>, <code>tracepath</code>, <code>traceroute</code>
 +
 
 +
<code>arp</code>, <code>ip neigh</code>
 +
 
 +
=== Traffic & Debugging ===
 +
 
 +
<code>tcpdump</code>
 +
 
 +
<code>termshark</code>
 +
 
 +
<code>iftop</code>
 +
 
 +
<code>nload</code>
 +
 
 +
<code>ethtool</code>
 +
 
 +
=== Erweiterte Tools ===
 +
 
 +
<code>conntrack</code>
 +
 
 +
<code>tc</code>
 +
 
 +
<code>mtr</code>
 +
 
 +
== 4. Logs & Events ==
 +
 
 +
=== Standard ===
 +
 
 +
<code>journalctl -xe</code>
 +
 
 +
<code>journalctl -u <service></code>
 +
 
 +
<code>dmesg -T</code>
 +
 
 +
<code>/var/log/syslog</code>, <code>/var/log/messages</code>
 +
 
 +
=== Analyse ===
 +
 
 +
<code>grep</code>, <code>egrep</code>, <code>rg</code>
 +
 
 +
<code>awk</code>, <code>sed</code>
 +
 
 +
<code>less +F</code>
 +
 
 +
=== Audit & Logrotation ===
 +
 
 +
<code>logrotate -d</code>
 +
 
 +
<code>ausearch</code>, <code>auditctl</code>
 +
 
 +
== 5. Prozesse & Services ==
 +
 
 +
=== systemd ===
 +
 
 +
<code>systemctl status</code>
 +
 
 +
<code>systemctl list-units --failed</code>
 +
 
 +
<code>systemctl show</code>
 +
 
 +
<code>systemd-analyze blame</code>
 +
 
 +
<code>systemd-analyze critical-chain</code>
 +
 
 +
=== Debugging ===
 +
 
 +
<code>strace -p <PID></code>
 +
 
 +
<code>lsof</code>
 +
 
 +
<code>pstree -ap</code>
 +
 
 +
<code>coredumpctl</code>
 +
 
 +
== 6. Hardware & Kernel ==
 +
 
 +
<code>lscpu</code>, <code>lsmem</code>
 +
 
 +
<code>lsusb</code>, <code>lspci</code>
 +
 
 +
<code>dmidecode</code>
 +
 
 +
<code>uname -a</code>
 +
 
 +
<code>lsmod</code>, <code>modprobe</code>
 +
 
 +
<code>sysctl -a</code>
 +
 
 +
== 7. Sicherheit ==
 +
 
 +
<code>last</code>, <code>lastlog</code>, <code>who</code>
 +
 
 +
<code>faillog</code>
 +
 
 +
<code>getenforce</code>, <code>sestatus</code>
 +
 
 +
<code>iptables -L -nv</code>
 +
 
 +
<code>nft list ruleset</code>
 +
 
 +
== 8. Performance & Spezialtools ==
 +
 
 +
<code>perf</code>
 +
 
 +
<code>bpftrace</code>
 +
 
 +
<code>sysdig</code>
 +
 
 +
<code>dstat</code>
 +
 
 +
<code>sar</code>
 +
 
 +
== 9. Container & Virtualisierung ==
 +
 
 +
=== Docker ===
 +
 
 +
<code>docker stats</code>
 +
 
 +
<code>docker inspect</code>
 +
 
 +
<code>docker logs</code>
 +
 
 +
=== Kubernetes ===
 +
 
 +
<code>kubectl describe</code>
 +
 
 +
<code>kubectl logs</code>
 +
 
 +
<code>kubectl top</code>
 +
 
 +
=== Virtualisierung ===
 +
 
 +
<code>virsh</code>
 +
 
 +
<code>virt-top</code>
 +
 
 +
= Typische Probleme =
  
 
== zuviele Zugriffe ==
 
== zuviele Zugriffe ==
Line 18: Line 218:
  
 
== Systemauslastung ==
 
== Systemauslastung ==
htop
+
top, htop, atop

Latest revision as of 09:47, 16 January 2026

Werkzeugkasten[edit]

1. Systemzustand & Ressourcen (CPU, RAM, Load)[edit]

Grundlegend[edit]

top / htop – laufende Prozesse, CPU/RAM

atop – sehr detailliert (CPU, RAM, I/O, Netzwerk)

uptime – Load Average

free -h – Speicherübersicht

vmstat 1 – CPU, I/O, Speicher

Tiefergehend[edit]

pidstat – Prozessbezogene Statistiken

mpstat – CPU-Auslastung pro Core

numactl, numastat – NUMA-Analyse

2. Storage & I/O[edit]

Übersicht[edit]

df -hT – Dateisysteme & Typen

du -sh * – Speicherverbrauch

lsblk -f – Blockgeräte

mount, findmnt

I/O-Analyse[edit]

iostat -xz 1 – Latenz, I/O-Wait

iotop – Disk-Last pro Prozess

blktrace, blkparse – Low-Level

lsusb

Dateisysteme[edit]

fsck

tune2fs, dumpe2fs

xfs_repair, xfs_growfs

3. Netzwerk[edit]

Basis[edit]

ip a, ip r, ip n

ss -tulpn (Kernel)

netstat -tulpn (/proc/..)

ping, tracepath, traceroute

arp, ip neigh

Traffic & Debugging[edit]

tcpdump

termshark

iftop

nload

ethtool

Erweiterte Tools[edit]

conntrack

tc

mtr

4. Logs & Events[edit]

Standard[edit]

journalctl -xe

journalctl -u <service>

dmesg -T

/var/log/syslog, /var/log/messages

Analyse[edit]

grep, egrep, rg

awk, sed

less +F

Audit & Logrotation[edit]

logrotate -d

ausearch, auditctl

5. Prozesse & Services[edit]

systemd[edit]

systemctl status

systemctl list-units --failed

systemctl show

systemd-analyze blame

systemd-analyze critical-chain

Debugging[edit]

strace -p <PID>

lsof

pstree -ap

coredumpctl

6. Hardware & Kernel[edit]

lscpu, lsmem

lsusb, lspci

dmidecode

uname -a

lsmod, modprobe

sysctl -a

7. Sicherheit[edit]

last, lastlog, who

faillog

getenforce, sestatus

iptables -L -nv

nft list ruleset

8. Performance & Spezialtools[edit]

perf

bpftrace

sysdig

dstat

sar

9. Container & Virtualisierung[edit]

Docker[edit]

docker stats

docker inspect

docker logs

Kubernetes[edit]

kubectl describe

kubectl logs

kubectl top

Virtualisierung[edit]

virsh

virt-top

Typische Probleme[edit]

zuviele Zugriffe[edit]

 
var/run/openvpn

.status auf undef prüfen, ob es zuviele sind

cat *.status | grep UNDEF | wc -l   (zählt die undef)

wenn ja, kann in der Einstellung die Anmeldungen pro Zeiteinheit herunter gesetzt werden

var/lcportal/persistent/dc/openvpn/server-tun0.conf

connect-freq 10 2    ( 10 Anmeldungen pro 2 sek)


Systemauslastung[edit]

top, htop, atop