<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Emmanuel Tekle — Blog</title><link>https://emmanueltekle.nl/posts/</link><description>Notes from the homelab, school, and what I'm currently learning.</description><generator>Hugo</generator><language>en</language><managingEditor>emmanueltekle@gmail.com (Emmanuel Tekle)</managingEditor><webMaster>emmanueltekle@gmail.com (Emmanuel Tekle)</webMaster><copyright>© 2026 Emmanuel Tekle</copyright><lastBuildDate>Fri, 15 May 2026 16:53:27 +0200</lastBuildDate><atom:link href="https://emmanueltekle.nl/posts/index.xml" rel="self" type="application/rss+xml"/><item><title>Building a monitoring stack from scratch</title><link>https://emmanueltekle.nl/posts/building-a-monitoring-stack-from-scratch/</link><pubDate>Thu, 02 Apr 2026 00:00:00 +0000</pubDate><author>emmanueltekle@gmail.com (Emmanuel Tekle)</author><guid>https://emmanueltekle.nl/posts/building-a-monitoring-stack-from-scratch/</guid><description>Notes on setting up Prometheus + Alertmanager + PagerDuty by hand on my home cluster.</description><content:encoded>&lt;![CDATA[<p>I wanted monitoring on my home cluster, so I set up Prometheus, Alertmanager and PagerDuty step by step. Writing the configs myself instead of using a pre-made stack helped me understand what each part actually does. These are my notes.</p><h2 id="the-setup">The setup</h2><p>Three Proxmox nodes. Each host runs<code>node_exporter</code>. One VM runs Prometheus and Alertmanager. Another runs<code>blackbox_exporter</code> for HTTP probes.</p><p>Prometheus has three scrape jobs:</p><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">scrape_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">job_name</span><span class="p">:</span><span class="w"/><span class="s1">'nodes'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">static_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">targets</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="s1">'10.0.0.10:9100'</span><span class="p">,</span><span class="w"/><span class="s1">'10.0.0.11:9100'</span><span class="p">,</span><span class="w"/><span class="s1">'10.0.0.12:9100'</span><span class="p">]</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">job_name</span><span class="p">:</span><span class="w"/><span class="s1">'blackbox-http'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">metrics_path</span><span class="p">:</span><span class="w"/><span class="l">/probe</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">params</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">module</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="l">http_2xx]</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">static_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">targets</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="l">https://cloud.emmanueltekle.nl</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="l">https://emmanueltekle.nl</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">relabel_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">source_labels</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="l">__address__]</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">target_label</span><span class="p">:</span><span class="w"/><span class="l">__param_target</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">source_labels</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="l">__param_target]</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">target_label</span><span class="p">:</span><span class="w"/><span class="l">instance</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">target_label</span><span class="p">:</span><span class="w"/><span class="l">__address__</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">replacement</span><span class="p">:</span><span class="w"/><span class="m">127.0.0.1</span><span class="p">:</span><span class="m">9115</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">job_name</span><span class="p">:</span><span class="w"/><span class="s1">'prometheus'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">static_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">targets</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="s1">'localhost:9090'</span><span class="p">]</span><span class="w"/></span></span></code></pre></div><p>The<code>blackbox-http</code> relabel block took me a while to get right. The<code>__param_target</code> rewrite is what tells<code>blackbox_exporter</code> which URL to probe on each scrape — without it, you get metrics but the target is empty.</p><h2 id="alertmanager">Alertmanager</h2><p>The mental model that helped me: labels are inputs, routes are filters, receivers are destinations.</p><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">route</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">receiver</span><span class="p">:</span><span class="w"/><span class="s1">'default-pagerduty'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">group_by</span><span class="p">:</span><span class="w"/><span class="p">[</span><span class="s1">'alertname'</span><span class="p">,</span><span class="w"/><span class="s1">'instance'</span><span class="p">]</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">group_wait</span><span class="p">:</span><span class="w"/><span class="l">30s</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">group_interval</span><span class="p">:</span><span class="w"/><span class="l">5m</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">repeat_interval</span><span class="p">:</span><span class="w"/><span class="l">4h</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">routes</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">matchers</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="l">severity = "info"</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">receiver</span><span class="p">:</span><span class="w"/><span class="s1">'null'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/></span></span><span class="line"><span class="cl"><span class="nt">receivers</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">name</span><span class="p">:</span><span class="w"/><span class="s1">'default-pagerduty'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">pagerduty_configs</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">service_key</span><span class="p">:</span><span class="w"/><span class="s1">'...'</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/>-<span class="nt">name</span><span class="p">:</span><span class="w"/><span class="s1">'null'</span><span class="w"/></span></span></code></pre></div><p>Info-level alerts go to a null receiver. Everything else pages.</p><h2 id="testing-the-chain">Testing the chain</h2><p>I needed to verify alerts actually reach PagerDuty without breaking a real service. I added a<code>TestAlert</code> rule with a constant true expression:</p><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl">-<span class="nt">alert</span><span class="p">:</span><span class="w"/><span class="l">TestAlert</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">expr</span><span class="p">:</span><span class="w"/><span class="l">vector(1) &gt; 0</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">for</span><span class="p">:</span><span class="w"/><span class="l">0m</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">labels</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">severity</span><span class="p">:</span><span class="w"/><span class="l">critical</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">annotations</span><span class="p">:</span><span class="w"/></span></span><span class="line"><span class="cl"><span class="w"/><span class="nt">summary</span><span class="p">:</span><span class="w"/><span class="s2">"Manual test alert"</span><span class="w"/></span></span></code></pre></div><p>Reload Prometheus, wait for the alert to fire, get the SMS. Five minutes later the voice escalation came in because I didn&rsquo;t acknowledge it. Then I disabled the rule again.</p><h2 id="things-that-tripped-me-up">Things that tripped me up</h2><ul><li>A wrong<code>__param_target</code> rewrite meant every blackbox metric had an empty<code>instance</code> label, so Alertmanager couldn&rsquo;t deduplicate alerts.</li><li><code>group_interval: 30s</code> (which I had early on) flooded my phone with 12 pages from one outage.<code>5m</code> is much saner.</li><li>I had the wrong PagerDuty integration key for a few days. Alerts fired in Alertmanager but PagerDuty showed nothing. The Alertmanager logs surfaced it eventually.</li></ul><h2 id="repo">Repo</h2><p><a href="https://github.com/E-mma9/Monitoring-Homelab">github.com/E-mma9/Monitoring-Homelab</a> — full config and Grafana dashboard JSON.</p>
]]></content:encoded><category>prometheus</category><category>alertmanager</category><category>pagerduty</category><category>homelab</category><category>monitoring</category></item><item><title>Hardening Ubuntu with Bash</title><link>https://emmanueltekle.nl/posts/hardening-ubuntu-with-bash/</link><pubDate>Wed, 25 Mar 2026 00:00:00 +0000</pubDate><author>emmanueltekle@gmail.com (Emmanuel Tekle)</author><guid>https://emmanueltekle.nl/posts/hardening-ubuntu-with-bash/</guid><description>Two scripts for taking a fresh Ubuntu/Debian box to a defensible baseline.</description><content:encoded>&lt;![CDATA[<p>I wrote two Bash scripts to harden a fresh Ubuntu/Debian server and audit it afterwards. Posting the structure and a few notes here.</p><h2 id="what-gets-configured">What gets configured</h2><p><code>harden.sh</code>:</p><ul><li>SSH — disable root login, disable password auth, change to a non-default port</li><li>UFW — default-deny, allow SSH/HTTP/HTTPS</li><li><code>auditd</code> for security logging</li><li><code>unattended-upgrades</code> for security patches</li><li><code>fail2ban</code> for brute-force protection</li></ul><p><code>security-audit.sh</code>:</p><ul><li>Verifies SSH config against the hardened baseline</li><li>Checks UFW status and rules</li><li>Reports failed logins from<code>journalctl</code></li><li>Lists pending updates</li><li>Exits non-zero on drift (useful for CI / cron)</li></ul><h2 id="idempotency">Idempotency</h2><p>Bash idempotency takes a bit of care. Each change is wrapped in a check so the script can be re-run safely:</p><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">ssh_set<span class="o">()</span><span class="o">{</span></span></span><span class="line"><span class="cl"><span class="nb">local</span><span class="nv">key</span><span class="o">=</span><span class="s2">"</span><span class="nv">$1</span><span class="s2">"</span><span class="nv">val</span><span class="o">=</span><span class="s2">"</span><span class="nv">$2</span><span class="s2">"</span></span></span><span class="line"><span class="cl"><span class="k">if</span> grep -qE<span class="s2">"^</span><span class="si">${</span><span class="nv">key</span><span class="si">}</span><span class="s2">[[:space:]]"</span> /etc/ssh/sshd_config<span class="p">;</span><span class="k">then</span></span></span><span class="line"><span class="cl"> sed -i<span class="s2">"s|^</span><span class="si">${</span><span class="nv">key</span><span class="si">}</span><span class="s2">.*|</span><span class="si">${</span><span class="nv">key</span><span class="si">}</span><span class="s2"/><span class="si">${</span><span class="nv">val</span><span class="si">}</span><span class="s2">|"</span> /etc/ssh/sshd_config</span></span><span class="line"><span class="cl"><span class="k">else</span></span></span><span class="line"><span class="cl"><span class="nb">echo</span><span class="s2">"</span><span class="si">${</span><span class="nv">key</span><span class="si">}</span><span class="s2"/><span class="si">${</span><span class="nv">val</span><span class="si">}</span><span class="s2">"</span> &gt;&gt; /etc/ssh/sshd_config</span></span><span class="line"><span class="cl"><span class="k">fi</span></span></span><span class="line"><span class="cl"><span class="o">}</span></span></span><span class="line"><span class="cl"/></span><span class="line"><span class="cl">ssh_set<span class="s2">"PermitRootLogin"</span><span class="s2">"no"</span></span></span><span class="line"><span class="cl">ssh_set<span class="s2">"PasswordAuthentication"</span><span class="s2">"no"</span></span></span><span class="line"><span class="cl">ssh_set<span class="s2">"Port"</span><span class="s2">"2222"</span></span></span></code></pre></div><p>Matching<code>^${key}[[:space:]]</code> avoids picking up commented-out lines. The function handles both update and insert in one place.</p><h2 id="the-audit-script">The audit script</h2><div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">check<span class="o">()</span><span class="o">{</span></span></span><span class="line"><span class="cl"><span class="nb">local</span><span class="nv">desc</span><span class="o">=</span><span class="s2">"</span><span class="nv">$1</span><span class="s2">"</span><span class="nv">cmd</span><span class="o">=</span><span class="s2">"</span><span class="nv">$2</span><span class="s2">"</span></span></span><span class="line"><span class="cl"><span class="k">if</span><span class="nb">eval</span><span class="s2">"</span><span class="nv">$cmd</span><span class="s2">"</span> &gt;/dev/null 2&gt;<span class="p">&amp;</span>1<span class="p">;</span><span class="k">then</span></span></span><span class="line"><span class="cl"><span class="nb">echo</span><span class="s2">" [OK]</span><span class="nv">$desc</span><span class="s2">"</span></span></span><span class="line"><span class="cl"><span class="k">else</span></span></span><span class="line"><span class="cl"><span class="nb">echo</span><span class="s2">" [FAIL]</span><span class="nv">$desc</span><span class="s2">"</span></span></span><span class="line"><span class="cl"><span class="nv">EXIT</span><span class="o">=</span><span class="m">1</span></span></span><span class="line"><span class="cl"><span class="k">fi</span></span></span><span class="line"><span class="cl"><span class="o">}</span></span></span><span class="line"><span class="cl"/></span><span class="line"><span class="cl">check<span class="s2">"SSH root login disabled"</span><span class="se">\</span></span></span><span class="line"><span class="cl"><span class="s2">"grep -qE '^PermitRootLogin no' /etc/ssh/sshd_config"</span></span></span><span class="line"><span class="cl">check<span class="s2">"SSH password auth disabled"</span><span class="se">\</span></span></span><span class="line"><span class="cl"><span class="s2">"grep -qE '^PasswordAuthentication no' /etc/ssh/sshd_config"</span></span></span><span class="line"><span class="cl">check<span class="s2">"UFW active"</span><span class="s2">"ufw status | grep -q 'Status: active'"</span></span></span><span class="line"><span class="cl">check<span class="s2">"auditd running"</span><span class="s2">"systemctl is-active --quiet auditd"</span></span></span></code></pre></div><p><code>[OK]/[FAIL]</code> columnar output is easy to pipe into a Slack notification or store in a log.</p><h2 id="when-this-stops-being-the-right-tool">When this stops being the right tool</h2><p>For one server, Bash is fine. For ten, I&rsquo;d move to Ansible — the inventory and playbook structure pay for themselves. The scripts were a way to get familiar with the actual files and directives. After that, config management makes more sense.</p><h2 id="repo">Repo</h2><p><a href="https://github.com/E-mma9/Linux-Security-Hardening">github.com/E-mma9/Linux-Security-Hardening</a></p>
]]></content:encoded><category>linux</category><category>bash</category><category>security</category><category>ssh</category><category>ufw</category><category>auditd</category></item><item><title>Notes on my 3-node Proxmox cluster</title><link>https://emmanueltekle.nl/posts/notes-on-my-3-node-proxmox-cluster/</link><pubDate>Sun, 18 Jan 2026 00:00:00 +0000</pubDate><author>emmanueltekle@gmail.com (Emmanuel Tekle)</author><guid>https://emmanueltekle.nl/posts/notes-on-my-3-node-proxmox-cluster/</guid><description>Hardware, networks, and what I've learned running three nodes at home.</description><content:encoded>&lt;![CDATA[<p>A short write-up on my home Proxmox cluster — why three nodes, how it&rsquo;s wired, and what I&rsquo;ve learned. I get asked about this a lot, so this is the canonical answer.</p><h2 id="hardware">Hardware</h2><p>Three identical second-hand Intel N100 mini-PCs, ~€180 each. Each one has:</p><ul><li>N100 CPU, 16 GB RAM, 512 GB NVMe</li><li>2.5 GbE — matters for cluster + storage traffic</li><li>Headless, Proxmox VE 8</li></ul><p>Total around €600. Power draw is roughly 8 W idle, 15 W loaded per node.</p><h2 id="why-three-instead-of-one">Why three instead of one</h2><p>A single bigger server would be cheaper to run and quieter. Three nodes give me things a single host can&rsquo;t:</p><ul><li><strong>HA pairs</strong> — Pi-hole and a few other services run as an HA pair across two nodes. I can take a node down for kernel updates without breaking DNS for the house.</li><li><strong>Live migration</strong> —<code>qm migrate &lt;vmid&gt; &lt;target&gt;</code> moves a VM between hosts with a few seconds of network blip. Useful operationally, and useful for testing.</li><li><strong>Real failure modes</strong> — pulling a power cable on one node behaves differently from rebooting a single host. The behaviour during recovery is the actual reason I have three.</li></ul><h2 id="networks">Networks</h2><p>Four VLANs, segmented on an OPNsense box:</p><ul><li><strong>MGMT (VLAN 10)</strong> — Proxmox UI, SSH, cluster heartbeat</li><li><strong>STORAGE (VLAN 20)</strong> — Ceph backend, jumbo frames (MTU 9000)</li><li><strong>PROD (VLAN 30)</strong> — VM traffic</li><li><strong>TRUST (VLAN 100)</strong> — daily-driver hosts</li></ul><p>Splitting storage from production traffic means a heavy workload doesn&rsquo;t starve the management plane.</p><h2 id="storage">Storage</h2><p>I tried Ceph for a while because I wanted to learn it. On three N100s it&rsquo;s not fast, but it does heal correctly when I pull a disk. After about three months I moved most things back to ZFS replication — simpler to reason about, and at this scale the resilience of Ceph wasn&rsquo;t worth its operational weight for me.</p><h2 id="what-its-not">What it&rsquo;s not</h2><ul><li>Not cheaper than a single server, on hardware or electricity</li><li>Not quieter — three fans humming</li><li>Not simpler</li></ul><p>It&rsquo;s just closer to a production setup, which is the only reason I run it.</p><h2 id="what-id-do-differently">What I&rsquo;d do differently</h2><p>Start with three nodes again, but skip Ceph and go straight to ZFS + scheduled replication. The time I spent on Ceph was educational but not directly useful for the workloads I actually run at home.</p>
]]></content:encoded><category>proxmox</category><category>homelab</category><category>linux</category><category>virtualization</category></item></channel></rss>