McGarrah Technical Blog

Posts tagged with "ceph"

Adding Ceph Dashboard to Your Proxmox Cluster

The Ceph Dashboard is incredibly useful for monitoring your cluster’s health, but setting it up on Proxmox isn’t as straightforward as the documentation suggests. After wrestling with SSL certificates and password policies, here’s how to get it working properly.

Why You Want the Ceph Dashboard

The dashboard gives you a web interface to monitor your Ceph cluster without SSH’ing into nodes and running CLI commands. You can see OSD status, pool usage, performance metrics, and cluster health at a glance. It’s essential for any serious homelab running Ceph. Or if you are doing something unusual like use USB Drives for your storage media and want additional metrics for performance evaluation. I also have fast SSDs for my DB/WAL with the USB Drives for Data.

Upcoming Articles Roadmap: September - December 2025

I’ve got a pile of articles I want to get out before the end of 2025, and I’m trying to stick to at least one post per week. That’s roughly 16 more articles between now and December, which sounds doable if I don’t get distracted by shiny new projects.

Ceph Cluster Complete Removal on Proxmox for the Homelabs

My test Proxmox Cluster is used for testing and along the way I broke the Ceph Cluster part of it badly while doing a lot of physical media replacements. The test cluster is the right place to try out risky stuff instead of on my main cluster that is loaded up with my data. Fixing it often teaches you something but in this case I already know the lessons and just want to fast track getting a clean ceph cluster back online.

I need it back in place to test the Proxmox 8.2 to Proxmox 8.3 upgrade of my main cluster. So this is a quick guide on how to completely clean out your Ceph Cluster installation as if it never existed on your Proxmox Cluster 8.2 or 8.3 environment.

proxmox ceph install dialog

ProxMox 8.2.4 Upgrade on Dell Wyse 3040s

My earlier post for ProxMox 8.2.2 Cluster on Dell Wyse 3040s mentioned the tight constraints of the cluster both with RAM and DISK space. There are some extra steps involved in keeping a very lean Proxmox 8 cluster running on these extremely resource limited boxes. I am running Proxmox 8.2 and Ceph Reef on them which leaves them slightly under resourced as a default. So when the Ceph would not start up the Ceph Monitors after my upgrade from Proxmox 8.2.2 to 8.2.4, I had to dig a bit to find the problem.

Proxmox SFF Cluster

Ceph Monitor will not start up if there is not at least 5% free disk space on the root partition. My root volumes were sitting right at 95% used. So our story begins…

Proxmox Ceph settings for the Homelab

What is Ceph? Ceph is an open source software-defined storage system designed and built to address block, file and object storage needs for a modern homelab. Proxmox Virtual Environment (PVE) makes creating and managing a Hyper-Converged Ceph Cluster relatively easy for initially configuring and setting it up.

Why would you want a Hyper-Converged storage system like Ceph? So your PVE that runs Virtual Machines and Linux Containers has a highly available shared storage service making them portable between nodes in your cluster of machines and thus highly-available services.

There is a significant learning curve involved in understanding how the pieces of Ceph fit together which the Proxmox documentation does a decent job of helping you along. Proxmox VE sets some decent defaults for the Ceph Cluster that are good for an enterprise environment. What they do not do is help you set default to reduce wear and load on your Homelab system. This is where I am going to try out a few things to reduce load and wear on my Homelab equipment while maintaining a relatively high-availability environment.

My post on Ceph Cluster rebalance issue from earlier was from figuring out issues in an unbalanced cluster from a strange data loaded into a cluster. This post is focused on a regular running cluster that needs some optimization for the homelab.

ProxMox 8.2.2 Cluster on Dell Wyse 3040s

I want a place to test and try out new features and capabilities in Proxmox 8.2.2 SDN (Software Defined Networking). I would also like to be able to test some Ceph Cluster configuration changes that are risky as well. I do not want to do it on my semi-production Proxmox 8.2.2 Ceph enabled Cluster that I have mentioned in earlier posts. With 55TiB of raw storage and 29TiB of it loaded up with content, that would be painful to rebuild or reload if I made a mistake during my testing of SDN or Ceph capabilities.

Test in Prod, what could go wrong?

ProxMox 8.2 for the Homelabs

I am in the process of building a Proxmox 8 Cluster with Ceph in an HA (high availability) configuration using very low-end hardware and questionable options for the various hardware buses. I’m going for HA, cheapfrugal and reuse of hardware that I’ve gathered up over the years.

Over the COVID lockdown, I was running a Plex Media Server (PMS) on an older Dell Optiplex 390 SFF Desktop that I cobbled into it several Seagate USB3 portable drives that I just slapped on it as I needed more space. It hosted my extensive VHS, DVD and BluRay library as I ripped them into digital formats. To improve the experience I threw a Nvidia Quadro P400 into the mix and a PCIe USB3 card for faster access to the drives. Eventually, I had some drive issues and wanted to get some additional reliability into the mix so tried out Microsoft Windows Storage Spaces (MWSS). Windows and the associated fun I had with MWSS left me incredibly frustrated and I was trying to make an enterprise product work in a low-end workstation with a bunch of USB drives. The thing that made me fully abandon MWSS was the recovery options when you had a bad drive. MWSS probably works well with solid enterprise equipment but was misery on the stuff I cobbled together. So exit Windows OS.

For about ten (10) years, I had run an VMWare ESXi server that let me play with new technology and host some content and services. I let it go awhile back while I was in graduate school and working full-time but have missed this as an option ever since. So adding a homelab server or cluster will let me get some of that back.

Ceph Cluster rebalance issue

This is rough draft that I’m just pushing out as it might be useful to someone not stay in my drafts folder forever… Good enough beats Perfect that never ships every time.

I think I have mentioned my ProxMox/Ceph combo cluster in an earlier post. A quick summary is it consists of a five (5) node cluster for ProxMox HA and three of those nodes have Ceph with three (3) OSDs each for a total of nine (9) 5Tb OSDs. They are in a 3/2 ceph configuration with three copies of each piece of data allowing for running if two nodes are active. Those OSD / hard drives have been added in batches of three (3) with one added on each node as I could get drives cleaned and available. So I added them piece meal in a sets of three OSDs, then three more and finally the last batch of three. I’m also committing the sin of not using 10Gbps SAN networking for the Ceph cluster and using 1Gbps so performance is impacted.

Adding them in pieces as I also loaded up the CephFS with media content is what is hurting me now. My first three OSDs that are spread across the three nodes are pretty full at 75-85% and as I added the next batches, the cluster has never fully caught up and rebalanced the initial contents. This impacts the results of my ‘ceph osd df tree’ results showing I have less space then I actually have available.

Something that I’m navigating is Ceph will go into read-only mode when you approach the fill limits which is typically 95% of space available. It starts alerting like crazy at 85% filled with warning of dire things coming. Notice in my OSD status below that I have massive imbalances between the initial OSDs 0,1,2 versus 3,4,5 and 6,7,8.

Ceph OSD Status