Posts tagged with "ceph"

USB Drive SMART Updates: Fast-Track to the GRUB Solution

February 03, 2026

New USB drives arrived for my Ceph cluster, and they’re not reporting SMART data. Again. After solving this problem in my October 2025 article, I need to update the configuration with new device IDs and share the lessons learned from running this solution across my entire cluster.

The bottom line: This was absolutely the right decision. SMART monitoring has already caught failing drives before they damaged data, and the performance trade-off is negligible compared to the stability benefits.

Homelab Storage Economics: Ceph vs Single Drive Costs

November 26, 2025

After building and running a Ceph cluster for my homelab, I’ve gained valuable insights into the real-world economics of distributed storage versus traditional single-drive solutions. This analysis breaks down the actual costs per GB across different storage strategies in my setup.

Optimizing Ceph Performance in Proxmox Homelab

October 12, 2025

Performance tuning Ceph in a homelab environment presents unique challenges, especially when running on USB storage and constrained hardware. After dealing with performance issues during cluster rebalancing and OSD expansion, I’ve learned valuable lessons about mClock configuration, IOPS optimization, and the realities of USB 3.0 storage performance.

Managing Ceph Nearfull Warnings in Proxmox Homelab

September 28, 2025

When running Ceph in a homelab environment, especially on resource-constrained hardware like my Dell Wyse 3040 cluster, managing storage capacity becomes critical. Understanding Ceph’s Nearfull warnings and how to respond to them can prevent your cluster from going read-only unexpectedly.

Proxmox 8 Lessons Learned in the Homelab

September 21, 2025

I’ve been running Proxmox in my homelab since version 7.4, and the journey to Proxmox 8.2.2 was to say the least… educational. Let me share some hard-won lessons that might save you some headaches. These even apply to the Proxmox 9 upgrades as well which I have not scheduled in my cluster yet. I’m pretty sure I’ll have updates when I get to that upgrade to share.

Adding Ceph Dashboard to Your Proxmox Cluster

September 14, 2025

The Ceph Dashboard is incredibly useful for monitoring your cluster’s health, but setting it up on Proxmox isn’t as straightforward as the documentation suggests. After wrestling with SSL certificates and password policies, here’s how to get it working properly.

Why You Want the Ceph Dashboard

The dashboard gives you a web interface to monitor your Ceph cluster without SSH’ing into nodes and running CLI commands. You can see OSD status, pool usage, performance metrics, and cluster health at a glance. It’s essential for any serious homelab running Ceph. Or if you are doing something unusual like use USB Drives for your storage media and want additional metrics for performance evaluation. I also have fast SSDs for my DB/WAL with the USB Drives for Data.

Upcoming Articles Roadmap: September - December 2025

September 12, 2025

I’ve got a pile of articles I want to get out before the end of 2025, and I’m trying to stick to at least one post per week. That’s roughly 16 more articles between now and December, which sounds doable if I don’t get distracted by shiny new projects.

Ceph Cluster Complete Removal on Proxmox for the Homelabs

February 15, 2025

My test Proxmox Cluster is used for testing and along the way I broke the Ceph Cluster part of it badly while doing a lot of physical media replacements. The test cluster is the right place to try out risky stuff instead of on my main cluster that is loaded up with my data. Fixing it often teaches you something but in this case I already know the lessons and just want to fast track getting a clean ceph cluster back online.

I need it back in place to test the Proxmox 8.2 to Proxmox 8.3 upgrade of my main cluster. So this is a quick guide on how to completely clean out your Ceph Cluster installation as if it never existed on your Proxmox Cluster 8.2 or 8.3 environment.

ProxMox 8.2.4 Upgrade on Dell Wyse 3040s

August 26, 2024

My earlier post for ProxMox 8.2.2 Cluster on Dell Wyse 3040s mentioned the tight constraints of the cluster both with RAM and DISK space. There are some extra steps involved in keeping a very lean Proxmox 8 cluster running on these extremely resource limited boxes. I am running Proxmox 8.2 and Ceph Reef on them which leaves them slightly under resourced as a default. So when the Ceph would not start up the Ceph Monitors after my upgrade from Proxmox 8.2.2 to 8.2.4, I had to dig a bit to find the problem.

Ceph Monitor will not start up if there is not at least 5% free disk space on the root partition. My root volumes were sitting right at 95% used. So our story begins…

Proxmox Ceph settings for the Homelab

August 08, 2024

What is Ceph? Ceph is an open source software-defined storage system designed and built to address block, file and object storage needs for a modern homelab. Proxmox Virtual Environment (PVE) makes creating and managing a Hyper-Converged Ceph Cluster relatively easy for initially configuring and setting it up.

Why would you want a Hyper-Converged storage system like Ceph? So your PVE that runs Virtual Machines and Linux Containers has a highly available shared storage service making them portable between nodes in your cluster of machines and thus highly-available services.

There is a significant learning curve involved in understanding how the pieces of Ceph fit together which the Proxmox documentation does a decent job of helping you along. Proxmox VE sets some decent defaults for the Ceph Cluster that are good for an enterprise environment. What they do not do is help you set default to reduce wear and load on your Homelab system. This is where I am going to try out a few things to reduce load and wear on my Homelab equipment while maintaining a relatively high-availability environment.

My post on Ceph Cluster rebalance issue from earlier was from figuring out issues in an unbalanced cluster from a strange data loaded into a cluster. This post is focused on a regular running cluster that needs some optimization for the homelab.

ProxMox 8.2.2 Cluster on Dell Wyse 3040s

July 23, 2024

I want a place to test and try out new features and capabilities in Proxmox 8.2.2 SDN (Software Defined Networking). I would also like to be able to test some Ceph Cluster configuration changes that are risky as well. I do not want to do it on my semi-production Proxmox 8.2.2 Ceph enabled Cluster that I have mentioned in earlier posts. With 55TiB of raw storage and 29TiB of it loaded up with content, that would be painful to rebuild or reload if I made a mistake during my testing of SDN or Ceph capabilities.

Hard Drives for the Homelabs

June 10, 2024

We live in a world with a penny ($0.01 USD) per GB of storage. I just found this bare drive MDD (MD20TS25672NAS) 20TB 7200 RPM 256MB Cache SATA 6.0Gb/s 3.5” Internal Hard Drive (for NAS, Network Storage) - 5 Years Warranty (Renewed) for $199.99 USD. I also found Avolusion PRO-X USB 3.0 External Hard Drive (Black) - 2 Year Warranty (20TB) for ~~$229.99~~ $219.99 USD with the USB C enclosure.

ProxMox 8.2 for the Homelabs

June 03, 2024

I am in the process of building a Proxmox 8 Cluster with Ceph in an HA (high availability) configuration using very low-end hardware and questionable options for the various hardware buses. I’m going for HA, ~~cheap~~frugal and reuse of hardware that I’ve gathered up over the years.

Over the COVID lockdown, I was running a Plex Media Server (PMS) on an older Dell Optiplex 390 SFF Desktop that I cobbled into it several Seagate USB3 portable drives that I just slapped on it as I needed more space. It hosted my extensive VHS, DVD and BluRay library as I ripped them into digital formats. To improve the experience I threw a Nvidia Quadro P400 into the mix and a PCIe USB3 card for faster access to the drives. Eventually, I had some drive issues and wanted to get some additional reliability into the mix so tried out Microsoft Windows Storage Spaces (MWSS). Windows and the associated fun I had with MWSS left me incredibly frustrated and I was trying to make an enterprise product work in a low-end workstation with a bunch of USB drives. The thing that made me fully abandon MWSS was the recovery options when you had a bad drive. MWSS probably works well with solid enterprise equipment but was misery on the stuff I cobbled together. So exit Windows OS.

For about ten (10) years, I had run an VMWare ESXi server that let me play with new technology and host some content and services. I let it go awhile back while I was in graduate school and working full-time but have missed this as an option ever since. So adding a homelab server or cluster will let me get some of that back.

Ceph Cluster rebalance issue

March 04, 2024

This is rough draft that I’m just pushing out as it might be useful to someone not stay in my drafts folder forever… Good enough beats Perfect that never ships every time.

I think I have mentioned my ProxMox/Ceph combo cluster in an earlier post. A quick summary is it consists of a five (5) node cluster for ProxMox HA and three of those nodes have Ceph with three (3) OSDs each for a total of nine (9) 5Tb OSDs. They are in a 3/2 ceph configuration with three copies of each piece of data allowing for running if two nodes are active. Those OSD / hard drives have been added in batches of three (3) with one added on each node as I could get drives cleaned and available. So I added them piece meal in a sets of three OSDs, then three more and finally the last batch of three. I’m also committing the sin of not using 10Gbps SAN networking for the Ceph cluster and using 1Gbps so performance is impacted.

Adding them in pieces as I also loaded up the CephFS with media content is what is hurting me now. My first three OSDs that are spread across the three nodes are pretty full at 75-85% and as I added the next batches, the cluster has never fully caught up and rebalanced the initial contents. This impacts the results of my ‘ceph osd df tree’ results showing I have less space then I actually have available.

Something that I’m navigating is Ceph will go into read-only mode when you approach the fill limits which is typically 95% of space available. It starts alerting like crazy at 85% filled with warning of dire things coming. Notice in my OSD status below that I have massive imbalances between the initial OSDs 0,1,2 versus 3,4,5 and 6,7,8.