OSD Recovery Complete

A project log for Raspberry Pi Ceph Cluster

A Raspberry Pi Ceph Cluster using 2TB USB drives.

Robert RouquetteRobert Rouquette 05/19/2021 at 03:240 Comments

Once the failed drive was replaced, the cluster was able to rebalance and repair the inconsistent PGs.

    id:     105370dd-a69b-4836-b18c-53bcb8865174
    health: HEALTH_OK
    mon: 3 daemons, quorum ceph-mon00,ceph-mon01,ceph-mon02 (age 33m)
    mgr: ceph-mon02(active, since 13d), standbys: ceph-mon00, ceph-mon01
    mds: cephfs:2 {0=ceph-mon00=up:active,1=ceph-mon02=up:active} 1 up:standby
    osd: 30 osds: 30 up (since 9d), 30 in (since 9d)
    pools:   5 pools, 385 pgs
    objects: 8.42M objects, 29 TiB
    usage:   41 TiB used, 14 TiB / 55 TiB avail
    pgs:     385 active+clean
    client:   7.7 KiB/s wr, 0 op/s rd, 0 op/s wr

 After poking around on the failed drive, it looks like the actual 2.5" drive itself is fine.  The USB-to-SATA controller seems to be culprit, and randomly garbles data over the USB interface.  I was also able to observe it fail to enumerate on the USB bus.  A failure rate of 1 in 30 isn't bad considering the cost of the drives.