Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Petasan 2 node cluster issue post 1 node failure

We have a 3 node Petasan cluster with version - 3.2.1. One of the nodes has crashed and we have rebalanced the data.

Post rebalancing following are the issues

  1. 9 PGs are down
  2. We are not able to bring up RBD disks from Petasan front end. Error is shown as "http code 504; Gateway timeout"
  3. We are trying to add 4th node; but unable to join ; error in front end is "Error joining cluster; not all ceph monitors are up"
  4. Only 1 manager is up

Following are the details

 

root@NODE-1:~# ceph -s
cluster:
id: 01d2f513-5979-4fb8-9d7d-d46d4027275c
health: HEALTH_ERR
1/3 mons down, quorum NODE-1,NODE-2
noout flag(s) set
full ratio(s) out of order
Reduced data availability: 9 pgs inactive, 9 pgs down
Degraded data redundancy: 17 pgs undersized

services:
mon: 3 daemons, quorum NODE-1,NODE-2 (age 15h), out of quorum: NODE-3
mgr: NODE-1(active, since 15h)
osd: 30 osds: 25 up (since 15h), 25 in (since 25h); 16 remapped pgs
flags noout

data:
pools: 3 pools, 1057 pgs
objects: 1.43M objects, 5.4 TiB
usage: 11 TiB used, 2.8 TiB / 14 TiB avail
pgs: 0.851% pgs not active
2/2862888 objects misplaced (0.000%)
1015 active+clean
17 active+undersized
16 active+clean+remapped
9 down