proper way to remove a server from the cluster

petasanrd911
19 Posts
August 4, 2021, 10:33 pmQuote from petasanrd911 on August 4, 2021, 10:33 pmHello
What is the proper way to remove a server from a petasan cluster? if let's say I have 5 servers and want o go down to 4 servers?
thanks
Hello
What is the proper way to remove a server from a petasan cluster? if let's say I have 5 servers and want o go down to 4 servers?
thanks

admin
3,054 Posts
August 5, 2021, 11:10 pmQuote from admin on August 5, 2021, 11:10 pmFor nodes after first 3: reduce osd crush weight to 0 on node, weight for data to be moved ( see PG Status chart), once done shut down node then delete from node list
nodes 1-3 cannot be deleted, you can replace them with other hardware using "Replace Management Node", you can still remove their OSD using the first method
For nodes after first 3: reduce osd crush weight to 0 on node, weight for data to be moved ( see PG Status chart), once done shut down node then delete from node list
nodes 1-3 cannot be deleted, you can replace them with other hardware using "Replace Management Node", you can still remove their OSD using the first method

neiltorda
108 Posts
June 18, 2025, 9:19 pmQuote from neiltorda on June 18, 2025, 9:19 pmHow do you delete the node from the 'node list'
Also, what/where is the "replace management node" process completed?
How do you delete the node from the 'node list'
Also, what/where is the "replace management node" process completed?
Last edited on June 18, 2025, 9:29 pm by neiltorda · #3

admin
3,054 Posts
June 26, 2025, 11:47 amQuote from admin on June 26, 2025, 11:47 amIf you want to replace a node: install new node giving it same hostname and when you deploy it: select "Replace Node" instead of "Join Existing Cluster". You can also move OSDs from old node to new node, whether before or after installing the new node.
If you want to delete a node for good without replacing, you can delete the node from UI when it is down.
If you want to replace a node: install new node giving it same hostname and when you deploy it: select "Replace Node" instead of "Join Existing Cluster". You can also move OSDs from old node to new node, whether before or after installing the new node.
If you want to delete a node for good without replacing, you can delete the node from UI when it is down.

neiltorda
108 Posts
June 26, 2025, 8:18 pmQuote from neiltorda on June 26, 2025, 8:18 pmThe issue I have is that one of the Management nodes crashed with a failed boot drive. The cluster is out of quorum, but is up and running on the 2 other management nodes.
cluster:
id: 1da111ec-[redacted]8079925b
health: HEALTH_WARN
1/3 mons down, quorum node3,node2
1 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum node3,node2 (age 4h), out of quorum: node1
mgr: node3(active, since 5M), standbys: node2
mds: 1/1 daemons up, 1 standby
osd: 155 osds: 129 up (since 5M), 102 in (since 5M); 3 remapped pgs
data:
volumes: 1/1 healthy
pools: 6 pools, 3489 pgs
objects: 46.18M objects, 175 TiB
usage: 354 TiB used, 380 TiB / 735 TiB avail
I have put a new drive in the original server and I have run the installer. I gave it the same hostname as the old management node and the same IP info. It is now a the deployment wizard screen at port 5001 on that reinstalled node.
The only options are:
- Create New Cluster
- Join Existing Cluster
- Replace Node
I don't see an option to replace management node.
When I look at the Node List from one of the other management nodes, it shows that other node is down.
Should I do replace Node here? Or is there somewhere else I see to replace management node.
--
Also, if I try to SSH from node 2 (a current management node) to the newly installed node that has the IP and hostname of the old management node, I get an error about ssh keys:
root@node2:~# ssh node1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:{******redacted**************].
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:7
remove with:
ssh-keygen -f "/root/.ssh/known_hosts" -R "node1"
ECDSA host key for node1 has changed and you have requested strict checking.
Host key verification failed.
Thanks,
Neil
The issue I have is that one of the Management nodes crashed with a failed boot drive. The cluster is out of quorum, but is up and running on the 2 other management nodes.
cluster:
id: 1da111ec-[redacted]8079925b
health: HEALTH_WARN
1/3 mons down, quorum node3,node2
1 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum node3,node2 (age 4h), out of quorum: node1
mgr: node3(active, since 5M), standbys: node2
mds: 1/1 daemons up, 1 standby
osd: 155 osds: 129 up (since 5M), 102 in (since 5M); 3 remapped pgs
data:
volumes: 1/1 healthy
pools: 6 pools, 3489 pgs
objects: 46.18M objects, 175 TiB
usage: 354 TiB used, 380 TiB / 735 TiB avail
I have put a new drive in the original server and I have run the installer. I gave it the same hostname as the old management node and the same IP info. It is now a the deployment wizard screen at port 5001 on that reinstalled node.
The only options are:
- Create New Cluster
- Join Existing Cluster
- Replace Node
I don't see an option to replace management node.
When I look at the Node List from one of the other management nodes, it shows that other node is down.
Should I do replace Node here? Or is there somewhere else I see to replace management node.
--
Also, if I try to SSH from node 2 (a current management node) to the newly installed node that has the IP and hostname of the old management node, I get an error about ssh keys:
root@node2:~# ssh node1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:{******redacted**************].
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:7
remove with:
ssh-keygen -f "/root/.ssh/known_hosts" -R "node1"
ECDSA host key for node1 has changed and you have requested strict checking.
Host key verification failed.
Thanks,
Neil
Last edited on June 26, 2025, 9:40 pm by neiltorda · #5
proper way to remove a server from the cluster
petasanrd911
19 Posts
Quote from petasanrd911 on August 4, 2021, 10:33 pmHello
What is the proper way to remove a server from a petasan cluster? if let's say I have 5 servers and want o go down to 4 servers?
thanks
Hello
What is the proper way to remove a server from a petasan cluster? if let's say I have 5 servers and want o go down to 4 servers?
thanks
admin
3,054 Posts
Quote from admin on August 5, 2021, 11:10 pmFor nodes after first 3: reduce osd crush weight to 0 on node, weight for data to be moved ( see PG Status chart), once done shut down node then delete from node list
nodes 1-3 cannot be deleted, you can replace them with other hardware using "Replace Management Node", you can still remove their OSD using the first method
For nodes after first 3: reduce osd crush weight to 0 on node, weight for data to be moved ( see PG Status chart), once done shut down node then delete from node list
nodes 1-3 cannot be deleted, you can replace them with other hardware using "Replace Management Node", you can still remove their OSD using the first method
neiltorda
108 Posts
Quote from neiltorda on June 18, 2025, 9:19 pmHow do you delete the node from the 'node list'
Also, what/where is the "replace management node" process completed?
How do you delete the node from the 'node list'
Also, what/where is the "replace management node" process completed?
admin
3,054 Posts
Quote from admin on June 26, 2025, 11:47 amIf you want to replace a node: install new node giving it same hostname and when you deploy it: select "Replace Node" instead of "Join Existing Cluster". You can also move OSDs from old node to new node, whether before or after installing the new node.
If you want to delete a node for good without replacing, you can delete the node from UI when it is down.
If you want to replace a node: install new node giving it same hostname and when you deploy it: select "Replace Node" instead of "Join Existing Cluster". You can also move OSDs from old node to new node, whether before or after installing the new node.
If you want to delete a node for good without replacing, you can delete the node from UI when it is down.
neiltorda
108 Posts
Quote from neiltorda on June 26, 2025, 8:18 pmThe issue I have is that one of the Management nodes crashed with a failed boot drive. The cluster is out of quorum, but is up and running on the 2 other management nodes.
cluster:
id: 1da111ec-[redacted]8079925b
health: HEALTH_WARN
1/3 mons down, quorum node3,node2
1 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum node3,node2 (age 4h), out of quorum: node1
mgr: node3(active, since 5M), standbys: node2
mds: 1/1 daemons up, 1 standby
osd: 155 osds: 129 up (since 5M), 102 in (since 5M); 3 remapped pgs
data:
volumes: 1/1 healthy
pools: 6 pools, 3489 pgs
objects: 46.18M objects, 175 TiB
usage: 354 TiB used, 380 TiB / 735 TiB avail
I have put a new drive in the original server and I have run the installer. I gave it the same hostname as the old management node and the same IP info. It is now a the deployment wizard screen at port 5001 on that reinstalled node.
The only options are:
- Create New Cluster
- Join Existing Cluster
- Replace NodeI don't see an option to replace management node.
When I look at the Node List from one of the other management nodes, it shows that other node is down.Should I do replace Node here? Or is there somewhere else I see to replace management node.
--
Also, if I try to SSH from node 2 (a current management node) to the newly installed node that has the IP and hostname of the old management node, I get an error about ssh keys:
root@node2:~# ssh node1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:{******redacted**************].
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:7
remove with:
ssh-keygen -f "/root/.ssh/known_hosts" -R "node1"
ECDSA host key for node1 has changed and you have requested strict checking.
Host key verification failed.
Thanks,
Neil
The issue I have is that one of the Management nodes crashed with a failed boot drive. The cluster is out of quorum, but is up and running on the 2 other management nodes.
cluster:
id: 1da111ec-[redacted]8079925b
health: HEALTH_WARN
1/3 mons down, quorum node3,node2
1 pgs not deep-scrubbed in time
services:
mon: 3 daemons, quorum node3,node2 (age 4h), out of quorum: node1
mgr: node3(active, since 5M), standbys: node2
mds: 1/1 daemons up, 1 standby
osd: 155 osds: 129 up (since 5M), 102 in (since 5M); 3 remapped pgs
data:
volumes: 1/1 healthy
pools: 6 pools, 3489 pgs
objects: 46.18M objects, 175 TiB
usage: 354 TiB used, 380 TiB / 735 TiB avail
I have put a new drive in the original server and I have run the installer. I gave it the same hostname as the old management node and the same IP info. It is now a the deployment wizard screen at port 5001 on that reinstalled node.
The only options are:
- Create New Cluster
- Join Existing Cluster
- Replace Node
I don't see an option to replace management node.
When I look at the Node List from one of the other management nodes, it shows that other node is down.
Should I do replace Node here? Or is there somewhere else I see to replace management node.
--
Also, if I try to SSH from node 2 (a current management node) to the newly installed node that has the IP and hostname of the old management node, I get an error about ssh keys:
root@node2:~# ssh node1
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
SHA256:{******redacted**************].
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:7
remove with:
ssh-keygen -f "/root/.ssh/known_hosts" -R "node1"
ECDSA host key for node1 has changed and you have requested strict checking.
Host key verification failed.
Thanks,
Neil