Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

proper way to remove a server from the cluster

Hello

What is the proper way to remove a server from a petasan cluster? if let's say I have 5 servers and want o go down to 4 servers?

thanks

For nodes after first 3: reduce osd crush weight to 0 on node, weight for data to be moved ( see PG Status chart), once done shut down node then delete from node list

nodes 1-3 cannot be deleted, you can replace them with other hardware using "Replace Management Node", you can still remove their OSD using the first method

How do you delete the node from the 'node list'

Also, what/where is the "replace management node" process completed?

If you want to replace a node: install new node giving it same hostname and when you deploy it: select "Replace Node" instead of "Join Existing Cluster". You can also move OSDs from old node to new node, whether before or after installing the new node.

If you want to delete a node for good without replacing, you can delete the node from UI when it is down.

The issue I have is that one of the Management nodes crashed with a failed boot drive. The cluster is out of quorum, but is up and running on the 2 other management nodes.

 cluster:

    id:     1da111ec-[redacted]8079925b

    health: HEALTH_WARN

            1/3 mons down, quorum node3,node2

            1 pgs not deep-scrubbed in time

  services:

    mon: 3 daemons, quorum node3,node2 (age 4h), out of quorum: node1

    mgr: node3(active, since 5M), standbys: node2

    mds: 1/1 daemons up, 1 standby

    osd: 155 osds: 129 up (since 5M), 102 in (since 5M); 3 remapped pgs

  data:

    volumes: 1/1 healthy

    pools:   6 pools, 3489 pgs

    objects: 46.18M objects, 175 TiB

    usage:   354 TiB used, 380 TiB / 735 TiB avail


I have put a new drive in the original server and I have run the installer. I gave it the same hostname as the old management node and the same IP info. It is now a the deployment wizard screen at port 5001 on that reinstalled node.
The only options are:
- Create New Cluster
- Join Existing Cluster
- Replace Node

I don't see an option to replace management node.
When I look at the Node List from one of the other management nodes, it shows that other node is down.

Should I do replace Node here? Or is there somewhere else I see to replace management node.

 

--

Also, if I try to SSH from node 2 (a current management node) to the newly installed node that has the IP and hostname of the old management node, I get an error about ssh keys:

root@node2:~# ssh node1

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!

Someone could be eavesdropping on you right now (man-in-the-middle attack)!

It is also possible that a host key has just been changed.

The fingerprint for the ECDSA key sent by the remote host is

SHA256:{******redacted**************].

Please contact your system administrator.

Add correct host key in /root/.ssh/known_hosts to get rid of this message.

Offending ECDSA key in /root/.ssh/known_hosts:7

  remove with:

  ssh-keygen -f "/root/.ssh/known_hosts" -R "node1"

ECDSA host key for node1 has changed and you have requested strict checking.

Host key verification failed.

 

Thanks,
Neil