Forums - PetaSAN

ForumGeneral DiscussionClarification on PetaSAN_Online_U …
You need to log in to create posts and topics. Login · Register
Clarification on PetaSAN_Online_Upgrade_Guide

Pages: 1 2 3

neiltorda
99 Posts

July 27, 2023, 6:32 pm
Quote from neiltorda on July 27, 2023, 6:32 pm
Client I/O appears to be working. I logged into a system that has an iscsi multi-path disk from this system mounted. I could go into the mounted folder and read and write data to it. It is reporting the correct sizes, etc.

I ran the commands on node psan4 (the one that has not yet been updated)

oot@psan4:~# ceph versions
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 36,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 114
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 39,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 120
}
}

root@psan4:~# ceph osd dump | grep release
require_osd_release octopus

Client I/O appears to be working. I logged into a system that has an iscsi multi-path disk from this system mounted. I could go into the mounted folder and read and write data to it. It is reporting the correct sizes, etc.

I ran the commands on node psan4 (the one that has not yet been updated)

oot@psan4:~# ceph versions
{
"mon": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"mgr": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"osd": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 36,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 114
},
"mds": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
},
"overall": {
"ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 39,
"ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 120
}
}

root@psan4:~# ceph osd dump | grep release
require_osd_release octopus

#11

admin
2,982 Posts

July 27, 2023, 6:35 pm
Quote from admin on July 27, 2023, 6:35 pm
looks good, i would go ahead with psan4 upgrade

looks good, i would go ahead with psan4 upgrade

#12

neiltorda
99 Posts

July 27, 2023, 6:44 pm
Quote from neiltorda on July 27, 2023, 6:44 pm
I am running the update script now on psan4.

Once complete, do I run this:
ceph osd require-osd-release quincy

from just one node, or on all four nodes?

Thanks,
Neil

I am running the update script now on psan4.

Once complete, do I run this:
ceph osd require-osd-release quincy

from just one node, or on all four nodes?

Thanks,
Neil

#13

neiltorda
99 Posts

July 27, 2023, 7:13 pm
Quote from neiltorda on July 27, 2023, 7:13 pm
after running the updates and running

ceph osd require-osd-release quincy on one of the nodes, everything comes up except for the graphs at the bottom of the webUI.

There is a red triangle in the top right corner that when moused=over says Internal server error.

after running the updates and running

ceph osd require-osd-release quincy on one of the nodes, everything comes up except for the graphs at the bottom of the webUI.

There is a red triangle in the top right corner that when moused=over says Internal server error.

#14

admin
2,982 Posts

July 28, 2023, 9:01 am
Quote from admin on July 28, 2023, 9:01 am
if you will reboot fter the upgrade, rhw graphs may work by themselves

you can also try

get the stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, try

/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh

if you will reboot fter the upgrade, rhw graphs may work by themselves

you can also try

get the stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, try

/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh

#15

neiltorda
99 Posts

July 28, 2023, 2:54 pm
Quote from neiltorda on July 28, 2023, 2:54 pm
I had rebooted all the nodes yesterday evening. So I first tried the commands you provided here. (the cluster leader is being shown as psan1, 172.16.32.X) where X is the correct IP for psan1.

I had the same result. I then tried rebooting all the nodes again, to see if that brought it back. Still same red triangle with internal server error.

So I ran the above commands a 2nd time on the appropriate node (again, still psan1), with the same results.

The top part of the webUI is reporting properly, as are other pages in the UI (iscsi disk list and path assignment for example) are also working. It is just the graph at the bottom of the dashboard that is still showing the error.

Any other ideas?

Thanks so much!

Neil

I had rebooted all the nodes yesterday evening. So I first tried the commands you provided here. (the cluster leader is being shown as psan1, 172.16.32.X) where X is the correct IP for psan1.

I had the same result. I then tried rebooting all the nodes again, to see if that brought it back. Still same red triangle with internal server error.

So I ran the above commands a 2nd time on the appropriate node (again, still psan1), with the same results.

The top part of the webUI is reporting properly, as are other pages in the UI (iscsi disk list and path assignment for example) are also working. It is just the graph at the bottom of the dashboard that is still showing the error.

Any other ideas?

Thanks so much!

Neil

#16

admin
2,982 Posts

July 30, 2023, 9:14 am
Quote from admin on July 30, 2023, 9:14 am
try to swicth the current stats server

get the current stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, stop the stats
systemctl stop petasan-cluster-leader
systemctl stop petasan-notification
/opt/petasan/scripts/stats-stop.sh
refresh dashboard, it should show bad gateway
consul kv delete PetaSAN/Services/ClusterLeader
wait approx 1 min then refresh dashboard
check the stats server is now a new node
/opt/petasan/scripts/util/get_cluster_leader.py

if you still have an error even with new stats server, i would suspect the stats data
on the new server:
/opt/petasan/scripts/stats-stop.sh
mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh
if this works and you really need the old stats data, you can move the old stats files in groups (there is 1 file for each metric) and maybe you could find a corrupt metric file causing this.

try to swicth the current stats server

get the current stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, stop the stats
systemctl stop petasan-cluster-leader
systemctl stop petasan-notification
/opt/petasan/scripts/stats-stop.sh
refresh dashboard, it should show bad gateway
consul kv delete PetaSAN/Services/ClusterLeader
wait approx 1 min then refresh dashboard
check the stats server is now a new node
/opt/petasan/scripts/util/get_cluster_leader.py

if you still have an error even with new stats server, i would suspect the stats data
on the new server:
/opt/petasan/scripts/stats-stop.sh
mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh
if this works and you really need the old stats data, you can move the old stats files in groups (there is 1 file for each metric) and maybe you could find a corrupt metric file causing this.

#17

neiltorda
99 Posts

July 31, 2023, 2:16 pm
Quote from neiltorda on July 31, 2023, 2:16 pm
After moving to the new node (psan2), I am still showing the same error.

So I started going through the next set of steps, but the stats-setup.sh script is throwing the error shown below:

root@psan2:~# /opt/petasan/scripts/stats-stop.sh
root@psan2:~# mv /opt/petasan/config/shared/graphite/ /opt/petasan/config/shared/graphite_backup
root@psan2:~# /opt/petasan/scripts/stats-setup.sh
mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

After moving to the new node (psan2), I am still showing the same error.

So I started going through the next set of steps, but the stats-setup.sh script is throwing the error shown below:

root@psan2:~# /opt/petasan/scripts/stats-stop.sh
root@psan2:~# mv /opt/petasan/config/shared/graphite/ /opt/petasan/config/shared/graphite_backup
root@psan2:~# /opt/petasan/scripts/stats-setup.sh
mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

#18

admin
2,982 Posts

July 31, 2023, 6:28 pm
Quote from admin on July 31, 2023, 6:28 pm
you can ignore it, proceed with the following step

/opt/petasan/scripts/stats-start.sh

and see if the graphs start to show

you can ignore it, proceed with the following step

/opt/petasan/scripts/stats-start.sh

and see if the graphs start to show

#19

neiltorda
99 Posts

August 1, 2023, 5:35 pm
Quote from neiltorda on August 1, 2023, 5:35 pm
Ran the steps again to be safe, same issue:

Here is the terminal output:

root@psan2:~# /opt/petasan/scripts/util/get_cluster_leader.py

{'psan2': '172.16.32.10'}

root@psan2:~# /opt/petasan/scripts/stats-stop.sh

root@psan2:~# mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup2

root@psan2:~# /opt/petasan/scripts/stats-setup.sh

mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

root@psan2:~# /opt/petasan/scripts/stats-start.sh

volume set: success

root@psan2:~#

Ran the steps again to be safe, same issue:

Here is the terminal output:

root@psan2:~# /opt/petasan/scripts/util/get_cluster_leader.py

{'psan2': '172.16.32.10'}

root@psan2:~# /opt/petasan/scripts/stats-stop.sh

root@psan2:~# mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup2

root@psan2:~# /opt/petasan/scripts/stats-setup.sh

mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

root@psan2:~# /opt/petasan/scripts/stats-start.sh

volume set: success

root@psan2:~#

#20

Post Reply: Clarification on PetaSAN_Online_Upgrade_Guide

Cancel

Pages: 1 2 3