Forums

Home / Forums

You need to log in to create posts and topics. Login · Register

Clarification on PetaSAN_Online_Upgrade_Guide

Pages: 1 2 3

Client I/O appears to be working. I logged into a system that has an iscsi multi-path disk from this system mounted. I could go into the mounted folder and read and write data to it. It is reporting the correct sizes, etc.

 

I ran the commands on node psan4 (the one that has not yet been updated)

oot@psan4:~# ceph versions
{
    "mon": {
        "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
        "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
    },
    "mgr": {
        "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
        "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
    },
    "osd": {
        "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 36,
        "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 114
    },
    "mds": {
        "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 1,
        "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 2
    },
    "overall": {
        "ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)": 39,
        "ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)": 120
    }
}


root@psan4:~# ceph osd dump | grep release
require_osd_release octopus

 

looks good, i would go ahead with psan4 upgrade

I am running the update script now on psan4.

Once complete, do I run this:
ceph osd require-osd-release quincy

from just one node, or on all four nodes?

 

Thanks,
Neil

after running the updates and running

ceph osd require-osd-release quincy on one of the nodes, everything comes up except for the graphs at the bottom of the webUI.

There is a red triangle in the top right corner that when moused=over says Internal server error.

if you will reboot fter the upgrade, rhw graphs may work by themselves

you can also try

get the stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, try

/opt/petasan/scripts/stats-stop.sh
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh

I had rebooted all the nodes yesterday evening. So I first tried the commands you provided here. (the cluster leader is being shown as psan1, 172.16.32.X) where X is the correct IP for psan1.

I had the same result. I then tried rebooting all the nodes again, to see if that brought it back. Still same red triangle with internal server error.

So I ran the above commands a 2nd time on the appropriate node (again, still psan1), with the same results.

The top part of the webUI is reporting properly, as are other pages in the UI (iscsi disk list and path assignment for example) are also working. It is just the graph at the bottom of the dashboard that is still showing the error.

Any other ideas?

Thanks so much!

Neil

try to swicth the current stats server

get the current stats server ip from
/opt/petasan/scripts/util/get_cluster_leader.py

on node which is stats server, stop the stats
systemctl stop petasan-cluster-leader
systemctl stop petasan-notification
/opt/petasan/scripts/stats-stop.sh
refresh dashboard, it should show bad gateway
consul kv delete PetaSAN/Services/ClusterLeader
wait approx 1 min then refresh dashboard
check the stats server is now a new node
/opt/petasan/scripts/util/get_cluster_leader.py

if you still have an error even with new stats server, i would suspect the stats data
on the new server:
/opt/petasan/scripts/stats-stop.sh
mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup
/opt/petasan/scripts/stats-setup.sh
/opt/petasan/scripts/stats-start.sh
if this works and you really need the old stats data, you can move the old stats files in groups (there is 1 file for each metric) and maybe you could find a corrupt metric file causing this.

After moving to the new node (psan2), I am still showing the same error.

So I started going through the next set of steps, but the stats-setup.sh script is throwing the error shown below:

root@psan2:~# /opt/petasan/scripts/stats-stop.sh
root@psan2:~# mv /opt/petasan/config/shared/graphite/ /opt/petasan/config/shared/graphite_backup
root@psan2:~# /opt/petasan/scripts/stats-setup.sh
mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

you can ignore it, proceed with the following step

/opt/petasan/scripts/stats-start.sh

and see if the graphs start to show

Ran the steps again to be safe, same issue:

Here is the terminal output:

root@psan2:~# /opt/petasan/scripts/util/get_cluster_leader.py 

{'psan2': '172.16.32.10'}

root@psan2:~# /opt/petasan/scripts/stats-stop.sh 

root@psan2:~# mv /opt/petasan/config/shared/graphite /opt/petasan/config/shared/graphite_backup2

root@psan2:~# /opt/petasan/scripts/stats-setup.sh 

mv: cannot stat '/opt/petasan/config/shared/graphite/whisper/PetaSAN/ClusterStats/ceph-ceph/cluster': No such file or directory

root@psan2:~# /opt/petasan/scripts/stats-start.sh 

volume set: success

root@psan2:~# 

Pages: 1 2 3