You need to log in to create posts and topics. Login · Register

Podman with Ganesha: log file runs out of disk space. FULL_DEBUG enabled by default.

Hey,

After the last update, the space on the root / disk began to disappear very quickly.

The culprit was a log file created in Podman, a container

localhost/petasan-nfs-ganesha:3.2.0

generated a file of more than 250 GB in /var/log/ganesha/ganesha.log

in the container, in the file

/etc/ganesha/ganesha.conf

I see that debugging is enabled on FULL_DEBUG.

LOG
{
components
{
ALL = FULL_DEBUG; # this will likely kill performance;
}
}

can this be disabled from within Petasan or do you have to rebuild the entire container image?

And a request to the readers, do they have the same setup at their place?

podman exec -it `podman ps -aq` cat /etc/ganesha/ganesha.conf | grep ALL

we are looking into this and will get back

download patch:
https://www.petasan.org/fixes/321/nfs-remove-debug-logs.patch

apply patch on all nodes:
patch -p1 -d / < nfs-remove-debug-logs.patch

on the admin nodes:
systemctl restart petasan-admin

on nfs server nodes:
systemctl restart petasan-nfs-server

It work's perfectly!

Thanks for fast response!

UPDATE:

After a recent online update, the problem returned.

After patching the system again, the problem has been fixed.

The patch is already included in 3.2.1 so it is strange you saw the issue after upgrade. Is it possible that you re-ran the patch after the upgrade by mistake and it may have reversed it ?

I'm sure nothing has changed.

This is a test cluster for learning.

The grafana stats were not working, I read that online-update fixes the problem.

I ran one command:
/opt/petasan/scripts/online-updates/update.sh
on each of the 3 nodes.

The stats fixed themselves.

After an hour, the monitoring started alerting about the huge data growth. After verification the logs were growing.

Patching and restarting - worked again .

Once again I have a situation where the disk begins to run out of space, it is a different node than last time.

I check the file:
/usr/lib/python3/dist-packages/PetaSAN/core/nfs/config_builder.py

It does not contain the phrase "DEBUG".

Debug is enabled in the container:

podman exec -it NFS-172-30-0-141 grep -i debug /etc/ganesha/ganesha.conf
ALL = FULL_DEBUG; # this will likely kill performance;

Test-adding something to NFS Exports changes the ganesha.conf file, but does not remove FULL_DEBUG.

--

I thought to myself that maybe the container was built once and maybe here is the problem:

I ran an empty container image to see if there was any DEBUG backlog, but there isn't, the ganesha.conf file contains a simple test one:

EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;

# Exported path (mandatory)
Path = /nonexistant;

--

I handled this by removing DEBUG from the ganesha.conf file in the container and restarted. It works.

--

I went through each node, whether in /var/lib/containers/storage/*/merged/*/ganesha.conf
DEBUG logging is visible -- all clean.

Looks like something is adding that line on the fly.
If you have an idea where else the problem might be - I'd be happy to check.

only the config_builder.py configures this file, it is read by the service when it starts. maybe something when wrong during upgrade which prevented the service to restart, maybe. If you add a test export and all nodes/containers show the correct config file, then all is working well.