#2484 Issue closed: Oracle related hugepage settings in sysctl config might mess up memory in recovery image

Labels: documentation, discuss / RFC, no-issue-activity

abbbi opened issue at 2020-08-20 09:48:

hi,

today i have come accross a situation which i like to share..

Recovery ISO booted, but no memory was available within the boot image!
OOM reaper was randomly killing processes resulting in applications to fail

It appeared that the system in question had an oracle database installed, and
oracle or related packages seem to enable the hugepages setting in
/etc/sysctl.conf:

vm.nr_hugepages = [some big value]

This settings (at least on SLES12) totaly messes up the memory in the
recovery image (all memory but some left over MB is in use)
As soon as option is disabled within recovery image:

sysctl -w vm.nr_hugepages=0

the memory becomes free and everything is working as excpected.
Maybe that should be documented or during recovery, it might not be the best idea
to use sysctl settings as applied to the original system?

gdha commented at 2020-08-20 09:56:

@abbbi That is indeed valuable information to take with us (in documentation) or during the recovery OS image building. Need to think about it.

abbbi commented at 2020-08-20 10:21:

@abbbi That is indeed valuable information to take with us (in documentation) or during the recovery OS image building. Need to think about it.

especially in enterprise environments. I think other big software players like SAP do also setup/create sysctl settings that might change memory behavior in a way that it causes trouble. Yet i fail to understand why the hugepages setting kicks in in such a way if the kernel boots via initrd .. Havent done much more testing here yet.

these were the settings that can cause/reproduce the issue:

/etc/sysctl.conf:
fs.aio-max-nr = 1048576
fs.file-max = 6815744
kernel.shmall = 53687091
kernel.shmmax = 219902325555
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048586
vm.nr_hugepages = 98304
vm.hugetlb_shm_group = 488

github-actions commented at 2020-10-20 01:53:

Stale issue message


[Export of Github issue for rear/rear.]