#2654 Issue closed: 'export TMPDIR' in etc/rear/...conf does no longer work since #2633

Labels: discuss / RFC, fixed / solved / done

hpannenb opened issue at 2021-07-12 11:36:

  • ReaR version ("/usr/sbin/rear -V"): 2.6 (current git version)
    2.6 / Git
  • OS version ("cat /etc/os-release" or "lsb_release -a" or "cat /etc/rear/os.conf"):
    CentOS Linux release 7.9.2009 (Core) and/or RHEL 7.9
  • ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):
# ReaR - site.conf

export TMPDIR="/var"

BACKUP=NETFS
BACKUP_URL=iso://backup

OUTPUT=ISO
OUTPUT_URL=null

USE_STATIC_NETWORKING=yes
USE_RESOLV_CONF=no

SSH_ROOT_PASSWORD='root'
  • Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
    KVM and/or hardware
  • System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
    x86
  • Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
    BIOS
  • Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
    local disks
  • Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT" or "lsblk" as makeshift):
NAME                        KNAME     PKNAME    TRAN TYPE FSTYPE       SIZE MOUNTPOINT
/dev/sr0                    /dev/sr0            ata  rom              1024M 
/dev/vda                    /dev/vda                 disk               16G 
|-/dev/vda1                 /dev/vda1 /dev/vda       part xfs          500M /boot
`-/dev/vda2                 /dev/vda2 /dev/vda       part LVM2_member 15,5G 
  |-/dev/mapper/centos-root /dev/dm-0 /dev/vda2      lvm  xfs           15G /
  `-/dev/mapper/centos-swap /dev/dm-1 /dev/vda2      lvm  swap         512M [SWAP]
  • Description of the issue (ideally so that others can reproduce it):
    Using export TMPDIR="/var" does not work any more in the current Git version of ReaR. It was working in 2.6 / 2020-06-17 (on RHEL7.9). The mktemp always uses /tmp instead. A later check of ReaR in the BUILD_DIR fails due a hardened (noexec) /tmp directory or ReaR errors out due to "no space left on device".

  • Workaround, if any:
    If Your login shell is NOT bash than change to bash and type `export TMP="/var" before manually executing ReaR in that shell.

  • Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):
    Pre-req is: Same /etc/rear/site.conffor both tests:

gdha commented at 2021-07-12 15:49:

@hpannenb cp: error writing './usr/share/microcode_ctl/ucode_with_caveats/intel/intel-ucode/06-6a-05': No space left on device?
So, what is your base shell then?

hpannenb commented at 2021-07-12 18:51:

The base shell after login is bash.

Info:
What I could see from the the source of the main /usr/sbin/rear in version 2.6 the initialisation of the build area took place after the "# Combine configuration files" section. So mktemp could take into account the TMPDIR value from site.conf.

In the current version the order of sections has changed: The build area is executed first and later the "# Combine configuration files"* section is executed.

So IMHO no matter what TMPDIR has been configured in site.conf; it will never be used by mktemp.

This major change was introduced with commit 0022063 .

jsmeix commented at 2021-07-13 14:20:

@hpannenb
you are right.
export TMPDIR="..." in etc/rear/*.conf
does no longer work since
https://github.com/rear/rear/pull/2633
because I had to move in usr/sbin/rear the part with
mktemp -d -t rear.XXXXXXXXXXXXXXX
from after read the configurations
to before read the configurations
so export TMPDIR="..." in etc/rear/*.conf
happens now after mktemp -d -t rear.XXXXXXXXXXXXXXX
so that export TMPDIR="..." in etc/rear/*.conf
has become basically useless.

What works is export TMPDIR="..."
on the command line before calling rear.

I will think about if export TMPDIR="..." in etc/rear/*.conf
could be made working again but I fear this is not possible
with reasonable effort and with reasonably clean code.

jsmeix commented at 2021-07-13 14:41:

With
https://github.com/rear/rear/commit/b422845f14ea59c8b8dd16449df217334569faa7
I adapted the documentation in default.conf to match the current state.
I did this is only to have the current state documented appropriately.
This does not mean the current state cannot be changed or improved.

jsmeix commented at 2021-07-13 14:47:

I am not a TMPDIR expert but I think calling export TMPDIR="..."
before calling whatever program that will use TMPDIR
is perfectly in compliance how TMPDIR is meant to work.

pcahyna commented at 2021-07-16 17:37:

I am not a TMPDIR expert but I think calling export TMPDIR="..."
before calling whatever program that will use TMPDIR
is perfectly in compliance how TMPDIR is meant to work.

Hi @jsmeix , it is my understanding as well, but the problem with this approach that I am currently encountering is that if the distributor wants to set distribution-specific default in /etc/rear/os.conf or in some file under /usr/share/rear/conf, it does not work for TMPDIR. And wanting to set this default is natural if the distribution in question has specific requirements for temporary files. Right now I am facing the issue that /tmp is by default a tmpfs and therefore often too small.

Why does have the creation of the build area happen before reading the configuration? (Note that default.conf is still being read before the creation of the build area, only the OS-specific, site-specific and user-specific part are read later.)

jsmeix commented at 2021-07-19 07:59:

The creation of the build area must happen
before Start logging where the following is normally done (unless in debug modes):

STDOUT_STDERR_FILE="$TMP_DIR/$PROGRAM.$WORKFLOW.stdout_stderr"

because TMP_DIR=$BUILD_DIR/tmp (i.e. in the build area)
and STDOUT_STDERR_FILE is intentionally in BUILD_DIR
to get it removed in the same way as BUILD_DIR gets removed.

If we moved reading the configuration before Start logging
we would have nothing about reading the configuration in the log
which makes it hard to understand what goes on by plain looking at the log.

An interesting idea is that default.conf is sourced very early
and we won't need to have that in the log because we know our defaults.

So setting TMPDIR can happen in a config file that is sourced very early
which is meant to specify defaults where we do not need logging.

Because "man rear" reads

CONFIGURATION
To configure Relax-and-Recover you have to edit the configuration files in /etc/rear/.
All *.conf files there are part of the configuration,
but only site.conf and local.conf are intended for the user configuration.
All other configuration files hold defaults for various distributions
and should not be changed.

in particular /etc/rear/os.conf can be sourced early
e.g. directly after default.conf was sourced
so that setting TMPDIR could be done in /etc/rear/os.conf
by Linux distributors and if needed also by users because in general
users are allowed to change their Linux distributor's defaults as needed.

@gdha @pcahyna @hpannenb
would it be OK for your use cases when I change usr/sbin/rear to
source /etc/rear/os.conf directly after default.conf was sourced?

pcahyna commented at 2021-07-19 11:05:

Hi @jsmeix , yes, and I think that it applies not only to os.conf but also to the distribution configuration files: https://github.com/rear/rear/blob/cbd352d06facbd7d74c609382d7aa4652e9e07b1/usr/sbin/rear#L516 and perhaps to $WORKFLOW.conf as well (since all of those are internal to ReaR).

jsmeix commented at 2021-07-19 12:50:

Things aren't as easy because
in the Combine configuration filessection
SetOSVendorAndVersion() is called
but that function may call e.g. Error but calling Error() requires
source $SHARE_DIR/lib/_input-output-functions.sh
which happens after logging is started
so it is all some kind of big chicken and egg dilemma:
The more I move up the more other stuff I have to move even upper.

pcahyna commented at 2021-07-20 15:01:

@jsmeix Thinking more about my particular issue with /tmp overflowing ... I believe now that the issue does not need to be set in a distribution-specific configuration file, not even in os.conf, because it is not just RHEL that suffers from it. See the Ubuntu manapage that points out the same problem: http://manpages.ubuntu.com/manpages/bionic/en/man7/file-hierarchy.7.html

/tmp
The place for small temporary files. This directory is usually mounted as a "tmpfs"
instance, and should hence not be used for larger files. (Use /var/tmp for larger
files.)

So, any distribution (at least any distribution using systemd) is prone to the problem of /tmp being too small for ReaR use, and thus it is reasonable to set TMPDIR to /var/tmp in default.conf if unset before. Would you accept this change to default.conf? If we can do that, I don't need to source os.conf that early (although it still makes sense to do it, IMO).

jsmeix commented at 2021-07-20 16:03:

I was also meditating over the generic issue
with /tmp on tmpfs too small in general for ReaR
and how to cleanly get out of that /tmp on tmpfs mess.

I think your proposal to set TMPDIR if unset could be a possible way out.
I think the crucial point is to only touch TMPDIR if it is unset.

On the other hand unset/empty variables do not indicate the user does not care
because the user may want to have the default behaviour he only is not aware
that he depends on a default and if that default changes he may be upset.

So I suggest to do a bit more to not make users with plenty of space in /tmp upset
and only set TMPDIR if /tmp seems to have not enought space for ReaR
e.g. some automatism based on what df -B M /tmp shows as "Available"
perhaps even with comparison what df -B M /var/tmp shows as "Available"
to avoid making things worse if we set TMPDIR to /var/tmp regardless
how likely this may go wrong - I would like to have really fail-safe behaviour
when we mess around with generic Unix environment variables.

Perhaps even simpler:
Don't mess around with generic Unix environment variables
but test in usr/sbin/rear or in a new usr/share/rear/init/ script
how much availabe space there is in BUILD_DIR
and show a warning message or be bold and error out according to
https://schlomo.schapiro.org/2015/04/warning-is-waste-of-my-time.html

I think testing in a new usr/share/rear/init/ script is better
because there we know the final config variables values
so we know e.g. things like MODULES and FIRMWARE_FILES
that make a difference how big the ReaR recovery system will get, cf.
https://github.com/rear/rear/discussions/2640#discussioncomment-908335

By the way: In output/default/940_grub_rescue.sh I found

available_space=$(df -Pkl /boot | awk 'END { print $4 * 1024 }')

pcahyna commented at 2021-07-20 16:23:

@jsmeix

On the other hand unset/empty variables do not indicate the user does not care
because the user may want to have the default behaviour he only is not aware
that he depends on a default and if that default changes he may be upset.

I believe your proposals are needlessly complicated and thus fragile. As written in the referenced manpage file-hierarchy(7) (which is not distribution-specific, Fedora has it, too), "/tmp ... This directory (...) should hence not be used for larger files." "/var/tmp (...) can thus accept larger files. (...) If applications find the environment variable $TMPDIR set, they should prefer using the directory specified in it over directly referencing /var/tmp/".
My reading of this is that since ReaR can generate pretty large files, according to this standard, it should use /var/tmp as the default temporary space, not /tmp. If an user expects that with TMPDIR unset the default for temporary files is /tmp, then their expectation is clearly wrong, the default depends on whether the application expects the files to be large (and other considerations, like persistence across reboots, that do not apply for us, but are relevant for other applications that have to make this decision).
This is also confirmed by this recommendation in the RHEL 7 release notes (quite old already):
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/migration_planning_guide/chap-red_hat_enterprise_linux-migration_planning_guide-major_changes_and_migration_considerations#sect-Red_Hat_Enterprise_Linux-Migration_Planning_Guide-File_System_Layout-Temporary_storage_space

Processes that store a large amount of data, or require temporary data to persist across reboots, should use /var/tmp.

ReaR is certainly a process that can store large amounts of data (build directory is rather on the large side and I have seen failures due to this in practice), so it should use /var/tmp.

The last quoted phrase from file-hierarchy(7) means that user-specified TMPDIR overrides both /tmp and /var/tmp, depending on which one the application has decided to use as the default. It is not limited just to /tmp.

If you don't want to mess around with generic Unix environment variables, let's use the -p argument to the mktemp invocation that creates the build area (when TMPDIR is unset). (The downside could be the risk that temporary files will be then scattered both in /tmp and /var/tmp instead of being found only in one place.)

jsmeix commented at 2021-07-20 16:38:

Ah!
Seems FHS or some successor was adapted to the /tmp on tmpfs mess.
My knowledge was based on the traditional FHS
that had no size limits for /tmp as far as I can remember.
So let's simply follow latest greatest file-hierarchy(7) recommendations ;-)

pcahyna commented at 2021-07-21 17:00:

@jsmeix

but test in usr/sbin/rear or in a new usr/share/rear/init/ script
how much availabe space there is in BUILD_DIR
and show a warning message or be bold and error out

checking available space in BUILD_DIR could be an useful feature for the users, but only if it is done reliably. My concern is that it is difficult to estimate reliably how much space will be needed - the size of the rescue image can vary wildly according to which software you include in it, which in turn depends on the chosen BACKUP setting, because the client for the given backup tool has to be included. As an example, I have found that the new Rubrik/CDM backup method leads to very large images due to (quite likely unneeded) inclusion of many libraries.
Another concern is the possible fragility of the code that will implement it.

pcahyna commented at 2021-07-30 15:08:

@jsmeix I have another question. If export TMPDIR in local.conf does not work anymore, this is no longer true: https://github.com/rear/rear/blob/3e56fc8d03203ac2a2741ba8365f7d9789e04636/usr/share/rear/rescue/GNU/Linux/600_unset_TMPDIR_in_rescue_conf.sh#L2
and unset TMPDIR in rescue.conf probably does not work, either.

jsmeix commented at 2021-08-06 05:41:

With https://github.com/rear/rear/pull/2664 merged
this issue and the whole TMPDIR case should now be finally solved - hopefully ;-)


[Export of Github issue for rear/rear.]