#586 Issue closed: cannot make pipe for command substitution: Too many open files

Labels: won't fix / can't fix / obsolete

gdha opened issue at 2015-05-13 13:45:

On a RHEL 6.3 system with approx 939 SCSI devices we run into the following (it is still rear-1.14):

2015-05-12 10:18:23 Excluding multipath slave /dev/sdahz.
2015-05-12 10:18:23 Excluding multipath slave /dev/sdhc.
2015-05-12 10:18:23 Excluding multipath slave /dev/sdqb.
/usr/share/rear/lib/_input-output-functions.sh: cannot make pipe for command substitution: Too many open files
Excluding multipath slave /dev/sdza.
2015-05-12 10:18:23 Including layout/save/default/33_remove_exclusions.sh
/usr/share/rear/layout/save/default/33_remove_exclusions.sh: cannot make pipe for command substitution: Too many open files
/usr/share/rear/layout/save/default/33_remove_exclusions.sh: line 19: read: read error: 0: Bad file descriptor
/usr/share/rear/layout/save/default/33_remove_exclusions.sh: redirection error: cannot duplicate fd: Too many open files
/usr/share/rear/lib/_input-output-functions.sh: cannot make pipe for command substitution: Too many open files
Including layout/save/default/34_generate_mountpoint_device.sh
/usr/share/rear/layout/save/default/34_generate_mountpoint_device.sh: redirection error: cannot duplicate fd: Too many open files
/usr/share/rear/layout/save/default/34_generate_mountpoint_device.sh: cannot make pipe for command substitution: Too many open files
/usr/share/rear/layout/save/default/34_generate_mountpoint_device.sh: line 9: read: read error: 0: Bad file descriptor
/usr/share/rear/layout/save/default/34_generate_mountpoint_device.sh: redirection error: cannot duplicate fd: Too many open files
/usr/share/rear/layout/save/default/34_generate_mountpoint_device.sh: redirection error: cannot duplicate fd: Too many open files
/usr/share/rear/lib/_input-output-functions.sh: cannot make pipe for command substitution: Too many open files
Including layout/save/GNU/Linux/35_copy_drbdtab.sh
/usr/share/rear/lib/_input-output-functions.sh: cannot make pipe for command substitution: Too many open files
Including layout/save/GNU/Linux/50_extract_vgcfg.sh
/usr/share/rear/lib/_input-output-functions.sh: redirection error: cannot duplicate fd: Too many open files

and rear just dies...

gdha commented at 2015-05-22 06:18:

check url http://www.linuxquestions.org/questions/linux-server-73/where-does-default-ulimit-values-come-from-866550/ to verify what the limits are on this particular system

gdha commented at 2015-05-26 09:51:

#-> ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 6057437
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
As you can see user root has still a soft/hard limit of 1024 open files and that is the reason why rear sometimes break with a huge amount of open file descriptors (disk devices) when making the layout of the devices and file systems.
What we need to do is follow the recommendations described in article:
http://unix.stackexchange.com/questions/8945/how-can-i-increase-open-files-limit-for-all-processes
edit /etc/security/limits.conf with entries like:
@root soft nofile 65535
@root hard nofile 65535

Will add this to our FAQ section of rear

gdha commented at 2015-05-31 15:32:

updated the FAQ page on our rear web-site

gdha commented at 2016-11-24 08:27:

Why do we not execute ulimit -n 65536 at each run? As log as the number returned of sysctl fs.file-max is not exceeded?
Another interesting output is:

# sysctl fs.file-nr
fs.file-nr = 7680   0   2417926

jsmeix commented at 2016-11-24 09:07:

I think rear must never ever change environment
restriction settings like ulimit and things like that
on its own.

Imagine rear changes restrictions on its own and then
(because of whatever error or bad luck) rear consumes
way too much resources (e.g. it could fill up the disk,
use all available RAM or CPU, whatever else)
so that other processes fail (e.g. get aborted by
the kernel out of memory killer), then users would
rightfully complain that running rear may render
their production systems inoperable ("rear mkbackup"
can be run "online").

I think when there are particular restirictions set and
rear fails because of them, then the admin should
know what to do so that rear can work.

The only environment where I could agree that rear
disables any restriction settings is the ReaR recovery
system - i.e. the environment where "rear recover"
runs because rear is the only running program there.

I even think strictly speaking there is no need to
document that especially for ReaR because this
is well known Unix/Linux standard behaviour
(i.e. any admin should know about that)
regardless that when I experience such issues
I am always initially a bit surprised ("why the heck
does it fail now?"), recently I was hit by some
unexpected "out of disk space" issues on my virtual
machines with a relatively small virtual harddisk ;-)

jsmeix commented at 2016-11-24 11:26:

A general addendum regarding
environment restriction settings:

Another kind of environment restriction setting
are network environment restriction settings
like local firewall rules that forbid that rear can store
its files on a remote server (e.g. BACKUP=NETFS).

Because rear runs as root it could "simply change"
any local network packet filtering and NAT rules
but I hope nobody really wants that rear does this.

jsmeix commented at 2016-11-25 08:46:

To avoid misunderstandings about my above comments:

I am against code in ReaR (as it comes from upstream)
that changes existing environment restriction settings.

On the other hand because all of ReaR is bash scripts
(even /etc/rear/local.conf is sourced and executed)
the user can add any additional commands he likes
to /etc/rear/local.conf to make rear work in his
particular environment, cf. some comments in
https://github.com/rear/rear/pull/1052
and also see
https://github.com/rear/rear/issues/968

gdha commented at 2016-12-02 13:56:

Won't be fixed in ReaR as standard solution - must be done as extra step by end-user (or like @jsmeix mentioned by adding some code in local.conf for example).

Dmole commented at 2021-08-10 22:14:

Bash can scripts can be given trivial changes to avoid leaking FDs;
https://stackoverflow.com/questions/24192459/bash-running-out-of-file-descriptors


[Export of Github issue for rear/rear.]