	Linux kernel patch from the Openwall Project.

	Overview.

This patch is a collection of security-related features for the Linux
kernel, all configurable via the new 'Security options' configuration
section.  In addition to the new features, some versions of the patch
contain various security fixes.  The number of such fixes changes from
version to version, as some are becoming obsolete (such as because of
the same problem getting fixed with a new kernel release), while other
security issues are discovered.


	Non-executable user stack area.

Most buffer overflow exploits are based on overwriting a function's return
address on the stack to point to some arbitrary code, which is also put
onto the stack.  If the stack area is non-executable, buffer overflow
vulnerabilities become harder to exploit.

Another way to exploit a buffer overflow is to point the return address to
a function in libc, usually system().  This patch also changes the default
address that shared libraries are mmap()'ed at to make it always contain a
zero byte.  This makes it impossible to specify any more data (parameters
to the function, or more copies of the return address when filling with a
pattern), -- in many exploits that have to do with ASCIIZ strings.

However, note that this patch is by no means a complete solution, it just
adds an extra layer of security.  Many buffer overflow vulnerabilities
will remain exploitable a more complicated way, and some will even remain
unaffected by the patch.  The reason for using such a patch is to protect
against some of the buffer overflow vulnerabilities that are yet unknown.

Also, note that some buffer overflows can be used for denial of service
attacks (usually in non-respawning daemons and network clients).  A patch
like this cannot do anything against that.

It is important that you fix vulnerabilities as soon as they become known,
even if you're using the patch.  The same applies to other features of the
patch (discussed below) and their corresponding vulnerabilities.


	Restricted links in /tmp.

I've also added a link-in-+t restriction, originally for Linux 2.0 only,
by Andrew Tridgell.  I've updated it to prevent from using a hard link in
an attack instead, by not allowing regular users to create hard links to
files they don't own, unless they could read and write the file (due to
group permissions).  This is usually the desired behavior anyway, since
otherwise users couldn't remove such links they've just created in a +t
directory (unfortunately, this is still possible for group-writable files)
and because of disk quotas.

Unfortunately, this may break existing applications.


	Restricted FIFOs in /tmp.

In addition to restricting links, you might also want to restrict writes
into untrusted FIFOs (named pipes), to make data spoofing attacks harder.
Enabling this option disallows writing into FIFOs not owned by the user in
+t directories, unless the owner is the same as that of the directory or
the FIFO is opened without the O_CREAT flag.


	Restricted /proc.

This was originally a patch by route that only changed the permissions on
some directories in /proc, so you had to be root to access them.  Then
there were similar patches by others.  I found them all quite unusable for
my purposes, on a system where I wanted several admins to be able to see
all the processes, etc, without having to su root (or use sudo) each time.
So I had to create my own patch that I include here.

This option restricts the permissions on /proc so that non-root users can
see their own processes only, and nothing about active network connections,
unless they're in a special group.  This group's id is specified via the
gid= mount option, and is 0 by default.  (Note: if you're using identd, you
will need to edit the inetd.conf line to run identd as this special group.)
Also, this disables dmesg(8) for the users.  You might want to use this
on an ISP shell server where privacy is an issue.  Note that these extra
restrictions can be trivially bypassed with physical access (without having
to reboot).

When using this part of the patch, most programs (ps, top, who) work as
desired -- they only show the processes of this user (unless root or in
the special group, or running with the relevant capabilities on 2.2+), and
don't complain they can't access others.  However, there's a known problem
with w(1) in recent versions of procps, so you should apply the included
patch to procps if this applies to you.


	Special handling of fd 0, 1, and 2 (Linux 2.0 and 2.2 only).

File descriptors 0, 1, and 2 have a special meaning for the C library and
lots of programs.  Thus, they're often referenced by number.  Still, it is
normally possible to execute a program with one or more of these fd's
closed, and any open(2) calls it might do will happily provide these fd
numbers.  The program (or the libraries it is linked with) will continue
using the fd's for their usual purposes, in reality accessing files the
program has just opened.  If such a program is installed SUID and/or SGID,
then we might have a security problem.

Enable this option to ensure that fd's 0, 1, and 2 are always open on
startup of a SUID/SGID binary.  If any of the fd's is closed, "/dev/null"
will be opened for it (the device itself; you don't need to have /dev in
the filesystem for that to work, such as in a chroot).  This part of the
patch is by Pavel Kankovsky, I've only ported it to Linux 2.2 (any errors
are mine, of course).


	Enforce RLIMIT_NPROC on execve(2).

Linux lets you set a limit on how many processes a user can have, via a
setrlimit(2) call with RLIMIT_NPROC.  Unfortunately, this limit is only
looked at when a new process is created on fork(2).  If a process changes
its UID, it might exceed the limit for its new UID.

This is not a security issue by itself, as changing the UID is a privileged
operation.  However, there're privileged programs that want to switch to a
user's context, including setting up some resource limits.  The only fork(2)
required (if at all) is done before switching the UID, and thus doesn't
result in a check against RLIMIT_NPROC.

Enable this option to enforce RLIMIT_NPROC on execve(2) calls.  (The Linux
2.0 version of this patch only checks the limit for processes that have
their "dumpable" flag reset, such as due to an UID change, to reduce the
performance impact.)

Note that there's at least one good reason I am not enforcing the limit
right after setuid(2) calls: some programs don't expect setuid(2) to fail
when running as root.


	Destroy shared memory segments not in use.

Linux lets you set resource limits, including on how much memory a process
can consume, via setrlimit(2).  Unfortunately, shared memory segments are
allowed to exist without association with any process, and thus might not
be counted against any resource limits.

This option automatically destroys shared memory segments when their attach
count becomes zero after a detach or a process termination.  It will also
destroy segments that were created, but never attached to, on exit from the
process.  (In case you're curious, the only use left for IPC_RMID is to
immediately destroy an unattached segment.)

Of course, this breaks the way things are defined, so some applications
might stop working.  In particular, expect most commercial databases to
break.  Apache and PostgreSQL are known to work, though. :-)

Note that this feature will do you no good unless you also configure your
resource limits (in particular, RLIMIT_AS and RLIMIT_NPROC).  Most systems
don't need this.


	Privileged IP aliases (Linux 2.0 only).

It is sometimes desirable not to let regular users put their services on
some of the IP addresses configured on the system.  For example, this is
the case when providing web hosting services with shell and/or CGI access,
so that one user can't abuse the other domains hosted on the same system.

When this option is enabled, only root can bind sockets to addresses of
privileged aliased interfaces: those with slot numbers of the first half
of the allowed range.  The default limit is also expanded to 2048 aliases,
so that the familiar slot numbers of 0 to 1023 become privileged.


	How to install.

Make sure you have the original kernel sources (as can be obtained from
ftp.kernel.org) installed in /usr/src/linux.  Apply the patch:

	cd /usr/src/linux
	patch -p1 < PATCH-FILE

where PATCH-FILE is the full path and name of the linux-*-ow*.diff file.

In kernel configuration, go to the new 'Security options' section.  Read
help for the suboptions, and configure them.

If desired, edit /etc/fstab to specify the group id for accessing /proc.
Also, make sure you have no extra procfs mount commands in the startup
scripts, as these might override your fstab settings; this is the case for
some distributions, including Red Hat.  (Note that you won't be able to
specify the GID by remounting /proc on a running system.  This is because
filesystem-specific options are not supported at that stage.)

Build the kernel and reboot.

You may also want to add the following line to your /etc/syslog.conf to
log [security] alerts separately:

	kern.alert				/var/log/alert

Additionally, you may do something like this (assuming the log file will
be empty most of the time):

	> /var/log/alert
	chown root.staff /var/log/alert
	chmod 640 /var/log/alert
	echo "less -XEU /var/log/alert" >> ~non-root/.bash_profile

Ensure that the non-executable stack part of the patch is working, using
stacktest.c for that purpose -- running './stacktest -e' should segfault,
and a message should get logged to /var/log/alert (if you've followed the
syslogd configuration described above).  If you've enabled the support for
GCC trampolines, try running './stacktest -t', it should succeed.  If you
have trampoline call emulation enabled on Linux 2.0, you should also try
'./stacktest -b', the simulated exploit attempt should fail even after a
trampoline call in the same process has succeeded.

If you enabled the link-in-+t restriction, you can also try to create a
symlink in /tmp (as a non-root user) pointing to a file that user has no
read access to, then switch to some other user that has the read access
(for example, root) and try to read the file via the link (such as, with
'cat /tmp/link').  This should fail, and a message should get logged.

Now, you can try to create a hard link as a non-root user to a file that
user doesn't own.  This should also fail.

Be sure to check the FAQ.

-- 
Solar Designer <solar at openwall.com>
