Hardening A Linux Kernel
Table of Contents
- 1. Overview
- 2. Hardening Ideas
- 3. Security Enhancements in the Linux Kernel
- 3.1. POSIX capabilities
- 3.2. Linux Namespaces
- 3.3. Hardening the System Calls
- 3.4. Seccomp (Secure Computing Mode)
- 3.5. Control Groups (cgroups)
- 3.6. Linux Security Modules (LSM)
- 3.7. Chroot Restrictions
- 3.8. Grsecurity
- 3.9. PaX Address Space Protection
- 3.10. Buffer Overflow Attack Prevention
- 3.11. Linux Security Modules (LSM)
- 3.12. Secure Linux containers
- 3.13. Linux Memory Forensics
- 3.14. Intrusion Detection/ Prevention
- 4. Kernel Security Modules
- 5. References
- 6. End
1 Overview
- This article is about hardening a Linux kernel. Includes applying patches to fix kernel bugs, and design + implementation improvements.
- Hardening a system == Harden the kernel + Harden Sys Programs
- Related: Build a Kernel (as-is or after hardening)
- We distinguish hardening a system from proper configuration and fortification. In this lecture, we describe areas of tightening security during the design and construction of systems rather than after their deployment. This distinction is not common yet. There exist several "hardening" scripts, e.g., Bastille, that just help do proper configuration.
- There are "secure operating systems" and "trusted operating systems." Alas, no one can offer technically rigorous definitions for these terms. Windows NT, and several of Unix derivatives claimed, over the years, to be secure and trusted. For example, we found the following quote, early 2000s, "Windows NT is a secure operating system but only if it's configured correctly." Nevertheless, operating systems that lay claim to either being secure or trusted are better designed and engineered from their inception with a concern for security.
1.1 Add/ Delete/ Freeze System Calls
- It is not difficult to add system calls to the kernel.
- Using the Linux Kernel Modules mechanisms, Add/ Delete/ Freeze System Calls can be done while the kernel is running.
- TBD Link Lecture The Lab Lx TBD link gives you practical experience of doing this.
1.2 Awareness of Linux Kernel Exploits
- Lecture on ../../KernelExploits/
- BackDoors
- SymLinkAttack
- RaceConditions
- Viruses
- privilege-escalation.org
- RootKits
1.3 Hardening of System Programs
- [List is not exhaustive; traditional, not minimalistic, view.]
- all of
/sbin
,/usr/sbin
(can be pruned heavily) - all suid root programs. Recall the find-based script.
1.4 Scope of Kernel Hardening in this Course
- We should be able to apply the tools of Development of Software without Security Holes.
- We should be able to gather the patches delivered by experts since the last such harvest.
- We should be able to patch the publicly released kernel, and build a new hardened kernel. We select the patches from the above harvest. We may omit some. The reasons for both select/ omit are documented.
- We are not expecting to discover new bugs that are worthy of being flashed across the world tech news sites.
- Limited by the time and lectures we can devote to this topic.
2 Hardening Ideas
The space of hardening a system is vast. It includes new ideas yet to be implemented in widely used OS (e.g., Windows, and Linux). It includes re-designing and re-implementing existing ideas such as "change-roots". It includes analyzing the source code of an OS extremely carefully by experts as well as via software tools based on mathematical proof techniques. The next few sections introduce these ideas further.
2.1 Careful Recompilation
Stack smashing (buffer overflow) attacks are among the most common. By and large, these are programming errors that can be caught by analytical techniques. Newer compilers are mechanizing these techniques. See the Buffer Overflow Attacks and Development of Software without Secure Holes articles.
2.2 Least Privilege
On Linux systems, the user called root or super user, with user-id 0, can bypass (nearly) all security restrictions. Windows systems have the "System" and "Administrator" accounts. The super user privilege should be split into many smaller privileges. E.g., a backup process should be able to read any file, but it should not be able to shut down the system, modify files, or send network packets. Processes, not users, should be given privileges. The backup process should be permitted to read all files, even though the root user who invokes the program should not be allowed such access. The backup process should not be able to pass its privileges down to processes that it starts. The use of such finely divided abilities instead of sweeping powerful mechanisms is called the least privilege principle.
The traditional Unix model allows for access control of only files. So, a number of resources become "files": processes, hard disks, network interfaces, etc. In order to apply the principle of least privilege, we also need to divide the resources into finer units (often called objects, but unrelated to OOP). The users and processes are called subjects.
2.3 Capabilities
Capabilities is a word used with different meanings in the context of OS design. In OS research literature, processes hold tokens, called capabilities, denoting resources to be accessed and what can be done with them. Capabilities can be explicitly passed among processes. Linux, and Windows are not capability based in this sense. This usage of the word is unrelated to "POSIX capabilities" which are implemented in Linux and described later.
2.4 Mandatory Access Control
An OS can control access by attaching sensitivity labels (SLs) to objects (files, processes, network interfaces, packets, and so on). E.g., incoming network packets can be assigned SLs, based on the source IP address or the network interface. Outgoing packets can have the label of the process that created them. A filtering rule can then be formulated so that packets can be dropped if the SL does not satisfy some conditions. When inheritance of privileges is not assumed, this is known as mandatory access control.
2.5 Role-Based Access Control
An OS can divide the privileges based on the function ("role") they have, such as backup, file system integrity check, filtration of incoming packets. Each user is permitted a collection of roles. RBAC can implement MAC. There is a considerable amount of discrete mathematics developed for RBAC and MAC.
2.6 Source Code Review
Source code review, both by human experts and automated software tools based on mathematical proof techniques, can reveal vulnerabilities. See the Secure Coding article.
2.7 Check Thyself
We must always assume that there can be (many unknown) ways of corrupting a kernel, running processes and loaded libraries. So kernel should include an integrity checking system which would check the integrity of kernel, while running, using crypto algorithms.
2.8 Mutual authentication: init v kernel
In spite of file system audits, suppose we have a rogue kernel that
was loaded through the OS loader (such as Grub). How can we detect?
Similarly, suppose the /sbin/init
was replaced. The conceptual
answer is to mutually authenticate using MD5 and SHA1 sums.
2.9 Scope of ptrace(2)
Through the ptrace(2)
system call one process (the "tracer") may
observe and control the execution of another process (the "tracee"),
and examine and change the tracee's memory and registers. It is
primarily used to implement breakpoint debugging and system call
tracing. A single user is able to examine the memory and running
state of any of their own processes. By compromising one
application process, an attacker can attach to other running processes
(a web browser e.g.) to extract credentials and continue to expand the
scope of their attack without resorting to user-assisted phishing.
Since ptrace is not used by non-developers and non-admins, system
builders should disable this debugging system on "normally" deployed
systems.
Some applications (e.g., ssh-agent) use prctl(PR-SET-DUMPABLE, ...)
to
specifically disallow such ptrace attachment ), but many
do not. A more general solution is to only allow ptrace directly from a
parent to a child process (i.e. direct "gdb EXE" and "strace EXE" still
work), or with CAP-SYS-PTRACE (i.e. "gdb –pid=PID", and "strace -p PID"
still work as root).
2.10 Trusted OS Loader
TrustedGRUB http://projects.sirrix.com/trustedgrub extends the GRUB bootloader with TCG support. This makes it possible to provide a secure bootstrap architecture. The whole boot process is measured and - by support of a Trusted Platform Module (TPM) - the system integrity is verifyable.
3 Security Enhancements in the Linux Kernel
Whereas the previous section described hardening ideas in general, this section is a summary of security enhancements of the Linux kernel that have occurred over the years. Most of these are now (2013) part of the officially released Linux kernel source code tree.
Many groups offer open source patches to Linux kernel prevent various attacks. Each patch has its own limitations and side effects. Patches released in binary form should in general be not trusted. Linux patches are source code. These replace section(s) of code in the kernel source code tree. Often a patch is in response to a newly discovered security hole. There are proactive modifications also. Open Wall Linux (http://www.openwall.com/Owl/), e.g., is a collection of patches for non-executable stack, temporary file race condition prevention, restricted proc file system, special handling of file descriptors 0, 1, 2, destroy shared memory segments not in use, enforce RLIMITNPROC on execve, and privileged IP aliases.
3.1 POSIX capabilities
POSIX capabilities (Pcaps) can turn a setuid-root file into a file with minimum privileges, run a daemon with uid=0 but with amost no superuser privileges, etc. Privileges are granted to processes instead of users. The table below presents Pcaps for a few typical suid-root binaries.
ping | CAP-NET-RAW (13) |
traceroute | CAP-NET-RAW (13) |
chsh | CAP-CHOWN (0), CAP-DAC-READ-SEARCH (2), CAP-FSETID (4), CAP-SETUID (7) |
chfn | CAP-CHOWN (0), CAP-DAC-READ-SEARCH (2), CAP-FSETID (4), CAP-SETUID (7) |
chage | CAP-DAC-READ-SEARCH (2) |
passwd | CAP-CHOWN (0), CAP-DAC-OVERRIDE (1), CAP-FOWNER (3) |
mount | CAP-DAC-OVERRIDE (1), CAP-SYS-ADMIN (21) |
umount | CAP-DAC-OVERRIDE (1), CAP-SYS-ADMIN (21) |
Pcaps are implemented in Linux kernels since 2.6.x; capsh, getpcaps,
getcap, setcap
are some of the tools. E.g., to reduce the privileges
of the nomrallu suid-ed chsh, run chmod u-s /usr/bin/chsh; setcap
0,2,4,7=ep /usr/bin/chsh
3.2 Linux Namespaces
A namespace of a process is a collection of "names" associated with
processes and pids, files, volumes, mount table, the network stack
(ports, sockets, interfaces), etc. A single call unshare(2)
creates a
new namespace for the current process (see also man unshare
).
The following namespaces are available.
mount namespace
mounting and unmounting filesystems will not affect rest of the systemUTS namespace
setting hostname, domainname will not affect rest of the systemIPC namespace
process will have independent namespace for System message queues, semaphore sets and shared memory segmentsnetwork namespace
process will have independent IPv4 and IPv6 stacks, IP routing tables, firewall rules, the /proc/net and /sys/class/net directory trees, sock‐ ets etc.
As an example use of namespaces, consider routing. The set of network interfaces and routing tables are shared across the entire OS and all processes. With network namespaces, different and separate instances of "the network" can be made.
The unshare(1)
command
starts a child process with the mount, UTS, IPC or network namespaces
"unshared" from its parent. The systemd
uses mount namespaces for the
ReadWriteDirectories, ReadOnlyDirectories or InaccessibleDirectories
unit configuration options, and for systemd-nspawn.
3.3 Hardening the System Calls
REMUS (REference Monitor for UNIX Systems [Bernaschi et al. 2002]) is a kernel patch which intercepts system calls without requiring changes to syntax and semantics of existing system calls. REMUS presents a complete classification of the system calls according to the level of threat.
3.4 Seccomp (Secure Computing Mode)
- Sandboxing mechanism in the kernel
- After a process starts, a one-way transition into a state
where no system calls except
exit, sigreturn, read, write
andclose
are permitted. - Attempts to other system calls, will
SIGKILL
the process. - Process enters
seccom
viaprctl()
system call - Programs using this: OpenSSH, vsftpd, Chrome, …
3.5 Control Groups (cgroups)
A cgroup is a collection of processes that are bound by the same
criteria to limit, police and account the resource usage Compared to
the nice
prefix command or /etc/security/limits.conf
, cgroups are
more flexible. The kernel source tree has
Documentation/cgroups/cgroups.txt
3.6 Linux Security Modules (LSM)
See the section below.
The Linux Kernel Security Module (LSM) is a kernel framework that enables many different access control models as loadable kernel modules. Currently (2013), the Linux kernel source tree has AppArmor, SELinux, SMACK, TOMOYO, Yama, and Unix DAC (Discretionary Access Controls). LSM may become stackable in future.
AppArmor associates assigns a security profile to each program that
restricts the capabilities of that program. It supplements the
traditional discretionary access control (DAC) model with mandatory
access control (MAC). Ubuntu uses apparmor by default, and the
profiles are located in /etc/apparmor*
SELinux (Security-Enhanced Linux) is a contribution by the National Security Agency. It restricts the actions that programs can take. AppArmor identifies file system objects by path name instead of inode. This means that, for example, a file that is inaccessible may become accessible under AppArmor when a hard link is created to it, while SELinux would deny access through the newly created hard link. On the other hand, data that is inaccessible in SELinux may become accessible when applications update the file by replacing it with a new version, a frequently used technique, while AppArmor would continue to deny access to the data. In both cases, a default policy of "no access" avoids the problem.
Smack consists of three components: a MAC LSM, a startup script that ensures that device files have the correct Smack attributes and loads the Smack configuration, and a set of patches to the GNU Core Utilities package to make it aware of Smack extended file attributes. Smack was/is used in the mobile OSs named MeeGo and Tizen.
TOMOYO is a name-based MAC LSM, as opposed to inode based security. "Every process is created to achieve a purpose, and like an immigration officer, TOMOYO Linux allows each process to declare behaviours and resources needed to achieve their purpose. When protection is enabled, TOMOYO Linux acts like an operation watchdog, restricting each process to only the behaviours and resources allowed by the administrator." [from TOMOYO's web site]
Yama extends DAC support with additional system-wide security settings. Currently available is ptrace scope restriction. Further information can be found in Documentation/security/Yama.txt.
3.7 Chroot Restrictions
Every process has a current working directory that it begins with and a root directory, which is used to resolve the absolute path names of files. By default, the root directory of a process is /. Chroot system call changes the directory that is considered the root of a process. All subsequent absolute path names of a file are resolved with respect to the new root. The process cannot access files that are outside of the tree rooted at the new root directory, even in the presence of hard or symbolic links. Such a process is said to be in a chroot jail. Server daemons, such as anonymous FTP server, and web server, where the processes need only access to a limited sub tree, are run inside a chroot jail for security reasons. Unfortunately, weaknesses exist, and a jailed super user process can break out of it. Linux chroot restricts only the real (e.g., persistent storage media) file system access of the processes in the jail. Using interprocess communication mechanisms such as domain sockets, shared memory segments, and signals, a jailed process can damage the rest of the system.
By exploiting chroot, chdir, fchdir
system calls, an attacker with
root privileges can break chroot jail. None of the three system calls
check to make sure that current working directory (cwd) is within the
root directory of the process. When a process calls chroot, the root
directory of the process is changed but cwd is left unchanged. If
process has a directory open, which is outside the root directory, it
can call fchdir to that directory and the cwd of the process changes
to that directory. Once the cwd goes out of the root directory of the
process, the process is successful in breaking the chroot jail.
3.8 Grsecurity
Grsecurity [Spender 2003] uses the least privilege principle. Some of the features of Grsecurity are Trusted Path Execution, Process-based Mandatory Access Control, Access control lists, chroot restrictions, randomizing PIDs, IP IDs, TCP initial sequence numbers, and FIFO restrictions.
Traditionally, a "trusted path" is one where the parent directory is owned by root and is neither group nor others writable. A file is said to be in the trusted path only if the directory of the file is owned by root and it has neither group nor others writable permissions. TPE works based on an internal list of trusted user ids. If a given user tries to execute a file not in the Trusted Path, and their user id is not in the kernels trusted list, they are denied execution privileges. This is known as Trusted Path Execution.
The RBAC Mandatory Access Control system of grsecurity was the inspiration for SELinux and AppArmor. Grsecurity is "coupled" with PaX in how its source code is distributed.
3.9 PaX Address Space Protection
PaX invented ASLR. PaX patches provide: Segmentation-based
implementation of non-executable pages; Mprotect restrictions prevent
new code from entering a task; Randomization of stack and mmap base;
Randomization of heap base; Randomization of executable base;
Randomization of kernel stack; Automatically emulate sigreturn
trampolines; No ELF .text relocations; No kernel modification via
/dev/mem, /dev/kmem, or /dev/port; Option to disable use of raw I/O;
Removal of addresses from /proc/*/maps
and /proc/*/stat
.
3.10 Buffer Overflow Attack Prevention
There have been patches including Open Wall Linux patch, Segmented- PAX [Team 2003], KNOX [Purczynski 2003a], RSX module [Starzetz 2003], Paging-PAX, and Exec shield. All these source code patches aim to prevent stack and heap execution at kernel level by using either segmentation logic or paging logic or both. See the Buffer Overflow article also.
3.11 Linux Security Modules (LSM)
- AppArmor confines individual programs to a set of listed files and posix 1003.1e draft capabilities.
- AppArmor: Name-based Access Controls
- http://sourceforge.net/projects/realtime-lsm/ The Realtime Linux Security Module (LSM) selectively grants realtime permissions to specific user groups or applications.
- Enforcer Linux Security Module (LSM) The Enforcer is a Linux Security Module designed to improve integrity of a computer running Linux by ensuring no tampering of the filesystem. It can interact with TCPA hardware to provide higher levels of assurance for software and sensitive data. http://enforcer.sourceforge.net/
3.12 Secure Linux containers
3.12.1 What is a Container?
- Lightweight virtual OSs running inside Linux
- Not a virtual machine like VirtualBox or VMware
- A container is a group of processes in a "box"
- Inside the box, it looks like a VM.
- Outside the box, it looks like normal processes.
- "chroot on steroids"
- Process isolation
- Name space isolation
- What is a Hypervisor?
- Example container software: LXC, Docker, OpenVZ.org
3.12.2 LXC on Ubuntu
- https://help.ubuntu.com/lts/serverguide/lxc.html
# apt-get install lxc
- KVM is a virtual machine running on Linux kernel. Relies on assistance from the CPU . Uses paravirtualization to reduce overhead.
- LXC v Xen: Both are light weight virtual OS, not VM
3.13 Linux Memory Forensics
% ls -l /proc/sys/vm
- Keep kernel details confidential?
% ls -l /boot
-rw-r--r-- 1 root root 1007311 Oct 2 19:19 abi-3.11.0-11-lowlatency -rw-r--r-- 1 root root 163504 Oct 2 19:19 config-3.11.0-11-lowlatency -rw-r--r-- 1 root root 26228945 Oct 17 23:33 initrd.img-3.11.0-11-lowlatency -rw------- 1 root root 3310511 Oct 2 19:19 System.map-3.11.0-11-lowlatency -rw------- 1 root root 5674032 Oct 2 19:19 vmlinuz-3.11.0-11-lowlatency
3.14 Intrusion Detection/ Prevention
No matter what design enhancements have been made, we should be prepared for intrusion, and hence must have OS functionality that can detect things.
Linux Intrusion Detection System (LIDS), is a a series of kernel patches that enable loadable module and mount point locking. Its focus is on Access Control Lists. LIDS features include enhancements to Linux capabilities, protecting important files, protecting Raw I/O devices, protecting important processes, and port scan detector at the kernel level.
Snort is an open source network intrusion prevention and detection system (IDS/IPS) combining signature, protocol, and anomaly-based inspection. "Snort can perform protocol analysis and content searching/matching. It can be used to detect a variety of attacks and probes, such as buffer overflows, stealth port scans, CGI attacks, SMB probes, OS fingerprinting attempts, and much more. It uses a flexible rules language to describe traffic that it should collect or pass, as well as a detection engine that utilizes a modular plug-in architecture. Snort has a real-time alerting capability as well, incorporating alerting mechanisms for syslog, a user specified file, a UNIX socket, or WinPopup messages to Windows clients. Snort has three primary uses: a straight packet sniffer like tcpdump, a packet logger (useful for network traffic debugging, etc), or a full-blown network intrusion prevention system." [from http://www.snort.org/]
The snort can easily belong in Fortification.
4 Kernel Security Modules
- The Linux Kernel Security Module (LKSM, aka LSM) mechanism permits new kernel extensions.
- These extensions are not actually loadable kernel modules. Instead, they are selectable at build-time via CONFIGDEFAULTSECURITY and can be overridden at boot-time via the "security=…" kernel command line argument
- The primary users of the LSM interface are Mandatory Access Control (MAC) extensions.
cat /sys/kernel/security/lsm
A list of the active security modules in the kernel now in use.The following grep results are from an old build of Ubuntu standard kernel:
# grep CONFIG_DEFAULT_SECURITY /boot/config-4.18.0-10-generic # CONFIG_DEFAULT_SECURITY_SELINUX is not set # CONFIG_DEFAULT_SECURITY_SMACK is not set # CONFIG_DEFAULT_SECURITY_TOMOYO is not set CONFIG_DEFAULT_SECURITY_APPARMOR=y # CONFIG_DEFAULT_SECURITY_DAC is not set CONFIG_DEFAULT_SECURITY="apparmor"
- The following grep results ./grep-security.txt are from a recent
build
/boot/config-5.4.0-050400rc2-generic
of Ubuntu standard kernel
5 References
- https://www.kernel.org/doc/html/v4.19/admin-guide/LSM/index.html Linux Security Module Usage 2018
- https://www.kernel.org/doc/html/v4.19/admin-guide/security-bugs.html Kernel Bug reporting, and Bug hunting. 2018