How Morello prevents buffer overreads - on example of libjpeg-turbo CVE-2018-19664

Arm
|

Memory Protection and Heap-based Exploits

Memory protection is one of the central considerations to the design of modern CPU architectures. It is necessary to provide general-purpose systems with methods for isolating one program so that it cannot access the memory of another program, nor access the operating system in order to directly control the computer’s resources (or to prevent uncontrolled access to other programs via the OS).

Modern CPU architectures could be viewed as somewhat similar in their approaches to memory safety: an architecture may offer virtual memory by specifying an MMU with permission checks during address translation and include broader support for privilege levels in the hardware, alongside dedicated instructions for issuing system calls. These techniques have developed over time in parallel with improvements to performance, and modern systems share various features in their approaches to memory protection.

However, there are some limitations to the MMU and privilege level-based memory protection approaches. Memory safety issues continue to be one of the (if not the) most common areas that require attention (e.g. according to the MITRE CWE Team, out-of-bounds read and write vulnerabilities are among the 4 most dangerous security weaknesses in 2020 1), and it remains a significant topic in computer security. Software patches may seek to fix certain problems in libraries, native programs or operating systems that could otherwise leave them subject for triggering in specific ways with potentially harmful effects.

A type of memory safety violation that involves the heap would be a buffer over-read. These are frequently used to see past the boundaries of one allocation region into an adjacent one (or its metadata) and read from the target locations. For instance, this would be problematic in cases where data is returned from one computer program to another, whereby an attacker issues an altered request to the server/process in order to somehow break or bypass the internal checks on the requested data size, and to reveal information contained in the private data structures. This would be possible if the over-read enters an adjacent allocation that contains information internal to the program.

As part of MMU-based protection it is also common for executable files to be split into multiple segments (text, data, stack, heap, etc.), each with an associated set of permissions that are propagated by the loader subroutine in the OS when it creates the virtual-to-physical mapping for the process and initializes its page tables. This has worked well to protect against a wide variety of exploits, including exploits that are based on heap-buffer overflows, where a program attempts to execute from the heap after being manipulated to write a payload to a certain location, as the hardware would have marked the region as non-executable and an exception would occur upon the attempt to execute the maliciously placed code.

However, more generally, heap-based read and write vulnerabilities are often undetected – for example, an instruction may be able to violate memory safety if it reads past one region into another if they are the same from the point of view of the MMU (the process has read permissions for both pages), so a pointer to either of the regions could be used to access data in the other where they are supposed to be accessed independently. As the heap will usually also be modifiable, another form of heap-based attack could also utilize the write permission and corrupt the target instead - e.g. by overwriting a vtable pointer on the heap to reference the location of a payload or section of code in the program known to be advantageous to the attacker.

Heap-based vulnerabilities remain and continue to provide opportunities for attacks that often cannot be efficiently detected (nor prevented) by existing systems. It is a similar story for other forms of attacks related to memory safety, e.g. those targeting the stack and other methods. Some programming languages may provide features that support extra checks at compilation or runtime to support memory safety, while languages that are focused on performance and not considered inherently memory-safe (such as C/C++) may use features included in the architectures that provide additional memory safety. There are also software solutions in the form of memory error detection tools available for integration with codebases at build-time (this may be enabled by default for system components). However, these methods are often probabilistic or limited in other ways.

Memory Safety and Morello

Morello is a research program investigating alternative approaches to memory safety that revamps an Armv8.2-A system to include additional security features as prevalent throughout the system. The Morello prototype architecture takes principles from the CHERI ISA, which offers extra security advantages by providing certain features that can be added to existing ecosystems 2 3. Arm has developed the Morello prototype architecture based on CHERI ISA in collaboration with the University of Cambridge. As part of this, Arm is contributing a software stack for Morello, providing ports that utilize these new features, with Android/Morello as its OS component.

CHERI-based architectures combine conventional memory-protection mechanisms with a concept known as a capability, which is a revised pointer integrated with memory protection-related information. Conceptually, capabilities are considered tokens of authority that cannot be forged by software 4. A capability in physical memory is an extension to a pointer that also includes context about how it can be used, with information about the ensuing memory address dereference such as the range of addresses it can access, permissions that determine which CPU instructions can use it as operands, among other scope-related details that greatly help with memory safety. These are all hardware-enforced runtime checks on a capability’s properties against the conditions at each dereference, and since they can only be propagated from software locations of greater privilege, the capability model increases the enforcement of scope across the full stack.

The Morello capability format is illustrated below 5.

Encoding SVG

Among other properties, the bounds in a capability restrict the range of addresses that can be used as targets of dereferences. In this way, the protection against out-of-bounds accesses is increased via the capability as the out-of-bounds access attempt would raise an exception at runtime.

The bounds are encoded in a Morello capability using a scheme that incorporates the address with metadata that enables the base (lower bound) and limit (upper bound) to be determined 6.

Encoding2 SVG

A capability’s bounds can be retrieved directly, and setting new bounds or address would be a valid operation providing that the new combination of bounds and address are representable. A capability cannot be used to dereference an address that falls outside of its bounds - an out-of-bounds access attempt would result in an exception upon the dereference.

These are detected at runtime, however a capability can also be rendered invalid beforehand if its address is set to a value that is too far outside the bounds, which would clear its tag bit and result in earlier exceptions if the capability is used in any dereference operations that may occur and attempt to access data, even within the bounds of the region. The representable region describes the range of addresses located outside the bounds that can be set without clearing the capability’s tag bit (+25% above the upper bound or 12.5% below the lower bound 7); deferring the detection of the out-of-bounds memory address to the dereference provides support for cases in commonly-used code and program language implementations where locations slightly outside the bounds are targeted by updating the value of pointers (capabilities) but no dereference occurs (e.g. a C for-loop may iterate a pointer to an array, and it would fall outside the bounds in the check at the final iteration 8).

Representability SVG

In the Morello Pure-cap ABI, all pointers are replaced with capabilities, memory-referencing instructions are repurposed to use capability registers instead of general-purpose registers as base operands, and there are also dedicated instructions for operating on capabilities.

On Morello, porting a component may require modifications to ensure it supports the Pure-cap ABI. For instance, hardcoded values that are used to compute memory addresses would need to be adjusted in order to account for changes in data structure layout that are due to the increased pointer size. It would also be important to disambiguate between data types used for memory references (capabilities) and addresses or offset values, so that either capability or general-purpose registers are used accordingly. This is also in accordance with the CHERI principle of provenance validity, as valid capabilities can only be derived from other valid capabilities 9 - the Morello compiler ensures there are no statements attempting to create new valid capabilities from the address values alone, which is not possible in Morello user-space. More information about the porting steps can be found in the Android/Morello documentation 10.

As part of Android/Morello, multiple workloads have been ported to the Pure-cap ABI. Additionally, there are fundamental changes implemented at the OS level to enable support for Morello’s features. These includes modifications to key system components, such as the dynamic allocator, which must return a capability with restricted bounds on each allocation.

The Android/Morello general purpose memory allocator

In a Morello payload, the operating system’s allocator is an important consideration. If a program is running in pure-cap mode on Morello, the dynamic memory allocator now returns a capability referencing a location on the heap instead of returning a numeric address value. An allocator, providing the traditional functionality of returning pointers to dynamic allocations as requested by callers, now also involves restricting the bounds on the returned capabilities. These bounds must only expose the allocated area, omitting regions that include adjacent allocations as well as any metadata located in an allocation’s vicinity.

Android/Morello uses jemalloc as the native allocator, which provides platform components with references to their heaps on allocation requests. Jemalloc fulfils dynamic allocation while attempting to maximize performance and minimize fragmentation using built-in optimization schemes, and has been tweaked in Android/Morello to ensure that the bounds of obtained capabilities are now also restricted before the result is passed to the caller once the allocation functions return (malloc, realloc etc.). The ranges permitted to be used for dereferencing are verified as precise, in that they are known to encapsulate the allocated region exclusively. Jemalloc derives the returned capabilities from parent capabilities stored in the internal data structures, whose bounds should already cover a broader area (it is not possible to expand bounds in a capability) - typically, this would be a (retained) copy of the capability obtained when mapping a corresponding enclosing heap virtual memory region, with these initial bounds already set by the mmap() syscall.

Vulnerability in libjpeg-turbo

An instance of heap-based buffer over-reads being used as a form of attack can be reproduced on base Android by reverting a past mitigation to the libjpeg-turbo library. Libjpeg-turbo is a JPEG codec that utilizes SIMD instructions to improve performance of JPEG handling in AOSP. Libjpeg-turbo implements two APIs: the legacy libjpeg API (a standard API that is also implemented by non-turbo libjpeg libraries), and a custom TurboJPEG API 11. The attack can be reproduced by reverting the relevant software patch 12, and then invoking a program (djpeg) that interfaces with the libjpeg API, providing an input image altered in specific ways, and requesting a conversion from JPEG to the BMP format; this results in a read past the end of an internal buffer. The invalid memory access occurs in the function put_pixel_rows() in wrbmp.c during the final decompression steps to output the data as a BMP file after the input JPEG has been processed and is stored over internal arrays. The specific cause of the over-read is due to a missing parameter check in a previous call, and it results in more data being copied than expected.

This was selected from a list of entries in libjpeg-turbo’s CVE records (CVE-2018-19664), and is marked as Medium in severity (the CVSS was obtained using various metrics about the vulnerability, which are combined to produce the final score 13). While this specific example might not be as exploitable as others in practice, the conditions would be similar to other, more serious vulnerabilities of this type. The patch containing the demo and its documentation can be found in the Android/Morello libjpeg-turbo repository 14.

If the attack leading to this over-read is invoked on the base architecture, the vulnerability would likely remain undetected. However, the fine-grained spatial memory safety provided on Morello enables this to be both detected and prevented automatically: in the moment of the invalid pointer dereference, the hardware determines the target address is outside the capability’s bounds and raises an exception (the default handler subsequently terminates the process). As libjpeg-turbo is one of the components ported to the Morello Pure-cap ABI, it is linked with the version of jemalloc that contains adjustments restricting bounds on the returned capabilities, which results in the buffer over-reads being automatically detected and prevented.

The example output below shows the resulting segmentation fault (result of a runtime exception initiated by Morello upon out-of-bounds access) when executed in Pure-cap mode:


  djpeg.c: main(): entry
  Corrupt JPEG data: 117 extraneous bytes before marker 0xdb
  wrbmp.c: jinit_write_bmp(): entry
  wrbmp.c: jinit_write_bmp(): registered function put_pixel_rows in the dest structure
  wrbmp.c: jinit_write_bmp(): exit
  djpeg.c: main(): starting decompressor..
  djpeg.c: main(): decompressor started, continuing...
  djpeg.c: main(): starting normal full-image decompress..
  djpeg.c: main(): writing output file header...
  djpeg.c: main(): finished writing output file header.
  djpeg.c: main(): processing data...
  djpeg.c: main(): running dest_mgr->put_pixel_rows(cinfo: 0x7fefb060b0, dest_mgr: 0x79ced8df60, num_scanlines: 1)...
  [INFO] dest_mgr->buffer: 0x0x79ced8e100, dest_mgr->buffer_height: 0x1
  wrbmp.c: put_pixel_rows(): entry.
  wrbmp.c: put_pixel_rows(): initiating data, transfer, fetching inptr...
  wrbmp.c: put_pixel_rows(): inptr fetched.
  wrbmp.c: put_pixel_rows(): Beginning transfer...
  wrbmp.c: put_pixel_rows(): entering loop...
  [INFO] inptr addr: 0x79ced8a020, inptr length: 0xbf, inptr base: 0x79ced8a000, inptr limit: 0x79ced8a0bf
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a080, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a084, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a088, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a08c, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a090, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a094, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a098, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a09c, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0a0, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0a4, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0a8, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0ac, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0b0, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0b4, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0b8, inptr limit: 0x79ced8a0bf -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x79ced8a0bc, inptr limit: 0x79ced8a0bf -> Segmentation fault

The bounds restrictions provide features for preventing various types of vulnerabilities that could otherwise slip through current systems. The result is decreased security risks arising from program bugs, with greater confidence in the overall memory protection and system security.

It can often require a high degree of precision to prevent exploits in situations where the number of bytes offset from the end of the buffer in the over-read is small. Morello provides the fine-grained memory safety in a way that can easily be integrated with other software as it is ported over, since the allocator ensures a returned capability doesn’t provide access to data outside the allocated region.

As the over-read is undetected without capabilities, the segmentation fault signal doesn’t occur in the Hybrid-cap or the base ABI - the loop continues iterating, and the program reaches the end of the execution flow without the problem being detected:


  djpeg.c: main(): entry
  Corrupt JPEG data: 117 extraneous bytes before marker 0xdb
  wrbmp.c: jinit_write_bmp(): entry
  wrbmp.c: jinit_write_bmp(): registered function put_pixel_rows in the dest structure
  wrbmp.c: jinit_write_bmp(): exit
  djpeg.c: main(): starting decompressor..
  djpeg.c: main(): decompressor started, continuing...
  djpeg.c: main(): starting normal full-image decompress..
  djpeg.c: main(): writing output file header...
  djpeg.c: main(): finished writing output file header.
  djpeg.c: main(): processing data...
  djpeg.c: main(): running dest_mgr->put_pixel_rows(cinfo: 0x7fea582e28, dest_mgr: 0x77aa77f2a0, num_scanlines: 1)...
  [INFO] dest_mgr->buffer: 0x0x77aa77f3c0, dest_mgr->buffer_height: 0x1
  wrbmp.c: put_pixel_rows(): entry.
  wrbmp.c: put_pixel_rows(): initiating data, transfer, fetching inptr...
  wrbmp.c: put_pixel_rows(): inptr fetched.
  wrbmp.c: put_pixel_rows(): Beginning transfer...
  wrbmp.c: put_pixel_rows(): entering loop...
  [INFO] inptr: 0x77aa7a70e0
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7140 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7144 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7154 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7158 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a715c -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7160 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7164 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x77aa7a7168 -> access complete
  ..
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583250 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583254 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583258 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f458325c -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583260 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583264 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f4583268 -> access complete
  [INFO] attempting to access 4 byte(s) at inptr 0x71f458326c -> access complete
  wrbmp.c: put_pixel_rows(): exited loop
  djpeg.c: main(): - finished processing data.

Conclusion

This is an instance of enhanced spatial memory safety on Morello - the features are backed by the hardware with the software updated to utilize them, where the allocator plays the role of one of the software components significantly contributing to the overall increase in security. The capability concept allows the extra security to be integrated into the architecture, and this makes Morello an example of increased ‘security-by-design’.

Copyright © 2021, Arm Ltd.

References

  1. Common Weakness Enumeration, 2020, 2020, CWE Top 25 Most Dangerous Software Weaknesses, viewed June 2021 https://cwe.mitre.org/top25/archive/2020/2020_cwe_top25.html 

  2. University of Cambridge, Computer Laboratory, 2020, Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 8), Chapter 2.4.5 “Source-Code and Binary Compatibility” viewed June 2021 https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf 

  3. University of Cambridge, Computer Laboratory, 2019, An Introduction to CHERI, Chapter 1 “Introduction” viewed June 2021 https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf 

  4. University of Cambridge, Computer Laboratory, 2014, Capability Hardware Enhanced RISC Instructions: CHERI Instruction-set architecture, Abstract, viewed June 2021 https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-850.pdf 

  5. Arm, 2021, Arm Architecture Reference Manual Supplement – Morello for A-profile Architecture, Chapter 2.5 “Capability Encoding”, viewed June 2021 https://developer.arm.com/documentation/ddi0606/latest 

  6. Arm, 2021, Arm Architecture Reference Manual Supplement – Morello for A-profile Architecture, Chapter 2.5.1 “Morello Bounds format”, viewed June 2021 https://developer.arm.com/documentation/ddi0606/latest 

  7. Arm, 2021, Arm Architecture Reference Manual Supplement – Morello for A-profile Architecture, Chapter 2.5.2 “Representability checks”, viewed June 2021 https://developer.arm.com/documentation/ddi0606/latest 

  8. University of Cambridge, Computer Laboratory, 2020, CHERI C/C++ Programming Guide, Chapter 4.3.5 “Out-of-bounds pointers”, viewed June 2021 https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf 

  9. University of Cambridge, Computer Laboratory, 2020, CHERI C/C++ Programming Guide, Chapter 4.2 “Pointer provenance validity”, viewed June 2021 https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf 

  10. Arm, 2021, The Android/Morello 2021.Q1 release, viewed June 2021 https://git.morello-project.org/morello/docs/-/blob/morello/release-1.1/android-readme.rst 

  11. libjpeg-turbo.org, 2021, “libjpeg-turbo” != “TurboJPEG”, viewed June 2021 https://libjpeg-turbo.org/About/TurboJPEG 

  12. libjpeg-turbo GitHub Repository, 2018, heap-buffer-overflow in function put_pixel_rows in wrbmp.c:145, viewed June 2021 https://github.com/libjpeg-turbo/libjpeg-turbo/issues/305 

  13. libjpeg-turbo Android/Morello GitLab Repository, 2021, Android/Morello Libjpeg-turbo - Prevention of Heap-based Buffer Over-read, viewed June 2021 https://git.morello-project.org/morello/android/platform/external/libjpeg-turbo/-/blob/morello/android10-release/cve-2018-19664-demo.rst 

  14. National Vulnerability Database, 2021, Common Vulnerability Scoring System Calculator: CVE-2018-19664, viewed June 2021 https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?name=CVE-2018-19664&vector=AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H&version=3.0&source=NIST