h4cker/buffer_overflow_example/calculating_offsets.md

# Calculating Offsets for Buffer Overflows
When dealing with buffer overflows, one common objective is to locate precisely where in the input stream certain critical values (like return addresses, saved frame pointers, or function pointers) reside. To do this, you need to determine the offset within the payload at which those values occur. Here’s the general process:

1. **Identify the Vulnerable Input Point:**
   First, you need to confirm that the program accepts input (e.g., command line arguments, environment variables, network data, etc.) that can potentially be used to overflow a buffer.

2. **Pattern Generation:**
   Instead of sending a long string of identical characters (e.g., all "A"s), you send a pattern of unique, non-repeating sequences. For instance, a string generated by tools like `pattern_create` (from the Metasploit Framework) or custom scripts. Such a pattern might look like `Aa0Aa1Aa2Aa3...` and so forth.

3. **Cause the Program to Crash:**
   Run the program (often under a debugger) with this patterned input to cause the overflow. When it crashes, examine the registers (especially the instruction pointer or return address register).

4. **Identify the Offset in the Crash Data:**
   The crash dump or debugger output will show a specific value from the pattern where, for example, the return address resides. By searching for this pattern substring in the original input pattern, you can find the exact position (offset) where control data (like the saved return address) was overwritten.

   For instance, if `0x41334141` (the ASCII representation of part of the pattern) shows up in the return address register, you search the original pattern for that sequence and find it corresponds to an offset of, say, 260 bytes from the start of your input.

5. **Refine the Payload Based on the Offset:**
   Once the offset is known, you can craft payloads that place specific shellcode or addresses precisely at the point required to control the program’s execution flow.

**Why You Might Need to Send More Data than the Buffer Size**
Simply matching the buffer’s declared size isn’t usually enough when trying to exploit a buffer overflow. Here’s why you might need to send more data:

1. **Overwriting Adjacent Memory:**
   The goal of a buffer overflow is typically to overwrite memory locations beyond the buffer’s intended boundaries. For example, if a buffer is 128 bytes long and right after it in memory are critical control structures (like a saved return address, function pointer, or a security cookie), you need to write beyond 128 bytes—perhaps 132, 256, or more—until you hit and overwrite those values.

2. **Reaching Control Data (Saved EIP/RIP):**
   On many architectures, the memory immediately after a local buffer on the stack includes saved registers, such as the saved instruction pointer (EIP on x86, RIP on x64). If this saved return address is at an offset of, say, 260 bytes after the start of the buffer, you need to send at least 260 bytes plus the new address or payload to overwrite it—far exceeding the original buffer length.

3. **Struct Padding and Alignment:**
   Due to structure padding, alignment, or compiler-inserted protective mechanisms, the target data might not be positioned immediately after the buffer. You may need to send extra data to "fill the gap" until you reach the memory segment you want to control.

4. **NOPS and Shellcode Placement:**
   When delivering an exploit, you might include a NOP-sled (a series of no-operation instructions) before your shellcode. This can make the attack more reliable by giving you a larger "landing area" in case the jump to shellcode isn’t perfectly aligned. This extra padding pushes the total length well beyond the original buffer size.

## Short Explanation:
- You calculate offsets by sending distinct, recognizable patterns and analyzing the crashed state of the program to find where your input overwrote critical registers.
- You send more data than the buffer’s declared size in order to surpass its boundaries, ultimately overwriting memory areas you’re not supposed to reach (such as return addresses) and injecting payloads for exploitation.