This is the first version of my bytecode virus for the JVM. This code is functional on JVM version 8 and higher. Along
with being capable of file infection, this virus was written to accomodate the user. Namely, this virus allows
the user to write some code in Java and instantly use it as a viral payload. Users can easily copy any function
or code to the target. We don't want to add additional libraries to our code so it's important to keep whatever payload
you add to what is available as standard Java libraries. Fortunately, the JVM's standard library is enormous and very flexible.
## Goals
Why would I write a virus for Java? There are a few reasons:
- Cross platform, no need to select binaries
- Rarity - I have not found a complete JVM virus on the web.
- Flexibility. JVM bytecode is much easier to manipulate than cpu opcodes and binary file formats.
## Prior Work
It appears there has not been a full Java virus in years. The only existing Java virus I could locate was
[Strangebrew](http://virus.wikidot.com/strangebrew), which was coded in 2001. Unfortunately even in this case the full
source was not disclosed. This virus would also not function in today's world, as Java has required bytecode verification
since that time.
There could be many causes for this. I was not able to find any other documented cases of a Java virus actually functioning.
While I was not able to find the source code for StrangeBrew, according to Symantec, the implementation was a bit buggy.
Upon starting the work I've done here, this might have sounded like an error on the part of the virus author, but we
will see that creating a fully functioning self-contained virus for the JVM is not a simple task.
## Design Overview
## File Infection Strategy
Cheshire infects any class file that contains a main function because this method is standard and reliable. All virus methods are static so they can easily be injected into and run from any
class. I chose to implement my own class file parser and infector because adding an entire library to a target is too
easy to spot, limits us if we want to develop more advanced features such as poly or metamorphism and just requires copying
too much data in general. In its current state, this virus is about 30kb. While large, it's much better than requiring entire
jar files simply to operate.
### The Java Class File Format
To create a virus that infects other executable files, we must first understand the executable format we are dealing with.
_I have absorbed [this page](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html) into my very being and no
longer understand anything about myself or the world around me._ Instead of traditional machine code, Java executables make use of bytecode. This allows portability without the software
authors needing to think about the platform they are writing code for. We have to consider the following aspects of the
.class file form:
- Which items are in the constant pool
- Which methods are available in the class
- Do the offsets used in our instruction operands match the offsets of our newly modified code and our newly placed constants
- How to adjust stack frames based on our modification of our target
#### The Constant Pool
Just like the data section of an ELF file, .class files have something called a Constant Pool to store information needed
by code. This is a listing of constant resources for the code to refer to as it runs. This can be anything from Strings
and Numbers to Objects, Methods and many other things. The formatting of the constant pool is very simple: each item is
given an index to which every other constant pool item, method and instruction may refer to. For our purposes, any
constant pool items we need to add can simply be appended to the target's constant pool. This will not cause any issues
with code verification or loading.
#### Methods
In a fashion similar to the constant pool, every method has an index. Our code has a few tasks when it comes to manipulating methods:
1) Read our own methods into memory so that we may copy them
2) Find methods in target code that we can infect. In our case, any main method will do
3) Inject our methods into the target class
4) Modify the code of the main method to invoke our virus code before continuing as normal
#### Code
Code is perhaps the simplest part of the entire class file format. Every instruction is loaded with some number of
operands following it. There can only be up to 255 JVM bytecode instructions so the set we need to understand is pretty
small compared to x86. The format of this data is simply an opcode followe by operands.
#### The Stack Map Table
After Java 7, you can no longer simply throw instructions into a method and expect functioning program. To make type guarantees
about code at runtime, Java maps out which variables are in the JVM's stack frame at any given time and for how long these
conditions apply. Every stackmapframe applies for some number of instructions indicated by an offset from the current
instruction being executed.
This is by far the hardest part to get right. Before Java runs code, it verifies that the code being loaded refers to
variables that are consistent with the types defined by the code. This would be fine normally, except for _a few complications._
#### The Challenge
So why is all of this hard? We run into a problem: several java instructions, one of which we use regularly, actually
have 2 different implementations. Some instructions will refer to constant pool values and take an
argument as a single unsigned byte(addressing up to 255 items) or two bytes(up to 65535 items). If we are appending our
needed constants to a target constant pool and the pool has more than 255 items, we need to decide whether to use the
instruction the original instruction that our compiler chose or a _new_ instruction addressing the correct number of
possible constants.
We could simply choose to hardcode our solution to only ever use 2 byte addressing and start only with lower numbers,
but ideally our code should be able to copy whatever methods we give it to copy and not simply some very specific code.
The viurs should be flexible and allow for advanced payloads specific by the user. Otherwise we are very limited in what we can do and
create more overead if we want to implement advanced features like polymoprhism or even metamorphism.
## Implementation
### Copying resources to the target
This is probably the easiest part of the whole process. Our code for doing this is:
Since our main virus method is never called by any of the other functions we've written, we have to copy the MethodRef
for that function to the target ourselves. We need to do this to use the invokestatic opcode, which is what we're sticking with
for execution. As you can see, I horribly bastardized my own code here by adding the newly generated instruction to an item in the destination's HashMap. This is horrible and I'm sorry.
It does however appear to have worked so there's that.
## Transmission Mechanism
One thing I've bundled with this virus is a very simple but effective way to help this virus spread. We know that we're
interested in infecting .class files inside of Jars, but simply allowing it to happen and spread over time would tkae a while.
After some digging into how we might abuse build systems to spread our code, I stumbled on to the somewhat surprising fact
that it is trivially easy to trigger code execution when somebody clones a gradle project in IntelliJ IDEA. This trick
probably also works in Android studio. I haven't tried it myself - maybe you should :)
The trick is very simple:
In settings.gradle in your project, place some innocent looking comments and code: