IntroductionIf you've been paying attention to the tech news lately you might have heard about a little something called Android from Google. Android is a new mobile phone platform based on Linux and Java, but unlike other Java platforms Android uses a non-standard JVM called Dalvik. While Google has promised to release much (all?) of Android under an open source license, they haven't done so yet and they also haven't released any documentation on this new VM. Being somewhat impatient, I've taken it upon myself to do some reverse engineering and put together some documentation of my own. This page documents the Dex file format that compiled programs get translated into for use on the Dalvik VM. I hope to write some documentation on the VM itself in the near future.File HeaderDex files start with a simple header with some checksums and offsets to other structures
Notes: All non-string fields are stored in little-endian format. It would appear that the checksum and signature fields are assumed to be zero when calculating the checksum and signature.
String TableThis table stores the length and offsets for every string in the Dex file including string constants, class names, variable names and more. Each entry has the following format:
Notes: Although the length of the string is stored in this table. All strings also have C-style null-terminators
Class ListA list of all classes referenced or conatined in this dex file. Each entry has the following format:
Field TableA table of fields of all classes defined in this dex file. Each entry has the following format:
Method TableA table of methods of all classes defined in this dex file. Each entry has the following format:
Class Definition TableA table of class definitions for all classes either defined in this dex file or has a method or field accessed by code in this dex file. Each entry has the following format:
Notes: Any of the list offset fields can be NULL in which case the class doesn't have any elements of that type. Not every class in the class list will necessarily have an entry in the class definition table.
Field ListStores data for pre-initialized fields in a class. The list is formed of a 32-bit integer containing the number of entries followed by the entries themselves. Each field has an entry with the following format:
Notes: If the field does not have a pre-initialized value it will be filled with 0 for primitive types and -1 for object types.
Method ListA list of methods for a particular class. Begins with a 32-bit integer that contains the number of items in the list followed by entries in the following format.
Code HeaderThis header contains information about the code that implements a method.
Notes: The code offset field actually points to a 32-bit integer that contains the number of 16-bit words in the instruction stream. The actual VM instructions follow this integer.
Local Variable ListA list of local variables for a particular method. Begins with a 32-bit integer that contains the number of items in the list. Each entry has the following format:
Notes: This list will include local variables that are arguments to the method as well as the "this" variable for non-static methods.
ToDoAdd documentation on Position list and constant objects for pre-initialized fields.
QuestionsIf you have any questions about this document feel free to send me an e-mail at pavone (AT) retrodev (DOT) com or mask_of_destiny@hotmail.com/. |