Introduction to the ASM 2.0 Bytecode Framework
Pages: 1, 2, 3, 4, 5, 6, 7, 8
Here are a few things to notice:
- All descriptors, string literals, and any other constants used in class structures are stored in a Constant Stack at the beginning of the class file and then referenced from all other structures by its indexes.
- Each class must contain headers (including class name, super class, interfaces, etc.) and the Constant Stack. Other elements, such as the list of fields, list of methods, and all attributes, are optional and may or may not be present.
- Each Field section includes field info such as name, access flags (public, private, etc.), descriptor and field Attributes.
- Each Method section contains similar header info and information about max stack and max local variable numbers, which are used to verify bytecode. For non-abstract and non-native methods, there is a table of method instructions (method body), an exceptions table, and code attributes. Besides these, there can be other method attributes.
- Each Attribute for Class, Field, Method, and Method Code has its own name, which is also documented in the Class File format section of the JVM specification. These attributes represent various pieces of information about bytecode, such as source file name, inner classes, signature (used to store generics info), line number, and local variable tables and annotations. The JVM specification also allows the definition of custom attributes that will be ignored by the standard VM, but may contain additional information. Note that Java 5 annotations practically made these custom attributes obsolete, because annotation semantics allow you to express pretty much anything.
- The Method Code table contains a list of instructions for the Java Virtual Machine. Some of these instructions (as well as the exception, line number, and local variable tables) use offsets within the code table and the values of all of these offsets may need to be adjusted when instructions are inserted or removed from the method code table.
As you can see, bytecode tweaking isn't easy. However, the ASM framework reduces the complexity of the underlying structures and provides a simplified API that still allows for access to all bytecode information and enables complex transformations.
Event-Based Bytecode Processing
The Core package uses a push approach (similar to the "Visitor" design
pattern, which is also used in the SAX API for XML processing) to walk
trough complex bytecode structures. ASM defines several interfaces,
such as ClassVisitor (section [1] in the class file format diagram above),
FieldVisitor (section [2]), MethodVisitor
(section [3]), and AnnotationVisitor.
AnnotationVisitor is a special interface that allows
you to express hierarchical annotation structures. The next few
paragraphs will show how these interfaces interact with each other
and how they can be used together to implement bytecode
transformations and/or capture information from the bytecode.
The Core package can be logically divided into two major parts:
- Bytecode producers, such as a
ClassReaderor a custom class that can fire the proper sequence of calls to the methods of the above visitor classes. - Bytecode consumers, such as writers (
ClassWriter,FieldWriter,MethodWriter, andAnnotationWriter), adapters (ClassAdapterandMethodAdapter), or any other classes implementing the above visitor interfaces.
Figure 2 shows the sequence diagram for the common producer-consumer interaction.

Figure 2. Sequence diagram for producer-consumer
interaction
In this interaction, a client application creates
ClassReader and calls the accept()
method, passing a concrete ClassVisitor instance as a
parameter. Then ClassReader parses the class and fires
"visit" events to ClassVisitor for each bytecode
fragment. For repeated contexts, such as fields, methods, or
annotations, a ClassVisitor may create child visitors
derived from the corresponding interface
(FieldVisitor, MethodVisitor, or
AnnotationVisitor) and return them to the producer. When
a producer receive a null value for FieldVisitor or
MethodVisitor, it skips that fragment of the class
(e.g., a ClassReader wouldn't even parse the
corresponding bytecode section in such a case, which leads to a
sort of "lazy loading" feature driven by the visitors). Otherwise,
the corresponding subcontext events are delegated to the child
visitor instance. At the end of each subcontext, the producer calls
the visitEnd() method and then moves on to the next
section (e.g., the next field, method, etc.).