Today, let's introduce the JVM class loading mechanism. The main content is as follows:
- Overview
- Timing of class loading
- Process of class loading
- Class loaders
- Classification of class loaders
- Parent delegation model
Overview#
JVM loads bytecode (.class) files into memory, verifies, parses, and initializes the data, and finally generates Java types that can be directly used by the JVM. This is the class loading mechanism of the JVM.
In Java, the loading, linking, and initialization of various types are completed during program execution. This approach incurs some performance overhead during class loading, but it has high flexibility. The dynamic extension feature of Java relies on dynamic loading and dynamic linking during runtime. For example, in plugin technology, resources are loaded and replaced using custom class loaders, which rely on the runtime class loading feature of the Java language.
Timing of class loading#
From the moment a class is loaded into the JVM memory until it is unloaded from the JVM memory, the lifecycle of class loading is as shown in the following diagram:
The order of the five stages of loading, verification, preparation, initialization, and unloading is fixed, but the resolution of classes is not necessarily. It may occur after initialization to support runtime binding in the Java language. In the entire process of class loading, each stage is triggered by the previous stage.
The JVM specification defines the initialization stage of a class, but there is no constraint on the loading stage. It is specifically controlled by the JVM itself. However, loading, verification, and preparation must be completed before the initialization stage.
So when does a class start initialization? The JVM strictly specifies the following situations where a class must be initialized:
- When encountering the new, getstatic/putstatic, invokestatic instructions, if the class has not been initialized, the class needs to be initialized. The above instructions correspond to object instantiation using the new keyword, reading or setting a static property, and calling a static method, respectively. You can verify this by using the
javap
command to view the implementation of the bytecode file. - When using java.lang.reflect to reflectively invoke a class, if the class has not been initialized, the class needs to be initialized.
- When initializing a class, if its parent class has not been initialized, the parent class needs to be initialized first.
- When the JVM starts and the user specifies the main class to be started, such as a class with a main method, the JVM will initialize this class first.
- When using the dynamic language support of JDK 1.7, if the final resolution result of the java.lang.invoke.MethodHandler instance is REF_getStatic, REF_putStatic, REF_invokeStatic, and the classes corresponding to these handles have not been initialized, they need to be initialized first. MethodHandler can be understood as another form of reflection.
Process of class loading#
The following provides specific explanations for several stages of class loading.
Loading#
The class file loads its bytecode content into memory through the class loader, converts this static data into runtime data structures in the method area, and generates the java.lang.Class object corresponding to the class file in the heap memory. This Class object is the entry point for accessing class data in the method area.
The JVM specification does not specify the source of class files. Examples are as follows:
- Obtained from a zip package, eventually becoming the basis for jar and war formats.
- Obtained from the network, typical application is Applet.
- Generated at runtime, typical application is dynamic proxy technology. In java.lang.reflect.Proxy, ProxyGenerator.generatrProxyClass is used to generate binary bytecode streams for proxy classes similar to Proxy for specific interfaces.
- Generated from other files, obtained from a database, etc.
The loading stage of a class and the linking stage that follows it are performed in an interleaved manner without clear boundaries. The loading stage may not be completed, but the linking stage may have already started. However, the start time of the two stages still maintains a fixed order.
Linking#
Linking includes three stages: verification, preparation, and resolution.
- Verification: Ensures that the information contained in the bytecode stream of the class file meets the requirements of the current virtual machine and does not harm the security of the virtual machine itself. Overall, the verification stage mainly includes file format verification, metadata verification, bytecode verification, and symbol reference verification. The specific verification content can be checked in the Java Virtual Machine Specification.
- Preparation: Allocates memory for class variables and sets the initial values of class variables. The initial values are generally the initial values of the data types, not the values initialized in the actual code. For example, the initial value of int is 0. The memory used by these class variables is allocated in the method area. Class variables refer to variables that are modified by the static keyword.
- Resolution: The JVM replaces the symbol references in the constant pool with direct references. The symbol references here refer to the symbol references mentioned in the symbol reference verification in the previous verification stage.
Initialization#
The initialization stage of a class is the last step of the class loading stage. Except for user-defined class loaders, all other operations in the loading stage and linking stage are completed by the JVM itself. The initialization stage is when the Java code, that is, the bytecode, is actually executed. Some points about class initialization are as follows:
- The initialization stage is the process of executing the class constructor () method.
- The () method is automatically generated by the compiler by combining all variable assignment actions of class variables and statements in static blocks statuc{}. The order of compilation collection is consistent with the order of statements in the source code. For example, variables defined after a static block can only be assigned but not accessed in the static block.
- When initializing a class, if the parent class has not been initialized, the parent class is initialized first.
- The JVM ensures that the () method of a class is correctly locked and synchronized in a multi-threaded environment.
- When accessing a static field of a Java class, only the class that declares it will be initialized.
Class loaders#
As the name suggests, a class loader is used to load Java classes into the JVM. All class loaders are instances of the java.lang.ClassLoader
class. As mentioned earlier, class files are loaded into memory by class loaders during the class loading stage. That is, a fully qualified name of a class can be used to obtain the binary bytecode stream that defines this class. This action is implemented by the class loader.
For any class, its uniqueness in the JVM is established together with its class loader. Each class loader has its own independent class namespace, which means that two identical classes loaded by different class loaders will no longer be equal.
Classification of class loaders#
From the perspective of the JVM, there are only two different class loaders:
- Bootstrap ClassLoader: Generally implemented in C++, specifically implemented by the JVM.
- Other class loaders: Implemented in Java, independent of the JVM, and all are instances of java.lang.ClassLoader, such as DexClassLoader in Android.
From the perspective of Java developers, class loaders can be divided into three categories:
-
Bootstrap ClassLoader: Responsible for loading the classes in JAVA_HOME\lib, or the paths specified by the -Xbootclasspath parameter, and recognized by the JVM (recognized based on the file name only, even if the class library with an incorrect name is placed in the lib directory, it will not be loaded). The Bootstrap ClassLoader cannot be directly used by JAVA programs.
-
Extension ClassLoader: This class loader is implemented by
sum.misc.Launcher$ExtClassLoader
and is responsible for loading the classes under JAVA_HOME\lib\ext or the paths specified by the java.ext.dirs system variable. The Extension ClassLoader can be used directly. -
Application ClassLoader: This class loader is implemented by
sun.misc.Launcher$AppClassLoader
. This class loader is the return value of the getSystemClassLoader() method in ClassLoader. It is generally referred to as the system class loader. It is responsible for loading the classes specified by the user class path (ClassPath). Developers can directly use this class loader. If the application does not define its own class loader, this is the default class loader in the program.
Parent delegation model#
Let's take a look at the relationship between the class loaders shown above:
The hierarchical relationship between class loaders shown in the above diagram is called the parent delegation model. The parent delegation model requires that except for the top-level Bootstrap ClassLoader, all other class loaders should have their own parent class loaders. The parent-child relationship between class loaders is generally not implemented through inheritance, but through composition to reuse the code of the parent loader. This approach is not a mandatory constraint model, but a recommended class loader implementation approach given by the Java designers.
So what is the workflow of the parent delegation model?
When a class loader receives a class loading request, it does not immediately load the class. Instead, it delegates the class loading request to its parent class loader. This process continues until each class loading request is delegated to the Bootstrap ClassLoader. Only when the parent class loader cannot complete the class loading, will the child class loader attempt to load it by itself. The loading process is as follows:
protected Class<?> loadClass(String name, boolean resolve)throws ClassNotFoundException{
synchronized (getClassLoadingLock(name)) {
// 1. Check if the class has already been loaded
Class<?> c = findLoadedClass(name);
if (c == null) {
long t0 = System.nanoTime();
try {
if (parent != null) {
// 2. If it has not been loaded, call the parent class loader to load it
c = parent.loadClass(name, false);
} else {
// 3. If the parent class loader does not exist, use the Bootstrap ClassLoader to load it directly
c = findBootstrapClassOrNull(name);
}
} catch (ClassNotFoundException e) {
// ClassNotFoundException thrown if class not found
// from the non-null parent class loader
}
if (c == null) {
// If still not found, then invoke findClass in order
// to find the class.
long t1 = System.nanoTime();
// 4. If neither the parent class loader nor the Bootstrap ClassLoader has loaded the class, call the findClass method of its own, that is, the child class loader, to load the class
c = findClass(name);
sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);
sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
sun.misc.PerfCounter.getFindClasses().increment();
}
}
if (resolve) {
resolveClass(c);
}
return c;
}
}
Since JDK 1.2, the java.lang.ClassLoader added a new protected method findClass(). If you want to customize a class loader, you can directly implement the findClass() method instead of overriding the loadClass() method, because the loadClass() method ultimately calls the findClass() method. This way, the custom class loader complies with the parent delegation rules.
The JVM class loading mechanism and the related knowledge of class loaders have been introduced. Class loaders well support the dynamic extension feature of Java. They are also used in Android, such as PathClassLoader and DexClassLoader used in plugin technology. These are indirectly subclasses of ClassLoader, which are based on the lack of restrictions on the source of class files and can be used to achieve app pluginization.