Java Integer 的缓存机制分析

Java Integer 缓存机制基础概念

在Java中，Integer类是对基本数据类型int的包装类。它为int值提供了对象表示形式，并包含了许多有用的方法。而Integer缓存机制，是Java为了提高性能和节省内存而引入的一项优化措施。

缓存范围

Integer类默认会缓存 -128 到 127 之间的整数。这意味着当通过Integer.valueOf(int)方法获取Integer对象时，如果传入的整数值在这个范围内，Integer类会直接返回缓存中的对象，而不是创建一个新的Integer对象。

代码示例

public class IntegerCacheExample {
    public static void main(String[] args) {
        Integer num1 = Integer.valueOf(100);
        Integer num2 = Integer.valueOf(100);

        System.out.println(num1 == num2); // 输出 true

        Integer num3 = Integer.valueOf(128);
        Integer num4 = Integer.valueOf(128);

        System.out.println(num3 == num4); // 输出 false
    }
}

在上述代码中，num1和num2通过Integer.valueOf(100)获取Integer对象，由于100在 -128 到 127 之间，所以num1和num2引用的是同一个缓存中的对象，num1 == num2返回true。而对于num3和num4，因为128超出了默认缓存范围，所以它们是不同的对象，num3 == num4返回false。

Integer 缓存机制的实现原理

缓存的存储结构

Integer类的缓存机制是通过一个静态数组实现的。在Integer类中，有一个静态内部类IntegerCache，这个内部类维护了一个Integer类型的数组cache。

private static class IntegerCache {
    static final int low = -128;
    static final int high;
    static final Integer cache[];

    static {
        // high value may be configured by property
        int h = 127;
        String integerCacheHighPropValue =
            sun.misc.VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
        if (integerCacheHighPropValue != null) {
            try {
                int i = parseInt(integerCacheHighPropValue);
                i = Math.max(i, 127);
                // Maximum array size is Integer.MAX_VALUE
                h = Math.min(i, Integer.MAX_VALUE - (-low) -1);
            } catch( NumberFormatException nfe) {
                // If the property cannot be parsed into an int, ignore it.
            }
        }
        high = h;

        cache = new Integer[(high - low) + 1];
        int j = low;
        for(int k = 0; k < cache.length; k++)
            cache[k] = new Integer(j++);

        // range [-128, 127] must be interned (JLS7 5.1.7)
        assert IntegerCache.high >= 127;
    }

    private IntegerCache() {}
}

从上述代码可以看出，cache数组的大小是根据high和low计算得出的，默认情况下low为 -128，high为127 。数组中的元素在类加载时就已经初始化好了，每个元素都是对应范围内的Integer对象。

valueOf 方法

Integer类的valueOf(int)方法是获取Integer对象的常用方式，它的实现与缓存机制紧密相关。

public static Integer valueOf(int i) {
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}

当调用valueOf(int)方法时，首先会检查传入的整数值i是否在IntegerCache.low到IntegerCache.high的范围内。如果在这个范围内，就直接从IntegerCache.cache数组中获取对应的Integer对象并返回；如果不在这个范围内，就会创建一个新的Integer对象并返回。

缓存机制对性能和内存的影响

性能提升

缓存机制在性能方面有显著的提升。在频繁使用Integer对象表示小范围整数的场景下，由于不需要频繁创建新的对象，减少了对象创建和垃圾回收的开销。例如，在循环中使用Integer对象来计数，如果使用缓存范围内的整数，每次获取Integer对象时都可以直接从缓存中获取，而不需要创建新对象，大大提高了程序的执行效率。

public class PerformanceTest {
    public static void main(String[] args) {
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 1000000; i++) {
            Integer num = Integer.valueOf(100);
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Time taken for cached values: " + (endTime - startTime) + " ms");

        startTime = System.currentTimeMillis();
        for (int i = 0; i < 1000000; i++) {
            Integer num = Integer.valueOf(128);
        }
        endTime = System.currentTimeMillis();
        System.out.println("Time taken for non - cached values: " + (endTime - startTime) + " ms");
    }
}

上述代码通过对比在循环中获取缓存范围内和范围外的Integer对象所花费的时间，可以明显看出使用缓存范围内整数时，程序的执行速度更快。

内存节省

缓存机制也有助于节省内存。因为在缓存范围内的整数对象只会被创建一次并被复用，而不是每次都创建新的对象。假设在一个大型应用中，有大量的小范围整数被频繁使用，如果没有缓存机制，这些整数对应的Integer对象会占用大量的内存空间。而通过缓存机制，相同的小范围整数只需要一个Integer对象实例，大大减少了内存的占用。

缓存范围的调整

通过系统属性调整

在某些情况下，可能需要调整Integer缓存的范围。Java允许通过设置系统属性java.lang.Integer.IntegerCache.high来调整缓存的上限。例如，要将缓存上限提高到255，可以在启动Java程序时通过-D参数设置系统属性：

java -Djava.lang.Integer.IntegerCache.high=255 YourMainClass

代码示例

public class AdjustCacheRange {
    public static void main(String[] args) {
        Integer num1 = Integer.valueOf(200);
        Integer num2 = Integer.valueOf(200);

        System.out.println(num1 == num2); // 在未调整缓存范围时输出 false

        // 假设已经通过 -Djava.lang.Integer.IntegerCache.high=255 设置了系统属性
        Integer num3 = Integer.valueOf(200);
        Integer num4 = Integer.valueOf(200);

        System.out.println(num3 == num4); // 在调整缓存范围后输出 true
    }
}

在上述代码中，未调整缓存范围时，200超出了默认的缓存范围，num1和num2是不同的对象，num1 == num2返回false。当通过设置系统属性将缓存上限提高到255后，200在新的缓存范围内，num3和num4引用的是同一个缓存中的对象，num3 == num4返回true。

缓存机制在不同场景下的应用

集合操作中的应用

在使用集合（如ArrayList、HashSet等）存储Integer对象时，缓存机制也会发挥作用。例如，在HashSet中添加Integer对象，如果这些对象在缓存范围内，由于对象的复用，会减少HashSet内部存储的对象数量，从而节省内存。同时，在进行查找等操作时，因为缓存范围内的对象是相同的实例，比较操作的效率也会提高。

import java.util.HashSet;
import java.util.Set;

public class SetWithIntegerCache {
    public static void main(String[] args) {
        Set<Integer> set = new HashSet<>();
        for (int i = -128; i <= 127; i++) {
            set.add(Integer.valueOf(i));
        }
        System.out.println(set.size()); // 输出 256，因为缓存范围内的整数对象是复用的

        Set<Integer> set2 = new HashSet<>();
        for (int i = 128; i <= 256; i++) {
            set2.add(Integer.valueOf(i));
        }
        System.out.println(set2.size()); // 输出 129，因为超出缓存范围的整数对象是不同的实例
    }
}

序列化与反序列化中的应用

在Java的序列化和反序列化过程中，Integer缓存机制同样有影响。当对Integer对象进行序列化时，如果对象在缓存范围内，反序列化后得到的对象与缓存中的对象是同一个实例。这保证了在不同的序列化和反序列化操作中，对于缓存范围内的Integer对象，其一致性和复用性。

import java.io.*;

public class IntegerSerialization {
    public static void main(String[] args) throws IOException, ClassNotFoundException {
        Integer num1 = Integer.valueOf(100);
        FileOutputStream fileOut = new FileOutputStream("integer.ser");
        ObjectOutputStream out = new ObjectOutputStream(fileOut);
        out.writeObject(num1);
        out.close();
        fileOut.close();

        FileInputStream fileIn = new FileInputStream("integer.ser");
        ObjectInputStream in = new ObjectInputStream(fileIn);
        Integer num2 = (Integer) in.readObject();
        in.close();
        fileIn.close();

        System.out.println(num1 == num2); // 输出 true，因为 100 在缓存范围内
    }
}

在上述代码中，将Integer对象num1（值为100）序列化到文件，然后再反序列化得到num2。由于100在缓存范围内，num1和num2引用的是同一个对象，num1 == num2返回true。

与其他包装类缓存机制的对比

Byte 缓存机制

Byte类也有缓存机制，它缓存了所有可能的byte值，即 -128 到 127 。因为byte类型的取值范围固定且较小，所以将所有可能的值都进行缓存是合理的。Byte类的valueOf(byte)方法实现与Integer类似：

public static Byte valueOf(byte b) {
    final int offset = 128;
    return ByteCache.cache[(int)b + offset];
}
private static class ByteCache {
    private ByteCache(){}

    static final Byte cache[] = new Byte[-(-128) + 127 + 1];

    static {
        for(int i = 0; i < cache.length; i++)
            cache[i] = new Byte((byte)(i - 128));
    }
}

Short 缓存机制

Short类同样缓存了 -128 到 127 之间的值。其valueOf(short)方法和缓存实现与Integer类基本相同：

public static Short valueOf(short s) {
    final int offset = 128;
    int sAsInt = s;
    if (sAsInt >= -128 && sAsInt <= 127) { // must cache
        return ShortCache.cache[sAsInt + offset];
    }
    return new Short(s);
}
private static class ShortCache {
    private ShortCache(){}

    static final Short cache[] = new Short[-(-128) + 127 + 1];

    static {
        for(int i = 0; i < cache.length; i++)
            cache[i] = new Short((short)(i - 128));
    }
}

Long 缓存机制

Long类也缓存了 -128 到 127 之间的值，其缓存机制的实现原理与Integer、Short等类似：

public static Long valueOf(long l) {
    final int offset = 128;
    if (l >= -128 && l <= 127) { // will cache
        return LongCache.cache[(int)l + offset];
    }
    return new Long(l);
}
private static class LongCache {
    private LongCache(){}

    static final Long cache[] = new Long[-(-128) + 127 + 1];

    static {
        for(int i = 0; i < cache.length; i++)
            cache[i] = new Long(i - 128);
    }
}

Character 缓存机制

Character类缓存了 0 到 127 之间的字符。Character类的valueOf(char)方法实现如下：

public static Character valueOf(char c) {
    if (c <= 127) { // must cache
        return CharacterCache.cache[(int)c];
    }
    return new Character(c);
}
private static class CharacterCache {
    private CharacterCache(){}

    static final Character cache[] = new Character[127 + 1];

    static {
        for (int i = 0; i < cache.length; i++)
            cache[i] = new Character((char)i);
    }
}

通过对比可以看出，不同的包装类根据其数据类型的特点和范围，设置了相应的缓存范围和机制，目的都是为了提高性能和节省内存。

注意事项与常见错误

基本类型与包装类型比较

在使用Integer缓存机制时，需要注意基本类型int和包装类型Integer的比较。例如：

public class ComparisonExample {
    public static void main(String[] args) {
        Integer num1 = Integer.valueOf(100);
        int num2 = 100;

        System.out.println(num1 == num2); // 输出 true，因为自动拆箱

        Integer num3 = Integer.valueOf(128);
        int num4 = 128;

        System.out.println(num3 == num4); // 输出 true，因为自动拆箱
    }
}

在上述代码中，num1 == num2和num3 == num4都返回true，这是因为Java会自动将Integer对象拆箱为int基本类型进行比较。但如果是两个Integer对象比较，就需要考虑缓存机制，如前面的示例所示。

缓存机制与多线程

在多线程环境下，Integer缓存机制是线程安全的。因为缓存是在类加载时初始化的，并且缓存中的对象是不可变的。多个线程同时获取缓存范围内的Integer对象时，不会出现线程安全问题。但是，如果在多线程环境下进行与Integer相关的复杂操作，如对Integer对象进行修改（虽然Integer对象本身不可变，但可能存在包装在可变对象中的情况），则需要考虑线程同步问题。

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class ThreadSafetyWithIntegerCache {
    public static void main(String[] args) {
        ExecutorService executorService = Executors.newFixedThreadPool(10);
        for (int i = 0; i < 10; i++) {
            executorService.submit(() -> {
                Integer num = Integer.valueOf(100);
                // 这里只是获取缓存中的对象，多线程环境下不会有问题
                System.out.println(Thread.currentThread().getName() + " : " + num);
            });
        }
        executorService.shutdown();
    }
}

上述代码展示了在多线程环境下获取缓存中的Integer对象，由于缓存机制本身的特性，不会出现线程安全问题。

避免错误依赖缓存

开发人员不应过度依赖缓存机制。虽然缓存机制在很多情况下能提高性能和节省内存，但如果代码逻辑依赖于缓存范围内对象的同一性，而在某些场景下缓存范围发生了变化（如通过系统属性调整了缓存上限），可能会导致代码逻辑错误。例如，在比较两个Integer对象是否相等时，应该使用equals方法，而不是单纯依赖==运算符，以确保在不同缓存设置下代码的正确性。

public class AvoidCacheDependency {
    public static void main(String[] args) {
        Integer num1 = Integer.valueOf(100);
        Integer num2 = Integer.valueOf(100);

        System.out.println(num1.equals(num2)); // 推荐使用，总是正确比较值

        // 假设缓存范围发生变化，下面这种依赖 == 的比较可能出错
        System.out.println(num1 == num2); 
    }
}

在上述代码中，num1.equals(num2)总是能正确比较两个Integer对象的值，而num1 == num2的结果可能会因为缓存范围的变化而改变。

通过深入理解Integer缓存机制的原理、应用场景、与其他包装类的对比以及注意事项，可以更好地在Java开发中利用这一特性，提高程序的性能和稳定性。在实际开发中，根据具体的业务需求和场景，合理地运用缓存机制，避免因不了解其特性而导致的潜在问题。同时，在多线程环境和涉及对象比较等操作时，要遵循正确的编程规范，确保代码的正确性和可靠性。