Android 中的 SpareArray 和 ArrayMap 实现分析

JI,XIAOYONG...大约 7 分钟

日常开发中，常用的存储键值对的数据结构是HashMap，根据Java 笔记之 HashMap 保存数据open in new window和Java 笔记之计算 Java 对象的大小及其应用open in new window可以知道，HashMap存储键值对会占用比较多的内存控件，而对于内存限制较大的 Android 平台来说，为了避免这种浪费，官方推荐我们使用SpareArray和ArrayMap，本文对这两个类的实现进行分析比较。

SpareArray以及他的衍生类都是以基本类型为key，因为避免了自动装箱，并且用数组直接保存 key、value（而非像HashMap那样将其封装为Node对象后再保存），因而节省了内存。

ArrayMap则支持所有类型的 key，他是将key和value全部保存在一个数组中（n位为key，n+1位为value），避免了将其封装为Node对象带来的内存消耗。

当要保存的数据量比较小（小于几千个）的时候，如果 KEY 是基本类型，推荐使用SparseArray及其衍生类以节省内存，如果 KEY 是其他类型则使用ArrayMap;否则使用HashMap更加高效。

SpareArray

SpareArray以及他的衍生类主要用于以基本类型为key保存非大量数据的场景。

相比HashMap而言，他的优点主要在于没有对保存的数据二次封装，没有对基本类型的数据自动装箱，存储单个数据的成本小，也没有hash计算。

但他在添加数据时需要扩展数组 (涉及到新建、复制数组，gc()等)，~~在删除数据时需要缩减数组~~ (查看gc()等源码发现他的数组只会增加，不会缩减)，以及通过二分法查找索引都会消耗性能。

为了避免每次删除时都需要缩减数组，SpareArray在删除数组时只会将其赋值为DELETED，在下次调用其private void gc()方法时丢弃掉这些数据

先看一下SpareArray的结构：

public class SparseArray<E> implements Cloneable {
    private boolean mGarbage = false; //是否调用 gc() 方法

    private int[] mKeys;//所有的 key
    private Object[] mValues;//所有的 value
    private int mSize;//所保存的数据个数
}

void put(int key, E value)

添加方法先用二分法查找key对应的位置：

如果有，则直接覆盖
如果没有，则取反得到应该插入的位置，并分别插入key和value

public void put(int key, E value) {
  // 先用二分法查找 key 对应的索引，找到的话返回对应索引，
  // 否则返回 key 应该插入的位置的取反值
    int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

    if (i >= 0) {
      // 如果已存在值则直接覆盖
        mValues[i] = value;
    } else {
      // 对二分法查找到的值再取反，得到 key 应该插入的位置
        i = ~i;

        if (i < mSize && mValues[i] == DELETED) {
            mKeys[i] = key;
            mValues[i] = value;
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();

            // Search again because indices may have changed.
            i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
        }

      // public static <T> T[] insert(T[] array, int currentSize, int index, T element)
      // Inserts an element into the array at the specified index,
      // growing the array if there is no more room.
        mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
        mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
        mSize++;
    }
}

E get(int key)

获取数据，先用二分法查找，如果找到就返回对应的值，否则返回null。

public E get(int key) {
        return get(key, null);
}

public E get(int key, E valueIfKeyNotFound) {
    int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

    if (i < 0 || mValues[i] == DELETED) {
        return valueIfKeyNotFound;
    } else {
        return (E) mValues[i];
    }
}

void remove(int key)

删除key以及对应的数据。

同样先用二分法查找对应位置，有的话则标记为DELETED，等待下次gc()时丢弃。

public void remove(int key) {
        delete(key);
}

public void delete(int key) {
    int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

    if (i >= 0) {
        if (mValues[i] != DELETED) {
            mValues[i] = DELETED;
            mGarbage = true;
        }
    }
}

gc()

在上文中我们看到，删除数据时，mGarbage被标记为true，这样当下一次进行put/valueAt/append/size等涉及到数组大小查询、改动等时，就出触发gc()以便整理数组结构。

private void gc() {
    // Log.e("SparseArray", "gc start with " + mSize);

    int n = mSize;
    int o = 0;
    int[] keys = mKeys;
    Object[] values = mValues;

    for (int i = 0; i < n; i++) {
        Object val = values[i];

      // 这里的操作只是将没有被删除的数据移动到了数组的前面
      // 而保证了数组后面都是 DELETED 或 null，方便后续操作
        if (val != DELETED) {
            if (i != o) {
                keys[o] = keys[i];
                values[o] = val;
                values[i] = null;
            }

            o++;
        }
    }

    mGarbage = false;
    mSize = o;
}

HashMap 与 SpareArray 及其衍生类对应关系

参考下图open in new window

ArrayMap

ArrayMap实现了Map<K, V>接口，他的 API 和HashMap相差无几，但是由于没有对数据再包装，动态调整数组的大小，一定范围内他比HashMap内存效率高。

但是如果保存大量数据（超过千位）时，由于他需要二分法查找的影响会比HashMap慢很多。

ArrayMap特殊之处在于将key，value保存到了同一个数组 mArray 中（n 位保存 key，n+1 位保存 value）。

先看一下ArrayMap的结构：

static Object[] mBaseCache;
static int mBaseCacheSize;
static Object[] mTwiceBaseCache;
static int mTwiceBaseCacheSize;

final boolean mIdentityHashCode;//是否强制使用 System.identityHashCode(key) 获取 key 的 HashCode
//System.identityHashCode(key) 方法无论类是否重写了 hashCode() 方法，
//都会调用 Object.identityHashCode(key) 来获取对象的 hashCode
int[] mHashes;//存储所有 key 的 hash 值
Object[] mArray;//存储 key 和 value，大小是 mHashes 的两倍
//n 位保存 key，n+1 位保存 value
int mSize;
MapCollections<K, V> mCollections;

在使用时：

计算key的hash值，

hash = mIdentityHashCode ? System.identityHashCode(key) : key.hashCode();

然后使用indexOf()在mHashes中进行二分法查找对应的index

index = indexOf(key, hash);

indexOf()方法会先用二分法查找hash对应的index,如果index<0则返回index；否则在对比mArray中对应位置mArray[index<<1]的key与要查询的key：

两者一致：返回index
两者不一致：从index开始，先向后，再向前查询是否有相同的key,如果有返回对应index
以上都没有找到：对mHashes中最后一个与key的hash一致的后一位index取反，并返回

int indexOf(Object key, int hash) {
    final int N = mSize;

    // Important fast case: if nothing is in here, nothing to look for.
    if (N == 0) {
        return ~0;
    }

    int index = binarySearchHashes(mHashes, N, hash);

    // If the hash code wasn't found, then we have no entry for this key.
    if (index < 0) {
        return index;
    }

    // If the key at the returned index matches, that's what we want.
    if (key.equals(mArray[index<<1])) {
        return index;
    }

    // Search for a matching key after the index.
    int end;
    for (end = index + 1; end < N && mHashes[end] == hash; end++) {
        if (key.equals(mArray[end << 1])) return end;
    }

    // Search for a matching key before the index.
    for (int i = index - 1; i >= 0 && mHashes[i] == hash; i--) {
        if (key.equals(mArray[i << 1])) return i;
    }

    // Key not found -- return negative value indicating where a
    // new entry for this key should go.  We use the end of the
    // hash chain to reduce the number of array entries that will
    // need to be copied when inserting.
    return ~end;
}

V put(K key, V value)

当添加item时，按照前述规则，先在mArray中查找key对应的索引index：

index >= 0 ：已经有键为key的数据，直接覆盖旧值并返回

index < 0 ：没有键为key的数据，对数组进行扩容，并保存对应数据

index = ~index;//上文 indexOf() 中计算得出的 key 应该添加的位置
mHashes[index] = hash;
mArray[index<<1] = key;
mArray[(index<<1)+1] = value;

V get(Object key)

get()方法就比较简单了，先查找key的索引，然后取出对应的数据value并返回即可：

    public V get(Object key) {
        final int index = indexOfKey(key);
        return index >= 0 ? (V)mArray[(index<<1)+1] : null;
    }

V remove(Object key)

remove()方法也会先使用indexOfKey()计算key的index，然后删除对应位置的数据。

此外，如果mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3的话，还会缩减数组的大小为osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2)：

public V removeAt(int index) {
  //...其他代码
  if(mHashes.length > (BASE_SIZE*2) && mSize < mHashes.length/3){
    //...其他代码
    final int n = osize > (BASE_SIZE*2) ? (osize + (osize>>1)) : (BASE_SIZE*2);
    allocArrays(n);
  }
  //...其他代码
}

 private void allocArrays(final int size) {
   //...其他代码
   mHashes = new int[size];
   mArray = new Object[size<<1];
 }