[ Is there an alternative to AtomicReferenceArray for large amounts of data? ]
I have a large amount of data that I'm currently storing in an
AtomicReferenceArray<X>, and processing from a large number of threads concurrently.
Each element is quite small and I've just got to the point where I'm going to have more than
Integer.MAX_VALUE entries. Unfortunately
List and arrays in java are limited to
Integer.MAX_VALUE (or just less) values. Now I have enough memory to keep a larger structure in memory - with the machine having about 250GB of memory in a 64b VM.
Is there a replacement for
AtomicReferenceArray<X> that is indexed by longs? (Otherwise I'm going to have to create my own wrapper that stores several smaller
AtomicReferenceArray and maps long accesses to int accesses in the smaller ones.)
Sounds like it is time to use native memory. Having 4+ billion objects is going to cause some dramatic GC pause times. However if you use native memory you can do this with almost no impact on the heap. You can also use memory mapped files to support faster restarts and sharing the data between JVMs.
Not sure what your specific needs are but there are a number of open source data structures which do this like; HugeArray, Chronicle Queue and Chronicle Map You can create an array which 1 TB but uses almost no heap and has no GC impact.
BTW For each object you create, there is a 8 byte reference and a 16 byte header. By using native memory you can save 24 bytes per object e.g. 4 bn * 24 is 96 GB of memory.