Class BigArrays
- java.lang.Object
-
- it.unimi.dsi.fastutil.BigArrays
-
public class BigArrays extends java.lang.Object
A class providing static methods and objects that do useful things with big arrays.Introducing big arrays
A big array is an array-of-arrays representation of an array. The length of a big array is bounded by
SEGMENT_SIZE
*Integer.MAX_VALUE
= 134217728 * (231 − 1) rather thanInteger.MAX_VALUE
. The type of a big array is that of an array-of-arrays, so a big array of integers is of typeint[][]
. Note thatSEGMENT_SIZE
has been chosen so that a single segment is smaller than 231 bytes independently of the data type. It might be enlarged in the future.If
a
is a big array,a[0]
,a[1]
, … are called the segments of the big array. All segments, except possibly for the last one, are of lengthSEGMENT_SIZE
. Given an indexi
into a big array, there is an associated segment and an associated displacement into that segment. Access to single members happens by means of accessors defined in the type-specific versions (see, e.g.,IntBigArrays.get(int[][], long)
andIntBigArrays.set(int[][], long, int)
), but you can also use the methodssegment(long)
/displacement(long)
to access entries manually.Scanning big arrays
You can scan a big array using the following idiomatic form:
for(int s = 0; s < a.length; s++) { final int[] t = a[s]; final int l = t.length; for(int d = 0; d < l; d++) { do something with t[d] } }
or using the (simpler and usually faster) reversed version:for(int s = a.length; s-- != 0;) { final int[] t = a[s]; for(int d = t.length; d-- != 0;) { do something with t[d] } }
Inside the inner loop, the original index in
a
can be retrieved usingindex(segment, displacement)
. You can also use an additional long to keep track of the index.Note that caching is essential in making these loops essentially as fast as those scanning standard arrays (as iterations of the outer loop happen very rarely). Using loops of this kind is extremely faster than using a standard loop and accessors.
In some situations, you might want to iterate over a part of a big array having an offset and a length. In this case, the idiomatic loops are as follows:
for(int s = segment(offset); s < segment(offset + length + SEGMENT_MASK); s++) { final int[] t = a[s]; final int l = (int)Math.min(t.length, offset + length - start(s)); for(int d = (int)Math.max(0, offset - start(s)); d < l; d++) { do something with t[d] } }
or, in a reversed form,for(int s = segment(offset + length + SEGMENT_MASK); s-- != segment(offset);) { final int[] t = a[s]; final int b = (int)Math.max(0, offset - start(s)); for(int d = (int)Math.min(t.length, offset + length - start(s)); d-- != b ;) { do something with t[d] } }
Literal big arrays
A literal big array can be easily created by using the suitable type-specific
wrap()
method (e.g.,IntBigArrays.wrap(int[])
) around a literal standard array. Alternatively, for very small arrays you can just declare a literal array-of-array (e.g.,new int[][] { { 1, 2 } }
). Be warned, however, that this can lead to creating illegal big arrays if for some reason (e.g., stress testing)SEGMENT_SIZE
is set to a value smaller than the inner array length.Big alternatives
If you find the kind of “bare hands” approach to big arrays not enough object-oriented, please use big lists based on big arrays (.e.g,
IntBigArrayBigList
). Big arrays follow the Java tradition of considering arrays as a “legal alien”—something in-between an object and a primitive type. This approach lacks the consistency of a full object-oriented approach, but provides some significant performance gains.Additional methods
In addition to commodity methods, this class contains
BigSwapper
-based implementations of quicksort and of a stable, in-place mergesort. These generic sorting methods can be used to sort any kind of list, but they find their natural usage, for instance, in sorting big arrays in parallel.- See Also:
Arrays
-
-
Field Summary
Fields Modifier and Type Field Description static int
SEGMENT_MASK
The mask used to compute the displacement associated to an index.static int
SEGMENT_SHIFT
The shift used to compute the segment associated with an index (equivalently, the logarithm of the segment size).static int
SEGMENT_SIZE
The current size of a segment (227) is the largest size that makes the physical memory allocation for a single segment strictly smaller than 231 bytes.
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
displacement(long index)
Computes the displacement associated with a given index.static void
ensureFromTo(long bigArrayLength, long from, long to)
Ensures that a range given by its first (inclusive) and last (exclusive) elements fits a big array of given length.static void
ensureLength(long bigArrayLength)
Ensures that a big-array length is legal.static void
ensureOffsetLength(long bigArrayLength, long offset, long length)
Ensures that a range given by an offset and a length fits a big array of given length.static long
index(int segment, int displacement)
Computes the index associated with given segment and displacement.static void
main(java.lang.String[] arg)
static void
mergeSort(long from, long to, LongComparator comp, BigSwapper swapper)
Sorts the specified range of elements using the specified big swapper and according to the order induced by the specified comparator using mergesort.static void
quickSort(long from, long to, LongComparator comp, BigSwapper swapper)
Sorts the specified range of elements using the specified big swapper and according to the order induced by the specified comparator using quicksort.static int
segment(long index)
Computes the segment associated with a given index.static long
start(int segment)
Computes the starting index of a given segment.
-
-
-
Field Detail
-
SEGMENT_SHIFT
public static final int SEGMENT_SHIFT
The shift used to compute the segment associated with an index (equivalently, the logarithm of the segment size).- See Also:
- Constant Field Values
-
SEGMENT_SIZE
public static final int SEGMENT_SIZE
The current size of a segment (227) is the largest size that makes the physical memory allocation for a single segment strictly smaller than 231 bytes.- See Also:
- Constant Field Values
-
SEGMENT_MASK
public static final int SEGMENT_MASK
The mask used to compute the displacement associated to an index.- See Also:
- Constant Field Values
-
-
Method Detail
-
segment
public static int segment(long index)
Computes the segment associated with a given index.- Parameters:
index
- an index into a big array.- Returns:
- the associated segment.
-
displacement
public static int displacement(long index)
Computes the displacement associated with a given index.- Parameters:
index
- an index into a big array.- Returns:
- the associated displacement (in the associated segment).
-
start
public static long start(int segment)
Computes the starting index of a given segment.- Parameters:
segment
- the segment of a big array.- Returns:
- the starting index of the segment.
-
index
public static long index(int segment, int displacement)
Computes the index associated with given segment and displacement.- Parameters:
segment
- the segment of a big array.displacement
- the displacement into the segment.- Returns:
- the associated index: that is,
segment(index(segment, displacement)) == segment
anddisplacement(index(segment, displacement)) == displacement
.
-
ensureFromTo
public static void ensureFromTo(long bigArrayLength, long from, long to)
Ensures that a range given by its first (inclusive) and last (exclusive) elements fits a big array of given length.This method may be used whenever a big array range check is needed.
- Parameters:
bigArrayLength
- a big-array length.from
- a start index (inclusive).to
- an end index (inclusive).- Throws:
java.lang.IllegalArgumentException
- iffrom
is greater thanto
.java.lang.ArrayIndexOutOfBoundsException
- iffrom
orto
are greater thanbigArrayLength
or negative.
-
ensureOffsetLength
public static void ensureOffsetLength(long bigArrayLength, long offset, long length)
Ensures that a range given by an offset and a length fits a big array of given length.This method may be used whenever a big array range check is needed.
- Parameters:
bigArrayLength
- a big-array length.offset
- a start index for the fragmentlength
- a length (the number of elements in the fragment).- Throws:
java.lang.IllegalArgumentException
- iflength
is negative.java.lang.ArrayIndexOutOfBoundsException
- ifoffset
is negative oroffset
+length
is greater thanbigArrayLength
.
-
ensureLength
public static void ensureLength(long bigArrayLength)
Ensures that a big-array length is legal.- Parameters:
bigArrayLength
- a big-array length.- Throws:
java.lang.IllegalArgumentException
- iflength
is negative, or larger than or equal toSEGMENT_SIZE
*Integer.MAX_VALUE
.
-
mergeSort
public static void mergeSort(long from, long to, LongComparator comp, BigSwapper swapper)
Sorts the specified range of elements using the specified big swapper and according to the order induced by the specified comparator using mergesort.This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort. The sorting algorithm is an in-place mergesort that is significantly slower than a standard mergesort, as its running time is O(n (log n)2), but it does not allocate additional memory; as a result, it can be used as a generic sorting algorithm.
- Parameters:
from
- the index of the first element (inclusive) to be sorted.to
- the index of the last element (exclusive) to be sorted.comp
- the comparator to determine the order of the generic data (arguments are positions).swapper
- an object that knows how to swap the elements at any two positions.
-
quickSort
public static void quickSort(long from, long to, LongComparator comp, BigSwapper swapper)
Sorts the specified range of elements using the specified big swapper and according to the order induced by the specified comparator using quicksort.The sorting algorithm is a tuned quicksort adapted from Jon L. Bentley and M. Douglas McIlroy, “Engineering a Sort Function”, Software: Practice and Experience, 23(11), pages 1249−1265, 1993.
- Parameters:
from
- the index of the first element (inclusive) to be sorted.to
- the index of the last element (exclusive) to be sorted.comp
- the comparator to determine the order of the generic data.swapper
- an object that knows how to swap the elements at any two positions.
-
main
public static void main(java.lang.String[] arg)
-
-