java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
My results are derived from reading various posts around the web and running tests using Jamm and Java's Runtime object.
Preliminaries
- All memory allocations in Java are aligned on 8 byte boundaries
- An object's properties are stored together, aligned on 8 byte boundaries and not mixed with other objects, not even a superclass
- The JVM is smart enough to reorder and pack an object's properties so it will use less space because of padding for 8 byte alignment
- My calculations will assume that UseCompressedOops is false (-XX:-UseCompressedOops), therefore references are 8 bytes (not 4.) Essentially heaps less than 32gb can save memory from using extra registers on a 64 bit processor to shorten address references. Explained here: https://wikis.oracle.com/display/HotSpotInternals/CompressedOops).
- Since Java 1.6.0_23 UseCompressedOops has defaulted to true. Use "jinfo -flag UseCompressedOops <pid>" to determine your setting
I think it is interesting that since UseCompressedOops defaults to *on* most of my results would be in line with 32 bit results. I am working with large heaps and need to assume that all references are 8 bytes.
8 Byte Memory Alignment
Java specification states that all memory allocations are aligned to addresses that are divisible by 8, which means that a memory allocation must start on an address that is divisible by 8, and its size must be a multiple of 8 bytes. So if an object requires 16 bytes of heap, it will start at some address X (divisible by 8) and extend 16 bytes. So every object (not primitives, but includes arrays) will require a minimum of 16 bytes.
Java specification states that all memory allocations are aligned to addresses that are divisible by 8, which means that a memory allocation must start on an address that is divisible by 8, and its size must be a multiple of 8 bytes. So if an object requires 16 bytes of heap, it will start at some address X (divisible by 8) and extend 16 bytes. So every object (not primitives, but includes arrays) will require a minimum of 16 bytes.
If you declare an object (non-inner object) that has no properties and measure its usage, it will take 16 bytes of heap space. After reading some web posts it seems that every object instance has a "header" and a "reference" to its class definition. Since references are 8 bytes (because of -XX:-UseCompressedOops) this means the actual space used by an empty object is 16 bytes.
I proved this by adding a single "byte" primitive to an object and watched the heap usage change from 16 to 24. This is my technique for proving heap usage.
Inner Classes
The exception to the "Basic Objects" rule is an instantiated inner class which has a hidden reference to its parent object. This type of object takes 24 bytes on the heap, no padding needed for 8 byte alignment.
We know the size of each primitive from the Java specification. What isn't stated in the specification is how much heap space they use. It seems to be JVM implementationdependent. Below is a table that shows each primitive's size and how much heap it may use on my JVM (see at the beginning.) Be aware because of "8 byte alignment" and "padding" that a primitive such as a byte or boolean, can be packed together to take up less memory, and this table only shows the maximum space it can use.
Primitive | Size (bytes) | Heap Usage (bytes) |
---|---|---|
boolean | 1 | 8 |
byte | 1 | 8 |
short | 2 | 8 |
int | 4 | 8 |
long | 8 | 8 |
float | 4 | 8 |
double | 8 | 8 |
char | 16 | 16 |
The interesting primitive is boolean, which can only have two values, true and false, and represent 1 bit of information.By doing the same type of test as with "byte", we can prove that a boolean primitive minimally uses 1 byte of heap.
Object References
Because of UseCompressedOops, object references are 8 bytes. We can test this by adding a reference to an empty class and measuring its memory usage. This shows 24 bytes used on the heap. Add one more "byte" primitive, measure again and you'll see 32 bytes of heap usage.
Arrays
All arrays have an extra integer "length" field stored in their header, which means that an array's header uses 24 bytes even if it has no data - 16 bytes for header, 4 bytes for integer length, and 4 bytes for padding. The JVM is smart about allocating memory for arrays. It will allocate a block of memory, 8 byte aligned, to contain all the elements and only padding after the end of the last element. The layout looks like this:
Field | Type | Size (bytes) |
---|---|---|
HEADER | 16 | |
length | int | 4 |
PADDING | 4 | |
MEMORY BLOCK | size | |
PADDING | pad | |
Total | 24 + size + pad |
So for example, if you allocate arrays having 0, 1, 8, and 9 bytes, the elements are packed together efficiently:
So you can see a byte array containing 1 element uses the same amount of heap space as an 8 element byte array.0 byte array size = 24 (size = 0, pad = 0)1 byte array size = 32 (size = 1, pad = 7)8 byte array size = 32 (size = 8, pad = 0)9 byte array size = 40 (size = 1, pad = 7)
Reordering Object Properties, Memory Layout of Super/Subclasses
These two topics are covered quite nicely in Domingos' blog with examples so I won't restate the details.
Domingos Neto's Blog- Remember his blog is for 32-bit JVM heap allocation, but the principles still apply.
Suffice it to say, it makes sense for the JVM to reorder object properties. For instance, so bytes can be packed together with other bytes or primitives smaller than 8 bytes. (Ex: 4 bytes + 1 int fills 8 bytes.)
The most pertinent rule of super/subclass memory allocations is that a superclass keeps all of its properties together, 8 byte aligned. Each sublcass will do the same, keeping its properties together, 8 byte aligned, and not mixing them around to use space more efficienctly.
String Object
To understand how much heap a String object uses, we must look at String's source code. It contains the following properties in this order: char[], int offset, int count, int hash. The following table shows the properties reordered and their sizes:
Field | Type | Size (bytes) |
---|---|---|
HEADER | 16 | |
value | char[] | 8 |
offset | int | 4 |
count | int | 4 |
hash | int | 4 |
PADDING | 4 | |
TOTAL | 40 |
Notice the "value" field is 8 bytes. How can we know its size since a string can be any length (up to 32-bits of course)? The heap space taken by the "value" field is only the reference to the char[], not including the data.
Notice the 4 bytes at the end of the object for padding.
The actual data is allocated in another block of heap space (even if the string length is zero, it appears a char[0] array is allocated.) If the string's length is L, then the heap usage follows the usage for an array with char (2 byte) elements:
24 + 0 * 2 = 24So an empty String object requires 64 bytes of heap space!
Another example: How much space does the string, "Smart" require on the heap?
64 + 5*2 = 74 (nope! don't forget about the 8 byte alignment)char primitives are 2 bytes long, so 5 of them requires 10 bytes, but 10 isn't 8 byte aligned, so must round up to 16. So a string 5 characters long requires:
64 + 5*2 + 6 = 80 -- 6 bytes of paddingHeapByteBuffer Object
I think this object encompasses most of the rules for determining space. Here is the layout of HeapByteBuffer as seen from the Eclipse debugger:
Field | Type | Size (bytes) |
---|---|---|
address | long | 8 |
bigEndian | boolean | 1 |
capacity | int | 4 |
hb | byte[] | 4 |
isReadyOnly | boolean | 1 |
limit | int | 4 |
mark | int | 4 |
nativeByteOrder | boolean | 1 |
offset | int | 4 |
position | int | 4 |
Looks like alphabetical order. If you look at decompiled source, the arrangement is also different. Neither of these matter because the JVM will rearrange the fields to make efficient use of heap space - and it is implementation and platform dependent. So the memory allocation could look like this:
Field | Type | Size (bytes) |
---|---|---|
HEADER | 16 | |
bigEndian | boolean | 1 |
isReadyOnly | boolean | 1 |
nativeByteOrder | boolean | 1 |
capacity | int | 4 |
PADDING | 1 | |
limit | int | 4 |
mark | int | 4 |
offset | int | 4 |
position | int | 4 |
hb | byte[] | 8 |
address | long | 8 |
Total | 56 |
If the memory was not rearranged, the alphabetical version would have 13 bytes of padding compared to 1 byte of padding in the efficient version. 56 is the actual "shallow" heap usage on my hardware and OS.
Notice that the field, hb, is a byte[] which means that 56 does not tell the whole story. "hb" is the storage used by ByteBuffer. Using the layout presented above for arrays, we have:
16 + (4+4) + size + pad = 24 + size + padSo best to use an example. If a ByteBuffer is instantiated with 5 bytes of storage, this means:
56 + 24 + (5 + 3) = 88 bytes for a 5 byte ByteBuffer!Summary
So before you run off and write your congressman regarding the flagrant misappropriation of heap space, we aren't in the land of small memory footprints anymore. In ByteBuffer all of its fields are necessary and provide great value in ease of programming cost ... at the cost of more RAM. If I had given an example of a ByteBuffer that had 1000 bytes of storage, the 80 bytes of overhead doesn't seem so significant.
You shouldn't normally (99.9%) of the time really care about this information. In my case I am trying to optimize memory usage by a cache, so being efficient as possible is a concern for me. What did I learn:
- Use primitives when possible
- Use "-XX:+UseCompressedOops" if your heap is less than 32gb
- Understand the classes you use.
Technical References
Object Structure - http://www.codeinstructions.com/2008/12/java-objects-memory-structure.html
UseCompressedOops -http://wikis.sun.com/display/HotSpotInternals/CompressedOops
UseCompressedOops - http://deusch.org/blog/?p=20
Jamm -https://github.com/jbellis/jamm
You say that: "Java specification states that all memory allocations are aligned to addresses that are divisible by 8, which means that a memory allocation must start on an address that is divisible by 8, and its size must be a multiple of 8 bytes."
ReplyDeleteWhat part of the Java specification says this? I can't find it in the JVM Specification for SE7. http://docs.oracle.com/javase/specs/jvms/se7/html/
you are correct, i'm not sure why i said that. the specification as i remember it only specifies the size of types and specifically says that it is up to the JVM implementation to determine the layout of memory usage.
ReplyDeletehttps://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop
DeleteManaged pointers in the Java heap point to objects which are aligned on 8-byte address boundaries. Compressed oops represent managed pointers (in many but not all places in the JVM software) as 32-bit object offsets from the 64-bit Java heap base address.
This is for the HotSpot Java VM. Perhaps, this was the basis of the above statement.
Great read, thank you!
ReplyDeleteHi - A small typo.... In your primitives table - char is listed as taking 16 bytes where it should be 2 (=16 bits).
ReplyDeleteExcellent read. thanks.
This comment has been removed by the author.
ReplyDeleteas a Amir said there is a typo at first table
ReplyDeletegood Article
There's also an error in your last table. The "Padding" of 1 extra byte would come immediately after the 3 booleans and BEFORE "Capacity" :-)
ReplyDeleteNice sir, thanks for solved
ReplyDeleteApprrieciate your work. https://blog.gceasy.io/2019/05/31/does-32-bit-or-64-bit-jvm-matter-anymore/. World's first memory dump analysis tool to find out amount of memory wasted by the application due to inefficient programming. Tool’s inbuilt AI intelligence recommends solution to fix the identified inefficiency.Heap Hero
ReplyDelete