Wednesday, January 25, 2012

How Much Heap Space Does My Object Use?

This post is just some of my observations while trying to determine the heap space used by Java objects in a 64 bit environment. There are some good posts covering this topic, but none that addressed 64-bit, so I decided to put together my results. I only tested on Sun HotSpot 64-bit OS X 10.7:
java version "1.6.0_29"
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11M3527)
Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
My results are derived from reading various posts around the web and running tests using Jamm and Java's Runtime object.

Preliminaries
  1. All memory allocations in Java are aligned on 8 byte boundaries
  2. An object's properties are stored together, aligned on 8 byte boundaries and not mixed with other objects, not even a superclass
  3. The JVM is smart enough to reorder and pack an object's properties so it will use less space because of padding for 8 byte alignment
  4. My calculations will assume that UseCompressedOops is false (-XX:-UseCompressedOops), therefore references are 8 bytes (not 4.) Essentially heaps less than 32gb can save memory from using extra registers on a 64 bit processor to shorten address references. Explained here: https://wikis.oracle.com/display/HotSpotInternals/CompressedOops).
  5. Since Java 1.6.0_23 UseCompressedOops has defaulted to true. Use "jinfo -flag UseCompressedOops <pid>" to determine your setting
I think it is interesting that since UseCompressedOops defaults to *on* most of my results would be in line with 32 bit results. I am working with large heaps and need to assume that all references are 8 bytes.

8 Byte Memory Alignment

Java specification states that all memory allocations are aligned to addresses that are divisible by 8, which means that a memory allocation must start on an address that is divisible by 8, and its size must be a multiple of 8 bytes. So if an object requires 16 bytes of heap, it will start at some address X (divisible by 8) and extend 16 bytes. So every object (not primitives, but includes arrays) will require a minimum of 16 bytes.

Basic Objects

If you declare an object (non-inner object) that has no properties and measure its usage, it will take 16 bytes of heap space. After reading some web posts it seems that every object instance has a "header" and a "reference" to its class definition. Since references are 8 bytes (because of -XX:-UseCompressedOops) this means the actual space used by an empty object is 16 bytes.

I proved this by adding a single "byte" primitive to an object and watched the heap usage change from 16 to 24. This is my technique for proving heap usage.

Inner Classes

The exception to the "Basic Objects" rule is an instantiated inner class which has a hidden reference to its parent object. This type of object takes 24 bytes on the heap, no padding needed for 8 byte alignment.

Primitives

We know the size of each primitive from the Java specification. What isn't stated in the specification is how much heap space they use. It seems to be JVM implementationdependent. Below is a table that shows each primitive's size and how much heap it may use on my JVM (see at the beginning.) Be aware because of "8 byte alignment" and "padding" that a primitive such as a byte or boolean, can be packed together to take up less memory, and this table only shows the maximum space it can use.

PrimitiveSize (bytes)Heap Usage (bytes)
boolean18
byte18
short28
int48
long88
float48
double88
char1616

The interesting primitive is boolean, which can only have two values, true and false, and represent 1 bit of information.By doing the same type of test as with "byte", we can prove that a boolean primitive minimally uses 1 byte of heap.

Object References

Because of UseCompressedOops, object references are 8 bytes. We can test this by adding a reference to an empty class and measuring its memory usage. This shows 24 bytes used on the heap. Add one more "byte" primitive, measure again and you'll see 32 bytes of heap usage.

Arrays

All arrays have an extra integer "length" field stored in their header, which means that an array's header uses 24 bytes even if it has no data - 16 bytes for header, 4 bytes for integer length, and 4 bytes for padding. The JVM is smart about allocating memory for arrays. It will allocate a block of memory, 8 byte aligned, to contain all the elements and only padding after the end of the last element. The layout looks like this:


FieldTypeSize (bytes)
HEADER16
lengthint4
PADDING4
MEMORY BLOCK
size
PADDINGpad
Total24 + size + pad

So for example, if you allocate arrays having 0, 1, 8, and 9 bytes, the elements are packed together efficiently:
0 byte array size = 24 (size = 0, pad = 0)
1 byte array size = 32 (size = 1, pad = 7)
8 byte array size = 32 (size = 8, pad = 0)
9 byte array size = 40 (size = 1, pad = 7)
So you can see a byte array containing 1 element uses the same amount of heap space as an 8 element byte array.

Reordering Object Properties, Memory Layout of Super/Subclasses

These two topics are covered quite nicely in Domingos' blog with examples so I won't restate the details.

Domingos Neto's Blog- Remember his blog is for 32-bit JVM heap allocation, but the principles still apply.

Suffice it to say, it makes sense for the JVM to reorder object properties. For instance, so bytes can be packed together with other bytes or primitives smaller than 8 bytes. (Ex: 4 bytes + 1 int fills 8 bytes.)

The most pertinent rule of super/subclass memory allocations is that a superclass keeps all of its properties together, 8 byte aligned. Each sublcass will do the same, keeping its properties together, 8 byte aligned, and not mixing them around to use space more efficienctly.

String Object

To understand how much heap a String object uses, we must look at String's source code. It contains the following properties in this order: char[], int offset, int count, int hash. The following table shows the properties reordered and their sizes:

FieldTypeSize (bytes)
HEADER16
valuechar[]8
offsetint4
countint4
hashint4
PADDING4
TOTAL40

Notice the "value" field is 8 bytes. How can we know its size since a string can be any length (up to 32-bits of course)? The heap space taken by the "value" field is only the reference to the char[], not including the data.

Notice the 4 bytes at the end of the object for padding.

The actual data is allocated in another block of heap space (even if the string length is zero, it appears a char[0] array is allocated.) If the string's length is L, then the heap usage follows the usage for an array with char (2 byte) elements:
24 + 0 * 2 = 24
So an empty String object requires 64 bytes of heap space!

Another example: How much space does the string, "Smart" require on the heap?
64 + 5*2 = 74 (nope! don't forget about the 8 byte alignment)
char primitives are 2 bytes long, so 5 of them requires 10 bytes, but 10 isn't 8 byte aligned, so must round up to 16. So a string 5 characters long requires:
64 + 5*2 + 6 = 80 -- 6 bytes of padding
HeapByteBuffer Object

I think this object encompasses most of the rules for determining space. Here is the layout of HeapByteBuffer as seen from the Eclipse debugger:

FieldTypeSize (bytes)
addresslong8
bigEndianboolean1
capacityint4
hbbyte[]4
isReadyOnlyboolean1
limitint4
markint4
nativeByteOrderboolean1
offsetint4
positionint4

Looks like alphabetical order. If you look at decompiled source, the arrangement is also different. Neither of these matter because the JVM will rearrange the fields to make efficient use of heap space - and it is implementation and platform dependent. So the memory allocation could look like this:

FieldTypeSize (bytes)
HEADER16
bigEndianboolean1
isReadyOnlyboolean1
nativeByteOrderboolean1
capacityint4
PADDING1
limitint4
markint4
offsetint4
positionint4
hbbyte[]8
addresslong8
Total56

If the memory was not rearranged, the alphabetical version would have 13 bytes of padding compared to 1 byte of padding in the efficient version. 56 is the actual "shallow" heap usage on my hardware and OS.

Notice that the field, hb, is a byte[] which means that 56 does not tell the whole story. "hb" is the storage used by ByteBuffer. Using the layout presented above for arrays, we have:
16 + (4+4) + size + pad = 24 + size + pad
So best to use an example. If a ByteBuffer is instantiated with 5 bytes of storage, this means:
56 + 24 + (5 + 3) = 88 bytes for a 5 byte ByteBuffer!
Summary

So before you run off and write your congressman regarding the flagrant misappropriation of heap space, we aren't in the land of small memory footprints anymore.  In ByteBuffer all of its fields are necessary and provide great value in ease of programming cost ... at the cost of more RAM.  If I had given an example of a ByteBuffer that had 1000 bytes of storage, the 80 bytes of overhead doesn't seem so significant.

You shouldn't normally (99.9%) of the time really care about this information.  In my case I am trying to optimize memory usage by a cache, so being efficient as possible is a concern for me.  What did I learn:

  • Use primitives when possible
  • Use "-XX:+UseCompressedOops" if your heap is less than 32gb
  • Understand the classes you use.


Technical References

Object Structure - http://www.codeinstructions.com/2008/12/java-objects-memory-structure.html
UseCompressedOops -http://wikis.sun.com/display/HotSpotInternals/CompressedOops
UseCompressedOops - http://deusch.org/blog/?p=20
Jamm -https://github.com/jbellis/jamm

10 comments:

  1. You say that: "Java specification states that all memory allocations are aligned to addresses that are divisible by 8, which means that a memory allocation must start on an address that is divisible by 8, and its size must be a multiple of 8 bytes."

    What part of the Java specification says this? I can't find it in the JVM Specification for SE7. http://docs.oracle.com/javase/specs/jvms/se7/html/

    ReplyDelete
  2. you are correct, i'm not sure why i said that. the specification as i remember it only specifies the size of types and specifically says that it is up to the JVM implementation to determine the layout of memory usage.

    ReplyDelete
    Replies
    1. https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop

      Managed pointers in the Java heap point to objects which are aligned on 8-byte address boundaries. Compressed oops represent managed pointers (in many but not all places in the JVM software) as 32-bit object offsets from the 64-bit Java heap base address.

      This is for the HotSpot Java VM. Perhaps, this was the basis of the above statement.

      Delete
  3. Hi - A small typo.... In your primitives table - char is listed as taking 16 bytes where it should be 2 (=16 bits).
    Excellent read. thanks.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. as a Amir said there is a typo at first table
    good Article

    ReplyDelete
  6. There's also an error in your last table. The "Padding" of 1 extra byte would come immediately after the 3 booleans and BEFORE "Capacity" :-)

    ReplyDelete
  7. Apprrieciate your work. https://blog.gceasy.io/2019/05/31/does-32-bit-or-64-bit-jvm-matter-anymore/. World's first memory dump analysis tool to find out amount of memory wasted by the application due to inefficient programming. Tool’s inbuilt AI intelligence recommends solution to fix the identified inefficiency.Heap Hero

    ReplyDelete