Skip to main content

Java HashMap 101

· 8 min read
Linh Nguyen
T-90MS Main Battle Tank

Ever wondered what's really happening inside a HashMap when you're casually throwing key-value pairs at it like confetti at a wedding? Let's dig in!

note

This article primarily concerns with Java's implementation of HashMap. Other languages may use different approaches, but I believe the principles should be the same.

HashMap: The Ultimate Organizer (With Trust Issues)

HashMap keeps an internal array of "buckets" (or "bins" if you're feeling fancy). Think of these buckets like those labeled storage containers your organized friend has in their garage. Each bucket can hold your stuff in two different ways:

  • LinkedList style: When there's not much stuff (fewer than 8 items), it just chains everything together like a conga line.

  • Red-Black tree style: When things get crowded (8+ items and the HashMap has at least 64 total buckets), it organizes everything into a perfectly balanced tree structure. This implementation started appeared in JDK 8 onwards.

The Great Hash Code Adventure

When you call map.put(key, value), HashMap doesn't just randomly yeet your data into the buckets and hope for some random dumb home runs. It follows a very specific three-step process:

Step 1: Identify Yourself (int hashCode())

Click to show

First, HashMap looks at your key and says, "Hello there, what's your hash code?" Your key object dutifully calls its hashCode() method and returns a number (a signed integer). This is basically your key's ID card, but instead of a photo, it's just a really not-so-random-looking number.

Step 2: The Secret Sauce (Internal Hashing)

Click to show

HashMap takes that hash code and runs it through its own special blender of bitwise operations. Why? Because HashMap doesn't trust your hash code. It's like, "Yeah, that's nice, but let me just... adjust this number a bit to make it more... acceptable."

The implementation (in JDK 21) looks like this:

// Magic
int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Step 3: "Welcome to Bucket #7" (Index Calculation)

Click to show

Finally, HashMap uses some mathematical wizardry (usually involving modulo operations) to figure out which bucket gets the honor of storing your precious data. It's like a really boring lottery where everyone's number is predetermined.

When Things Get Funnier: Hash Collisions

Java's hashCode() method returns a 32-bit integer, limiting the number of possible values to about 4.3 billion. Since the number of distinct objects can definitely exceed this, collisions are an unavoidable.

Therefore, two different keys may end up wanting to crash in the same bucket.

This is where the equals() method comes to the rescue, acting like a diplomatic party host:

HashMap checks each item in the bucket and asks, "Hey, are you two actually the same person, or do you just have terrible taste in hash codes?"

  • If they're the same: "Cool, let me replace that old value with your new one."
  • If they're different: "Okay, you can both stay, but you're sharing the bucket."

This is called separate chaining. Fancy, no?

The Sacred Contract: boolean equals(Object) and int hashCode() Holy Combination

Here's where things get serious for a hot minute. HashMap has one non-negotiable rule that's more sacred than your grandma's secret recipe:

If two objects are equal (according to boolean equals(Object) method), they MUST have the same hash code.

NO EXCEPTIONS.

Break this rule, and HashMap will lose its mind and start throwing tantrums. You'll get:

  • Performance that's slower than dial-up internet (insert 56 kbps modem sound here)
  • Your data scattered around like a routed army, with no way to rally them back.
  • Bugs, obviously, are inevitable

The Golden Rule: Don't Touch That Key!

Remember that friend who rearranges their bookshelf by color and then freaks out when you move one book? HashMap is exactly like that, but with your keys.

NEVER, EVER, CHANGE A KEY STATE AFTER PUTTING IT IN A HASH MAP.

Seriously, this is like changing your address and not telling the post office. Your mail (data) will end up in the wrong place, and HashMap will be sitting there like, "I put your stuff somewhere safe, but now I have no idea where that was."

Blame yourself for making this decision, not HashMap.

This is why immutable classes are HashMap's favorite keys. They can't change even if they wanted to, which makes HashMap sleep better at night.

The Gang's All Here: Other HashMap Methods

The best part? All the other HashMap methods are basically just variations on the same theme. They're like that one friend who tells the same joke in different ways:

boolean containsKey(Object)

"Let me just hash this key, find the right bucket, and see if anyone's home."

V remove(Object key)

"Hash, find, delete. It's that simple."

V get(Object key)

"Hash, find, retrieve. Simple as ordering pizza, but with more math."

Performance: The Need for Speed

When HashMap is having a good day (well-distributed keys, minimal collisions), it's lightning fast - O(1) operations that make other data structures run for their money.

When it's having a bad day (everything hashing to the same bucket), it becomes a grumpy O(n) monster. Though thanks to the tree structure in modern Java, it's more like a slightly less grumpy O(log n) monster for insertion and deletion. Still grumpy nonetheless.

info

Here's a fun detail: your keys don't need to implement Comparable for the tree to work! HashMap is clever enough to happily arrange the elements for you.

The moral of the story? Keep your int hashCode() method happy, and HashMap will keep you happy.

The Bottom Line

HashMap is basically that incredibly efficient friend who has a system for everything, but only if you follow their very specific rules. Mess with their system (looking at you, mutable keys and broken hash codes), and they'll make your life miserable in ways you never imagined possible.

But treat HashMap right - give it immutable keys, proper equals() and hashCode() implementations, and it'll be your loyal companion through thick and thin, serving up lightning-fast lookups with a smile.

So the next time you casually type map.put(key, value), just remember: you're not just storing data, you're participating in an intricate dance of hashing, bucketing, and collision resolution.

If you want more detail? Read the source code of java.util.HashMap yourself, but I warn you, prepare yourself mentally, or you will end up with very severe headache!

Trust me, I tried, and could only murmur "magic" before passing out.

The Extended Family: LinkedHashMap and Friends

What about LinkedHashMap and its Friends?

The Extended Family: LinkedHashMap and Friends

Before we wrap up, let's talk about HashMap's slightly extra cousin: LinkedHashMap. Think of it as HashMap but with commitment issues - everything inside is according to their insertion order.

LinkedHashMap is basically HashMap that decided to keep a diary. It maintains all the blazing fast performance of HashMap while also remembering the exact order you inserted things. How? By keeping an additional doubly-linked list running through all the entries.

This makes LinkedHashMap the darling of JSON serialization frameworks everywhere. When you're converting objects to JSON and back, you probably want your fields to stay in the same order. LinkedHashMap delivers that consistency with the enthusiasm of a golden retriever.

But here's the real MVP move: LinkedHashMap can be your lazy developer's dream for implementing an LRU (Least Recently Used) cache.

Just override the removeEldestEntry() method, and boom - you've got yourself a cache that automatically kicks out old entries when it gets full, like this:

import java.util.LinkedHashMap;
import java.util.Map;

// Look ma, LRU cache!
public class LRUCache<K, V> extends LinkedHashMap<K, V> {

private final int capacity;

public LRUCache(int initialCapacity, int maxCacheSize) {
super(initialCapacity, 0.75F, true);
this.maxCacheSize = maxCacheSize;
}

@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > capacity;
}

// other methods and fields omitted for brevity
}

This may not impress your interviewers at all, but Leetcode interviewing nowadays is overrated, so who cares?

Is this cheating? Maybe.

Do we even feel remorseful? Absolutely not.

Even the big players use this trick - just check out org.graalvm.compiler.replacements.SnippetTemplate.LRUCache in GraalVM's JDK source code. If it's good enough for the JVM wizards, it's good enough for us mere mortals.

Of course, LinkedHashMap comes with the classic LinkedList shenanigans: extra memory overhead and slightly more complex operations. But sometimes you need that insertion order, and LinkedHashMap delivers it with style.

As for concurrent implementations like ConcurrentHashMap, well...

That's dark magic reserved for the pros, maybe a presentation by Jose Paumard. Us normies talking about lock-free algorithms and memory barriers would sound about as convincing as someone explaining quantum physics with sock puppets. Let's just say it exists, it's complicated, and it makes your concurrent code not explode.

Now go forth and HashMap responsibly (and maybe LinkedHashMap when you're feeling fancy)!