Java HashMap 101
Ever wondered what's really happening inside a HashMap
when you're casually throwing key-value pairs at it like
confetti at a wedding? Let's dig in!
This article primarily concerns with Java's implementation of HashMap
. Other languages may use different
approaches, but I believe the principles should be the same.
HashMap
: The Ultimate Organizer (With Trust Issues)
HashMap
keeps an internal array of "buckets" (or "bins" if you're feeling fancy). Think of these buckets like those
labeled storage containers your organized friend has in their garage. Each bucket can hold your stuff in two different
ways:
-
LinkedList style: When there's not much stuff (fewer than 8 items), it just chains everything together like a conga line.
-
Red-Black tree style: When things get crowded (8+ items and the
HashMap
has at least 64 total buckets), it organizes everything into a perfectly balanced tree structure. This implementation started appeared in JDK 8 onwards.
The Great Hash Code Adventure
When you call map.put(key, value)
, HashMap
doesn't just randomly yeet your data into the buckets and hope for some
random dumb home runs. It follows a very specific three-step process:
Step 1: Identify Yourself (int hashCode()
)
Click to show
First, HashMap
looks at your key and says, "Hello there, what's your hash code?" Your key object dutifully calls its
hashCode()
method and returns a number (a signed integer). This is basically your key's ID card, but instead of a
photo, it's just a really not-so-random-looking number.
Step 2: The Secret Sauce (Internal Hashing)
Click to show
HashMap
takes that hash code and runs it through its own special blender of bitwise operations. Why? Because HashMap
doesn't trust your hash code. It's like, "Yeah, that's nice, but let me just... adjust this number a bit to make it
more... acceptable."
The implementation (in JDK 21) looks like this:
// Magic
int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
Step 3: "Welcome to Bucket #7" (Index Calculation)
Click to show
Finally, HashMap
uses some mathematical wizardry (usually involving modulo operations) to figure out which bucket gets
the honor of storing your precious data. It's like a really boring lottery where everyone's number is predetermined.
When Things Get Funnier: Hash Collisions
Java's hashCode() method returns a 32-bit integer, limiting the number of possible values to about 4.3 billion. Since the number of distinct objects can definitely exceed this, collisions are an unavoidable.
Therefore, two different keys may end up wanting to crash in the same bucket.
This is where the equals()
method comes to the rescue, acting like a diplomatic party host:
HashMap
checks each item in the bucket and asks, "Hey, are you two actually the same person, or do you just have
terrible taste in hash codes?"
- If they're the same: "Cool, let me replace that old value with your new one."
- If they're different: "Okay, you can both stay, but you're sharing the bucket."
This is called separate chaining. Fancy, no?
The Sacred Contract: boolean equals(Object)
and int hashCode()
Holy Combination
Here's where things get serious for a hot minute. HashMap
has one non-negotiable rule that's more sacred than your
grandma's secret recipe:
If two objects are equal (according to boolean equals(Object)
method), they MUST have the same hash code.
NO EXCEPTIONS.
Break this rule, and HashMap
will lose its mind and start throwing tantrums. You'll get:
- Performance that's slower than dial-up internet (insert 56 kbps modem sound here)
- Your data scattered around like a routed army, with no way to rally them back.
- Bugs, obviously, are inevitable
The Golden Rule: Don't Touch That Key!
Remember that friend who rearranges their bookshelf by color and then freaks out when you move one book? HashMap
is
exactly like that, but with your keys.
NEVER, EVER, CHANGE A KEY STATE AFTER PUTTING IT IN A HASH MAP.
Seriously, this is like changing your address and not telling the post office. Your mail (data) will end up in the wrong
place, and HashMap
will be sitting there like, "I put your stuff somewhere safe, but now I have no idea where that
was."
Blame yourself for making this decision, not HashMap
.
This is why immutable classes are HashMap
's favorite keys. They can't change even if they wanted to, which makes
HashMap
sleep better at night.
The Gang's All Here: Other HashMap
Methods
The best part? All the other HashMap
methods are basically just variations on the same theme. They're like that one
friend who tells the same joke in different ways:
boolean containsKey(Object)
"Let me just hash this key, find the right bucket, and see if anyone's home."
V remove(Object key)
"Hash, find, delete. It's that simple."
V get(Object key)
"Hash, find, retrieve. Simple as ordering pizza, but with more math."
Performance: The Need for Speed
When HashMap
is having a good day (well-distributed keys, minimal collisions), it's lightning fast - O(1) operations
that make other data structures run for their money.
When it's having a bad day (everything hashing to the same bucket), it becomes a grumpy O(n) monster. Though thanks to the tree structure in modern Java, it's more like a slightly less grumpy O(log n) monster for insertion and deletion. Still grumpy nonetheless.
Here's a fun detail: your keys don't need to implement Comparable
for the tree to work! HashMap
is clever enough to happily arrange the elements for you.
The moral of the story? Keep your int hashCode()
method happy, and HashMap
will keep you happy.
The Bottom Line
HashMap
is basically that incredibly efficient friend who has a system for everything, but only if you follow their
very specific rules. Mess with their system (looking at you, mutable keys and broken hash codes), and they'll make your
life miserable in ways you never imagined possible.
But treat HashMap
right - give it immutable keys, proper equals()
and hashCode()
implementations, and it'll be
your loyal companion through thick and thin, serving up lightning-fast lookups with a smile.
So the next time you casually type map.put(key, value)
, just remember: you're not just storing data, you're
participating in an intricate dance of hashing, bucketing, and collision resolution.
If you want more detail? Read the source code of java.util.HashMap
yourself, but I warn you, prepare yourself
mentally, or you will end up with very severe headache!
Trust me, I tried, and could only murmur "magic" before passing out.
The Extended Family: LinkedHashMap
and Friends
What about LinkedHashMap
and its Friends?
The Extended Family: LinkedHashMap
and Friends
Before we wrap up, let's talk about HashMap
's slightly extra cousin: LinkedHashMap
. Think of it as HashMap
but
with commitment issues - everything inside is according to their insertion order.
LinkedHashMap
is basically HashMap
that decided to keep a diary. It maintains all the blazing fast performance of
HashMap
while also remembering the exact order you inserted things. How? By keeping an additional doubly-linked list
running through all the entries.
This makes LinkedHashMap
the darling of JSON serialization frameworks everywhere. When you're converting objects to
JSON and back, you probably want your fields to stay in the same order. LinkedHashMap
delivers that consistency with
the enthusiasm of a golden retriever.
But here's the real MVP move: LinkedHashMap
can be your lazy developer's dream for implementing an LRU (Least Recently
Used) cache.
Just override the removeEldestEntry()
method, and boom - you've got yourself a cache that automatically kicks out old
entries when it gets full, like this:
import java.util.LinkedHashMap;
import java.util.Map;
// Look ma, LRU cache!
public class LRUCache<K, V> extends LinkedHashMap<K, V> {
private final int capacity;
public LRUCache(int initialCapacity, int maxCacheSize) {
super(initialCapacity, 0.75F, true);
this.maxCacheSize = maxCacheSize;
}
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > capacity;
}
// other methods and fields omitted for brevity
}
This may not impress your interviewers at all, but Leetcode interviewing nowadays is overrated, so who cares?
Is this cheating? Maybe.
Do we even feel remorseful? Absolutely not.
Even the big players use this trick - just check out org.graalvm.compiler.replacements.SnippetTemplate.LRUCache
in
GraalVM's JDK source code. If it's good enough for the JVM wizards, it's good enough for us mere mortals.
Of course, LinkedHashMap
comes with the classic LinkedList shenanigans: extra memory overhead and slightly more
complex operations. But sometimes you need that insertion order, and LinkedHashMap
delivers it with style.
As for concurrent implementations like ConcurrentHashMap
, well...
That's dark magic reserved for the pros, maybe a presentation by Jose Paumard. Us normies talking about lock-free algorithms and memory barriers would sound about as convincing as someone explaining quantum physics with sock puppets. Let's just say it exists, it's complicated, and it makes your concurrent code not explode.
Now go forth and HashMap
responsibly (and maybe LinkedHashMap
when you're feeling fancy)!