Skip to content

Latest commit

 

History

History
280 lines (226 loc) · 10.1 KB

File metadata and controls

280 lines (226 loc) · 10.1 KB

Key Insights

  • To count the frequency of each value, we can use a hash table.
  • Stack uses FILO, it requires us to do extra work (pop other items to elsewhere, pop it and push other items back) to pop the most frequent item if it's in the middle of the stack. Instead, we can use multiple stack of each freqency for this.
Frequency hash map:
+--------+-----------+
| Number | Frequency |
+--------+-----------+
|   8    |     3     |
|   6    |     2     |
|   2    |     1     |
+--------+-----------+

Stacks by frequency level:
|  8 |
|  6 |  6 |
|  2 |  8 |  8 |
+----+----+----+----> frequency
   1    2    3
             * max frequency
         <---- pop()

// After `pop()`
|  8 |
|  6 |  6 |
|  2 |  8 |
+----+----+----> frequency
   1    2
        * max frequency
    <---- pop()

Stack + Hash Table (Optimal)

  1. Need to keep track: We use hash table to keep track of the frequency for each value: frequencyMap.
  2. Need to resolve tie of freqency by recency: If multiple values have the same frequency, we return the most recently added one. This can be done by using a stack, so we maintain a stack of values for each frequency. (like bucket sort, grouped by frequency)
  3. We need to pop the most recent item at the maximum frequency, so we need to keep track of the maximum frequency, so that we know which stack to pop from. Since the maximum frequency is monotonically increasing or decreasing, we can just use a variable to keep track of it.

This approach simulates a priority behavior using multiple stacks by frequency instead of maintaining one global heap or ordered set. (See below reflections)

  • 一个数字 x如果出现了 5 次,那么在频率 1、频率 2、……、频率 5 这五个栈里都有x
  • It's guaranteed that the key of groupMap is always increasing, from 1 to maxFrequency without any gaps.
  • 是否会出现中间的某个栈为空的情况? 因为出栈一定是在元素频率最高的栈上发生的,maxFrequency 必然是以 +/- 1 的频率变化,當 pop() 之後出現栈为空的情况,我們就會 maxFrequency--,所以不会出现中间的某个栈为空的情况。
class FreqStack() {

    private val frequencyMap = HashMap<Int, Int>()
    private val groupMap = HashMap<Int, Stack<Int>>()
    private var maxFrequency = 0

    fun push(`val`: Int) {
        val frequency = (frequencyMap[`val`] ?: 0) + 1
        frequencyMap[`val`] = frequency
        if (frequency !in groupMap) {
            groupMap[frequency] = Stack<Int>()
        }
        groupMap[frequency]!!.push(`val`)
        maxFrequency = maxOf(maxFrequency, frequency)
    }

    fun pop(): Int {
        val stack = groupMap[maxFrequency] ?: return -1
        val item = stack.pop()
        val frequency = (frequencyMap[item] ?: 0) - 1
        frequencyMap[item] = frequency
        if (stack.isEmpty()) {
            groupMap.remove(maxFrequency)
            maxFrequency--
        }
        return item
    }
}
  • Time Complexity: O(1) for push() and pop().
  • Space Complexity: O(n) for the hash table and the stack.

Heap + Hash Table (AC)

We can use a heap to keep track of the most frequent item, and a hash table to keep track of the frequency of each item. And the priority of the heap is: (priority) = (frequency, position), higher frequency comes first, and if frequency ties, the more recent item comes first.

We need two data structures to keep track of the frequency of each item and the position of each item.

data class Entry(
    val value: Int,
    val frequency: Int,
    val position: Int
)

class FreqStack() {

    private val frequencyMap = HashMap<Int, Int>()
    private var index: Int = 0
    private val maxHeap = PriorityQueue<Entry>() { e1, e2 ->
        if (e1.frequency == e2.frequency) {
            e2.position - e1.position
        } else {
            e2.frequency - e1.frequency
        }
    }
    // Or equivalent:
    // private val maxHeap = PriorityQueue(
    //     compareByDescending<Entry> {
    //         it.frequency
    //     }.thenByDescending {
    //         it.position
    //     }
    // )

    fun push(`val`: Int) {
        val frequency = (frequencyMap[`val`] ?: 0) + 1
        frequencyMap[`val`] = frequency
        maxHeap.add(Entry(`val`, frequency, index++))
    }

    fun pop(): Int {
        val entry = maxHeap.poll()
        val frequency = entry.frequency - 1
        if (frequency == 0) {
            frequencyMap.remove(entry.value)
        } else {
            frequencyMap[entry.value] = frequency
        }
        return entry.value
    }
}
  • Time Complexity: O(lg n) for push() and pop(), where n is the number of items in the stack.
  • Space Complexity: O(n) for the hash table and the heap.

Dry Run

// After `push(A)`, `push(B)`, `push(A)`, `push(C)`, `push(B)`, `push(A)`
freqMap = {A: 3, B: 2, C: 1}
heap = [
    (A, 1, 0)
    (B, 1, 1)
    (A, 2, 2)s
    (C, 1, 3)
    (B, 2, 4)
    (A, 3, 5)
]
index = 6

// After `pop()` = A
freqMap = {A: 2, B: 2, C: 1}
heap = [
    (A, 1, 0)
    (B, 1, 1)
    (A, 2, 2)
    (C, 1, 3)
    (B, 2, 4)
]

// After `pop()` = B, because `(A, 2, 2)` vs `(B, 2, 4)`, B is more recent
freqMap = {A: 2, B: 1, C: 1}
heap = [
    (A, 1, 0)
    (B, 1, 1)
    (A, 2, 2)
    (C, 1, 3)
]

My Original Implementation (WA)

We need to keep track of the frequency of each item, and also the most recently added if frequency tie-break, so I choose to implement this with 3 data structures for this:

  1. Use hash map to keep track of frequency of each value
  2. Use stack to keep track of last position of each value, use hash map of (value to stack) for this.
  3. Maintain a ordered set to find the most frequent item.
class FreqStack() {

    private val positionMap = HashMap<Int, Stack<Int>>()
    private val frequencyMap = HashMap<Int, Int>()
    private val orderedSet = TreeSet<Int> { key1, key2 ->
        val freq1 = frequencyMap.getOrDefault(key1, 0)
        val freq2 = frequencyMap.getOrDefault(key2, 0)
        if (freq1 != freq2) {
            freq2 - freq1
        } else {
            val pos1 = positionMap[key1]
            val pos2 = positionMap[key2]

            val position1 = if (pos1 != null && pos1.isNotEmpty()) pos1.peek() else -1
            val position2 = if (pos2 != null && pos2.isNotEmpty()) pos2.peek() else -1
            position2.compareTo(position1)
        }
    }
    var position = 0

    fun push(`val`: Int) {
        if (`val` !in positionMap) {
            positionMap[`val`] = Stack<Int>()
        }
        positionMap[`val`]!!.push(position++)
        frequencyMap[`val`] = (frequencyMap[`val`] ?: 0) + 1
        orderedSet.remove(`val`)
        orderedSet.add(`val`)
    }

    fun pop(): Int {
        val topItem = orderedSet.first()
        positionMap[topItem]?.pop()
        val count = frequencyMap[topItem]!! - 1
        if (count == 0) {
            positionMap.remove(topItem)
            frequencyMap.remove(topItem)
        } else {
            frequencyMap[topItem] = count
        }
        // Update the ordered set after updating all related data structures.
        orderedSet.remove(topItem)
        if (count > 0) orderedSet.add(topItem)
        return topItem
    }
}

Reflections

  1. Using ordered set is the main pitfall of this solution, there is a problem with ordered set with comparator depending on mutable states:
  • TreeSet is binary search tree and expected to be immutable, but in this case, we are updating the frequency of each item, which is mutable.
  • Depending on mutable states leads to stale ordering after update, unless you manually remove() and add() the item to the ordered set to force refreshing the ordering.
  1. Overkill by using positionMap, it's more complicated than necessary:
orderedSet.first(); // 1
positionMap[1] = Stack([1, 2]); // 2 is the most recent item

We already get the most recently item with a stack. If we grouped by frquency, the most recent item is the top of the stack.

push(5); // freq=1 → groupMap[1] = [5]
push(7); // freq=1 → groupMap[1] = [5, 7]
push(5); // freq=2 → groupMap[2] = [5]

Here the groupMap already keeps track of:

  • Which values are at each frequency
  • The most recent item at each frequency

We just need to know what the most frequency is, then we can pop the most recent item at that frequency.

Pivoting Thinking

How to pivot the thinking?

From:

"How do I track and find the most frequent item?"

To:

"Can I group values by frequency so that we can just pop from the right group?"

Buckets of frequency: Frequency becomes a discete bucket, recency with the same frequency can be modeled as a stack. This is inspired by the same core intuition behind counting sort and bucket sort. Instead of maintaining a dynamic order (like ordered set, heap, etc.), we group items based on **discrete property (*frequency* here), then pick the right bucket. Here is a reusable pattern: If a problem involves “frequency + priority”, ask:

Can I bucket/group the values instead of sorting or using a heap?

Question Use Heap/TreeSet Pivot to Buckets
Do you need global sorted order at all times? ✅ Yes ❌ No
Do priorities change frequently (mutable key)? ❌ Risky (violates heap/invariants) ✅ Grouping is safer
Do you need the top k / most frequent / max efficiently? Sometimes ✅ Often via frequency buckets
Is ordering based on discrete integer buckets (e.g. count, score, time bucket)? ❌ Heap overkill ✅ Grouping (bucket sort–like)