Nil

Implementing a Stack That Supports Nested Transactions

Kyle mentioned that in one of his recent technical interviews he was asked to implement a stack that supports nested transactions. Since I've been trying to brush up on my data structures and algorithms skills and prepare myself for such interviews I thought this would be a fun exercise to tackle. Fortunately I went on vacation shortly after he mentioned this so I had some time to think about it. Not sure how well I would fare in an interview environment but I suppose that's for a different post.

Not So Basic Operations

Of course this data structure should support the basic stack operations of Push(), Pop() and even Peek(), an operation that returns the next value that would be popped without actually popping it off the stack. Implementing these operations are relatively straight forward. The interesting part however is when you start thinking about handling nested transactions. What happens if you begin a new transaction, push a few values on the stack but decide to rollback? One option would be to track these values in a temporary location and only apply them to the stack once a commit or rollback takes place. But how do you actually handle these temporary values?

The Structure

I decided to use Go to write this implementation so all methods will be written on the txstack.Stack type.

Here's what the struct looks like:

type Stack struct {
    Storage    [][]int
    InTx       bool
    Pointer    int
    TxCommitted int
}

The most interesting field in this struct is the Storage slice. More precisely, it's a slice of slices of type int. This is where the so-called magic happens. The other three fields will become apparent after I describe the nested transaction functionality.

Transactions and Nesting

Transactions occur after a Begin() call. This is when the InTx field is set to true. Basic operations can still take place before a transaction begins but all operations afterwards are in a temporary state. What this means is that the operations won't be applied to the stack until a Commit() takes place, or, if a Rollback() happens, all operations after the Begin() will be discarded. A transaction ends when either a rollback takes place or when the number of commits match the number of begins.

That last sentence could use a bit of unpacking. Essentially what it means is that when a Begin() takes place both the Pointer and Committed fields are incremented. The pointer keeps track of the current transaction state (operations are done on the storage at index Pointer) and when a Commit() takes place the Committed int is decremented. When this field reaches 0 all nested commits have taken place so all temporary stacks can be flushed and the latest stack gets copied into Storage at index 0.

Example:

s := txstack.New()
s.Push(3)
s.Push(2)
s.Push(1)

State:

+ - +
| 1 |
+ - +
| 2 |
+ - +
| 3 |
+ - +

No transactions have taken place. That is the stack as it stands.

s.Begin()
s.Pop()

Temporary State:

+ - +
| 2 |
+ - +
| 3 |
+ - +

A new transaction has begun creating a temporary state of the stack.

s.Rollback()

State:

+ - +
| 1 |
+ - +
| 2 |
+ - +
| 3 |
+ - +

The stack looks the same before the Begin() took place because of the call to Rollback(). If a call to Commit() happened instead then obviously the temporary state would have become the permanent state.

Handling State

Each time a new transaction begins a new temporary stack is created. This is where the Storage and Pointer fields comes in. A new slice is created, the contents of the previous slice are copied into it and the pointer is incremented. Now, all methods are done on the stack that the pointer points to.

Given the same example from above the internal state of the data structure would look like the following:

s := txstack.New()
s.Push(3)
s.Push(2)
s.Push(1)

Internal State:

Stack {
    Storage: [[3,2,1]],
    InTx: false,
    Pointer: 0,
    Committed: 0,
}
s.Begin()
s.Pop()

Internal State:

Stack {
    Storage: [[3,2,1],[3,2]],
    InTx: true,
    Pointer: 1,
    Committed: 1,
}
s.Rollback()

Internal State:

Stack {
    Storage: [[3,2,1]],
    InTx: false,
    Pointer: 0,
    Committed: 0,
}

If Commit() was called instead of Rollback() then the Committed field would be set to 0. When this happens the last slice gets copied into index 0 and the other slices are removed.

Putting the whole txstack package together:

package txstack

type Stack struct {
    Storage     [][]int
    InTx        bool
    Pointer     int
    TxCommitted int
}

// flushTxStacks will flush all temp storage stacks while only
// retaining the stack at storage location p by copying it into
// storage location 0.
func (s *Stack) flushTxStacks(p int) {
    keepStack := s.Storage[p]

    storage := make([][]int, 1)
    for _, val := range keepStack {
        storage[0] = append(storage[0], val)
    }
    s.Storage = storage
}

func (s *Stack) inTx() bool {
    return s.InTx
}

func (s *Stack) Empty() bool {
    p := s.Pointer
    if len(s.Storage[p]) != 0 {
        return false
    }

    return true
}

func (s *Stack) Size() int {
    p := s.Pointer
    return len(s.Storage[p])
}

func (s *Stack) Push(val int) {
    p := s.Pointer
    s.Storage[p] = append(s.Storage[p], val)
}

func (s *Stack) Pop() int {
    p := s.Pointer
    val := s.Storage[p][s.Size()-1]
    s.Storage[p] = s.Storage[p][:s.Size()-1]

    return val
}

func (s *Stack) Peek() int {
    p := s.Pointer
    return s.Storage[p][s.Size()-1]
}

// Begin creates a new transaction by making a new temporary slice and
// copying the last storage into it. Only when a full commit or
// rollback takes place will the temporary stacks be flushed.
func (s *Stack) Begin() {
    s.InTx = true
    newStack := make([]int, s.Size())

    p := s.Pointer
    copy(newStack, s.Storage[p])

    s.Storage = append(s.Storage, newStack)

    s.Pointer++
    s.TxCommitted++
}

// Commit decrements the internal TxCommitted counter and when a full
// commit takes place (when the counter reaches 0) it flushes all
// stacks while only retaining the latest.
func (s *Stack) Commit() {
    s.TxCommitted--
    if s.TxCommitted == 0 {
        s.flushTxStacks(s.Pointer)
        s.Pointer = 0
    }
}

func (s *Stack) Rollback() {
    s.flushTxStacks(0)
    s.Pointer = 0
    s.TxCommitted = 0
    s.InTx = false
}

func New() Stack {
    storage := make([][]int, 1)
    return Stack{
        Storage:     storage,
        InTx:        false,
        Pointer:     0,
        TxCommitted: 0,
    }
}

It could certainly use some more work like implementing error handling and watching for underflows and overflows but it works and I'm pretty happy with the outcome. Interestingly there isn't much on the web about writing a stack like this. Actually, I couldn't find anything but information about SQL transactions. This made it even more satisfying when all the tests passed.