GapBuffer: Header-Only C++17 Text Editor Data Structure

Overview

GapBuffer is a header-only, fully-templated C++17 implementation of the gap buffer data structure — the foundational text-editor buffer used historically by Emacs. The core innovation is maintaining a contiguous internal gap at the cursor position, making insertions and deletions at the cursor O(1) amortized while preserving O(1) random access via transparent index translation. The implementation features a random-access iterator (GapBufferIterator<T>) that satisfies std::sort requirements, complete Rule-of-Five compliance with timing-validated move semantics (<10 microseconds for 10,000-element moves), the Scott Meyers const-cast idiom for DRY const/non-const overloads, STL-compatible type aliases and comparison operators, and a 2x doubling reallocation strategy with two-phase element transfer. The 41-test Qt Test harness covers edge cases including pathological self-assignment chains ((buf = buf = buf) = buf), nested GapBuffer<GapBuffer<int>> containers, and std::sort compatibility verification.

Gap Buffer: Physical Memory Layout

[ element_0 ... element_{cursor-1} | GAP_SLOTS | element_{cursor} ... element_{n-1} ]
  ^─── pre-cursor content ─────^    ^── gap ──^   ^─── post-cursor content ─────^

Five member variables track the buffer state:

Field Type Invariant
_logical_size size_t Number of actual elements (excludes gap)
_buffer_size size_t Total allocated capacity (includes gap)
_cursor_index size_t Array index where the gap starts
_gap_size size_t Number of unused slots in the gap
_elems T* Raw pointer to the underlying array

Fundamental invariant: _buffer_size == _logical_size + _gap_size must always hold.

Core Operations

O(1) Amortized Insertion at Cursor

void insert_at_cursor(const_reference element) {
    if (_gap_size == 0) reserve(2 * _buffer_size);  // 2x doubling
    _elems[_cursor_index++] = element;
    ++_logical_size;
    --_gap_size;
}

The element is placed directly into the gap slot at _cursor_index. When the gap is exhausted, the buffer doubles in size. An rvalue overload uses std::move(element) to avoid copying expensive types (e.g., GapBuffer<std::vector<int>>).

O(1) Deletion at Cursor (Backspace Semantics)

void delete_at_cursor() {
    if (_cursor_index > 0) {
        --_cursor_index;
        --_logical_size;
        ++_gap_size;
    }
}

No data is physically moved — the cursor retreats one position, growing the gap to “absorb” the element. Deletion at position 0 is a safe no-op.

O(|delta|) Cursor Movement

void move_cursor(int delta) {
    if (delta > 0) {
        // Moving right: shift elements from after gap to before gap
        std::move(begin_after_gap, begin_after_gap + delta, cursor_position);
    } else {
        // Moving left: shift elements from before gap to after gap
        std::move(cursor_position + delta, cursor_position, destination_after_gap);
    }
    _cursor_index += delta;
}
Moving the cursor by delta positions requires physically relocating |delta| elements across the gap boundary via std::move. This is **O( delta )** — proportional to the distance moved, not the total buffer size. This matches text editor access patterns where edits are localized.

Transparent Index Translation

The gap is invisible to external users. Two private helpers translate between logical (external) and physical (array) indices:

size_type to_array_index(size_type external_index) const {
    if (external_index < _cursor_index) return external_index;        // Before gap: identity
    else return external_index + _gap_size;                           // After gap: skip gap
}

size_type to_external_index(size_type array_index) const {
    if (array_index < _cursor_index) return array_index;              // Before gap: identity
    else if (array_index >= _cursor_index + _gap_size)
        return array_index - _gap_size;                               // After gap: subtract gap
    else throw std::string("Inside gap: invalid");
}

Memory Management: 2x Doubling Reallocation

void reserve(size_type new_size) {
    if (_logical_size >= new_size) return;              // No shrinking
    auto new_elems = new T[new_size];
    std::move(_elems, _elems + _cursor_index, new_elems);           // Phase 1: pre-gap
    size_t new_gap_size = new_size - _logical_size;
    std::move(_elems + _cursor_index + _gap_size,
              _elems + _buffer_size,
              new_elems + _cursor_index + new_gap_size);             // Phase 2: post-gap
    delete[] _elems;
    _elems = new_elems;
    _buffer_size = new_size;
    _gap_size = new_gap_size;
}

Two-phase element transfer: Pre-cursor elements are moved first, then post-cursor elements are repositioned after the enlarged gap. The old array is delete[]‘d only after the new one is fully populated — if new throws, _elems still points to valid data. Uses std::move for element transfer, avoiding unnecessary copies of complex types.

Rule-of-Five: Complete Value Semantics

Constructor/Operator Behavior
GapBuffer() Default: 10-slot buffer (kDefaultSize), entire buffer is gap
GapBuffer(count, val) Fill: allocates 2*count slots, fills count elements
GapBuffer(initializer_list) Delegates to fill, then copies from initializer
GapBuffer(const GapBuffer&) Deep copy: new array, copies all elements
GapBuffer(GapBuffer&&) Pointer steal, nullifies source (other._elems = nullptr)
operator=(const GapBuffer&) Deep copy assignment with self-assignment check
operator=(GapBuffer&&) Move assignment with self-assignment check
~GapBuffer() delete[] _elems

Scott Meyers Const-Cast Idiom

Non-const access methods delegate to their const counterparts to avoid code duplication:

reference get_at_cursor() {
    return const_cast<reference>(
        static_cast<const GapBuffer*>(this)->get_at_cursor()
    );
}

This pattern appears in get_at_cursor(), at(), and operator[].

Random-Access Iterator

GapBufferIterator<T> stores a pointer to the owning GapBuffer and a logical index (not array index). Dereferencing calls _pointee->at(_index), which handles gap translation transparently.

Category Operations
Dereference operator*()_pointee->at(_index)
Increment/Decrement ++, -- (pre and post)
Random access +=, -=, +, - (iterator arithmetic)
Difference iter - iterdifference_type
Comparison ==, !=, <, >, <=, >=
Subscript operator[](n)*(*this + n)

The constructor is private — only GapBuffer<T> (declared friend) can create iterators. Users obtain them via begin(), end(), or cursor() (a gap-buffer-specific method returning an iterator at the cursor position).

STL Compatibility

The iterator satisfies random-access iterator requirements, verified by std::sort compatibility in the test suite (TEST5F).

STL-Compatible Operator Overloads

Operator Implementation
operator<< {elem1, elem2, ^elem3, ...} with ^ marking cursor position
operator== / != Element-wise via std::equal
operator< / > / <= / >= Lexicographic via std::lexicographical_compare

Performance Characteristics

Operation Complexity Notes
insert_at_cursor O(1) amortized O(n) worst-case during reallocation
delete_at_cursor O(1) Index adjustment only
move_cursor(delta) O(|delta|) Physical element transfer across gap
at(pos) / operator[] O(1) Index translation + array access
reserve(new_size) O(n) Two-phase element move
Move constructor O(1) Pointer steal (verified <10 us for 10K elements)
Copy constructor O(n) Deep copy

Why Gap Buffers Excel for Text Editing

Aspect Gap Buffer Rope Piece Table
Insert at cursor O(1) amortized O(log n) O(1) append
Delete at cursor O(1) O(log n) O(k) split
Random access O(1) O(log n) O(k) pieces
Cache locality Contiguous Fragmented Fragmented
Move cursor by k O(k) O(log n) O(1)
Implementation ~500 lines Thousands Hundreds

The gap buffer is optimal when edits are strongly localized (typing, backspacing, small cursor movements) — exactly the dominant pattern in text editors.

Test Suite (41 Qt Test Cases)

Part Tests Coverage
Part 1: Basic Ops 11 Insert, delete, move_cursor, reserve, 10K-element stress test
Part 2: Const 1 Mutable vs. const reference semantics
Part 3: Operators 10 Fill constructor, operator[], operator<<, ==/!=, lexicographic </>
Part 5: Iterators 6 Range-based for, bidirectional, random-access arithmetic, std::sort compatibility
Part 6: Copy 6 Initializer list, deep copy, pathological self-assignment chains
Part 7: Move 8 Move constructor/assignment, timing tests (<10 us), rvalue insert, GapBuffer<GapBuffer<int>> nesting

Edge cases tested: empty buffer ops, single-element buffers, deletion at position 0 (no-op), reserve smaller than logical size (no-op), self-assignment (buf = buf), chained self-assignment ((buf = buf = buf) = buf), self-move-assignment, nested containers (GapBuffer<std::vector<int>>, GapBuffer<GapBuffer<int>>).

Tech Stack

C++17, Qt (qmake build system), Qt Test (QTest framework), STL (std::move, std::equal, std::lexicographical_compare, std::fill, std::initializer_list)