go-ruby-strscan

Ruby's StringScanner in pure Go — MRI-compatible, no cgo.

pure Go · zero cgo StringScanner scan · scan_until skip · match? · check peek · getch pos · charpos captures · [] unscan go-ruby-regexp backed MRI byte-exact 100% coverage 6 arches
Documentation GitHub
Documentation (MkDocs Material + mike) License: BSD-3-Clause Go 1.26.4+ Coverage 100%

go-ruby-strscan is a pure-Go (no cgo) reimplementation of Ruby's StringScanner (the strscan library) — a cursor over a string that advances by matching patterns at the current position. It implements the full method surface: scan / scan_until / skip, match? / check / check_until, peek / getch, the pos / charpos cursors, capture access ([] / captures) and unscan — matching reference Ruby byte-for-byte. Its pattern matching is backed by go-ruby-regexp, the sibling pure-Go Onigmo. It was extracted from rbgo's prelude/internals into a reusable standalone library: no dependency on the Ruby runtime, the dependency runs the other way. It is the StringScanner backend for go-embedded-ruby, bound by rbgo as a native module just like go-ruby-regexp and go-ruby-erb — differential-tested against MRI, 100% coverage, CI green across 6 arches and 3 OSes.

Cursor model ready

The byte cursor pos and the codepoint cursor charpos, the scanned/pre-match/post-match regions, and reset / terminate — the StringScanner state MRI exposes.

Anchored scanning ready

scan, skip, match? and check match a pattern anchored at the cursor; scan and skip advance, match? and check only report — exactly MRI’s advance/no-advance split.

Searching forward ready

scan_until, skip_until, check_until and exist? search forward for the pattern, returning (and optionally consuming) everything up to and including the match, as MRI does.

Peeking & getch ready

peek(n) returns the next n bytes without advancing, getch consumes one (multibyte-aware) character, and eos? reports end-of-string.

Captures & unscan ready

After a match, [] and captures expose the regexp groups ([0] the whole match, named groups included), and unscan rolls the cursor back to before the last scan — matching MRI’s semantics.

Differential oracle & coverage ready

A wide scan-sequence corpus run both here and by the system ruby, compared byte-for-byte; 100% coverage, gofmt + go vet clean, green across all six 64-bit Go arches and three OSes. Pattern matching is backed by go-ruby-regexp.

A faithful port of MRI's strscan.c in pure Go, cgo disabled, so it cross-compiles and embeds anywhere. It implements the full scanner surface — scan, scan_until, skip, match?, check, peek, getch, pos / charpos, captures and unscan — with pattern matching backed by go-ruby-regexp. Validated differentially against the system ruby binary. It is a standalone, reusable module extracted from rbgo's internals, and the StringScanner backend for the sibling org github.com/go-embedded-ruby.