xsx

package module
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 27, 2025 License: MIT Imports: 11 Imported by: 5

README

XSX-Logo – eXtended S-eXpressions

Test Coverage Go Report Card Go Reference

import "git.fractalqb.de/fractalqb/xsx"


Package XSX provides tools for parsing something I call eXtended S-eXpressions. Extended means the following things compared to SEXP S-expressions:

  1. Nested groups are delimited by a configurable set of balanced brackets e.g., '()', '[]' or '{}’ – not only by '()'.

  2. XSX provides a notation for "Meta Values", i.e. XSXs that provide some sort of out-of-band information.

On the other hand some properties from SEXP were dropped, e.g. typing of the so called "octet strings". Things like that are completely left to the application.

Somewhat more formal description

Frist of all, XSX is not about datatypes, in this it is comparable to e.g. XML (No! don't leave… its much simpler). Instead, its building block is the atom i.e., nothing else than a sequence of characters, aka a 'string'. Atoms come as quoted atoms and as unquoted atoms. An atom has to be quoted when the atom's string contains characters that have a special meaning in XSX: White-space, grouping characters, the quote and the meta character.

Regexp style definition of Atom
ws           := UNICODE whitespace
o-group      := “set of opening chars”               – e.g. '(' | '[' | '{'
c-group      := “set of corresponding closing chars” – e.g. ')' | ']' | '}'
quote        := “the quote char”                     – e.g. "
meta         := “the meta char”                      – e.g. \
syntax-chars := o-group | c-group | quote | meta

atom     := nq-atom | q-atom
nq-atom  := (^(syntax-chars|ws))+
q-atom   := quote(quote{2}|^quote)*quote

I.e. x is an atom and foo, bar and baz are atoms. An atom that contains space, a " or \ would be quoted. Also "(" is an atom but ( is not an atom, i.e. any atom conating o-group or c-group need quotes.

Groups – now BNF Style

Each atom is an XSX and from XSX'es one can build groups:

XSX   := atom | group
group := o-group {ws} c-group | o-group {ws} xsxs {ws} c-group
xsxs  := XSX | XSX {ws} xsxs
Out-Of-Band Information with Meta XSXs

You can prefix each XSX with the meta char to make that expression a meta-expression. A meta-expression is not considered to be an XSX, i.e. you cannot create meta-meta-expressions or meta-meta-meta-expressions… hmm… and not even meta-meta-meta-meta-expressions! I think it became clear?

The special case when the meta char does not prefix an XSX e.g., “\ ” or (\), makes the meta char the special meta token void. One may use it to denote the absence of something when position in a group is relevant.

E.g. \4711 is a meta-atom and \{foo 1 bar false baz 3.1415} is a meta-group. What meta means is completely up to the application. Imagine (div hiho) and (div \{class green} hiho) to be a translation from <div>hiho</div> and <div class="green">hiho</div>.

Rationale

None! … despite the fact that I found it to be fun – and useful in some situations. Because XSX syntax is so simple it is easy to use for proprietary data files.

The first implementation was inspired by the expat streaming parser that allows one to push some data into the paring machinery and it will fire appropriate callbacks when tokes are detected. Reimplementing it as a simple pulling scanner simplified the code dramatically. With Go routines one can easily turn that into the streaming and event driven model again.

So, if you are looking for something that's even simpler than JSON or YAML you might give it a try… Happy coding!

Documentation

Overview

Package xsx provides tools for parsing so called eXtended S-eXpressions. Extended means the following things compared to https://github.com/jpmalkiewicz/rivest-sexp:

Nested structures are delimited by balanced braces e.g. '()', '[]' or '{}’ – not only by '()'.

XSX provides a notation for "Meta Values", i.e. XSX that provide some sort of meta information that is not part of the "normal" data.

On the other hand some properties from SEXP were dropped, e.g. typing of the so called "octet strings". Things like that are completely left to the application.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Check added in v0.9.0

func Check(summary *[]error, tok *Token, exs ...Expectation) bool

Check collects failed checks in summary. It returns true if all expectation are met and false otherwise.

Example
var tok Token
scn := NewStringDefaultScanner("\\", &tok)
must.Ret(scn.HasNext(false))
var fails []error
switch {
case Check(&fails, &tok, AtomEq("foo")):
	fmt.Println("found an atom")
case Check(&fails, &tok, Begin):
	fmt.Println("some meta value")
case Check(&fails, &tok, Void):
	fmt.Println("The VOID")
}
fmt.Println(errors.Join(fails...))
Output:

The VOID
token Void is not Atom
token Void is not Begin

Types

type Any added in v0.9.0

type Any []Expectation

Any is an expectation that is met if any of its elements is met.

func (Any) Check added in v0.9.0

func (any Any) Check(tok *Token) error

Check returns the joined errors when all expectations fail.

type AsInt added in v0.9.0

type AsInt struct{ Ptr *int }

func (AsInt) Check added in v0.9.0

func (ex AsInt) Check(tok *Token) (err error)

type AsTime added in v0.9.0

type AsTime struct {
	Ptr    *time.Time
	Layout string
}
Example
scn := NewStringDefaultScanner("2025-07-20T15:30:12Z", nil)
var t time.Time
must.Do(scn.Expect(false, Atom, Meta(false), AsTime{
	Ptr:    &t,
	Layout: time.RFC3339,
}))
fmt.Println(t)
Output:

2025-07-20 15:30:12 +0000 UTC

func (AsTime) Check added in v0.9.0

func (ex AsTime) Check(tok *Token) (err error)

type AsUint added in v0.9.0

type AsUint struct{ Ptr *uint }

func (AsUint) Check added in v0.9.0

func (ex AsUint) Check(tok *Token) error

type AtomEq added in v0.9.0

type AtomEq string

AtomEq expects a specific atom.

func (AtomEq) Check added in v0.9.0

func (ex AtomEq) Check(tok *Token) error

type BeginEq added in v0.9.0

type BeginEq rune

BeginEq expects a specific begin rune.

func (BeginEq) Check added in v0.9.0

func (ex BeginEq) Check(tok *Token) error

type CompactWriter added in v0.9.0

type CompactWriter struct{ *SpacedWriter }

CompactWriter is a variant of SpacedWriter that ignores any calls to Space. It is used for compact XSX output. See also SpacedWriter.Compact.

func (CompactWriter) Space added in v0.9.0

func (CompactWriter) Space(spc string) (n int, err error)

type Expectation added in v0.8.1

type Expectation interface {
	Check(*Token) error
}

Expectation checks a Token and reports an error when the expectation is not met. See also Token.Check, Check and Scanner.Expect.

type Indenter added in v0.9.0

type Indenter struct {
	// contains filtered or unexported fields
}

Indenter provides functionality to generate whitespace prefixes for SpacedWriter.Space, allowing the creation of properly indented XSX output.

func NewIndenter added in v0.9.0

func NewIndenter(step string) *Indenter

func (*Indenter) Indent added in v0.9.0

func (i *Indenter) Indent(delta int) *Indenter

func (*Indenter) Nl added in v0.9.0

func (i *Indenter) Nl(n int) string

func (*Indenter) Prefix added in v0.9.0

func (i *Indenter) Prefix() string

type Match added in v0.9.0

type Match struct{ R *regexp.Regexp }

Meta expects a token to match the regular expression R.

func (Match) Check added in v0.9.0

func (ex Match) Check(tok *Token) error

type Meta

type Meta bool

Meta expects a token to be meta (true) or not meta (false).

func (Meta) Check added in v0.9.0

func (ex Meta) Check(tok *Token) error

type Quoted

type Quoted bool

Quoted expects tok to be an atom and that the atom is quoted (true) or unquoted (false).

func (Quoted) Check added in v0.9.0

func (ex Quoted) Check(tok *Token) error

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

func NewDefaultScanner added in v0.8.0

func NewDefaultScanner(r io.Reader, tok *Token) *Scanner

NewDefaultScanner creates a new Scanner using the provided io.Reader as input. The Scanner is initialized with the DefaultSyntax settings.The tok parameter specifies a Token to be reused for scanning. If nil, a new Token will be allocated.

func NewScanner

func NewScanner(r io.Reader, s Syntax, tok *Token) (scn *Scanner, err error)

NewScanner creates and returns a new Scanner instance using the provided io.Reader and Syntax. It validates the given Syntax before creating the Scanner. The tok parameter specifies a Token to be reused for scanning. If nil, a new Token will be allocated.

func NewStringDefaultScanner added in v0.8.0

func NewStringDefaultScanner(str string, tok *Token) *Scanner

NewStringDefaultScanner creates a new Scanner that reads from the provided string using DefaultSyntax.The tok parameter specifies a Token to be reused for scanning. If nil, a new Token will be allocated.

func NewStringScanner added in v0.8.0

func NewStringScanner(str string, s Syntax, tok *Token) (*Scanner, error)

NewStringScanner creates a new Scanner that reads from the provided string using the specified Syntax.The tok parameter specifies a Token to be reused for scanning. If nil, a new Token will be allocated.

func (*Scanner) CanExpect added in v0.9.0

func (scn *Scanner) CanExpect(space bool, es ...Expectation) (bool, error)

CanExpect attempts to get the next token and checks if it satisfies the provided expectations. It returns (true, nil) if successful, (false, nil) if there are no more tokens, and (false, error) if scanning fails or the token does not meet expectations.

func (*Scanner) Expect added in v0.9.0

func (scn *Scanner) Expect(space bool, es ...Expectation) error

Expect advances the scanner to the next token (see Scanner.Token) and checks if the token satisfies the provided expectations es. It returns an error if advancing to the next token fails or if the token does not meet the expectations.

func (*Scanner) Expected added in v0.9.0

func (scn *Scanner) Expected(space bool, es ...Expectation) iter.Seq2[*Token, error]

Expected returns an iterator that yields tokens from the scanner that satisfy the provided expectations. If 'space' is true, whitespace tokens are included; otherwise, they are skipped. The iterator yields the scanner's internal token along with any error encountered during scanning.

func (*Scanner) GroupExpect added in v0.9.0

func (scn *Scanner) GroupExpect(space bool, es ...Expectation) (bool, error)

CanExpect attempts to get the next token in the current group and checks if it satisfies the provided expectations. It returns (true, nil) if successful, (false, nil) if there are no more tokens, and (false, error) if the scanner is not currently within a group, if scanning fails or the token does not meet expectations.

func (*Scanner) GroupExpected added in v0.9.0

func (scn *Scanner) GroupExpected(space bool, es ...Expectation) iter.Seq2[*Token, error]

GroupExpected returns an iterator that yields tokens from within the current group that satisfy the provided expectations. If 'space' is true, whitespace tokens are included; otherwise, they are skipped. It is an error to call GroupExpected while not inside a group. The iterator yields the scanner's internal token along with any error encountered.

func (*Scanner) GroupNext added in v0.9.0

func (scn *Scanner) GroupNext(space bool) (bool, error)

GroupNext tries to read the next token in the current group and returns true on success. It is an error to call NextGroup while scn is not inside an XSX group. At the end of the group (false, nil) is returned.

func (*Scanner) GroupTokens added in v0.9.0

func (scn *Scanner) GroupTokens(space bool) iter.Seq2[*Token, error]

GroupTokens returns an iterator that yields tokens from within the current group. If 'space' is true, whitespace tokens are included; otherwise, they are skipped. It is an error to call GroupTokens while not inside a group. The iterator yields the scanner's internal token along with any error encountered.

func (*Scanner) HasNext added in v0.9.0

func (scn *Scanner) HasNext(space bool) (bool, error)

HasNext tries to get the next token using [Next] and returns true if successful. If there is no next token, it returns false and nil error. If an error occurs HasNext returns false and the error.

func (*Scanner) Level added in v0.9.0

func (scn *Scanner) Level() int

Level returns the current nesting depth of the scanner. Zero means the scanner is at the top level, while positive values indicate the depth of nested groups.

func (*Scanner) Nesting added in v0.9.0

func (scn *Scanner) Nesting(i int) int

Nesting returns the nesting group at the specified level i. If i is negative, it is interpreted as an offset from the end of the slice (like Python indexing). If there is no such nesting level, Nesting returns -1.

func (*Scanner) Next added in v0.8.0

func (scn *Scanner) Next(space bool) error

Next reads the next token from the scanner into the scanner's token buffer (see Scanner.Token). If space is true, space from the input is collected as a token as well. Otherwise space will be ignored.

func (*Scanner) SetToken added in v0.9.0

func (scn *Scanner) SetToken(tok *Token) (old *Token)

SetToken replaces the scanner's internal token buffer with the provided tok and returns the previous token buffer. If tok is nil, a new Token is allocated. The scanner will use the new token buffer for all subsequent scanning operations.

func (*Scanner) Syntax added in v0.8.0

func (scn *Scanner) Syntax() Syntax

func (*Scanner) Token added in v0.9.0

func (scn *Scanner) Token() *Token

Token returns the scanner's internal token buffer. This token is reused for all scanning operations. See Scanner.SetToken to use a different token.

func (*Scanner) Tokens added in v0.9.0

func (scn *Scanner) Tokens(space bool) iter.Seq2[*Token, error]

Tokens returns an iterator that yields tokens from the scanner. If 'space' is true, whitespace tokens are included; otherwise, they are skipped. The iterator yields the scanner's internal token along with any error encountered during scanning. Iteration stops when there are no more tokens or if the yield function returns false.

Example
p := NewStringDefaultScanner(` "xyz"foo `, nil)
var (
	tok *Token
	err error
)
for tok, err = range p.Tokens(true) {
	fmt.Println(tok)
}
fmt.Println(err)
Output:

<Space[ ]m>
<Atom[xyz]mQ>
<Atom[foo]mq>
<Space[ ]m>
<nil>

type SpacedWriter added in v0.9.0

type SpacedWriter struct {
	// contains filtered or unexported fields
}

SpacedWriter implements the Writer interface, providing XSX output with whitespace between elements.

func NewDefaultWriter added in v0.8.0

func NewDefaultWriter(w io.Writer) *SpacedWriter

func NewWriter added in v0.8.0

func NewWriter(w io.Writer, s Syntax) (wr *SpacedWriter, err error)

func (*SpacedWriter) Atom added in v0.9.0

func (w *SpacedWriter) Atom(a string, meta bool) (quoted bool, n int, err error)

func (*SpacedWriter) Begin added in v0.9.0

func (w *SpacedWriter) Begin(b int, meta bool) (n int, err error)

func (*SpacedWriter) Compact added in v0.9.0

func (w *SpacedWriter) Compact() CompactWriter

func (*SpacedWriter) End added in v0.9.0

func (w *SpacedWriter) End() (int, error)

func (*SpacedWriter) Flush added in v0.9.0

func (w *SpacedWriter) Flush() error

func (*SpacedWriter) Space added in v0.9.0

func (w *SpacedWriter) Space(spc string) (n int, err error)

func (*SpacedWriter) Syntax added in v0.9.0

func (w *SpacedWriter) Syntax() Syntax

func (*SpacedWriter) Void added in v0.9.0

func (w *SpacedWriter) Void() (n int, err error)

type Syntax added in v0.8.0

type Syntax struct {
	// contains filtered or unexported fields
}

func DefaultSyntax added in v0.8.0

func DefaultSyntax() Syntax

func NewSyntax added in v0.8.0

func NewSyntax(begin, end string, quote, meta rune) (Syntax, error)

func (Syntax) Begin added in v0.8.0

func (s Syntax) Begin() []rune

func (Syntax) End added in v0.8.0

func (s Syntax) End() []rune

func (Syntax) Groups added in v0.9.0

func (s Syntax) Groups() int

func (Syntax) IsBegin added in v0.8.0

func (s Syntax) IsBegin(r rune) int

func (Syntax) IsEnd added in v0.8.0

func (s Syntax) IsEnd(r rune) int

func (Syntax) IsToken added in v0.8.0

func (s Syntax) IsToken(r rune) bool

func (Syntax) Meta added in v0.8.0

func (s Syntax) Meta() rune

func (Syntax) Quote added in v0.8.0

func (s Syntax) Quote() rune

func (Syntax) QuoteIf added in v0.8.0

func (syn Syntax) QuoteIf(s string) (string, bool)
Example
syn := DefaultSyntax()
fmt.Println(syn.QuoteIf(""))
fmt.Println(syn.QuoteIf("foo"))
fmt.Println(syn.QuoteIf("foo bar"))
fmt.Println(syn.QuoteIf("(foo)"))
fmt.Println(syn.QuoteIf("\\foo"))
fmt.Println(syn.QuoteIf(`a "foo"`))
Output:

"" true
foo false
"foo bar" true
"(foo)" true
"\foo" true
"a ""foo""" true

type Token

type Token struct {
	Type         TokenType
	Group        int
	Meta, Quoted bool
	// contains filtered or unexported fields
}

func (*Token) Check added in v0.9.0

func (t *Token) Check(exs ...Expectation) error

Check runs all expectations exs against t and return the respective error on first fail. If all expectations are met, Check returns nil.

func (*Token) Rune added in v0.8.0

func (t *Token) Rune() rune

Returns unicode.MaxRune+1 if no rune is available

func (*Token) String

func (t *Token) String() string

func (*Token) Text added in v0.9.0

func (t *Token) Text() string

type TokenType added in v0.8.0

type TokenType uint
const (
	Space TokenType = (1 << iota)
	Atom
	Begin
	End
	Void
)

func (TokenType) Check added in v0.9.0

func (tt TokenType) Check(tok *Token) error

Check expects tok to be of any of the token types in tt.

func (TokenType) String added in v0.8.0

func (i TokenType) String() string

type Writer added in v0.8.0

type Writer interface {
	// Flush writes any buffered data to the underlying io.Writer.
	Flush() error

	// Begin starts a new XSX element with the given group index `b`.
	// If `meta` is true, a meta marker is written before the element.
	Begin(b int, meta bool) (n int, err error)

	// End closes the current XSX element and returns the closed group's index.
	End() (int, error)

	// Atom writes an XSX atom with the given string `a`.
	// If `meta` is true, a meta marker is written before the atom.
	Atom(a string, meta bool) (quoted bool, n int, err error)

	// Void writes an XSX void element.
	Void() (n int, err error)

	// Space writes a string of whitespace characters.
	Space(spc string) (n int, err error)

	// Syntax returns the syntax used by this writer.
	Syntax() Syntax
}
Example
syn, _ := NewSyntax("<", ">", '\'', '^')
w, _ := NewWriter(os.Stdout, syn)
w.Begin(0, false)
w.Space("\n\t")
w.Atom("foo 'n' bar", true)
w.Space("\n\t")
w.Atom("4711", false)
w.Space("\n")
w.End()
w.Flush()
Output:

<
	^'foo ''n'' bar'
	4711
>
Example (Spaces)
w := NewDefaultWriter(os.Stdout)
w.Begin(0, false)
w.Void()
w.Atom("foo", false)
w.End()
w.Begin(0, false)
w.Atom("bar", true)
w.Void()
w.End()
w.Flush()
Output:

(\ foo) (\bar \)

Directories

Path Synopsis
Package gem provides a GEneric Model for XSX data
Package gem provides a GEneric Model for XSX data

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL