Documentation
¶
Overview ¶
Package jsonl reads data in the JSON Lines format, also called newline-delimited JSON. JSON Lines is a convenient format for processing structured data one record at a time. This allows processing an unbounded number of records in constant memory.
Example (Sum) ¶
n := 10000
pr, pw := io.Pipe()
go func() {
enc := json.NewEncoder(pw)
for i := range n {
_ = enc.Encode(i)
}
pw.Close()
}()
sum := 0
for v := range jsonl.Records[int](pr) {
sum += v
}
fmt.Println(sum)
Output: 49995000
Index ¶
Examples ¶
Constants ¶
const MaxLineSize = bufio.MaxScanTokenSize
MaxLineSize is the default maximum allowed size of a record, and the amount of memory which will be allocated for decoding. Lines longer than this will not be unmarshaled and yield ErrTooLong instead.
Variables ¶
var ErrTooLong = bufio.ErrTooLong
ErrTooLong is signaled when a record exceeds the maximum allowed token size.
Functions ¶
func Records ¶
Records iterates over decoded values in JSON Lines data. Internally it uses bufio.Scanner; see its documentation for edge cases. There are three types of error which may occur during iteration:
Invalid JSON data yields json.SyntaxError (or any other error from json.Unmarshal). This is non-fatal; iterating continues decoding the next line.
A line exceeding the maximum size (WithMaxLineSize) yields ErrTooLong. This is non-fatal; the remainder of the line is discarded and iterating continues decoding the next line.
Errors from the io.Reader (except for io.EOF, which stops iteration without error) are fatal and will yield the error before ending iteration.
Types ¶
type Option ¶
type Option func(*options)
func WithBuffer ¶
WithBuffer provides a preallocated buffer to use for reading. The full capacity of the buffer may be used, and limits the maximum line length. The actual maximum record size may be smaller as the buffer may need to include, for instance, a newline.
func WithMaxLineSize ¶
WithMaxLineSize calls WithBuffer with a new buffer of the provided size. Unlike bufio.Scanner the full buffer is preallocated. By default this is MaxLineSize.