Documentation
¶
Overview ¶
Package stringutil provides string manipulation utilities.
The package is organized into several categories:
Search and Indexing ¶
Functions for finding substrings and patterns:
indices := stringutil.AllIndexes("banana", "an") // [1, 3]
ok := stringutil.HasAnyPrefix(s, "http://", "https://")
ok := stringutil.ContainsAll(s, "foo", "bar")
Transformation ¶
Functions for transforming strings:
reversed := stringutil.Reverse("hello") // "olleh"
truncated := stringutil.Truncate(s, 100, "...")
padded := stringutil.PadLeft("42", 5, '0') // "00042"
Validation ¶
Functions for checking string properties:
if stringutil.IsNumeric(s) { ... }
if stringutil.IsAlpha(s) { ... }
if stringutil.IsPalindrome(s) { ... }
Similarity (see similarity.go) ¶
Algorithms for measuring string similarity:
distance := stringutil.LevenshteinDistance("kitten", "sitting") // 3
score := stringutil.JaroWinklerSimilarity("martha", "marhta") // ~0.96
coefficient := stringutil.DiceCoefficient("night", "nacht")
All functions are designed to be nil-safe and handle edge cases gracefully.
Index ¶
- func AllIndexes(s, substr string) []int
- func Between(s, start, end string) (string, bool)
- func BetweenAll(s, start, end string) []string
- func CamelCase(s string) string
- func Capitalize(s string) string
- func CommonPrefix(strs ...string) string
- func CommonSuffix(strs ...string) string
- func ContainsAll(s string, substrs ...string) bool
- func ContainsAny(s string, substrs ...string) bool
- func CosineSimilarity(s1, s2 string, n int) float64
- func CountLines(s string) int
- func DamerauLevenshteinDistance(s1, s2 string) int
- func Dedent(s string) string
- func DiceCoefficient(s1, s2 string) float64
- func HammingDistance(s1, s2 string) int
- func HasAnyPrefix(s string, prefixes ...string) bool
- func HasAnySuffix(s string, suffixes ...string) bool
- func Indent(s, prefix string) string
- func IsASCII(s string) bool
- func IsAlpha(s string) bool
- func IsAlphanumeric(s string) bool
- func IsBlank(s string) bool
- func IsEmpty(s string) bool
- func IsLower(s string) bool
- func IsNumeric(s string) bool
- func IsPalindrome(s string, normalize bool) bool
- func IsPrintable(s string) bool
- func IsUpper(s string) bool
- func JaroSimilarity(s1, s2 string) float64
- func JaroWinklerSimilarity(s1, s2 string, prefixScale float64) float64
- func Join(elems []string, sep string) string
- func KebabCase(s string) string
- func LevenshteinDistance(s1, s2 string) int
- func LevenshteinSimilarity(s1, s2 string) float64
- func Lines(s string) []string
- func LongestCommonSubsequence(s1, s2 string) int
- func LongestCommonSubstring(s1, s2 string) string
- func NthRune(s string, n int) (rune, bool)
- func PadCenter(s string, length int, padChar rune) string
- func PadLeft(s string, length int, padChar rune) string
- func PadRight(s string, length int, padChar rune) string
- func PascalCase(s string) string
- func RemoveAll(s string, substrs ...string) string
- func Repeat(s string, n int) string
- func Reverse(s string) string
- func RuneCount(s string) int
- func SafeSlice(s string, start, end int) string
- func SnakeCase(s string) string
- func SplitAfter(s, sep string) []string
- func SplitN(s, sep string, n int) []string
- func StripTags(s string) string
- func SwapCase(s string) string
- func Title(s string) string
- func Truncate(s string, maxLen int, suffix string) string
- func TruncateWords(s string, maxLen int, suffix string) string
- func Words(s string) []string
- func Wrap(s string, width int) string
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AllIndexes ¶
AllIndexes returns all starting positions of substr in s. Returns nil if substr is empty or s doesn't contain substr.
Example:
indices := AllIndexes("banana", "an")
// indices = [1, 3]
func Between ¶
Between extracts the substring between start and end markers. Returns empty string and false if markers not found in proper order.
Example:
Between("[hello]", "[", "]") // "hello", true
func BetweenAll ¶
BetweenAll extracts all substrings between start and end markers.
Example:
BetweenAll("a[1]b[2]c[3]", "[", "]") // ["1", "2", "3"]
func CamelCase ¶
CamelCase converts s to camelCase.
Example:
CamelCase("hello_world") // "helloWorld"
CamelCase("hello-world") // "helloWorld"
func Capitalize ¶
Capitalize returns s with the first character uppercased and the rest lowercased.
Example:
Capitalize("hELLO") // "Hello"
func CommonPrefix ¶
CommonPrefix returns the longest common prefix of the given strings. Returns empty string if no common prefix or fewer than 2 strings.
Example:
CommonPrefix("interstellar", "internet", "internal") // "inter"
func CommonSuffix ¶
CommonSuffix returns the longest common suffix of the given strings.
func ContainsAll ¶
ContainsAll reports whether s contains all of the given substrings.
func ContainsAny ¶
ContainsAny reports whether s contains any of the given substrings.
Example:
if ContainsAny(text, "error", "fail", "warning") { ... }
func CosineSimilarity ¶
CosineSimilarity computes the cosine similarity of two strings based on their character n-gram vectors. Returns a value between 0 and 1.
This is useful for comparing longer texts.
func CountLines ¶
CountLines returns the number of lines in s. An empty string returns 0; a string without newlines returns 1.
func DamerauLevenshteinDistance ¶
DamerauLevenshteinDistance extends Levenshtein to include transpositions (swapping two adjacent characters) as a single edit operation.
Example:
DamerauLevenshteinDistance("ca", "ac") // 1 (transposition)
LevenshteinDistance("ca", "ac") // 2 (delete + insert)
func Dedent ¶
Dedent removes common leading whitespace from all lines.
Example:
Dedent(" a\n b\n c") // "a\nb\nc"
func DiceCoefficient ¶
DiceCoefficient returns the Sørensen–Dice coefficient comparing bigrams. Returns a value between 0 and 1, where 1 means identical sets of bigrams.
This metric is useful for comparing short strings or when order matters less.
Example:
DiceCoefficient("night", "nacht") // ~0.25
func HammingDistance ¶
HammingDistance returns the number of positions where corresponding characters differ. Only defined for strings of equal length. Returns -1 if strings have different lengths.
Example:
HammingDistance("karolin", "kathrin") // 3
func HasAnyPrefix ¶
HasAnyPrefix reports whether s starts with any of the given prefixes.
Example:
if HasAnyPrefix(url, "http://", "https://") { ... }
func HasAnySuffix ¶
HasAnySuffix reports whether s ends with any of the given suffixes.
func Indent ¶
Indent adds prefix to the beginning of each line in s.
Example:
Indent("a\nb\nc", " ") // " a\n b\n c"
func IsAlphanumeric ¶
IsAlphanumeric reports whether s contains only letters and digits.
func IsLower ¶
IsLower reports whether all letters in s are lowercase. Returns true for strings with no letters.
func IsPalindrome ¶
IsPalindrome reports whether s reads the same forwards and backwards. Case-sensitive and ignores whitespace/punctuation only if normalize is true.
Example:
IsPalindrome("racecar", false) // true
IsPalindrome("A man a plan a canal Panama", true) // true (normalized)
func IsPrintable ¶
IsPrintable reports whether s contains only printable characters.
func IsUpper ¶
IsUpper reports whether all letters in s are uppercase. Returns true for strings with no letters.
func JaroSimilarity ¶
JaroSimilarity returns the Jaro similarity between two strings. Returns a value between 0 (completely different) and 1 (identical).
The algorithm considers: - Number of matching characters - Number of transpositions
Example:
JaroSimilarity("martha", "marhta") // ~0.944
func JaroWinklerSimilarity ¶
JaroWinklerSimilarity returns the Jaro-Winkler similarity between two strings. This is an extension of Jaro that gives more weight to strings with a common prefix.
The prefixScale parameter (0 to 0.25) determines how much weight to give to the common prefix. Standard value is 0.1.
Example:
JaroWinklerSimilarity("martha", "marhta", 0.1) // ~0.961
func KebabCase ¶
KebabCase converts s to kebab-case.
Example:
KebabCase("HelloWorld") // "hello-world"
func LevenshteinDistance ¶
LevenshteinDistance returns the minimum number of single-character edits (insertions, deletions, substitutions) required to change s1 into s2.
Time complexity: O(len(s1) * len(s2)) Space complexity: O(min(len(s1), len(s2)))
Example:
LevenshteinDistance("kitten", "sitting") // 3
func LevenshteinSimilarity ¶
LevenshteinSimilarity returns a similarity score between 0 and 1 based on Levenshtein distance. 1 means identical strings.
Example:
LevenshteinSimilarity("hello", "hallo") // ~0.8
func Lines ¶
Lines splits s into lines. Unlike strings.Split, handles \r\n properly.
Example:
lines := Lines("a\nb\nc") // ["a", "b", "c"]
func LongestCommonSubsequence ¶
LongestCommonSubsequence returns the length of the longest common subsequence. A subsequence is a sequence that can be derived by deleting some elements without changing the order of remaining elements.
Example:
LongestCommonSubsequence("ABCDGH", "AEDFHR") // 3 ("ADH")
func LongestCommonSubstring ¶
LongestCommonSubstring returns the longest common contiguous substring.
Example:
LongestCommonSubstring("ABABC", "BABCA") // "BABC"
func NthRune ¶
NthRune returns the rune at position n (0-indexed). Returns (0, false) if n is out of bounds.
func PadCenter ¶
PadCenter centers s by adding padChar on both sides. If odd padding needed, extra character goes on the right.
Example:
PadCenter("hello", 11, '*') // "***hello***"
func PadLeft ¶
PadLeft pads s on the left with padChar to reach the target length. If s is already >= length, returns s unchanged.
Example:
PadLeft("42", 5, '0') // "00042"
func PadRight ¶
PadRight pads s on the right with padChar to reach the target length.
Example:
PadRight("42", 5, '0') // "42000"
func PascalCase ¶
PascalCase converts s to PascalCase.
Example:
PascalCase("hello_world") // "HelloWorld"
func RemoveAll ¶
RemoveAll removes all occurrences of the given substrings from s.
Example:
clean := RemoveAll("hello world", "l", "o") // "he wrd"
func Repeat ¶
Repeat returns s repeated n times. If n <= 0, returns empty string.
Example:
Repeat("ab", 3) // "ababab"
func Reverse ¶
Reverse returns s with its characters in reverse order. Correctly handles multi-byte UTF-8 characters.
Example:
rev := Reverse("hello") // "olleh"
rev := Reverse("日本語") // "語本日"
func RuneCount ¶
RuneCount returns the number of runes (Unicode code points) in s. This differs from len(s), which returns bytes.
Example:
RuneCount("日本語") // 3
len("日本語") // 9 (bytes)
func SafeSlice ¶
SafeSlice safely slices s by rune indices, returning an empty string for invalid ranges. Useful when working with user input where indices might be out of bounds.
func SnakeCase ¶
SnakeCase converts s to snake_case.
Example:
SnakeCase("HelloWorld") // "hello_world"
SnakeCase("helloWorld") // "hello_world"
func SplitAfter ¶
SplitAfter splits s after each instance of sep. Wrapper around strings.SplitAfter for consistency.
func SplitN ¶
SplitN splits s by sep into at most n parts. If n <= 0, returns all parts (same as strings.Split). Wrapper around strings.SplitN for consistency.
func StripTags ¶
StripTags removes HTML/XML tags from s. This is a simple implementation that may not handle all edge cases.
Example:
StripTags("<p>Hello <b>World</b></p>") // "Hello World"
func SwapCase ¶
SwapCase swaps the case of each letter in s.
Example:
SwapCase("Hello World") // "hELLO wORLD"
func Title ¶
Title returns s with the first character of each word uppercased.
Example:
Title("hello world") // "Hello World"
func Truncate ¶
Truncate shortens s to maxLen characters, appending suffix if truncated. The total length including suffix will not exceed maxLen.
Example:
Truncate("Hello World", 8, "...") // "Hello..."
func TruncateWords ¶
TruncateWords truncates s at a word boundary, appending suffix if truncated. Attempts to break at word boundaries rather than mid-word.
Types ¶
This section is empty.