RegExp vs String Manipulation in Go language
I wrote a tool to abstract all package managers such as apt, snap, and flatpak, so I can use it as the interface instead of dealing with the specific package manager of the Linux distribution currently in use.
What is wrong? ๐
In i , the tool, I wrote this function to replace the letter x from a command (if it has) with the package the user specified.
func executeCommand(template string, pkgName string) {
if template == "" {
fmt.Println("Command not defined for this package manager.")
return
}
re := regexp.MustCompile(`\bx\b`)
cmdStr := re.ReplaceAllStringFunc(template, func(s string) string {
return pkgName
})
if verbose {
fmt.Printf("Executing: %s\n", cmdStr)
}
parts := strings.Fields(cmdStr)
if len(parts) == 0 {
return
}
head := parts[0]
args := parts[1:]
cmd := exec.Command(head, args...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
if verbose {
fmt.Printf("Error executing command: %v\n", err)
}
os.Exit(1)
}
}
But using regular expressions for a simple task like this is a horrible choice.
getting rid of regular expression ๐
The task is to remove the letter x from the command and put the package name instead.
The commands always had that x letter at the end which made it a good case to use strings.TrimSuffic():
sudo apt install x
sudo apt remove x
sudo apt install --only-upgrade x
apt search x
apt show x
brew install x
brew uninstall x
brew upgrade x
brew search x
brew info x
sudo port install x
sudo port uninstall x
sudo port upgrade x
port search x
port info x
sudo flatpak install x
sudo flatpak uninstall x
sudo flatpak update x
flatpak search x
flatpak info x
sudo snap install --classic x
sudo snap remove x
sudo snap refresh x
snap find x
snap info x
sudo dnf install -y x
sudo dnf remove -y x
sudo dnf upgrade -y x
dnf search x
dnf info x
sudo pacman -S --noconfirm x
sudo pacman -Rs --noconfirm x
sudo pacman -Syu --noconfirm x
pacman -Ss x
pacman -Qi x
But there is a case that the command ends with x as a part of the package manager’s name (guix), I don’t want to replace this x.
guix
So, I can use a letter preceded with a space like this " x". But nix package manager has dot before the package name like this nix-env -iA nixpkgs.x, so, I need to handle this case too.
nix-env -iA nixpkgs.x
After taking all of these cases into consideration, If the command ends with “.x” or " x", I will remove the end x and put the package name.
Here is the function after using string manipulation instead of RegExp:
func executeCommand(template string, pkgName string) {
if template == "" {
fmt.Println("Command not defined for this package manager.")
return
}
cmdStr := template
// if template ends with ".x" or " x" remove "x" and add pkgName
if strings.HasSuffix(template, ".x") || strings.HasSuffix(template, " x") {
cmdStr = strings.TrimSuffix(template, "x") + pkgName
}
if verbose {
fmt.Printf("Executing: %s\n", cmdStr)
}
parts := strings.Fields(cmdStr)
if len(parts) == 0 {
return
}
head := parts[0]
args := parts[1:]
cmd := exec.Command(head, args...)
cmd.Stdin = os.Stdin
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
err := cmd.Run()
if err != nil {
if verbose {
fmt.Printf("Error executing command: %v\n", err)
}
os.Exit(1)
}
}
I was wondering how much speed this will make!
Benchmarking RegExp vs string replacement ๐
I wrote a benchmark to test the performance and it turns out it is super faaaaaaaaaaaaast!
$ go test -bench=. -run=NONE -benchmem
goos: linux
goarch: amd64
pkg: github.com/abanoubha/i
cpu: Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz
BenchmarkRegexpReplacement-8 541910 2207 ns/op 1609 B/op 21 allocs/op
BenchmarkRegexpPrecompiledReplacement-8 1986163 609.2 ns/op 88 B/op 4 allocs/op
BenchmarkStringReplacement-8 23621672 50.48 ns/op 24 B/op 1 allocs/op
PASS
ok github.com/abanoubha/i 3.608s
That means:
- String Manipulation: ~50 ns/op
- Regexp (Pre-compiled): ~609 ns/op (~12x slower)
- Regexp (Compiled on-the-fly): ~2207 ns/op (~44x slower)
If you are curious to see the performance benchmark code, I left it on github inside the i project codebase .
takeaways ๐
Use string functions whenever and wherever you can. Keep your code free of regular expression as much as you can. Using strings functions in Go programming language is simpler and easier to understand and reason about and performs better.
I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .