This repository contains the Go stemmers generated by the Snowball project. They are maintained outside of the core bleve package so that they may be more easily be reused in other contexts.
All these stemmers export a single Stem()
method which operates on a snowball Env
structure. The Env
structure maintains all state for the stemmer. A new Env
is created to point at an initial string. After stemming, the results of the Stem()
operation can be retrieved using the Current()
method. The Env
structure can be reused for subsequent calls by using the SetCurrent()
method.
package main
import (
"fmt"
"github.com/blevesearch/snowballstem"
"github.com/blevesearch/snowballstem/english"
)
func main() {
// words to stem
words := []string{
"running",
"jumping",
}
// build new environment
env := snowballstem.NewEnv("")
for _, word := range words {
// set up environment for word
env.SetCurrent(word)
// invoke stemmer
english.Stem(env)
// print results
fmt.Printf("%s stemmed to %s\n", word, env.Current())
}
}
Produces Output:
$ ./snowtest
running stemmed to run
jumping stemmed to jump
The test harness for these stemmers is hosted in the main Snowball repository. There are functional tests built around the separate snowballstem-data repository, and there is support for fuzz-testing the stemmers there as well.
$ export SNOWBALL=/path/to/github.com/snowballstem/snowball/after/snowball/built
$ go generate
A simple tool is provided to automate these from the snowball algorithms directory:
$ go run gengen.go /path/to/github.com/snowballstem/snowball/algorithms