Skip to content

Add basic blevex documentation #7

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions content/docs/Analyzers.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,12 @@ The Detect Language Analyzer is used to examine input text, use heuristics to de

## Language Specific Analyzers

There are a number of language specific analyzers available. Some of these analyzers are of experimental nature and can be found in the [blevex repository](https://github.com/blevesearch/blevex/). To learn more about the experimental code in the blevex repository and how to use it go [here]({{< relref "docs/Blevex.md" >}}).

* Danish
* Dutch
* English
* Japanese
* Finnish
* French
* Hungarian
Expand Down
72 changes: 72 additions & 0 deletions content/docs/Blevex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
+++
author = ["Silvan Jegen"]
date = "2016-07-11T19:25:38-04:00"
title = "Blevex"
[menu.docs]
weight = 1
parent = 'analysis'
+++

## The Blevex experimental repository

In the blevex repository you can find code with experimental status. The repository contains the following types of code.

* language specific analyzer components
* kvstore bindings
* language detection analyzer
* stemmers


### Language-specific Analyzers

The following language-specific analyzers are available in blevex.

* Danish
* German
* English
* Spanish
* Finnish
* French
* Hungarian
* Italian
* Japanese
* Dutch
* Norwegian
* Portuguese
* Romanian
* Russian
* Thai
* Turkish

In order to use them you just import the language analyzer component(s) you are interested in and use it in your CustomAnalyzer. In case you want to use the German normalize filter you have to do something like this at indexing time.

Go
```

import "github.com/blevesearch/blevex/lang/de"

err = indexmapping.AddCustomAnalyzer("myde",
map[string]interface{}{
"type": custom_analyzer.Name,
"tokenizer": unicode.Name,
"token_filters": []string{
lower_case_filter.Name,
de.NormalizeName,
},
})

(error handling has been left out for brevity)

```

Bleve will return an error if you open an index for which you haven't loaded all the necessary analyzer components. So if you want to open an index for which you have used one of the blevex analyzer components you will have to import the package (just for the side effect of having the analyzer components registered that way). To open an index built with the CustomAnalyzer above you will have to import the bleve 'de' package like this.


Go
```
import _ "github.com/blevesearch/blevex/lang/de"

```

Afterwards you should be able to just open the index using bleve.Open().