Server version of the HeLI language identifier. Language identifier based on HeLI, a Word-Based Backoff Method for Language Identification.
If you are using the identifier on scientific work, please refer to the following articles:
For the method:
For a use case:
The HeLI identifier uses the Google guava library. You have to download it from: "" and add it to your classpath. The identifier has been tested only in a linux/unix environment.
Here are detailed instructions that you can try to follow and adapt to your own computing environment.
Download the zip file from GitHub:
Unzip it:
Go to the folder containing the Java.file:
cd TunnistinPalveluFast-master/
Unzip the example language models for Finnish and Swedish:
Download the guava from
Compile the java file using the guava as part of the classpath:
javac -cp './guava-23.0.jar'
Run the java program using the guava as part of the classpath:
java -cp '.:./guava-23.0.jar' TunnistinPalveluFast
Then the server prompts "Ready to accept queries." if everything went well. The port is set in code to be 8082. If you do not have access to it or you want to change it for some other reasong, you have to edit the java file and re-compile it.
Then you can access the service through for example using telnet for testing (from the same server):
telnet 8082
At this point telnet is waiting for you to enter a line of text followed by newline, so we type:
Tämä on suomea
And the server responds with the language code of the identified language:
Unfortunately, the level of documentation is very low. Please, contact the author for more information on how to use the software if the previous steps do not work.