Package opennlp.tools.tokenize
Class ThreadSafeTokenizerME
java.lang.Object
opennlp.tools.tokenize.ThreadSafeTokenizerME
- All Implemented Interfaces:
AutoCloseable,Tokenizer
A thread-safe version of
TokenizerME. Using it is completely transparent.
You can use it in a single-threaded context as well, it only incurs a minimal overhead.- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionThreadSafeTokenizerME(String language) Initializes aThreadSafeTokenizerMEby downloading a default model for a givenlanguage.Initializes aThreadSafeTokenizerMEwith the specifiedmodel.ThreadSafeTokenizerME(TokenizerModel model, Dictionary abbDict) Instantiates aThreadSafeTokenizerMEwith an existingTokenizerModel. -
Method Summary
-
Constructor Details
-
ThreadSafeTokenizerME
Initializes aThreadSafeTokenizerMEby downloading a default model for a givenlanguage.- Parameters:
language- An ISO conform language code.- Throws:
IOException- Thrown if the model could not be downloaded or saved.
-
ThreadSafeTokenizerME
Initializes aThreadSafeTokenizerMEwith the specifiedmodel.- Parameters:
model- A validTokenizerModel.
-
ThreadSafeTokenizerME
Instantiates aThreadSafeTokenizerMEwith an existingTokenizerModel.- Parameters:
model- TheTokenizerModelto be used.abbDict- TheDictionaryto be used. It must fit the language of themodel.
-
-
Method Details
-
tokenize
Description copied from interface:TokenizerSplits a string into its atomic parts. -
tokenizePos
Description copied from interface:TokenizerFinds the boundaries of atomic parts in a string.- Specified by:
tokenizePosin interfaceTokenizer- Parameters:
s- The string to be tokenized.- Returns:
- The
spans (offsets intofor each token as the individuals array elements.s)
-
getProbabilities
public double[] getProbabilities() -
close
public void close()- Specified by:
closein interfaceAutoCloseable
-