Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for InputStream constructor in MaxentTagger #191

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 50 additions & 2 deletions src/edu/stanford/nlp/tagger/maxent/MaxentTagger.java
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,20 @@ public MaxentTagger(String modelFile) {
this(modelFile, StringUtils.argsToProperties("-model", modelFile), true);
}

/**
* Constructor for a tagger, loading a model stored in a particular file,
* classpath resource, or URL.
* The tagger data is loaded when the constructor is called (this can be
* slow). This constructor first constructs a TaggerConfig object, which
* loads the tagger options from the modelFile.
*
* @param modelStream The InputStream from which to read the model
* @throws RuntimeIOException if I/O errors or serialization errors
*/
public MaxentTagger(InputStream modelStream) {
this(modelStream, new Properties(), true);
}

/**
* Constructor for a tagger using a model stored in a particular file,
* with options taken from the supplied TaggerConfig.
Expand Down Expand Up @@ -301,6 +315,17 @@ public MaxentTagger(String modelFile, Properties config, boolean printLoading) {
readModelAndInit(config, modelFile, printLoading);
}

/**
* Initializer that loads the tagger.
*
* @param modelStream An InputStream for reading the model file
* @param config TaggerConfig based on command-line arguments
* @param printLoading Whether to print a message saying what model file is being loaded and how long it took when finished.
* @throws RuntimeIOException if I/O errors or serialization errors
*/
public MaxentTagger(InputStream modelStream, Properties config, boolean printLoading) {
readModelAndInit(config, modelStream, printLoading);
}

final Dictionary dict = new Dictionary();
TTags tags;
Expand Down Expand Up @@ -763,9 +788,33 @@ protected void saveModel(DataOutputStream file) throws IOException {
* @throws RuntimeIOException if I/O errors or serialization errors
*/
protected void readModelAndInit(Properties config, String modelFileOrUrl, boolean printLoading) {
try {
readModelAndInit(config, IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(modelFileOrUrl), printLoading);
} catch (IOException e) {
throw new RuntimeIOException("Error while loading a tagger model (probably missing model file)", e);
}

}

/** This reads the complete tagger from a single model provided as an InputStream,
* and initializes the tagger using a
* combination of the properties passed in and parameters from the file.
* <p>
* <i>Note for the future:</i> This assumes that the TaggerConfig in the file
* has already been read and used. This work is done inside the
* constructor of TaggerConfig. It might be better to refactor
* things so that is all done inside this method, but for the moment
* it seemed better to leave working code alone [cdm 2008].
*
* @param config The tagger config
* @param modelStream The model provided as an InputStream
* @param printLoading Whether to print a message saying what model file is being loaded and how long it took when finished.
* @throws RuntimeIOException if I/O errors or serialization errors
*/
protected void readModelAndInit(Properties config, InputStream modelStream, boolean printLoading) {
try {
// first check can open file ... or else leave with exception
DataInputStream rf = new DataInputStream(IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(modelFileOrUrl));
DataInputStream rf = new DataInputStream(modelStream);

readModelAndInit(config, rf, printLoading);
rf.close();
Expand All @@ -775,7 +824,6 @@ protected void readModelAndInit(Properties config, String modelFileOrUrl, boolea
}



/** This reads the complete tagger from a single model file, and inits
* the tagger using a combination of the properties passed in and
* parameters from the file.
Expand Down