Interface | Description |
---|---|
Annotator |
This is an interface for adding annotations to a partially annotated
Annotation.
|
Class | Description |
---|---|
Annotation |
An annotation representing a span of text in a document.
|
ChunkAnnotationUtils |
Utility functions for annotating chunks
|
CoreMapAggregator |
Function that aggregates several core maps into one
|
CoreMapAttributeAggregator |
Functions for aggregating token attributes.
|
CoreMapAttributeAggregator.ConcatAggregator | |
CoreMapAttributeAggregator.ConcatCoreMapListAggregator<T extends CoreMap> | |
CoreMapAttributeAggregator.ConcatListAggregator<T> | |
CoreMapAttributeAggregator.ConcatTextAggregator | |
CoreMapAttributeAggregator.MostFreqAggregator | |
LanguageInfo |
This class contains mappings from strings to language properties files.
|
Enum | Description |
---|---|
LanguageInfo.HumanLanguage |
languages supported
|
TextAnnotation.class
). They should also specify what they add
to the annotation, and where.
public void testPipeline(String text) throws Exception { // create pipeline AnnotationPipeline pipeline = new AnnotationPipeline(); pipeline.addAnnotator(new TokenizerAnnotator(false, "en")); pipeline.addAnnotator(new WordsToSentencesAnnotator(false)); pipeline.addAnnotator(new POSTaggerAnnotator(false)); pipeline.addAnnotator(new MorphaAnnotator(false)); pipeline.addAnnotator(new NERCombinerAnnotator(false)); pipeline.addAnnotator(new ParserAnnotator(false, -1)); // create annotation with text Annotation document = new Annotation(text); // annotate text with pipeline pipeline.annotate(document); // demonstrate typical usage for (CoreMap sentence: document.get(CoreAnnotations.SentencesAnnotation.class)) { // get the tree for the sentence Tree tree = sentence.get(TreeAnnotation.class); // get the tokens for the sentence and iterate over them for (CoreLabel token: sentence.get(CoreAnnotations.TokensAnnotation.class)) { // get token attributes String tokenText = token.get(TextAnnotation.class); String tokenPOS = token.get(PartOfSpeechAnnotation.class); String tokenLemma = token.get(LemmaAnnotation.class); String tokenNE = token.get(NamedEntityTagAnnotation.class); } } }
./bin/stanfordcorenlp.shor
java -cp stanford-corenlp-YYYY-MM-DD.jar:stanford-corenlp-YYYY-MM-DD-models.jar:xom.jar:joda-time.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP [ -props YOUR_CONFIGURATION_FILE ] -file YOUR_INPUT_FILEwhere the following properties are defined: (if
-props
or annotators
is not defined, default properties will be loaded via the classpath)
"annotators" - comma separated list of annotators The following annotators are supported: tokenize, ssplit, pos, lemma, ner, truecase, parse, dcoref, nflMore information is available here: Stanford CoreNLP