Stanfordcorenlp簡介

  • Stanford CoreNLP提供了一套人類語言技術工具。 支持多種自然語言處理基本功能,Stanfordcorenlp是它的一個python介面。
  • 官網地址:Stanford CoreNLP - Natural language software
  • Github地址:stanfordnlp/CoreNLP
  • Stanfordcorenlp主要功能包括分詞、詞性標註、命名實體識別、句法結構分析和依存分析等等。

Stanfordcorenlp工具Demo

安裝:pip install stanfordcorenlp

先下載模型,下載地址:nlp.stanford.edu/softwa

支持多種語言,這裡記錄一下中英文使用方法

from stanfordcorenlp import StanfordCoreNLP
zh_model = StanfordCoreNLP(rstanford-corenlp-full-2018-02-27, lang=zh)
en_model = StanfordCoreNLP(rstanford-corenlp-full-2018-02-27, lang=en)
zh_sentence = 我愛自然語言處理技術!
en_sentence = I love natural language processing technology!


1.分詞(Tokenize)

print (Tokenize:, zh_model.word_tokenize(zh_sentence))
print (Tokenize:, en_model.word_tokenize(en_sentence))
Tokenize: [我愛, 自然, 語言, 處理, 技術, ]
Tokenize: [I, love, natural, language, processing, technology, !]


2.詞性標註(Part of Speech)

print (Part of Speech:, zh_model.pos_tag(zh_sentence))
print (Part of Speech:, en_model.pos_tag(en_sentence))
Part of Speech: [(我愛, NN), (自然, AD), (語言, NN), (處理, VV), (技術, NN), (, PU)]
Part of Speech: [(I, PRP), (love, VBP), (natural, JJ), (language, NN), (processing, NN), (technology, NN), (!, .)]


3.命名實體識別(Named Entity)

print (Named Entities:, zh_model.ner(zh_sentence))
print (Named Entities:, en_model.ner(en_sentence))
Named Entities: [(我愛, O), (自然, O), (語言, O), (處理, O), (技術, O), (, O)]
Named Entities: [(I, O), (love, O), (natural, O), (language, O), (processing, O), (technology, O), (!, O)]


4.句法成分分析(Constituency Parse)

print (Constituency Parsing:, zh_model.parse(zh_sentence) + "
")
print (Constituency Parsing:, en_model.parse(en_sentence))
Constituency Parsing: (ROOT
(IP
(IP
(NP (NN 我愛))
(ADVP (AD 自然))
(NP (NN 語言))
(VP (VV 處理)
(NP (NN 技術))))
(PU )))

Constituency Parsing: (ROOT
(S
(NP (PRP I))
(VP (VBP love)
(NP (JJ natural) (NN language) (NN processing) (NN technology)))
(. !)))


5.依存句法分析(Dependency Parse)

print (Dependency:, zh_model.dependency_parse(zh_sentence))
print (Dependency:, en_model.dependency_parse(en_sentence))
Dependency: [(ROOT, 0, 4), (nsubj, 4, 1), (advmod, 4, 2), (nsubj, 4, 3), (dobj, 4, 5), (punct, 4, 6)]
Dependency: [(ROOT, 0, 2), (nsubj, 2, 1), (amod, 6, 3), (compound, 6, 4), (compound, 6, 5), (dobj, 2, 6), (punct, 2, 7)]


另外,代碼我已經上傳github:github.com/yuquanle/Stu

公眾號:StudyForAI(小白人工智慧入門學習)


推薦閱讀:
相关文章