如何使用Python获得单词的同义词?
§01 同义词典
认知一个英语单词除了对应翻译、读音,列些出所有的同义词帮助建立起单词相互支撑网络,提高单词使用准确性。
1.1 py-thesaurus
从 py-thesaurus 可以看到相应的安装与使用。
1.1.1 安装软件
pip install py_thesaurus
1.1.2 测试软件
from head import *
from py_thesaurus import Thesaurus
input_word = "apple"
new_instance = Thesaurus(input_word)
print(new_instance.get_synonym())
print(new_instance.get_synonym(pos='verb'))
print(new_instance.get_synonym(pos='adj'))
print(new_instance.get_definition())
print(new_instance.get_antonym())
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
(1)输出结果
---------- [PYTHON INFOR] ----------
[]
[]
[]
Give a non-empty argument
[]
[]
- 1
- 2
- 3
- 4
- 5
- 6
- 7
可以看到这个软件并没有工作。
1.2 nltk wordnet
1.2.1 软件安装
pip install nltk
1.2.2 测试软件
from head import *
from nltk.corpus import wordnet
synonyms = []
for syn in wordnet.synsets("Soil"):
for lm in syn.lemmas():
synonyms.append(lm.name())
print (set(synonyms))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
Traceback (most recent call last):
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 86, in __load
root = nltk.data.find('{}/{}'.format(self.subdir, zip_name))
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\data.py", line 699, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource [93mwordnet[0m not found.
Please use the NLTK Downloader to obtain the resource:
[31m>>> import nltk
>>> nltk.download('wordnet')
[0m
Attempted to load [93mcorpora/wordnet.zip/wordnet/[0m
Searched in:
- 'C:\\Users\\zhuoqing/nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\share\\nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\lib\\nltk_data'
- 'C:\\Users\\zhuoqing\\AppData\\Roaming\\nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
**********************************************************************
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\Temp\TEMP0002\test2.PY", line 16, in <module>
for syn in wordnet.synsets("Soil"):
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 123, in __getattr__
self.__load()
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 88, in __load
raise e
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 83, in __load
root = nltk.data.find('{}/{}'.format(self.subdir, self.__name))
File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\data.py", line 699, in find
raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource [93mwordnet[0m not found.
Please use the NLTK Downloader to obtain the resource:
[31m>>> import nltk
>>> nltk.download('wordnet')
[0m
Attempted to load [93mcorpora/wordnet[0m
Searched in:
- 'C:\\Users\\zhuoqing/nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\share\\nltk_data'
- 'C:\\Users\\zhuoqing\\Anaconda3\\lib\\nltk_data'
- 'C:\\Users\\zhuoqing\\AppData\\Roaming\\nltk_data'
- 'C:\\nltk_data'
- 'D:\\nltk_data'
- 'E:\\nltk_data'
**********************************************************************
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 50
- 51
- 52
- 53
测试出现错误。
1.3 PyDictionary
在 PyDictionary 可以获得PYDictionary 安装与使用方法。
1.3.1 软件包安装
pip install pydictionary
便可以成功安装软件包。
1.3.2 测试软件包
from head import *
from PyDictionary import PyDictionary
dictionary = PyDictionary()
str = dictionary.meaning('indentation')
printf(str)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
{‘Noun’: [‘a concave cut into a surface or edge (as in a coastline’, ‘the formation of small pits in a surface as a consequence of corrosion’, ‘the space left between the margin and the start of an indented line’, ‘the act of cutting into an edge with toothlike notches or angular incisions’]}
str = dictionary.meaning('apple')
- 1
{'Noun': ['fruit with red or yellow or green skin and sweet to tart crisp whitish flesh', 'native Eurasian tree widely cultivated in many varieties for its firm rounded edible fruits']}
- 1
- 消耗时间: 8.822秒。
这个软件是在太消耗时间了。
1.4 pyenchant
这是寻找与输入单词相近的词语调用。用于对单词进行修改。
pip install pyenchant
1.4.1 测试软件
from head import *
import enchant
d = enchant.Dict('en_US')
printf(d.suggest('aple'))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
['pale', 'ape', 'ale', 'ample', 'maple', 'apple', 'plea', 'apse', 'able', 'Apple']
- 1
printf(d.suggest('window'))
- 1
['window', 'widow', 'windows', 'windrow', 'winnow', 'win dow', 'win-dow', 'wind ow', 'wind-ow']
- 1
§02 程序改造
最终利用 pyenchant软件包对于在 cal, cdtm 中的翻译功能,如果碰到错误单词,作为使用 enchant 寻找相近单词。
2.1 cal
if targetsampa.find('*') >= 0:
sss = result['translateResult'][0][0]['src']
printf ("%s%s --> %s" %(result['translateResult'][0][0]['src'], originsampa,
dict.suggest(sss)))
else:
printf ("%s%s --> %s%s" %(result['translateResult'][0][0]['src'], originsampa,
result['translateResult'][0][0]['tgt'], targetsampa))
# printf ("%s%s --> %s%s" %(result['translateResult'][0][0]['src'], originsampa,
# result['translateResult'][0][0]['tgt'], targetsampa))
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
2.1.1 测试
lod/lod*/ --> [‘lewd’, ‘old’, ‘dol’, ‘lid’, ‘lo’, ‘od’, ‘lode’, ‘load’, ‘lord’, ‘clod’, ‘loud’, ‘plod’, ‘led’, ‘sod’, ‘lad’]
2.2 cdtm
通过同样的方式进行改造。
※ 总 结 ※
通过初步的寻找,最终还是没有能够找到合适的单词同义词的寻找软件包。不过使用 enchant 可以对于输入错误的单词找到相似的单词。
■ 相关文献链接:
文章来源: zhuoqing.blog.csdn.net,作者:卓晴,版权归原作者所有,如需转载,请联系作者。
原文链接:zhuoqing.blog.csdn.net/article/details/123022016
- 点赞
- 收藏
- 关注作者
评论(0)