- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

如何使用Python获得单词的同义词？

tsinghuazhuoqing 发表于 2022/02/20 23:54:14 2022/02/20

【摘要】 ...

§01 同义词典

认知一个英语单词除了对应翻译、读音，列些出所有的同义词帮助建立起单词相互支撑网络，提高单词使用准确性。

1.1 py-thesaurus

从 py-thesaurus 可以看到相应的安装与使用。

1.1.1 安装软件

pip install py_thesaurus

1.1.2 测试软件

from head import *

from py_thesaurus import Thesaurus
input_word = "apple"

new_instance = Thesaurus(input_word)

print(new_instance.get_synonym())

print(new_instance.get_synonym(pos='verb'))

print(new_instance.get_synonym(pos='adj'))

print(new_instance.get_definition())

print(new_instance.get_antonym())

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16

（1）输出结果

---------- [PYTHON INFOR] ----------
[]
[]
[]
Give a non-empty argument
[]
[]

  
 
  1
  2
  3
  4
  5
  6
  7

可以看到这个软件并没有工作。

1.2 nltk wordnet

1.2.1 软件安装

pip install nltk

1.2.2 测试软件

from head import *

from nltk.corpus import wordnet

synonyms = []

for syn in wordnet.synsets("Soil"):
    for lm in syn.lemmas():
             synonyms.append(lm.name())
print (set(synonyms))

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10

Traceback (most recent call last):
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 86, in __load
    root = nltk.data.find('{}/{}'.format(self.subdir, zip_name))
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\data.py", line 699, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource [93mwordnet[0m not found.
  Please use the NLTK Downloader to obtain the resource:
  [31m>>> import nltk
  >>> nltk.download('wordnet')
  [0m
  Attempted to load [93mcorpora/wordnet.zip/wordnet/[0m
  Searched in:
    - 'C:\\Users\\zhuoqing/nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\share\\nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\lib\\nltk_data'
    - 'C:\\Users\\zhuoqing\\AppData\\Roaming\\nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
**********************************************************************
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "D:\Temp\TEMP0002\test2.PY", line 16, in <module>
    for syn in wordnet.synsets("Soil"):
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 123, in __getattr__
    self.__load()
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 88, in __load
    raise e
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\corpus\util.py", line 83, in __load
    root = nltk.data.find('{}/{}'.format(self.subdir, self.__name))
  File "C:\Users\zhuoqing\Anaconda3\lib\site-packages\nltk\data.py", line 699, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource [93mwordnet[0m not found.
  Please use the NLTK Downloader to obtain the resource:
  [31m>>> import nltk
  >>> nltk.download('wordnet')
  [0m
  Attempted to load [93mcorpora/wordnet[0m
  Searched in:
    - 'C:\\Users\\zhuoqing/nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\share\\nltk_data'
    - 'C:\\Users\\zhuoqing\\Anaconda3\\lib\\nltk_data'
    - 'C:\\Users\\zhuoqing\\AppData\\Roaming\\nltk_data'
    - 'C:\\nltk_data'
    - 'D:\\nltk_data'
    - 'E:\\nltk_data'
**********************************************************************

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53

测试出现错误。

1.3 PyDictionary

在 PyDictionary 可以获得PYDictionary 安装与使用方法。

1.3.1 软件包安装

pip install pydictionary

便可以成功安装软件包。

1.3.2 测试软件包

from head import *

from PyDictionary import PyDictionary

dictionary = PyDictionary()

str = dictionary.meaning('indentation')

printf(str)

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9

{‘Noun’: [‘a concave cut into a surface or edge (as in a coastline’, ‘the formation of small pits in a surface as a consequence of corrosion’, ‘the space left between the margin and the start of an indented line’, ‘the act of cutting into an edge with toothlike notches or angular incisions’]}

str = dictionary.meaning('apple')

  
 
  1

{'Noun': ['fruit with red or yellow or green skin and sweet to tart crisp whitish flesh', 'native Eurasian tree widely cultivated in many varieties for its firm rounded edible fruits']}

  
 
  1

消耗时间： 8.822秒。

这个软件是在太消耗时间了。

1.4 pyenchant

这是寻找与输入单词相近的词语调用。用于对单词进行修改。

pip install pyenchant

1.4.1 测试软件

from head import *

import enchant

d = enchant.Dict('en_US')

printf(d.suggest('aple'))

  
 
  1
  2
  3
  4
  5
  6
  7

['pale', 'ape', 'ale', 'ample', 'maple', 'apple', 'plea', 'apse', 'able', 'Apple']

  
 
  1

printf(d.suggest('window'))

  
 
  1

['window', 'widow', 'windows', 'windrow', 'winnow', 'win dow', 'win-dow', 'wind ow', 'wind-ow']

  
 
  1

§02 程序改造

最终利用 pyenchant软件包对于在 cal, cdtm 中的翻译功能，如果碰到错误单词，作为使用 enchant 寻找相近单词。

2.1 cal

    if targetsampa.find('*') >= 0:
        sss = result['translateResult'][0][0]['src']
        printf ("%s%s --> %s" %(result['translateResult'][0][0]['src'], originsampa,
                               dict.suggest(sss)))
    else:
        printf ("%s%s --> %s%s" %(result['translateResult'][0][0]['src'], originsampa,
                              result['translateResult'][0][0]['tgt'], targetsampa))

#    printf ("%s%s --> %s%s" %(result['translateResult'][0][0]['src'], originsampa,
#                              result['translateResult'][0][0]['tgt'], targetsampa))

  
 
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10

2.1.1 测试

lod/lod*/ --> [‘lewd’, ‘old’, ‘dol’, ‘lid’, ‘lo’, ‘od’, ‘lode’, ‘load’, ‘lord’, ‘clod’, ‘loud’, ‘plod’, ‘led’, ‘sod’, ‘lad’]

2.2 cdtm

通过同样的方式进行改造。

※ 总结 ※

通过初步的寻找，最终还是没有能够找到合适的单词同义词的寻找软件包。不过使用 enchant 可以对于输入错误的单词找到相似的单词。

■ 相关文献链接:

py-thesaurus
PyDictionary

文章来源: zhuoqing.blog.csdn.net，作者：卓晴，版权归原作者所有，如需转载，请联系作者。

原文链接：zhuoqing.blog.csdn.net/article/details/123022016

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

如何使用Python获得单词的同义词？

§01 同义词典

1.1 py-thesaurus

1.1.1 安装软件

1.1.2 测试软件

（1）输出结果

1.2 nltk wordnet

1.2.1 软件安装

1.2.2 测试软件

1.3 PyDictionary

1.3.1 软件包安装

1.3.2 测试软件包

1.4 pyenchant

1.4.1 测试软件

§02 程序改造

2.1 cal

2.1.1 测试

2.2 cdtm

※ 总结 ※

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

如何使用Python获得单词的同义词？

§01 同义词典

1.1 py-thesaurus

1.1.1 安装软件

1.1.2 测试软件

（1）输出结果

1.2 nltk wordnet

1.2.1 软件安装

1.2.2 测试软件

1.3 PyDictionary

1.3.1 软件包安装

1.3.2 测试软件包

1.4 pyenchant

1.4.1 测试软件

§02 程序改造

2.1 cal

2.1.1 测试

2.2 cdtm

※ 总 结 ※

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

推荐阅读

相关产品

※ 总结 ※