ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
Paper
•
2106.16038
•
Published
本项目是将ChineseBERT进行了加工,可供使用者直接使用HuggingFace API进行调用,无需再进行多余的代码配置。
原论文地址: ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information Zijun Sun, Xiaoya Li, Xiaofei Sun, Yuxian Meng, Xiang Ao, Qing He, Fei Wu and Jiwei Li
原项目地址: ChineseBERT github link
原模型地址: ShannonAI/ChineseBERT-base (该模型无法直接使用HuggingFace API调用)
pip install pypinyin
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("iioSnail/ChineseBERT-base", trust_remote_code=True)
model = AutoModel.from_pretrained("iioSnail/ChineseBERT-base", trust_remote_code=True)
inputs = tokenizer(["我 喜 [MASK] 猫"], return_tensors='pt')
logits = model(**inputs).logits
print(tokenizer.decode(logits.argmax(-1)[0, 1:-1]))
输出:
tokenizer.decode(logits.argmax(-1)[0, 1:-1])
获取hidden_state的方法:
model.bert(**inputs).last_hidden_state
Connection Error解决方案:将模型下载到本地使用。批量下载方案可参考该博客
ModuleNotFoundError: No module named 'transformers_modules.iioSnail/ChineseBERT-base'解决方案:将 iioSnail/ChineseBERT-base 改为 iioSnail\ChineseBERT-base