Commit 61314e57 authored by lishen's avatar lishen

[init]

parents
# QABasedOnMedicaKnowledgeGraph
self-implement of disease centered Medical graph from zero to full and sever as question answering base. 从无到有搭建一个以疾病为中心的一定规模医药领域知识图谱,并以该知识图谱完成自动问答与分析服务。
# 项目介绍
知识图谱是目前自然语言处理的一个热门方向,关于较全面的参考资料,可以查看我的ccks2018参会总结(https://github.com/liuhuanyong/CCKS2018Summary )。
与知识图谱相关的另一种形态,即事理图谱,本人在这方面也尝试性地积累了一些工作,可参考:(https://github.com/liuhuanyong/ComplexEventExtraction )
关于知识图谱概念性的介绍就不在此赘述。目前知识图谱在各个领域全面开花,如教育、医疗、司法、金融等。本项目立足医药领域,以垂直型医药网站为数据来源,以疾病为核心,构建起一个包含7类规模为4.4万的知识实体,11类规模约30万实体关系的知识图谱。
本项目将包括以下两部分的内容:
1) 基于垂直网站数据的医药知识图谱构建
2) 基于医药知识图谱的自动问答
# 项目最终效果
话不多少,直接上图。以下两图是实际问答运行过程中的截图:
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/chat1.png)
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/chat2.png)
# 项目运行方式
1、配置要求:要求配置neo4j数据库及相应的python依赖包。neo4j数据库用户名密码记住,并修改相应文件。
2、知识图谱数据导入:python build_medicalgraph.py,导入的数据较多,估计需要几个小时。
3、启动问答:python chat_graph.py
# 以下介绍详细方案
# 一、医疗知识图谱构建
# 1.1 业务驱动的知识图谱构建框架
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/kg_route.png)
# 1.2 脚本目录
prepare_data/datasoider.py:网络资讯采集脚本
prepare_data/datasoider.py:网络资讯采集脚本
prepare_data/max_cut.py:基于词典的最大向前/向后切分脚本
build_medicalgraph.py:知识图谱入库脚本   
# 1.3 医药领域知识图谱规模
1.3.1 neo4j图数据库存储规模
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/graph_summary.png)
1.3.2 知识图谱实体类型
| 实体类型 | 中文含义 | 实体数量 |举例 |
| :--- | :---: | :---: | :--- |
| Check | 诊断检查项目 | 3,353| 支气管造影;关节镜检查|
| Department | 医疗科目 | 54 | 整形美容科;烧伤科|
| Disease | 疾病 | 8,807 | 血栓闭塞性脉管炎;胸降主动脉动脉瘤|
| Drug | 药品 | 3,828 | 京万红痔疮膏;布林佐胺滴眼液|
| Food | 食物 | 4,870 | 番茄冲菜牛肉丸汤;竹笋炖羊肉|
| Producer | 在售药品 | 17,201 | 通药制药青霉素V钾片;青阳醋酸地塞米松片|
| Symptom | 疾病症状 | 5,998 | 乳腺组织肥厚;脑实质深部出血|
| Total | 总计 | 44,111 | 约4.4万实体量级|
1.3.3 知识图谱实体关系类型
| 实体关系类型 | 中文含义 | 关系数量 | 举例|
| :--- | :---: | :---: | :--- |
| belongs_to | 属于 | 8,844| <妇科,属于,妇产科>|
| common_drug | 疾病常用药品 | 14,649 | <阳强,常用,甲磺酸酚妥拉明分散片>|
| do_eat |疾病宜吃食物 | 22,238| <胸椎骨折,宜吃,黑鱼>|
| drugs_of | 药品在售药品 | 17,315| <青霉素V钾片,在售,通药制药青霉素V钾片>|
| need_check | 疾病所需检查 | 39,422| <单侧肺气肿,所需检查,支气管造影>|
| no_eat | 疾病忌吃食物 | 22,247| <唇病,忌吃,杏仁>|
| recommand_drug | 疾病推荐药品 | 59,467 | <混合痔,推荐用药,京万红痔疮膏>|
| recommand_eat | 疾病推荐食谱 | 40,221 | <鞘膜积液,推荐食谱,番茄冲菜牛肉丸汤>|
| has_symptom | 疾病症状 | 5,998 | <早期乳腺癌,疾病症状,乳腺组织肥厚>|
| acompany_with | 疾病并发疾病 | 12,029 | <下肢交通静脉瓣膜关闭不全,并发疾病,血栓闭塞性脉管炎>|
| Total | 总计 | 294,149 | 约30万关系量级|
1.3.4 知识图谱属性类型
| 属性类型 | 中文含义 | 举例 |
| :--- | :---: | :---: |
| name | 疾病名称 | 喘息样支气管炎 |
| desc | 疾病简介 | 又称哮喘性支气管炎... |
| cause | 疾病病因 | 常见的有合胞病毒等...|
| prevent | 预防措施 | 注意家族与患儿自身过敏史... |
| cure_lasttime | 治疗周期 | 6-12个月 |
| cure_way | 治疗方式 | "药物治疗","支持性治疗" |
| cured_prob | 治愈概率 | 95% |
| easy_get | 疾病易感人群 | 无特定的人群 |
# 二、基于医疗知识图谱的自动问答
# 2.1 技术架构
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/qa_route.png)
# 2.2 脚本结构
question_classifier.py:问句类型分类脚本
question_parser.py:问句解析脚本
chatbot_graph.py:问答程序脚本
# 2.3 支持问答类型
| 问句类型 | 中文含义 | 问句举例 |
| :--- | :---: | :---: |
| disease_symptom | 疾病症状| 乳腺癌的症状有哪些? |
| symptom_disease | 已知症状找可能疾病 | 最近老流鼻涕怎么办? |
| disease_cause | 疾病病因 | 为什么有的人会失眠?|
| disease_acompany | 疾病的并发症 | 失眠有哪些并发症? |
| disease_not_food | 疾病需要忌口的食物 | 失眠的人不要吃啥? |
| disease_do_food | 疾病建议吃什么食物 | 耳鸣了吃点啥? |
| food_not_disease | 什么病最好不要吃某事物 | 哪些人最好不好吃蜂蜜? |
| food_do_disease | 食物对什么病有好处| 鹅肉有什么好处? |
| disease_drug | 啥病要吃啥药 | 肝病要吃啥药? |
| drug_disease | 药品能治啥病 | 板蓝根颗粒能治啥病? |
| disease_check | 疾病需要做什么检查 | 脑膜炎怎么才能查出来?|
| check_disease | 检查能查什么病 | 全血细胞计数能查出啥来? |
| disease_prevent | 预防措施| 怎样才能预防肾虚? |
| disease_lasttime | 治疗周期 | 感冒要多久才能好? |
| disease_cureway | 治疗方式 | 高血压要怎么治? |
| disease_cureprob | 治愈概率 | 白血病能治好吗? |
| disease_easyget | 疾病易感人群 | 什么人容易得高血压? |
| disease_desc | 疾病描述 | 糖尿病 |
# 问答结果展示
用户:乳腺癌的症状有哪些?
小勇: 乳腺癌的症状包括:乳腺癌的远处转移;胸痛;乳头溢液;乳头破碎;肝肿大;泌乳障碍;乳头内陷;乳房肿块;剧痛
******************************************************************************************
用户:最近老流鼻涕怎么办?
小勇: 症状流鼻涕可能染上的疾病有:枯草热;副流行性感冒;急性上呼吸道感染;硫化氢中毒;小儿衣原体肺炎;风寒感冒;慢性额窦炎;鼻源性头痛;人禽流行性感冒;小儿流行性喘憋性肺炎;病毒性感冒;慢性鼻炎;风热犯肺;感冒;顿呛;小儿急性上呼吸道感染;嗜酸细胞增多性非变态反应性鼻炎;干酪性鼻窦炎;下呼吸道感染;麻疹
******************************************************************************************
用户:为什么有的人会失眠?
小勇: 失眠可能的成因有:躯体疾病和服用药物可以影响睡眠,如消化不良,头痛,背痛,关节炎,心脏病,糖尿病,哮喘,鼻窦炎,溃疡病,或服用某些影响中枢神经的药物。
由于生活方式引起睡眠问题也很常见,如饮用咖啡或茶叶,晚间饮酒,睡前进食或晚饭较晚造成满腹食物尚未消化,大量吸烟,睡前剧烈的体力活动,睡前过度的精神活动,夜班工作,白天小睡,上床时间不规律,起床时间不规律。
可能的原因有压力很大,过度忧虑,紧张或焦虑,悲伤或抑郁,生气,容易出现睡眠问题。
吵闹的睡眠环境,睡眠环境过于明亮,污染,过度拥挤。
******************************************************************************************
用户:失眠有哪些并发症?
小勇: 失眠的症状包括:心肾不交;神经性耳鸣;咽鼓管异常开放症;偏执狂;十二指肠胃反流及胆汁反流性胃炎;腋臭;黧黑斑;巨细胞动脉炎;Stargardt病;抑郁症;腔隙性脑梗死;甲状腺功能亢进伴发的精神障碍;紧张性头痛;胃下垂;心血虚;迷路震荡;口腔结核性溃疡;痰饮;游走性结节性脂膜炎;小儿脑震荡
******************************************************************************************
用户:失眠的人不要吃啥?
小勇: 失眠忌食的食物包括有:油条;河蚌;猪油(板油);淡菜(鲜)
******************************************************************************************
用户:耳鸣了吃点啥?
小勇: 耳鸣宜食的食物包括有:南瓜子仁;鸡翅;芝麻;腰果
推荐食谱包括有:紫菜芙蓉汤;羊肉汤面;油豆腐油菜;紫菜鸡蛋莲草汤;乌药羊肉汤;可乐鸡翅;栗子鸡翅;冬菇油菜心
******************************************************************************************
用户:哪些人最好不好吃蜂蜜?
小勇: 患有散发性脑炎伴发的精神障碍;情感性心境障碍;蝎螫伤;四肢淋巴水肿;农药中毒所致的精神障碍;肝错构瘤;细菌性肺炎;急性高原病;小儿颅后窝室管膜瘤;柯萨奇病毒疹;眼眶静脉性血管瘤;乙脑伴发的精神障碍;晚期产后出血;吸入性肺炎;腓总神经损伤;铍及其化合物引起的皮肤病;猝死型冠心病;彼得异常;过敏性急性小管间质性肾炎;小儿腹胀的人最好不要吃蜂蜜
******************************************************************************************
用户:鹅肉有什么好处?
小勇: 患有子宫内膜厚;呼吸疾病;肛肠病;闭经;丧偶后适应性障碍;宫颈外翻;巨球蛋白血症;急性颌下腺炎;锥体外系损害;腺样体炎;咳嗽;错构瘤;牙科病;子宫内膜炎;闭锁综合征;结膜炎;恶性淋巴瘤;足外翻;神经炎;病理性近视的人建议多试试鹅肉
******************************************************************************************
用户:肝病要吃啥药?
小勇: 肝病宜食的食物包括有:鹅肉;鸡肉;鸡肝;鸡腿
推荐食谱包括有:小米红糖粥;小米蛋奶粥;扁豆小米粥;黄豆小米粥;人参小米粥;小米粉粥;鲜菇小米粥;芝麻小米粥
肝病通常的使用的药品包括:恩替卡韦分散片;维生素C片;二十五味松石丸;拉米夫定胶囊;阿德福韦酯片
******************************************************************************************
用户:板蓝根颗粒能治啥病?
小勇: 板蓝根颗粒主治的疾病有流行性腮腺炎;喉痹;喉炎;咽部异感症;急性单纯性咽炎;腮腺隙感染;过敏性咽炎;咽囊炎;急性鼻咽炎;喉水肿;慢性化脓性腮腺炎;慢性咽炎;急性喉炎;咽异感症;鼻咽炎;锁喉痈;小儿咽喉炎;喉返神经损伤;化脓性腮腺炎;喉血管瘤,可以试试
******************************************************************************************
用户:脑膜炎怎么才能查出来?
小勇: 脑膜炎通常可以通过以下方式检查出来:脑脊液钠;尿常规;Fisher手指试验;颈项强直;脑脊液细菌培养;尿谷氨酰胺;脑脊液钾;脑脊液天门冬氨酸氨基转移酶;脑脊液病原体检查;硝酸盐还原试验
******************************************************************************************
用户:怎样才能预防肾虚?
小勇: 肾虚可能的成因有:1、多因房劳过度,或少年频繁手淫。2、思虑忧郁,损伤心脾,则病及阳明冲脉。3、恐惧伤肾,恐则伤肾。4、肝主筋,阴器为宗筋之汇,若情志不遂,忧思郁怒,肝失疏泄条达,则宗筋所聚无能。5、湿热下注,宗筋弛纵。
肾虚是肾脏精气阴阳不足所产生的诸如精神疲乏、头晕耳鸣、健忘脱发、腰脊酸痛、遗精阳痿、男子不育、女子不孕、更年期综合征等多种病证的一个综合概念。关于肾虚形成的原因,可归结为两个方面,一为先天禀赋不足,二为后天因素引起。
从引起肾虚的先天因素来看,首先是先天禀赋薄弱。《灵枢.寿天刚柔》篇说:“人之生也,有刚有柔,有弱有强。”由于父母体弱多病,精血亏虚时怀孕;或酒后房事怀孕;或年过五十精气力量大减之时怀孕;或男女双方年龄不够,身体发育不完全结婚,也就是早婚时怀孕,或生育过多,精血过度耗损;或妊娠期中失于调养,胎气不足等等都可导致肾的精气亏虚成为肾虚证形成的重要原因;其次,如果肾藏精功能失常就会导致性功能异常,生殖功能下降,影响生殖能力,便会引起下一代形体虚衰,或先天畸形、痴呆、缺陷、男子出现精少不育、早泄,女子出现闭经不孕、小产、习惯性流产等等。
肾虚的预防措施包括:肾虚日常预防
在预防方面,因起病与恣情纵欲有关的,应清心寡欲,戒除手淫;如与全身衰弱、营养不良或身心过劳有关的,应适当增加营养或注意劳逸结合,节制性欲。
1、性生活要适度,不勉强,不放纵。
2、饮食方面:无力疲乏时多吃含铁、蛋白质的食物,如木耳、大枣、乌鸡等;消化不良者多喝酸奶,吃山楂;平日护肾要多吃韭菜、海参、人参、乌鸡、家鸽等。
3、经常进行腰部活动,这些运动可以健运命门,补肾纳气。还可多做一些刺激脚心的按摩,中医认为,脚心的涌泉穴是浊气下降的地方,经常按摩涌泉穴,可益精补肾、强身健体、防止早衰,并能舒肝明目,清喉定心,促进睡眠,增进食欲。
4、充足的睡眠也是恢复精气神的重要保障,工作再紧张,家里的烦心事再多,到了该睡觉的时候也要按时休息。
健康教育
1、过度苦寒、冰凉的食物易伤肾,如芦荟、苦瓜、雪糕、鹅肉、啤酒进食过多都伤肾,应该多食黑色素含量高和温补性中药如黑米黑豆等。
2、男性接触过多的洗涤剂也伤肾,家庭应少用洗涤剂清洗餐具及蔬果,以免洗涤剂残留物被过多摄入。
3、适当运动可延缓衰老,但强度不宜太大,应选能力所及的运动项目,以促进血液循环,可改善血淤、气损等情况。散步、慢跑、快步走,或在鹅卵石上赤足适当行走,都会促进血液循环,对肾虚有辅助治疗作用。
4、保持良好的作息习惯,尽量避免熬夜。
5、积极参加户外运动,放松心情。
6、不要给自己太大的压力,学会合理减压。
******************************************************************************************
用户:感冒要多久才能好?
小勇: 感冒治疗可能持续的周期为:7-14天
******************************************************************************************
用户:高血压要怎么治?
小勇: 高血压可以尝试如下治疗:药物治疗;手术治疗;支持性治疗
******************************************************************************************
用户:白血病能治好吗?
小勇: 白血病治愈的概率为(仅供参考):50%-70%
******************************************************************************************
用户:什么人容易得高血压?
小勇: 高血压的易感人群包括:有高血压家族史,不良的生活习惯,缺乏运动的人群
******************************************************************************************
用户:糖尿病
小勇: 糖尿病,熟悉一下:糖尿病是一种比较常见的内分泌代谢性疾病。该病发病原因主要是由于胰岛素分泌不足,以及胰升高血糖素不适当地分泌过多所引起。多见于40岁以上喜食甜食而肥胖的病人,城市多于农村,常有家族史,故与遗传有关。少数病人与病毒感染和自身免疫反应有关。主要表现为烦渴、多饮、多尿、多食、乏力、消瘦等症状。生命的常见病,伴发高血压、冠心病、高脂血症等,严重时危及生命。
中医学认为,肝主疏泄,关系人体接收机的升降与调畅,肝气郁滞则气机升降输布紊乱,肝失疏泄则血糖等精微物质不能随清阳之气输布于周身而郁滞于血中,出现高血糖或精微物质的输布紊乱,反见血糖升高,进一步导致血脂、蛋白等其它精微物质紊乱,引起其他合并症,治疗以疏肝调气为主,顺肝条达之性以恢复其生理功能,肝气条达,气机调畅,精微得以输布,糖被利用而血糖自然下降。
另外,因糖尿病的发生和饮食有关,饮食控制的好坏直接影响着治疗的效果。再就是配合运动,注意调摄情志,再适当的配合中药治疗会取得良好的治疗效果。 
******************************************************************************************
用户:全血细胞计数能查出啥来
小勇: 通常可以通过全血细胞计数检查出来的疾病有成人类风湿性关节炎性巩膜炎;外阴-阴道-牙龈综合征;电击伤;老年收缩期高血压;小儿肝硬化;异常血红蛋白病;痴呆综合征;高血压病伴发的精神障碍;睾丸淋巴瘤;叶酸缺乏所致贫血;眼球内炎;不稳定血红蛋白病;类癌综合征;老年痴呆;急性淋巴管炎;宫颈妊娠;蚕食性角膜溃疡;低增生性急性白血病;交感性眼炎;原发性免疫缺陷病
# 总结
1、本项目完成了从无到有,以垂直网站为数据来源,构建起以疾病为中心的医疗知识图谱,实体规模4.4万,实体关系规模30万。并基于此,搭建起了一个可以回答18类问题的自动问答小系统,总共耗时3天。其中,数据采集与整理1天,知识图谱构建与入库0.5天,问答系统组件1.5天。总的来说,还是比较快速。
2、本项目以业务驱动,构建医疗知识图谱,知识schema设计基于所采集的结构化数据生成(对网页结构化数据进行xpath解析)。
3、本项目以neo4j作为存储,并基于传统规则的方式完成了知识问答,并最终以cypher查询语句作为问答搜索sql,支持了问答服务。
4、本项目可以快速部署,数据已经放在data/medical.json当中,本项目的数据,如侵犯相关单位权益,请联系我删除。本数据请勿商用,以免引起不必要的纠纷。在本项目中的部署上,可以遵循项目运行步骤,完成数据库搭建,并提供搜索服务。
5、本项目还有不足:关于疾病的起因、预防等,实际返回的是一大段文字,这里其实可以引入事件抽取的概念,进一步将原因结构化表示出来。这个可以后面进行尝试。
If any question about the project or me ,see https://liuhuanyong.github.io/
如有自然语言处理、知识图谱、事理图谱、社会计算、语言资源建设等问题或合作,可联系我:
1、我的github项目介绍:https://liuhuanyong.github.io
2、我的csdn博客:https://blog.csdn.net/lhy2014
3、about me:刘焕勇,中国科学院软件研究所,lhy_in_blcu@126.com.
4、我的技术公众号:老刘说NLP,扫码一键关注:
![image](https://github.com/liuhuanyong/QABasedOnMedicalKnowledgeGraph/blob/master/img/wechat.jpg)
#!/usr/bin/env python3
# coding: utf-8
# File: answer_search.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-5
from sm4_utils import CryptSM4
from type_enum import word_type_enum, question_type_enum
class AnswerSearcher:
def __init__(self, g):
self.g = g
self.num_limit = 20
self.sm4 = CryptSM4('mkQXUpzFZpD3JLrw')
'''执行cypher查询,并返回相应结果'''
def search_main(self, sqls):
final_answers = []
for sql_ in sqls:
question_type = sql_['question_type']
queries = sql_['sql']
answers = []
for query in queries:
ress = self.g.run(query).data()
answers += ress
final_answer = self.answer_prettify(question_type, answers)
if final_answer:
final_answers.append(final_answer)
return final_answers
'''根据对应的qustion_type,调用相应的回复模板'''
def answer_prettify(self, question_type, answers):
final_answer = []
if not answers:
return ''
if question_type == question_type_enum.DISEASE_SYMPTOM:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.SYMPTOM_DISEASE:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '症状{0}可能染上的疾病有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_CAUSE:
desc = [self.sm4.decryptSM4(i['m.cause']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}可能的成因有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_PREVENT:
desc = [self.sm4.decryptSM4(i['m.prevent']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的预防措施包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_LASTTIME:
desc = [self.sm4.decryptSM4(i['m.cure_lasttime']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}治疗可能持续的周期为:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_CUREWAY:
desc = [self.sm4.decryptSM4(i['m.cure_way']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}可以尝试如下治疗:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_CUREPROB:
desc = [self.sm4.decryptSM4(i['m.cured_prob']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}治愈的概率为(仅供参考):{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_EASYGET:
desc = [self.sm4.decryptSM4(i['m.easy_get']) for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的易感人群包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_DESC:
desc = [i['m.desc'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0},熟悉一下:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_ACOMPANY:
desc1 = [i['n.name'] for i in answers]
desc2 = [i['m.name'] for i in answers]
subject = answers[0]['m.name']
desc = [i for i in desc1 + desc2 if i != subject]
final_answer = '{0}的症状包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_NOT_FOOD:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}忌食的食物包括有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_DO_FOOD:
do_desc = [i['n.name'] for i in answers if i['r.name'] == '宜吃']
recommand_desc = [i['n.name'] for i in answers if i['r.name'] == '推荐食谱']
subject = answers[0]['m.name']
final_answer = '{0}宜食的食物包括有:{1}\n推荐食谱包括有:{2}'.format(subject, ';'.join(list(set(do_desc))[:self.num_limit]), ';'.join(list(set(recommand_desc))[:self.num_limit]))
elif question_type == question_type_enum.FOOD_NOT_DISEASE:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '患有{0}的人最好不要吃{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject)
elif question_type == question_type_enum.FOOD_DO_DISEASE:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '患有{0}的人建议多试试{1}'.format(';'.join(list(set(desc))[:self.num_limit]), subject)
elif question_type == 'disease_drug':
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}通常的使用的药品包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DRUG_DISEASE:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}主治的疾病有{1},可以试试'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.DISEASE_CHECK:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}通常可以通过以下方式检查出来:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.CHECK_DISEASE:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '通常可以通过{0}检查出来的疾病有{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODMATERIAL_NOT_CROWD:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '患有{1}等疾病的人群不适宜食用{0}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODMATERIAL_DO_CROWD:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '患有{1}等疾病的人群适宜食用{0}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_NOT_CROWD:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '患有{1}等疾病的人群不适宜食用{0}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_DO_CROWD:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '患有{1}等疾病的人群适宜食用{0}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_LEVEL_HIGH or question_type == question_type_enum.FOODMATERIAL_LEVEL_HIGH:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}中的{1}较高'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_LEVEL_MID or question_type == question_type_enum.FOODMATERIAL_LEVEL_MID:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}中的{1}适中'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_LEVEL_LOW or question_type == question_type_enum.FOODMATERIAL_LEVEL_LOW:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}中的{1}较低'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.INDEX_LEVEL_HIGH:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}较高的食物有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.INDEX_LEVEL_MID:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}适中的食物有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.INDEX_LEVEL_LOW:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}较低的食物有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_LEVEL_CONTENT:
final_answer = ''
for i in answers:
level = '未知'
if i['type'] == 'HIGH_TO':
level = '较高'
elif i['type'] == 'MIDDLE_TO':
level = '适中'
elif i['type'] == 'LOW_TO':
level = '较低'
final_answer += f"{i['m.name']}的{i['n.name']}属于{level}的水平;"
elif question_type == question_type_enum.FOODPROCESS_COOKING_SPECIALTY or question_type == question_type_enum.FOODMATERIAL_COOKING_SPECIALTY:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的烹饪特点是:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_FOOD_SPECIALTY or question_type == question_type_enum.FOODMATERIAL_FOOD_SPECIALTY:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的膳食特点是:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_PROVINCE or question_type == question_type_enum.FOODMATERIAL_PROVINCE:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}所属地域包括:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.PROVINCE_FOOD:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '{0}的菜品有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_UNIT or question_type == question_type_enum.FOODMATERIAL_UNIT:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的扩展单位是:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_HIGH_CONT or question_type == question_type_enum.FOODMATERIAL_HIGH_CONT:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}富含的元素有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.NUTRIENT_HIGH_CONT:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '富含{0}的食物有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_LOW_CONT or question_type == question_type_enum.FOODMATERIAL_LOW_CONT:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}含量低的元素有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.NUTRIENT_LOW_CONT:
desc = [i['m.name'] for i in answers]
subject = answers[0]['n.name']
final_answer = '含低量{0}的食物有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
elif question_type == question_type_enum.FOODPROCESS_ALIAS:
desc = [i['n.name'] for i in answers]
subject = answers[0]['m.name']
final_answer = '{0}的别名还有:{1}'.format(subject, ';'.join(list(set(desc))[:self.num_limit]))
return final_answer
if __name__ == '__main__':
searcher = AnswerSearcher()
#!/usr/bin/env python3
# coding: utf-8
# File: MedicalGraph.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-3
import os
import json
from py2neo import Graph,Node
class MedicalGraph:
def __init__(self):
cur_dir = '/'.join(os.path.abspath(__file__).split('/')[:-1])
self.data_path = os.path.join(cur_dir, 'data/medical.json')
self.g = Graph(
'http://127.0.0.1:7474',
user="neo4j", # 数据库user name,如果没有更改过,应该是neo4j
password="neo4j")
'''读取文件'''
def read_nodes(self):
# 共7类节点
drugs = [] # 药品
foods = [] # 食物
checks = [] # 检查
departments = [] #科室
producers = [] #药品大类
diseases = [] #疾病
symptoms = []#症状
disease_infos = []#疾病信息
# 构建节点实体关系
rels_department = [] # 科室-科室关系
rels_noteat = [] # 疾病-忌吃食物关系
rels_doeat = [] # 疾病-宜吃食物关系
rels_recommandeat = [] # 疾病-推荐吃食物关系
rels_commonddrug = [] # 疾病-通用药品关系
rels_recommanddrug = [] # 疾病-热门药品关系
rels_check = [] # 疾病-检查关系
rels_drug_producer = [] # 厂商-药物关系
rels_symptom = [] #疾病症状关系
rels_acompany = [] # 疾病并发关系
rels_category = [] # 疾病与科室之间的关系
count = 0
for data in open(self.data_path):
disease_dict = {}
count += 1
print(count)
data_json = json.loads(data)
disease = data_json['name']
disease_dict['name'] = disease
diseases.append(disease)
disease_dict['desc'] = ''
disease_dict['prevent'] = ''
disease_dict['cause'] = ''
disease_dict['easy_get'] = ''
disease_dict['cure_department'] = ''
disease_dict['cure_way'] = ''
disease_dict['cure_lasttime'] = ''
disease_dict['symptom'] = ''
disease_dict['cured_prob'] = ''
if 'symptom' in data_json:
symptoms += data_json['symptom']
for symptom in data_json['symptom']:
rels_symptom.append([disease, symptom])
if 'acompany' in data_json:
for acompany in data_json['acompany']:
rels_acompany.append([disease, acompany])
if 'desc' in data_json:
disease_dict['desc'] = data_json['desc']
if 'prevent' in data_json:
disease_dict['prevent'] = data_json['prevent']
if 'cause' in data_json:
disease_dict['cause'] = data_json['cause']
if 'get_prob' in data_json:
disease_dict['get_prob'] = data_json['get_prob']
if 'easy_get' in data_json:
disease_dict['easy_get'] = data_json['easy_get']
if 'cure_department' in data_json:
cure_department = data_json['cure_department']
if len(cure_department) == 1:
rels_category.append([disease, cure_department[0]])
if len(cure_department) == 2:
big = cure_department[0]
small = cure_department[1]
rels_department.append([small, big])
rels_category.append([disease, small])
disease_dict['cure_department'] = cure_department
departments += cure_department
if 'cure_way' in data_json:
disease_dict['cure_way'] = data_json['cure_way']
if 'cure_lasttime' in data_json:
disease_dict['cure_lasttime'] = data_json['cure_lasttime']
if 'cured_prob' in data_json:
disease_dict['cured_prob'] = data_json['cured_prob']
if 'common_drug' in data_json:
common_drug = data_json['common_drug']
for drug in common_drug:
rels_commonddrug.append([disease, drug])
drugs += common_drug
if 'recommand_drug' in data_json:
recommand_drug = data_json['recommand_drug']
drugs += recommand_drug
for drug in recommand_drug:
rels_recommanddrug.append([disease, drug])
if 'not_eat' in data_json:
not_eat = data_json['not_eat']
for _not in not_eat:
rels_noteat.append([disease, _not])
foods += not_eat
do_eat = data_json['do_eat']
for _do in do_eat:
rels_doeat.append([disease, _do])
foods += do_eat
recommand_eat = data_json['recommand_eat']
for _recommand in recommand_eat:
rels_recommandeat.append([disease, _recommand])
foods += recommand_eat
if 'check' in data_json:
check = data_json['check']
for _check in check:
rels_check.append([disease, _check])
checks += check
if 'drug_detail' in data_json:
drug_detail = data_json['drug_detail']
producer = [i.split('(')[0] for i in drug_detail]
rels_drug_producer += [[i.split('(')[0], i.split('(')[-1].replace(')', '')] for i in drug_detail]
producers += producer
disease_infos.append(disease_dict)
return set(drugs), set(foods), set(checks), set(departments), set(producers), set(symptoms), set(diseases), disease_infos,\
rels_check, rels_recommandeat, rels_noteat, rels_doeat, rels_department, rels_commonddrug, rels_drug_producer, rels_recommanddrug,\
rels_symptom, rels_acompany, rels_category
'''建立节点'''
def create_node(self, label, nodes):
count = 0
for node_name in nodes:
node = Node(label, name=node_name)
self.g.create(node)
count += 1
print(count, len(nodes))
return
'''创建知识图谱中心疾病的节点'''
def create_diseases_nodes(self, disease_infos):
count = 0
for disease_dict in disease_infos:
node = Node("Disease", name=disease_dict['name'], desc=disease_dict['desc'],
prevent=disease_dict['prevent'] ,cause=disease_dict['cause'],
easy_get=disease_dict['easy_get'],cure_lasttime=disease_dict['cure_lasttime'],
cure_department=disease_dict['cure_department']
,cure_way=disease_dict['cure_way'] , cured_prob=disease_dict['cured_prob'])
self.g.create(node)
count += 1
print(count)
return
'''创建知识图谱实体节点类型schema'''
def create_graphnodes(self):
Drugs, Foods, Checks, Departments, Producers, Symptoms, Diseases, disease_infos,rels_check, rels_recommandeat, rels_noteat, rels_doeat, rels_department, rels_commonddrug, rels_drug_producer, rels_recommanddrug,rels_symptom, rels_acompany, rels_category = self.read_nodes()
self.create_diseases_nodes(disease_infos)
self.create_node('Drug', Drugs)
print(len(Drugs))
self.create_node('Food', Foods)
print(len(Foods))
self.create_node('Check', Checks)
print(len(Checks))
self.create_node('Department', Departments)
print(len(Departments))
self.create_node('Producer', Producers)
print(len(Producers))
self.create_node('Symptom', Symptoms)
return
'''创建实体关系边'''
def create_graphrels(self):
Drugs, Foods, Checks, Departments, Producers, Symptoms, Diseases, disease_infos, rels_check, rels_recommandeat, rels_noteat, rels_doeat, rels_department, rels_commonddrug, rels_drug_producer, rels_recommanddrug,rels_symptom, rels_acompany, rels_category = self.read_nodes()
self.create_relationship(word_type_enum.DISEASE, 'Food', rels_recommandeat, 'recommand_eat', '推荐食谱')
self.create_relationship(word_type_enum.DISEASE, 'Food', rels_noteat, 'no_eat', '忌吃')
self.create_relationship(word_type_enum.DISEASE, 'Food', rels_doeat, 'do_eat', '宜吃')
self.create_relationship('Department', 'Department', rels_department, 'belongs_to', '属于')
self.create_relationship(word_type_enum.DISEASE, 'Drug', rels_commonddrug, 'common_drug', '常用药品')
self.create_relationship('Producer', 'Drug', rels_drug_producer, 'drugs_of', '生产药品')
self.create_relationship(word_type_enum.DISEASE, 'Drug', rels_recommanddrug, 'recommand_drug', '好评药品')
self.create_relationship(word_type_enum.DISEASE, 'Check', rels_check, 'need_check', '诊断检查')
self.create_relationship(word_type_enum.DISEASE, 'Symptom', rels_symptom, 'has_symptom', '症状')
self.create_relationship(word_type_enum.DISEASE, word_type_enum.DISEASE, rels_acompany, 'acompany_with', '并发症')
self.create_relationship(word_type_enum.DISEASE, 'Department', rels_category, 'belongs_to', '所属科室')
'''创建实体关联边'''
def create_relationship(self, start_node, end_node, edges, rel_type, rel_name):
count = 0
# 去重处理
set_edges = []
for edge in edges:
set_edges.append('###'.join(edge))
all = len(set(set_edges))
for edge in set(set_edges):
edge = edge.split('###')
p = edge[0]
q = edge[1]
query = "match(p:%s),(q:%s) where p.name='%s'and q.name='%s' create (p)-[rel:%s{name:'%s'}]->(q)" % (
start_node, end_node, p, q, rel_type, rel_name)
try:
self.g.run(query)
count += 1
print(rel_type, count, all)
except Exception as e:
print(e)
return
'''导出数据'''
def export_data(self):
Drugs, Foods, Checks, Departments, Producers, Symptoms, Diseases, disease_infos, rels_check, rels_recommandeat, rels_noteat, rels_doeat, rels_department, rels_commonddrug, rels_drug_producer, rels_recommanddrug, rels_symptom, rels_acompany, rels_category = self.read_nodes()
f_drug = open('drug.txt', 'w+')
f_food = open('food.txt', 'w+')
f_check = open('check.txt', 'w+')
f_department = open('department.txt', 'w+')
f_producer = open('producer.txt', 'w+')
f_symptom = open('symptoms.txt', 'w+')
f_disease = open('disease.txt', 'w+')
f_drug.write('\n'.join(list(Drugs)))
f_food.write('\n'.join(list(Foods)))
f_check.write('\n'.join(list(Checks)))
f_department.write('\n'.join(list(Departments)))
f_producer.write('\n'.join(list(Producers)))
f_symptom.write('\n'.join(list(Symptoms)))
f_disease.write('\n'.join(list(Diseases)))
f_drug.close()
f_food.close()
f_check.close()
f_department.close()
f_producer.close()
f_symptom.close()
f_disease.close()
return
if __name__ == '__main__':
handler = MedicalGraph()
print("step1:导入图谱节点中")
handler.create_graphnodes()
print("step2:导入图谱边中")
handler.create_graphrels()
#!/usr/bin/env python3
# coding: utf-8
# File: chatbot_graph.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-4
from py2neo import Graph
from question_classifier import *
from question_parser import *
from answer_search import *
'''问答类'''
class ChatBotGraph:
def __init__(self):
self.g = Graph(
'http://127.0.0.1:7474',
user="neo4j",
password="neo4j")
self.classifier = QuestionClassifier(self.g)
self.parser = QuestionPaser()
self.searcher = AnswerSearcher(self.g)
def chat_main(self, sent):
answer = '您好,我是医药智能助理-小益,希望可以帮到您。如果没答上来,可联系人工客服。祝您身体棒棒!'
res_classify = self.classifier.classify(sent)
if not res_classify:
return answer
res_sql = self.parser.parser_main(res_classify)
final_answers = self.searcher.search_main(res_sql)
if not final_answers:
return answer
else:
return '\n'.join(final_answers)
if __name__ == '__main__':
handler = ChatBotGraph()
while 1:
question = input('用户:')
answer = handler.chat_main(question)
print('小益:', answer)
#!/usr/bin/env python3
# coding: utf-8
# File: build_data.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-3
import pymongo
from lxml import etree
import os
from max_cut import *
class MedicalGraph:
def __init__(self):
self.conn = pymongo.MongoClient()
cur_dir = '/'.join(os.path.abspath(__file__).split('/')[:-1])
self.db = self.conn['medical']
self.col = self.db['data']
first_words = [i.strip() for i in open(os.path.join(cur_dir, 'first_name.txt'))]
alphabets = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y', 'z']
nums = ['1','2','3','4','5','6','7','8','9','0']
self.stop_words = first_words + alphabets + nums
self.key_dict = {
'医保疾病' : 'yibao_status',
"患病比例" : "get_prob",
"易感人群" : "easy_get",
"传染方式" : "get_way",
"就诊科室" : "cure_department",
"治疗方式" : "cure_way",
"治疗周期" : "cure_lasttime",
"治愈率" : "cured_prob",
'药品明细': 'drug_detail',
'药品推荐': 'recommand_drug',
'推荐': 'recommand_eat',
'忌食': 'not_eat',
'宜食': 'do_eat',
'症状': 'symptom',
'检查': 'check',
'成因': 'cause',
'预防措施': 'prevent',
'所属类别': 'category',
'简介': 'desc',
'名称': 'name',
'常用药品' : 'common_drug',
'治疗费用': 'cost_money',
'并发症': 'acompany'
}
self.cuter = CutWords()
def collect_medical(self):
cates = []
inspects = []
count = 0
for item in self.col.find():
data = {}
basic_info = item['basic_info']
name = basic_info['name']
if not name:
continue
# 基本信息
data['名称'] = name
data['简介'] = '\n'.join(basic_info['desc']).replace('\r\n\t', '').replace('\r\n\n\n','').replace(' ','').replace('\r\n','\n')
category = basic_info['category']
data['所属类别'] = category
cates += category
inspect = item['inspect_info']
inspects += inspect
attributes = basic_info['attributes']
# 成因及预防
data['预防措施'] = item['prevent_info']
data['成因'] = item['cause_info']
# 并发症
data['症状'] = list(set([i for i in item["symptom_info"][0] if i[0] not in self.stop_words]))
for attr in attributes:
attr_pair = attr.split(':')
if len(attr_pair) == 2:
key = attr_pair[0]
value = attr_pair[1]
data[key] = value
# 检查
inspects = item['inspect_info']
jcs = []
for inspect in inspects:
jc_name = self.get_inspect(inspect)
if jc_name:
jcs.append(jc_name)
data['检查'] = jcs
# 食物
food_info = item['food_info']
if food_info:
data['宜食'] = food_info['good']
data['忌食'] = food_info['bad']
data['推荐'] = food_info['recommand']
# 药品
drug_info = item['drug_info']
data['药品推荐'] = list(set([i.split('(')[-1].replace(')','') for i in drug_info]))
data['药品明细'] = drug_info
data_modify = {}
for attr, value in data.items():
attr_en = self.key_dict.get(attr)
if attr_en:
data_modify[attr_en] = value
if attr_en in ['yibao_status', 'get_prob', 'easy_get', 'get_way', "cure_lasttime", "cured_prob"]:
data_modify[attr_en] = value.replace(' ','').replace('\t','')
elif attr_en in ['cure_department', 'cure_way', 'common_drug']:
data_modify[attr_en] = [i for i in value.split(' ') if i]
elif attr_en in ['acompany']:
acompany = [i for i in self.cuter.max_biward_cut(data_modify[attr_en]) if len(i) > 1]
data_modify[attr_en] = acompany
try:
self.db['medical'].insert(data_modify)
count += 1
print(count)
except Exception as e:
print(e)
return
def get_inspect(self, url):
res = self.db['jc'].find_one({'url':url})
if not res:
return ''
else:
return res['name']
def modify_jc(self):
for item in self.db['jc'].find():
url = item['url']
content = item['html']
selector = etree.HTML(content)
name = selector.xpath('//title/text()')[0].split('结果分析')[0]
desc = selector.xpath('//meta[@name="description"]/@content')[0].replace('\r\n\t','')
self.db['jc'].update({'url':url}, {'$set':{'name':name, 'desc':desc}})
if __name__ == '__main__':
handler = MedicalGraph()
#!/usr/bin/env python3
# coding: utf-8
# File: data_spider.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-3
import urllib.request
import urllib.parse
from lxml import etree
import pymongo
import re
'''基于司法网的犯罪案件采集'''
class CrimeSpider:
def __init__(self):
self.conn = pymongo.MongoClient()
self.db = self.conn['medical']
self.col = self.db['data']
'''根据url,请求html'''
def get_html(self, url):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/51.0.2704.63 Safari/537.36'}
req = urllib.request.Request(url=url, headers=headers)
res = urllib.request.urlopen(req)
html = res.read().decode('gbk')
return html
'''url解析'''
def url_parser(self, content):
selector = etree.HTML(content)
urls = ['http://www.anliguan.com' + i for i in selector.xpath('//h2[@class="item-title"]/a/@href')]
return urls
'''测试'''
def spider_main(self):
for page in range(1, 11000):
try:
basic_url = 'http://jib.xywy.com/il_sii/gaishu/%s.htm'%page
cause_url = 'http://jib.xywy.com/il_sii/cause/%s.htm'%page
prevent_url = 'http://jib.xywy.com/il_sii/prevent/%s.htm'%page
symptom_url = 'http://jib.xywy.com/il_sii/symptom/%s.htm'%page
inspect_url = 'http://jib.xywy.com/il_sii/inspect/%s.htm'%page
treat_url = 'http://jib.xywy.com/il_sii/treat/%s.htm'%page
food_url = 'http://jib.xywy.com/il_sii/food/%s.htm'%page
drug_url = 'http://jib.xywy.com/il_sii/drug/%s.htm'%page
data = {}
data['url'] = basic_url
data['basic_info'] = self.basicinfo_spider(basic_url)
data['cause_info'] = self.common_spider(cause_url)
data['prevent_info'] = self.common_spider(prevent_url)
data['symptom_info'] = self.symptom_spider(symptom_url)
data['inspect_info'] = self.inspect_spider(inspect_url)
data['treat_info'] = self.treat_spider(treat_url)
data['food_info'] = self.food_spider(food_url)
data['drug_info'] = self.drug_spider(drug_url)
print(page, basic_url)
self.col.insert(data)
except Exception as e:
print(e, page)
return
'''基本信息解析'''
def basicinfo_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
title = selector.xpath('//title/text()')[0]
category = selector.xpath('//div[@class="wrap mt10 nav-bar"]/a/text()')
desc = selector.xpath('//div[@class="jib-articl-con jib-lh-articl"]/p/text()')
ps = selector.xpath('//div[@class="mt20 articl-know"]/p')
infobox = []
for p in ps:
info = p.xpath('string(.)').replace('\r','').replace('\n','').replace('\xa0', '').replace(' ', '').replace('\t','')
infobox.append(info)
basic_data = {}
basic_data['category'] = category
basic_data['name'] = title.split('的简介')[0]
basic_data['desc'] = desc
basic_data['attributes'] = infobox
return basic_data
'''treat_infobox治疗解析'''
def treat_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
ps = selector.xpath('//div[starts-with(@class,"mt20 articl-know")]/p')
infobox = []
for p in ps:
info = p.xpath('string(.)').replace('\r','').replace('\n','').replace('\xa0', '').replace(' ', '').replace('\t','')
infobox.append(info)
return infobox
'''treat_infobox治疗解析'''
def drug_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
drugs = [i.replace('\n','').replace('\t', '').replace(' ','') for i in selector.xpath('//div[@class="fl drug-pic-rec mr30"]/p/a/text()')]
return drugs
'''food治疗解析'''
def food_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
divs = selector.xpath('//div[@class="diet-img clearfix mt20"]')
try:
food_data = {}
food_data['good'] = divs[0].xpath('./div/p/text()')
food_data['bad'] = divs[1].xpath('./div/p/text()')
food_data['recommand'] = divs[2].xpath('./div/p/text()')
except:
return {}
return food_data
'''症状信息解析'''
def symptom_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
symptoms = selector.xpath('//a[@class="gre" ]/text()')
ps = selector.xpath('//p')
detail = []
for p in ps:
info = p.xpath('string(.)').replace('\r','').replace('\n','').replace('\xa0', '').replace(' ', '').replace('\t','')
detail.append(info)
symptoms_data = {}
symptoms_data['symptoms'] = symptoms
symptoms_data['symptoms_detail'] = detail
return symptoms, detail
'''检查信息解析'''
def inspect_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
inspects = selector.xpath('//li[@class="check-item"]/a/@href')
return inspects
'''通用解析模块'''
def common_spider(self, url):
html = self.get_html(url)
selector = etree.HTML(html)
ps = selector.xpath('//p')
infobox = []
for p in ps:
info = p.xpath('string(.)').replace('\r', '').replace('\n', '').replace('\xa0', '').replace(' ','').replace('\t', '')
if info:
infobox.append(info)
return '\n'.join(infobox)
'''检查项抓取模块'''
def inspect_crawl(self):
for page in range(1, 3685):
try:
url = 'http://jck.xywy.com/jc_%s.html'%page
html = self.get_html(url)
data = {}
data['url']= url
data['html'] = html
self.db['jc'].insert(data)
print(url)
except Exception as e:
print(e)
handler = CrimeSpider()
handler.inspect_crawl()
\ No newline at end of file
#!/usr/bin/env python3
# coding: utf-8
# File: maxmatch.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-3-26
class CutWords:
def __init__(self):
dict_path = './disease.txt'
self.word_dict, self.max_wordlen = self.load_words(dict_path)
# 加载词典
def load_words(self, dict_path):
words = list()
max_len = 0
for line in open(dict_path):
wd = line.strip()
if not wd:
continue
if len(wd) > max_len:
max_len = len(wd)
words.append(wd)
return words, max_len
# 最大向前匹配
def max_forward_cut(self, sent):
# 1.从左向右取待切分汉语句的m个字符作为匹配字段,m为大机器词典中最长词条个数。
# 2.查找大机器词典并进行匹配。若匹配成功,则将这个匹配字段作为一个词切分出来。
cutlist = []
index = 0
while index < len(sent):
matched = False
for i in range(self.max_wordlen, 0, -1):
cand_word = sent[index: index + i]
if cand_word in self.word_dict:
cutlist.append(cand_word)
matched = True
break
# 如果没有匹配上,则按字符切分
if not matched:
i = 1
cutlist.append(sent[index])
index += i
return cutlist
# 最大向后匹配
def max_backward_cut(self, sent):
# 1.从右向左取待切分汉语句的m个字符作为匹配字段,m为大机器词典中最长词条个数。
# 2.查找大机器词典并进行匹配。若匹配成功,则将这个匹配字段作为一个词切分出来。
cutlist = []
index = len(sent)
max_wordlen = 5
while index > 0:
matched = False
for i in range(self.max_wordlen, 0, -1):
tmp = (i + 1)
cand_word = sent[index - tmp: index]
# 如果匹配上,则将字典中的字符加入到切分字符中
if cand_word in self.word_dict:
cutlist.append(cand_word)
matched = True
break
# 如果没有匹配上,则按字符切分
if not matched:
tmp = 1
cutlist.append(sent[index - 1])
index -= tmp
return cutlist[::-1]
# 双向最大向前匹配
def max_biward_cut(self, sent):
# 双向最大匹配法是将正向最大匹配法得到的分词结果和逆向最大匹配法的到的结果进行比较,从而决定正确的分词方法。
# 启发式规则:
# 1.如果正反向分词结果词数不同,则取分词数量较少的那个。
# 2.如果分词结果词数相同 a.分词结果相同,就说明没有歧义,可返回任意一个。 b.分词结果不同,返回其中单字较少的那个。
forward_cutlist = self.max_forward_cut(sent)
backward_cutlist = self.max_backward_cut(sent)
count_forward = len(forward_cutlist)
count_backward = len(backward_cutlist)
def compute_single(word_list):
num = 0
for word in word_list:
if len(word) == 1:
num += 1
return num
if count_forward == count_backward:
if compute_single(forward_cutlist) > compute_single(backward_cutlist):
return backward_cutlist
else:
return forward_cutlist
elif count_backward > count_forward:
return forward_cutlist
else:
return backward_cutlist
#!/usr/bin/env python3
# coding: utf-8
# File: question_classifier.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-4
import os
import ahocorasick
from type_enum import word_type_enum, question_type_enum
class QuestionClassifier:
def __init__(self, g):
self.g = g
cur_dir = '/'.join(os.path.abspath(__file__).split('/')[:-1])
#  特征词路径
self.deny_path = os.path.join(cur_dir, 'dict/deny.txt')
# 加载特征词
# 1、疾病
self.disease_wds = self.init_name_by_node('Disease')
self.department_wds = self.init_name_by_node('Department')
self.check_wds = self.init_name_by_node('Check')
self.producer_wds = self.init_name_by_node('Producer')
self.symptom_wds = self.init_name_by_node('Symptom')
# 2、食品
self.food_wds = self.init_name_by_node('Food')
self.food_material_wds = self.init_name_by_node('FoodMaterial')
self.food_package_wds = self.init_name_by_node('FoodPackage')
self.food_process_wds = self.init_name_by_node('FoodProcess')
self.cooking_style_wds = self.init_name_by_node('CookingStyle')
self.index_level_wds = self.init_name_by_node('IndexLevel')
self.index_level_wds += [x[:2] for x in self.index_level_wds]
self.province_wds = self.init_name_by_node('Province')
self.nutrient_wds = self.init_name_by_node('Nutrient')
# 3、运动
self.exercise_wds = self.init_name_by_node('Exercise')
# 4、药品
self.drug_wds = self.init_name_by_node('Drug')
self.region_words = set(self.department_wds + self.disease_wds + self.check_wds + self.producer_wds + self.symptom_wds
+ self.food_wds + self.food_material_wds + self.food_package_wds + self.food_process_wds
+ self.cooking_style_wds + self.index_level_wds + self.province_wds + self.nutrient_wds
+ self.exercise_wds
+ self.drug_wds)
self.deny_words = [i.strip() for i in open(self.deny_path) if i.strip()]
# 构造领域actree
self.region_tree = self.build_actree(list(self.region_words))
# 构建词典
self.wdtype_dict = self.build_wdtype_dict()
# 问句疑问词
# 1、疾病
self.symptom_qwds = ['症状', '表征', '现象', '症候', '表现']
self.cause_qwds = ['原因', '成因', '为什么', '怎么会', '怎样才', '咋样才', '怎样会', '如何会', '为啥', '为何', '如何才会', '怎么才会', '会导致', '会造成']
self.acompany_qwds = ['并发症', '并发', '一起发生', '一并发生', '一起出现', '一并出现', '一同发生', '一同出现', '伴随发生', '伴随', '共现']
self.food_qwds = ['饮食', '饮用', '吃', '食', '伙食', '膳食', '喝', '菜', '忌口', '补品', '保健品', '食谱', '菜谱', '食用', '食物', '补品']
self.drug_qwds = ['药', '药品', '用药', '胶囊', '口服液', '炎片']
self.prevent_qwds = ['预防', '防范', '抵制', '抵御', '防止', '躲避', '逃避', '避开', '免得', '逃开', '避开', '避掉', '躲开', '躲掉', '绕开',
'怎样才能不', '怎么才能不', '咋样才能不', '咋才能不', '如何才能不',
'怎样才不', '怎么才不', '咋样才不', '咋才不', '如何才不',
'怎样才可以不', '怎么才可以不', '咋样才可以不', '咋才可以不', '如何可以不',
'怎样才可不', '怎么才可不', '咋样才可不', '咋才可不', '如何可不']
self.lasttime_qwds = ['周期', '多久', '多长时间', '多少时间', '几天', '几年', '多少天', '多少小时', '几个小时', '多少年']
self.cureway_qwds = ['怎么治疗', '如何医治', '怎么医治', '怎么治', '怎么医', '如何治', '医治方式', '疗法', '咋治', '怎么办', '咋办', '咋治']
self.cureprob_qwds = ['多大概率能治好', '多大几率能治好', '治好希望大么', '几率', '几成', '比例', '可能性', '能治', '可治', '可以治', '可以医']
self.easyget_qwds = ['易感人群', '容易感染', '易发人群', '什么人', '哪些人', '感染', '染上', '得上']
self.check_qwds = ['检查', '检查项目', '查出', '检查', '测出', '试出']
self.belong_qwds = ['属于什么科', '属于', '什么科', '科室']
self.cure_qwds = ['治疗什么', '治啥', '治疗啥', '医治啥', '治愈啥', '主治啥', '主治什么', '有什么用', '有何用', '用处', '用途',
'有什么好处', '有什么益处', '有何益处', '用来', '用来做啥', '用来作甚', '需要', '要']
# 2、食品
self.crowd_qwds = ['什么人', '啥人', '啥样的人', '哪种人', '人群', '群体', '哪些人']
self.cooking_specialty_qwds = ['烹饪特点']
self.food_specialty_qwds = ['膳食特点']
self.province_qwds = ['地方', '地点', '地', '地域', '地区', '哪里']
self.unit_qwds = ['单位', '扩展单位', '量词']
self.content_qwds = ['含有', '有', '包含', '含', '包括', '营养', '元素', '成分']
self.alias_qwds = ['别名', '化名', '别称', '叫什么', '称为', '近似', '类似', '差不多', '称呼', '菜名', '叫法', '其他']
self.nature_qwds = ['性质']
self.cooking_style_qwds = ['哪些菜', '什么菜']
self.index_level_high_qwds = ['高', '多', '富']
self.index_level_mid_qwds = ['中']
self.index_level_low_qwds = ['低', '少']
print('model init finished ......')
return
def init_name_by_node(self, node_name):
query = f"MATCH (m:{node_name}) return m.name"
res = self.g.run(query).data()
return [x['m.name'] for x in res]
'''分类主函数'''
def classify(self, question):
data = {}
medical_dict = self.check_medical(question)
if not medical_dict:
return {}
data['args'] = medical_dict
# 收集问句当中所涉及到的实体类型
types = []
for type_ in medical_dict.values():
types += type_
question_types = []
# 症状
if self.check_words(self.symptom_qwds, question) and (word_type_enum.DISEASE in types):
question_type = question_type_enum.DISEASE_SYMPTOM
question_types.append(question_type)
if self.check_words(self.symptom_qwds, question) and (word_type_enum.SYMPTOM in types):
question_type = question_type_enum.SYMPTOM_DISEASE
question_types.append(question_type)
# 原因
if self.check_words(self.cause_qwds, question) and (word_type_enum.DISEASE in types):
question_type = question_type_enum.DISEASE_CAUSE
question_types.append(question_type)
# 并发症
if self.check_words(self.acompany_qwds, question) and (word_type_enum.DISEASE in types):
question_type = question_type_enum.DISEASE_ACOMPANY
question_types.append(question_type)
# 推荐食品
if self.check_words(self.food_qwds, question) and word_type_enum.DISEASE in types:
deny_status = self.check_words(self.deny_words, question)
if deny_status:
question_type = question_type_enum.DISEASE_NOT_FOOD
else:
question_type = question_type_enum.DISEASE_DO_FOOD
question_types.append(question_type)
# 已知食物找疾病:
if self.check_words(self.food_qwds + self.cure_qwds, question) and word_type_enum.FOOD in types and word_type_enum.DISEASE in types:
deny_status = self.check_words(self.deny_words, question)
if deny_status:
question_type = question_type_enum.FOOD_NOT_DISEASE
else:
question_type = question_type_enum.FOOD_DO_DISEASE
question_types.append(question_type)
# 推荐药品
if self.check_words(self.drug_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_DRUG
question_types.append(question_type)
# 药品治啥病
if self.check_words(self.cure_qwds, question) and word_type_enum.DRUG in types:
question_type = question_type_enum.DRUG_DISEASE
question_types.append(question_type)
# 疾病接受检查项目
if self.check_words(self.check_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_CHECK
question_types.append(question_type)
# 已知检查项目查相应疾病
if self.check_words(self.check_qwds + self.cure_qwds, question) and word_type_enum.CHECK in types:
question_type = question_type_enum.CHECK_DISEASE
question_types.append(question_type)
#  症状防御
if self.check_words(self.prevent_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_PREVENT
question_types.append(question_type)
# 疾病医疗周期
if self.check_words(self.lasttime_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_LASTTIME
question_types.append(question_type)
# 疾病治疗方式
if self.check_words(self.cureway_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_CUREWAY
question_types.append(question_type)
# 疾病治愈可能性
if self.check_words(self.cureprob_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_CUREPROB
question_types.append(question_type)
# 疾病易感染人群
if self.check_words(self.easyget_qwds, question) and word_type_enum.DISEASE in types:
question_type = question_type_enum.DISEASE_EASYGET
question_types.append(question_type)
# 若没有查到相关的外部查询信息,那么则将该疾病的描述信息返回
if question_types == [] and word_type_enum.DISEASE in types:
question_types = [question_type_enum.DISEASE_DESC]
# 若没有查到相关的外部查询信息,那么则将该疾病的描述信息返回
if question_types == [] and word_type_enum.SYMPTOM in types:
question_types = [question_type_enum.SYMPTOM_DISEASE]
# 已知foodMaterial找人群
if self.check_words(self.crowd_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
deny_status = self.check_words(self.deny_words, question)
if deny_status:
question_type = question_type_enum.FOODMATERIAL_NOT_CROWD
else:
question_type = question_type_enum.FOODMATERIAL_DO_CROWD
question_types.append(question_type)
# 已知foodProcess找人群
if self.check_words(self.crowd_qwds, question) and word_type_enum.FOOD_PROCESS in types:
deny_status = self.check_words(self.deny_words, question)
if deny_status:
question_type = question_type_enum.FOODPROCESS_NOT_CROWD
else:
question_type = question_type_enum.FOODPROCESS_DO_CROWD
question_types.append(question_type)
# 食物、含量关系、Level,已知1,2查3
if self.check_words(self.index_level_high_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_LEVEL_HIGH
question_types.append(question_type)
if self.check_words(self.index_level_mid_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_LEVEL_MID
question_types.append(question_type)
if self.check_words(self.index_level_low_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_LEVEL_LOW
question_types.append(question_type)
if self.check_words(self.index_level_high_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_LEVEL_HIGH
question_types.append(question_type)
if self.check_words(self.index_level_mid_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_LEVEL_MID
question_types.append(question_type)
if self.check_words(self.index_level_low_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_LEVEL_LOW
question_types.append(question_type)
# 食物、含量关系、Level,已知3,2查1
if self.check_words(self.index_level_high_qwds, question) and word_type_enum.INDEX_LEVEL in types:
question_type = question_type_enum.INDEX_LEVEL_HIGH
question_types.append(question_type)
if self.check_words(self.index_level_mid_qwds, question) and word_type_enum.INDEX_LEVEL in types:
question_type = question_type_enum.INDEX_LEVEL_MID
question_types.append(question_type)
if self.check_words(self.index_level_low_qwds, question) and word_type_enum.INDEX_LEVEL in types:
question_type = question_type_enum.INDEX_LEVEL_LOW
question_types.append(question_type)
# 食物、含量关系、Level,已知1,3查2
if self.check_words(self.content_qwds + self.index_level_high_qwds + self.index_level_mid_qwds + self.index_level_low_qwds,
question) and word_type_enum.INDEX_LEVEL in types and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_LEVEL_CONTENT
question_types.append(question_type)
if self.check_words(self.content_qwds + self.index_level_high_qwds + self.index_level_mid_qwds + self.index_level_low_qwds,
question) and word_type_enum.INDEX_LEVEL in types and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_LEVEL_CONTENT
question_types.append(question_type)
# 已知食物找烹饪特点
if self.check_words(self.cooking_specialty_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_COOKING_SPECIALTY
question_types.append(question_type)
if self.check_words(self.cooking_specialty_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_COOKING_SPECIALTY
question_types.append(question_type)
# 已知食物找膳食特点
if self.check_words(self.food_specialty_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_FOOD_SPECIALTY
question_types.append(question_type)
if self.check_words(self.food_specialty_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_FOOD_SPECIALTY
question_types.append(question_type)
# 已知食物找地域
if self.check_words(self.province_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_PROVINCE
question_types.append(question_type)
if self.check_words(self.province_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_PROVINCE
question_types.append(question_type)
# 已知地域找食物
if self.check_words(self.food_qwds, question) and word_type_enum.PROVINCE in types:
question_type = question_type_enum.PROVINCE_FOOD
question_types.append(question_type)
# 已知食物查单位
if self.check_words(self.unit_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_UNIT
question_types.append(question_type)
if self.check_words(self.unit_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_UNIT
question_types.append(question_type)
# 已知食物查富含的元素
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_high_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_HIGH_CONT
question_types.append(question_type)
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_high_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_HIGH_CONT
question_types.append(question_type)
# 已知富含的元素查食物
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_high_qwds, question) and word_type_enum.NUTRIENT in types:
question_type = question_type_enum.NUTRIENT_HIGH_CONT
question_types.append(question_type)
# 已知食物查低量的元素
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_low_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_LOW_CONT
question_types.append(question_type)
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_low_qwds, question) and word_type_enum.FOOD_MATERIAL in types:
question_type = question_type_enum.FOODMATERIAL_LOW_CONT
question_types.append(question_type)
# 已知低量的元素查食物
if self.check_words(self.content_qwds, question) and self.check_words(self.index_level_low_qwds, question) and word_type_enum.NUTRIENT in types:
question_type = question_type_enum.NUTRIENT_LOW_CONT
question_types.append(question_type)
# 已知菜品查别名
if self.check_words(self.alias_qwds, question) and word_type_enum.FOOD_PROCESS in types:
question_type = question_type_enum.FOODPROCESS_ALIAS
question_types.append(question_type)
# 将多个分类结果进行合并处理,组装成一个字典
data['question_types'] = question_types
return data
'''构造词对应的类型'''
def build_wdtype_dict(self):
wd_dict = dict()
for wd in self.region_words:
wd_dict[wd] = []
if wd in self.disease_wds:
wd_dict[wd].append(word_type_enum.DISEASE)
if wd in self.department_wds:
wd_dict[wd].append(word_type_enum.DEPARTMENT)
if wd in self.check_wds:
wd_dict[wd].append(word_type_enum.CHECK)
if wd in self.symptom_wds:
wd_dict[wd].append(word_type_enum.SYMPTOM)
if wd in self.producer_wds:
wd_dict[wd].append(word_type_enum.PRODUCER)
if wd in self.food_wds:
wd_dict[wd].append(word_type_enum.FOOD)
if wd in self.food_material_wds:
wd_dict[wd].append(word_type_enum.FOOD_MATERIAL)
if wd in self.food_package_wds:
wd_dict[wd].append(word_type_enum.FOOD_PACKAGE)
if wd in self.food_process_wds:
wd_dict[wd].append(word_type_enum.FOOD_PROCESS)
if wd in self.cooking_style_wds:
wd_dict[wd].append(word_type_enum.COOKING_STYLE)
if wd in self.index_level_wds:
wd_dict[wd].append(word_type_enum.INDEX_LEVEL)
if wd in self.province_wds:
wd_dict[wd].append(word_type_enum.PROVINCE)
if wd in self.nutrient_wds:
wd_dict[wd].append(word_type_enum.NUTRIENT)
if wd in self.exercise_wds:
wd_dict[wd].append(word_type_enum.EXERCISE)
if wd in self.drug_wds:
wd_dict[wd].append(word_type_enum.DRUG)
return wd_dict
'''构造actree,加速过滤'''
def build_actree(self, wordlist):
actree = ahocorasick.Automaton()
for index, word in enumerate(wordlist):
actree.add_word(word, (index, word))
actree.make_automaton()
return actree
'''问句过滤'''
def check_medical(self, question):
region_wds = []
for i in self.region_tree.iter(question):
wd = i[1][1]
region_wds.append(wd)
stop_wds = []
for wd1 in region_wds:
for wd2 in region_wds:
if wd1 in wd2 and wd1 != wd2:
stop_wds.append(wd1)
final_wds = [i for i in region_wds if i not in stop_wds]
final_dict = {i: self.wdtype_dict.get(i) for i in final_wds}
return final_dict
'''基于特征词进行分类'''
def check_words(self, wds, sent):
for wd in wds:
if wd in sent:
return True
return False
if __name__ == '__main__':
handler = QuestionClassifier()
while 1:
question = input('input an question:')
data = handler.classify(question)
print(data)
#!/usr/bin/env python3
# coding: utf-8
# File: question_parser.py
# Author: lhy<lhy_in_blcu@126.com,https://huangyong.github.io>
# Date: 18-10-4
from type_enum import word_type_enum, question_type_enum
class QuestionPaser:
'''构建实体节点'''
def build_entitydict(self, args):
entity_dict = {}
for arg, types in args.items():
for type in types:
if type not in entity_dict:
entity_dict[type] = [arg]
else:
entity_dict[type].append(arg)
return entity_dict
'''解析主函数'''
def parser_main(self, res_classify):
args = res_classify['args']
entity_dict = self.build_entitydict(args)
question_types = res_classify['question_types']
sqls = []
for question_type in question_types:
sql_ = {}
sql_['question_type'] = question_type
sql = []
# 查询疾病有哪些症状
if question_type == question_type_enum.DISEASE_SYMPTOM:
sql = ["MATCH (m:Disease)-[r:has_symptom]->(n:Symptom) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询症状会导致哪些疾病
elif question_type == question_type_enum.SYMPTOM_DISEASE:
sql = ["MATCH (m:Disease)-[r:has_symptom]->(n:Symptom) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.SYMPTOM)]
# 查询疾病的原因
elif question_type == question_type_enum.DISEASE_CAUSE:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.cause".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的并发症
elif question_type == question_type_enum.DISEASE_ACOMPANY:
sql1 = ["MATCH (m:Disease)-[r:acompany_with]->(n:Disease) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql2 = ["MATCH (m:Disease)-[r:acompany_with]->(n:Disease) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql = sql1 + sql2
# 查询疾病的忌口
elif question_type == question_type_enum.DISEASE_NOT_FOOD:
sql = ["MATCH (m:Disease)-[r:no_eat]->(n:Food) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病建议吃的东西
elif question_type == question_type_enum.DISEASE_DO_FOOD:
sql1 = ["MATCH (m:Disease)-[r:do_eat]->(n:Food) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql2 = ["MATCH (m:Disease)-[r:recommend_eat]->(n:Food) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql = sql1 + sql2
# 已知忌口查疾病
elif question_type == question_type_enum.FOOD_NOT_DISEASE:
sql = ["MATCH (m:Disease)-[r:no_eat]->(n:Food) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD)]
# 已知推荐查疾病
elif question_type == question_type_enum.FOOD_DO_DISEASE:
sql1 = ["MATCH (m:Disease)-[r:do_eat]->(n:Food) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD)]
sql2 = ["MATCH (m:Disease)-[r:recommend_eat]->(n:Food) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD)]
sql = sql1 + sql2
# 查询疾病常用药品-药品别名记得扩充
elif question_type == question_type_enum.DISEASE_DRUG:
sql1 = ["MATCH (m:Disease)-[r:common_drug]->(n:Drug) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql2 = ["MATCH (m:Disease)-[r:recommand_drug]->(n:Drug) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
sql = sql1 + sql2
# 已知药品查询能够治疗的疾病
elif question_type == question_type_enum.DRUG_DISEASE:
sql1 = ["MATCH (m:Disease)-[r:common_drug]->(n:Drug) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DRUG)]
sql2 = ["MATCH (m:Disease)-[r:recommand_drug]->(n:Drug) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DRUG)]
sql = sql1 + sql2
# 查询疾病应该进行的检查
elif question_type == question_type_enum.DISEASE_CHECK:
sql = ["MATCH (m:Disease)-[r:need_check]->(n:Check) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 已知检查查询疾病
elif question_type == question_type_enum.CHECK_DISEASE:
sql = ["MATCH (m:Disease)-[r:need_check]->(n:Check) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.CHECK)]
# 查询疾病的防御措施
elif question_type == question_type_enum.DISEASE_PREVENT:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.prevent".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的持续时间
elif question_type == question_type_enum.DISEASE_LASTTIME:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.cure_lasttime".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的治疗方式
elif question_type == question_type_enum.DISEASE_CUREWAY:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.cure_way".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的治愈概率
elif question_type == question_type_enum.DISEASE_CUREPROB:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.cured_prob".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的易发人群
elif question_type == question_type_enum.DISEASE_EASYGET:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.easy_get".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 查询疾病的相关介绍
elif question_type == question_type_enum.DISEASE_DESC:
sql = ["MATCH (m:Disease) where m.name = '{0}' return m.name, m.desc".format(i) for i in entity_dict.get(word_type_enum.DISEASE)]
# 已知忌口查人群
elif question_type == question_type_enum.FOODMATERIAL_NOT_CROWD:
sql = ["MATCH (m:FoodMaterial)-[r:taboo_crowd]->(n:Crowd) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知推荐食品查人群
elif question_type == question_type_enum.FOODMATERIAL_DO_CROWD:
sql = ["MATCH (m:FoodMaterial)-[r:recommend_crowd]->(n:Crowd) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知忌口查人群
elif question_type == question_type_enum.FOODPROCESS_NOT_CROWD:
sql = ["MATCH (m:FoodProcess)-[r:taboo_crowd]->(n:Crowd) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
# 已知推荐食品查人群
elif question_type == question_type_enum.FOODPROCESS_DO_CROWD:
sql = ["MATCH (m:FoodProcess)-[r:recommend_crowd]->(n:Crowd) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
# 食物、含量关系、Level,已知1,3查2
elif question_type == question_type_enum.FOODPROCESS_LEVEL_CONTENT:
sql = ["MATCH (m:FoodProcess)-[r]->(n:IndexLevel) where m.name = '{0}' AND n.name contains '{1}' return m.name, type(r) AS type, n.name".format(i, j) for i in
entity_dict.get(word_type_enum.FOOD_PROCESS) for j in entity_dict.get(word_type_enum.INDEX_LEVEL)]
elif question_type == question_type_enum.FOODMATERIAL_LEVEL_CONTENT:
sql = ["MATCH (m:FoodMaterial)-[r]->(n:IndexLevel) where m.name = '{0}' AND n.name contains '{1}' return m.name, type(r) AS type, n.name".format(i, j) for i in
entity_dict.get(word_type_enum.FOOD_MATERIAL) for j in entity_dict.get(word_type_enum.INDEX_LEVEL)]
# 食物、含量关系、Level,已知1,2查3
elif question_type == question_type_enum.FOODPROCESS_LEVEL_HIGH:
sql = ["MATCH (m:FoodProcess)-[r:HIGH_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODPROCESS_LEVEL_MID:
sql = ["MATCH (m:FoodProcess)-[r:MIDDLE_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODPROCESS_LEVEL_LOW:
sql = ["MATCH (m:FoodProcess)-[r:LOW_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_LEVEL_HIGH:
sql = ["MATCH (m:FoodMaterial)-[r:HIGH_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
elif question_type == question_type_enum.FOODMATERIAL_LEVEL_MID:
sql = ["MATCH (m:FoodMaterial)-[r:MIDDLE_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
elif question_type == question_type_enum.FOODMATERIAL_LEVEL_LOW:
sql = ["MATCH (m:FoodMaterial)-[r:LOW_TO]->(n:IndexLevel) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 食物、含量关系、Level,已知3,2查1
elif question_type == question_type_enum.INDEX_LEVEL_HIGH:
sql = ["MATCH (m:FoodProcess)-[r:HIGH_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
sql += ["MATCH (m:FoodMaterial)-[r:HIGH_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
elif question_type == question_type_enum.INDEX_LEVEL_MID:
sql = ["MATCH (m:FoodProcess)-[r:MIDDLE_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
sql += ["MATCH (m:FoodMaterial)-[r:MIDDLE_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
elif question_type == question_type_enum.INDEX_LEVEL_LOW:
sql = ["MATCH (m:FoodProcess)-[r:LOW_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
sql += ["MATCH (m:FoodMaterial)-[r:LOW_TO]->(n:IndexLevel) where n.name contains '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.INDEX_LEVEL)]
# 已知食物找烹饪特点
elif question_type == question_type_enum.FOODPROCESS_COOKING_SPECIALTY:
sql = ["MATCH (m:FoodProcess)-[r:cooking_specialty]->(n:CookingSpecialty) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_COOKING_SPECIALTY:
sql = ["MATCH (m:FoodMaterial)-[r:cooking_specialty]->(n:CookingSpecialty) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知食物找膳食特点
elif question_type == question_type_enum.FOODPROCESS_FOOD_SPECIALTY:
sql = ["MATCH (m:FoodProcess)-[r:food_specialty]->(n:FoodSpecialty) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_FOOD_SPECIALTY:
sql = ["MATCH (m:FoodMaterial)-[r:food_specialty]->(n:FoodSpecialty) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知食物找地域
elif question_type == question_type_enum.FOODPROCESS_PROVINCE:
sql = ["MATCH (m:FoodProcess)-[r:province_of]->(n:Province) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_PROVINCE:
sql = ["MATCH (m:FoodMaterial)-[r:province_of]->(n:Province) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知地域找食物
elif question_type == question_type_enum.PROVINCE_FOOD:
sql = ["MATCH (m:FoodProcess)-[r:province_of]->(n:Province) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.PROVINCE)]
sql += ["MATCH (m:FoodMaterial)-[r:province_of]->(n:Province) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.PROVINCE)]
# 已知食物查单位
elif question_type == question_type_enum.FOODPROCESS_UNIT:
sql = ["MATCH (m:FoodProcess)-[r:extension_unit]->(n:ExtensionUnits) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_UNIT:
sql = ["MATCH (m:FoodMaterial)-[r:extension_unit]->(n:ExtensionUnits) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知食物查富含的元素
elif question_type == question_type_enum.FOODPROCESS_HIGH_CONT:
sql = ["MATCH (m:FoodProcess)-[r:high_content]->(n:Nutrient) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_HIGH_CONT:
sql = ["MATCH (m:FoodMaterial)-[r:high_content]->(n:Nutrient) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知富含的元素查食物
elif question_type == question_type_enum.NUTRIENT_HIGH_CONT:
sql = ["MATCH (m:FoodProcess)-[r:high_content]->(n:Nutrient) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.NUTRIENT)]
sql += ["MATCH (m:FoodMaterial)-[r:high_content]->(n:Nutrient) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.NUTRIENT)]
# 已知食物查低量的元素
elif question_type == question_type_enum.FOODPROCESS_LOW_CONT:
sql = ["MATCH (m:FoodProcess)-[r:low_content]->(n:Nutrient) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
elif question_type == question_type_enum.FOODMATERIAL_LOW_CONT:
sql = ["MATCH (m:FoodMaterial)-[r:low_content]->(n:Nutrient) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_MATERIAL)]
# 已知低量的元素查食物
elif question_type == question_type_enum.NUTRIENT_LOW_CONT:
sql = ["MATCH (m:FoodProcess)-[r:low_content]->(n:Nutrient) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.NUTRIENT)]
sql += ["MATCH (m:FoodMaterial)-[r:low_content]->(n:Nutrient) where n.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.NUTRIENT)]
# 已知菜品查别名
elif question_type == question_type_enum.FOODPROCESS_ALIAS:
sql = ["MATCH (m:FoodProcess)-[r:alias_of]->(n:Alias) where m.name = '{0}' return m.name, r.name, n.name".format(i) for i in entity_dict.get(word_type_enum.FOOD_PROCESS)]
if sql:
sql_['sql'] = sql
sqls.append(sql_)
return sqls
if __name__ == '__main__':
handler = QuestionPaser()
import base64
from gmssl import sm4
class CryptSM4:
def __init__(self, key):
self.crypt_sm4 = sm4.CryptSM4() # 实例化
self.key = key
def encryptSM4(self, value):
crypt_sm4 = self.crypt_sm4
crypt_sm4.set_key(self.key.encode(), sm4.SM4_ENCRYPT) # 设置密钥
date_str = str(value)
encrypt_value = crypt_sm4.crypt_ecb(date_str.encode()) # 开始加密。bytes类型
return base64.b64encode(encrypt_value).decode()
def decryptSM4(self, encrypt_value):
crypt_sm4 = self.crypt_sm4
crypt_sm4.set_key(self.key.encode(), sm4.SM4_DECRYPT) # 设置密钥
decrypt_value = crypt_sm4.crypt_ecb(base64.b64decode(encrypt_value)) # 开始解密。十六进制类型
return decrypt_value.decode()
from enum import Enum
class word_type_enum(Enum):
DISEASE = 'disease'
DEPARTMENT = 'department'
CHECK = 'check'
SYMPTOM = 'symptom'
PRODUCER = 'producer'
FOOD = 'food'
FOOD_MATERIAL = 'foodMaterial'
FOOD_PACKAGE = 'foodPackage'
FOOD_PROCESS = 'foodProcess'
COOKING_STYLE = 'cookingStyle'
INDEX_LEVEL = 'indexLevel'
PROVINCE = 'province'
NUTRIENT = 'nutrient'
EXERCISE = 'exercise'
DRUG = 'drug'
class question_type_enum(Enum):
OTHERS = 'others'
DISEASE_SYMPTOM = 'disease_symptom'
SYMPTOM_DISEASE = 'symptom_disease'
DISEASE_CAUSE = 'disease_cause'
DISEASE_ACOMPANY = 'disease_acompany'
DISEASE_NOT_FOOD = 'disease_not_food'
DISEASE_DO_FOOD = 'disease_do_food'
FOOD_NOT_DISEASE = 'food_not_disease'
FOOD_DO_DISEASE = 'food_do_disease'
DISEASE_DRUG = 'disease_drug'
DRUG_DISEASE = 'drug_disease'
DISEASE_CHECK = 'disease_check'
CHECK_DISEASE = 'check_disease'
DISEASE_PREVENT = 'disease_prevent'
DISEASE_LASTTIME = 'disease_lasttime'
DISEASE_CUREWAY = 'disease_cureway'
DISEASE_CUREPROB = 'disease_cureprob'
DISEASE_EASYGET = 'disease_easyget'
DISEASE_DESC = 'disease_desc'
FOODMATERIAL_NOT_CROWD = 'foodMaterial_not_crowd'
FOODMATERIAL_DO_CROWD = 'foodMaterial_do_crowd'
FOODPROCESS_NOT_CROWD = 'foodProcess_not_crowd'
FOODPROCESS_DO_CROWD = 'foodProcess_do_crowd'
FOODPROCESS_LEVEL_HIGH = 'foodProcess_level_high'
FOODPROCESS_LEVEL_MID = 'foodProcess_level_mid'
FOODPROCESS_LEVEL_LOW = 'foodProcess_level_low'
FOODMATERIAL_LEVEL_HIGH = 'foodMaterial_level_high'
FOODMATERIAL_LEVEL_MID = 'foodMaterial_level_mid'
FOODMATERIAL_LEVEL_LOW = 'foodMaterial_level_low'
INDEX_LEVEL_HIGH = 'index_level_high'
INDEX_LEVEL_MID = 'index_level_mid'
INDEX_LEVEL_LOW = 'index_level_low'
FOODPROCESS_LEVEL_CONTENT = 'foodProcess_level_content'
FOODMATERIAL_LEVEL_CONTENT = 'foodMaterial_level_content'
FOODPROCESS_COOKING_SPECIALTY = 'foodProcess_cooking_specialty'
FOODPROCESS_FOOD_SPECIALTY = 'foodProcess_food_specialty'
FOODMATERIAL_COOKING_SPECIALTY = 'foodMaterial_cooking_specialty'
FOODMATERIAL_FOOD_SPECIALTY = 'foodMaterial_food_specialty'
FOODPROCESS_PROVINCE = 'foodProcess_province'
FOODMATERIAL_PROVINCE = 'foodMaterial_province'
PROVINCE_FOOD = 'province_food'
FOODPROCESS_UNIT = 'foodProcess_unit'
FOODMATERIAL_UNIT = 'foodMaterial_unit'
FOODPROCESS_HIGH_CONT = 'foodProcess_high_cont'
FOODMATERIAL_HIGH_CONT = 'foodMaterial_high_cont'
NUTRIENT_HIGH_CONT = 'nutrient_high_cont'
FOODPROCESS_LOW_CONT = 'foodProcess_low_cont'
FOODMATERIAL_LOW_CONT = 'foodMaterial_low_cont'
NUTRIENT_LOW_CONT = 'nutrient_low_cont'
FOODPROCESS_ALIAS = 'foodProcess_alias'
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment