Python Sample Project Code Developing Health Doctor Recommendation System Based on Doctors Reviews

Biomed Res Int. 2021; 2021: 7431199.

Doctor Recommendation Model Based on Ontology Characteristics and Disease Text Mining Perspective

Chunhua Ju

¹Business concern Administration Higher, Zhejiang Gongshang University, Hangzhou, China

Shuangzhu Zhang

ⁱⁱSchoolhouse of Management Scientific discipline and Engineering, Zhejiang Gongshang University, Hangzhou, China

Received 2021 May 21; Accepted 2021 Jul 20.

Data Availability Statement: The information were collected with assist from the ambassador of the WeiYi platform. Due to third-party rights, patient privacy, and commercial confidentiality, data is not open up source.

Abstract

Background

Patients tin access medical services such as disease diagnosis online, medical handling guidance, and medication guidance that are provided by doctors from all over the country at domicile. Due to the complication of scenarios applying medical services online and the necessity of professionalism of knowledge, the traditional recommendation methods in the medical field are confronting with bug such every bit low computational efficiency and poor effectiveness. At the same fourth dimension, patients consulting online come from all sides, and most of them suffer from nonacute or cancerous diseases, and hence, at that place may be offline medical treatment. Therefore, this paper proposes an online prediagnosis dr. recommendation model past integrating ontology characteristics and disease text. Peculiarly, this recommendation model takes full consideration of geographical location of patients.

Objective

The recommendation model takes the real consultation data from online as the research object, fully testifying its effectiveness. Specifically, this model would make recommendation to patients on department and doctors based on patients' information of symptoms, diagnosis, and geographical location, also every bit medico's specialty and their department.

Methods

Utilizing crawler technique, five hospital departments were selected from the online medical service platform. The names of the departments were in accordance with the standardized department names used in real hospitals (e.g., endocrinology, dermatology, gynemetrics, pediatrics, and neurology). As a result, a dataset consisting of 20000 consultation questions past patients was built. Through the application of Python and MySQL algorithms, replacing semantic dictionary retrieval or word frequency statistics, word vectors were utilized to measure out similarity betwixt patients' prediagnosis and doctors' specialty, forming a recommendation framework on medical departments or doctors based on the above-obtained judgement similarity measurement and providing recommendation advices on intentional departments and doctors.

Results

In the online medical field, compared with the traditional recommendation method, the model proposed in the paper is of higher recommendation accuracy and feasibility in terms of department and doctor recommendation effectiveness.

Conclusions

The proposed online prediagnosis doctor recommendation model integrates ontology characteristics and disease text mining. The model gives a relatively more accurate recommendation advice based on ontology characteristics such every bit patients' description texts and doctors' specialties. Furthermore, the model also gives full consideration on patients' location factors. As a result, the proposed online prediagnosis doctor recommendation model would amend patients' online consultation feel and offline treatment convenience, enriching the value of online prediagnosis data.

i. Introduction

As the accent of medical care gradually shifts from disease to patient, the role of patients' participation in online health improvement is becoming more than prominent. The health service in the world is not only different in terms of regions but also varying in terms of online wellness services [i, two]. Specifically, there exist phenomenon such as information asymmetry between doctors and patients and diff distribution of medical resource geographically [3]. Therefore, patients registering doctors online and intelligent department recommendation have as well become one of the of import topics of medical informatization. According to a report released in 2019 past the Big Data Research Plant, the scale of users in Red china's medical and health marketplace was about 800 meg by the end of 2018 [4]. With a big number of doctors and patients interacting online, a big amount of real consultation data has been accumulated in the online health customs. Therefore, it is of of import theoretical and practical value to investigate how to make total use of online data to build models to amend patients' medical handling experience in terms of increasing the accurateness of patients' medical choice and the effectiveness of section recommendation.

The existing literature has been conducting studies from perspectives of department recommendation and doctor recommendation. The two methods of department recommendation are separately based on expert system and similarity adding. As for department recommendation based on the expert system, on one hand, through institution of medical knowledge base with the help from medical experts, the diagnosis process of medical experts is simulated by applying rule-based reasoning engine. As a result, patients' diseases are predicted, and then as to achieve the target department recommendation for patients. Moreover, the expert-based department recommendation is built upon fuzzy logic and RBF neural network, finer improving the recommendation accuracy [5, 6]. On the other paw, there exist many bug due to the abundant number of reasoning rules, such as low computational efficiency and high maintenance cost of knowledge base of operations. Equally for department recommendation based on similarity calculation, the current literature uses various methods to measure similarities, such as similarity between patients' symptoms and disease' symptoms [7], TF-IDF sentence-based similarity and TF-IDF algorithm that is based on multiple words [viii, 9], combination of focus shifting backwards, and professional medical corpus [x]. This similarity-based recommendation would, respectively, calculate the possibility of having illness and descriptive words that may correspond with certain symptoms, realizing the goal of department recommendation to patients. Research of recommendation on medico is mainly based on the content and collaborative filtering recommendation algorithm, focusing on user keywords, browsing history, evaluation, and other data [11, 12]. The user collaborative filtering algorithm assumes that one user and other user group who share similar interest would have same product preference [thirteen–xv]. Among them, user collaborative filtering algorithm integrating projects mainly solves the problem of information overload through filtering attribute collaboratively [xvi]. Moreover, the awarding of customized relational network and tags solves the problem of data sparsity in the matrix factorization recommendation model [17, 18], and the collaborative filtering recommendation method integrates contextual perception, project similarity, and user behavior, giving recommendation results from perspectives of patients' contexts, projects, and user participation [19–21]. In addition, scholars also conducted modeling inquiry on doctor recommendation, affliction diagnosis, and medical examination [22, 23] from the perspectives of semantic characteristics of medical resource [24], user information types [25], user ratings, and annotate portraits [26], also as Bayesian algorithm [27].

The recommendation algorithms in the traditional medical field mainly have the post-obit three problems. First, in terms of section recommendation, the algorithm based on the skilful system causes issues such equally explosion of noesis rule reasoning and loftier maintenance toll of knowledge base of operations. Furthermore, the algorithm based on similarity may not effectively recognize synonyms, possibly decreasing recommendation accurateness. Second, in terms of dr. recommendation, the user-based collaborative filtering algorithm may cause problems that patients of like symptoms would not be diagnosed with the same disease, due to complexity and diverseness of diseases. What is more, considering of the nonnecessary human relationship among patients' etiologies, the assumption of the project-based collaborative filtering algorithm that users would choose doctors with the same research field as their previous doctors may hardly be met. Third, although relevant literatures have studied how to reduce information sparsity [28–30], the collaborative filtering recommendation algorithm still cannot completely avert the operation problems caused by data sparsity.

Based on the above theorization, information technology can be concluded that the existing recommendation algorithms cannot fully run into requirements with regard to recommendation in the context of the Internet medical field. Patients can access medical services provided by doctors in the online health customs all over the country online without going out, including illness diagnosis, medical handling guidance, and medication guidance. Meanwhile, patients consulting online come from far and near and may involve situations of offline medical treatment, making it necessary to have into account the factor of patients' location. Therefore, this paper proposes an online prediagnosis doctor recommendation model that integrates ontology characteristics and disease text mining, improving both the effectiveness of dr. recommendation within the environs of online medical service and the convenience of offline medical handling for patients.

ii. Research on the Md Recommendation Model

The doctor recommendation model is mainly divided into three steps. Pace 1: information preprocessing. Perform word segmentation and terminate word removal with regard to patient'southward input of natural language. Step 2: hospital department recommendation. After screening patients' query data, create the nigh similar sentence gear up based on key parts of word vector or the similarity measurement for symptom descriptions, so as to achieve department recommendation. Step 3: md recommendation. Use SQL sentence query in the MYSQL database to complete doctor recommendation (Effigy ane).

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.001.jpg

Prediagnosis doctor recommendation model integrating ontology characteristics and disease text mining.

iii. Data Cleaning Process

In that location are mainly two aspects of information that are bachelor online. The get-go attribute of data is patients' online consultation regarding disease symptom. This source of information mainly covers age, gender, symptom description, and other information. The 2d aspect of data is doctors' information online, including doctors' names, titles, hospitals, departments, and their specialties equally shown in Table 1. All data is in structured form, and data such as illness description, prediagnosis, and specialties are stored in text form. Then, model will be built subsequently discussion segmentation and keyword extraction (Figure two).

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.002.jpg

Table i

Information sample on patients and doctors online.

Patient ID	Gender	Age	Province/metropolis	Main complaint	Initial consultation department online
8070844	Female	65	Jiangsu	Menstruation keeps coming. B-ultrasound event shows that my endometrium is thick. I ate progesterone and did curettage. For now, I have been taking medicines for ten days. three days after progesterone, I withal had large corporeality of blood flow, and my stomach ached. I am wondering what is wrong with me.	Gynecology
81305510	Female person	42	Guangdong	Bilateral hydrosalpinx. I never had abortion history. I want to be meaning at present, what should I do now?	Gynecology
12031251	Female person	43	Heilongjiang	43-year-old, irregular menstruation for many years, 3 times for 2 months, the period was long for 7/8 days, the amount is little, and the color is dark brown. What medicine should I take?	Gynecology
57715499	Female	37	Henan	Just had miscarriage a month ago; yet, I got significant in confinement. Tin I go along the child?	Gynecology
72520784	Female	53	Shanghai	My mother is 53 years old. She feels nervous, unable to breathe, cannot lie down, and feels no strength.	Neurology
Doctor name	Title	Hospital	City	Specialties	Department
Niu^∗∗	Chief Physician	Ningbo Showtime Infirmary	Ningbo	Diagnosis and treatment of diabetes and thyroid affliction	Endocrinology
Yang^∗	Associate Main Physician	Shijiazhuang Start Hospital	Shijiazhuang	Hemorrhagic cerebrovascular disease such as cerebral aneurysm, arteriovenous malformation, arteriovenous fistula, and cavernous hemangioma; ischemic cerebrovascular diseases such equally carotid artery stenosis, vertebral avenue stenosis, intracranial artery stenosis,and moyamoya disease	Neurosurgery
Xu^∗∗	Chief Physician	Beijing Anzhen Infirmary, Capital Medical Academy	Beijing	Diagnosis, surgical treatment, and perioperative treatment of various congenital eye diseases	Pediatric cardiac surgery
Wang^∗∗	Associate Chief Doc	Shenzhen Bao'an People'southward Hospital	Shenzhen	Diagnosis and treatment of diabetes and its complications, hyperthyroidism, and hypothyroidism; employ of insulin pump and dynamic blood glucose monitors	Endocrinology
Liu^∗∗	Master Physician	Hospital of Traditional Chinese Medicine in Uygur, Xinjiang	Xinjiang	Neurology of traditional Chinese medicine	Neurology

4. Data on Ontology Characteristics of Doctors and Patients

The doctor-patient demographic information obtained from WeiYi platform are more often than not well-organized semistructured textual data. The first footstep is to transform unstructured text information into structured text data through named entity recognition and data extraction. Organization names, people'south names, and location names tin exist recognized by applying multiple open source Chinese language processing tools [31], such as fudanNLP adult by Fudan University [32], NLPIR discussion segmentation arrangement developed by Chinese Academy of Sciences [33], and LTP Chinese natural language processing platform of Harbin Institute of Technology [34]. In addition, delete the missing value and duplicated information. And, for the trouble of different doctors sharing one aforementioned name, use fields such equally "the infirmary to which they vest" and "the section to which they belong" to restrict.

v. Data on Patients' Condition Description

Data on patients' online condition description are presented as specific evaluations expressed by patients in natural language. The data in its initial grade are fulfilled with problems that the contents are nonstandardized, repetitive, short, and single [35]. The authors marked the text content by part of speech and synonyms and then utilize human tissue dictionary and human anatomy lexicon to match the discussion sectionalization results then equally to extract disease symptoms and keywords of human trunk parts. Every bit shown in Table one, the patient'south main complaint was that "it was caused past pelvic effusion viii years ago, at that place was no ballgame history and no pregnancy." The common clinical symptoms that the patient did not actually have appeared in the description brand it difficult to extract keywords. For example, "no abortion history " was divided into "no" and "abortion history," resulting in the extraction of " abortion history " as the keyword; even so, the patient did not have these symptoms. To deal with situations like the abovementioned, before word sectionalization, the authors would divide the description paragraph into short sentences or phrases by punctuation marks, and the end words should be retained in word segmentation. Then, while extracting keyword, the target words cannot exist considered as the existent target keywords if they contain negative modifiers such as none, unaccompanied, and no.

half-dozen. Data on Doctors' Specialties

Data on doctors' specialties are structured textual data and are confronted with problems of synonymous naming and missing data. An example of synonymous naming refers to the problem that doctors in different hospitals have dissimilar naming for their fields of expertise. Specifically, synonyms for fields of expertise are specialties, being good at, specializing in, being skilled in, being professional with, medical interest, and research direction. All synonymous naming shall exist integrated into the same field. Every bit for the problem of missing data, utilise multiple data source data integration to complete improvement or deletion.

vii. Dr. Recommendation

7.i. Department Recommendation

For questions input past patients, every keyword for each sentence tin can be obtained afterward give-and-take segmentation and discussion stopping removal. Adjacent, the respective question set can exist obtained by positioning question sentences that are associated with each keyword. The authors divided the question set into sample dataset and test dataset, both containing information of patients' condition description text, online prediagnosis department recommendation, etc. Then, use the word2vec library to train a word vector model on the keywords of the sentences in the sample data set, calculate the similarity between the questions input by the patient in the test data set and the discussion vector model of the sample data set, and lastly select the nearly similar questions to the sample data set in the examination dataset. Following the rule that higher similarity indicates the aforementioned one department, after screening the similarity calculation 1 by one, the department with the highest similarity would be the concluding recommendation result.

eight. Doctor Recommendation

The core significance of the development of online medical and health services is to reshape the medical service procedure and optimize the allocation of medical resources, then as to see the medical and health needs of individual consumers. Due to its mobility, convenience, rapidness, personalization, and interaction, the online medical services have get the primary channel for consumers to seek medical aid online, having been adopted and utilized by consumers. To some extent, it alleviates the medical pressure level and realizes the optimal resource allotment of medical resources. The patients using online medical service come from all sides, and the bulk of them have conventional and chronic diseases, making information technology sometimes necessary for patients to confirm their diagnosis offline. Therefore, doc recommendation that takes into business relationship of patients' location information is particularly important to improve patients' convenience of offline medical treatment and to attract more than patients to employ online medical services. Based on the SQL statements query part in the MYSQL database, matching keywords with doctors' specialties, department, and region information, integrating patients' location data, and this paper recommends local doctors that meet the requirements according to patients' region. For instance, a patient's naming Zhang San, living in Zhejiang province, with condition described every bit thick endometrium, heavy menstrual menstruum, and stomachache, would be recommended to see a Chief Dr. from Department of Gynecology at Zheyi hospital with family name of Wang.

nine. Sentence Similarity

9.1. Adding of Similarity Based on Postcontent

Afterward obtaining the unique d-dimensional distribution vector representation of the illness description text content, the similarity and distance between each two text contents can be obtained through similarity calculation. The writer uses the cosine formula to measure the similarity between 2 texts and uses the Mahala Nobis altitude to summate the tongue description of the two posts. Presume that ii paragraph vectors of tongue clarification of text content are expressed as PV_a = (×11, ×12, ⋯, ×aned) and PV_b = (×21, ×22, ⋯, ×2d), where d represents two paragraph vectors. The similarity and distance are defined as follows:

$\begin{matrix} \begin{matrix} sim (PV a, PV b) = \frac{PV d • PV d}{{‖PV d‖}^{two} • {‖PV d‖}^{2}}, \\ = \frac{\sum_{i - 0}^{i = d} 101 i x ii i}{\sqrt{\sum_{i - 0}^{i = d} x_{1 i}^{ii} \sqrt{\sum_{i - 0}^{i = d} 10_{ii i}^{2}}}}, \end{matrix} \\ dis (PV a, PV b) = \sqrt{{(PV a - {PV}_{b})}^{T} S^{- ane} (PV a - PV b)}, \end{matrix}$

(one)

where S is the covariance matrix of eigenvectors PV _a and PV _b .

ix.2. TF-IDF Sentence Similarity Based on Co-Occurring Words

This method believes that in two sentences, the more the same vocabulary, the higher the similarity of the ii sentences ^[36]. Specifically,

$\begin{matrix} SimScore (Southward 1, Due south 2) = \frac{|S 1 \cap S 2|}{|Due south 1 \cup Due south 2|} \sum_{west i \in Southward 1 \cap S 2} w eight (w i), \\ weight (west i) = \frac{Num (w i, k)}{Due north k} \times \log (\frac{N t}{Num (w i, t) + 1}) . \end{matrix}$

(two)

Among them, |·| is the cardinality of the gear up, S ₁ and South ₂ are the word sets of the two sentences to be compared, w _i represents the symptom word i in the department question and respond sentence, weight (w _i) is the TF-IDF ^[37] weight, Num (west _i,thousand) represents the number of sentences in which the symptom word w _i appears in the question and reply judgement set of department k, Due north _k represents the number of all questions and answers in section k, N _t represents the total number of questions and answers in the noesis base, and Num (wi, t) represents the full number of questions and answers in the noesis base. The number of sentences in which the symptom word i appears in the question. The TF-IDF sentence similarity calculation method based on co-occurring words belongs to the surface structure assay method. It only uses the surface data of the judgement, that is, the word frequency, part of oral communication, and other information of the words in the sentence to calculate the sentence similarity, without considering synonyms. This results in a decrease in the accurateness of sentence similarity.

ix.three. Sentence Similarity Method Based on Word Vector

Give-and-take vector sentence similarity is mainly used indepth learning tool word2vec ^[38] to process words into vectors and obtain the semantic similarity of judgement pairs to be compared past calculating the similarity between vectors. The specific formula is every bit follows:

$\begin{matrix} \underset{w i \in I, w j \in R}{CosSim} (due west i, w j) = \frac{\sum_{i = 1}^{north} (x i, y i)}{\sqrt{\sum_{i = 1}^{n} x_{i}^{2}} \times \sqrt{\sum_{i = ane}^{north} y_{i}^{2}}}, \\ SimScore (Due south one, S 2) = \frac{\sum_{w \in IR} β w M axSimValue (CosSim (w, IR))}{\sum_{due west \in IR} β westward} . \end{matrix}$

(three)

Among them, IR = S ₁ ∪ S _two, w _i and w _j are the two words to be compared, which stand for the words in sentence S ₁ and the words in sentence Southward ₂, respectively; n represents the dimension of the word vector, and x _i and y _i represent the word vector of w _i, and the vector value of the ithursday dimension of the word vector of w _j; MaxSimValue (CosSim (west,·)) represents the maximum value of the cosine similarity between the discussion vector corresponding to word due west and the word vector corresponding to all vocabulary of another judgement; parameter βw is The TF-IDF weight value of word w in the sentence. The greater the value of SimScore (S1, S2), the greater the similarity betwixt the 2 sentences and the closer the semantics.

x. Experiment

10.1. The Information Set

To analyze the doctor recommendation method proposed in this paper, an experimental report was conducted. The data of five virtually mutual departments were crawled from the well-known domestic medical online platform-WeiYi. The names of the departments were in accordance with the standardized department names used in real hospitals (e.thousand., endocrinology, dermatology, gynemetrics, pediatrics, and neurology). Every bit a effect, a dataset with name of T consisting of 20000 patients' preclinical data online were built. To carry experimentally comparative analysis of various algorithms, two widely used evaluation indexes for the recommendation performance were adopted in this paper, beingness accuracy rate (existence P) and call up charge per unit (being R):

x.2. Parameter Setting

In the experiment, the dimension parameter of the word vector was prepare every bit 100. With regard to the calculated similarity results of keyword gear up that would be used for department recommendation, have the top five questions with the highest sentence similarity every bit the recommended result data (topN = top five), and the threshold value of keyword set up similarity was set as 0.8; that is, when keyword and test set up data were used for keyword similarity calculation, the consequence must exceed 0.eight to be included in the infirmary department recommendation set up. If there were ii or more recommended hospital departments, it would be considered as no recommendation, existence a special case.

11. Results and Analysis

Among the 20000 patients surveyed, 16170 were female (77.3%). This may be because women are oftentimes required to care of family unit health and other responsibilities in addition to work; as well, women tend to pay more than attending to health information than men. A total of 16800/20000patients (84.0%) were 30 to 45years of historic period. Because of old men with express experiences in consulting physicians and obtaining medicines and children that cannot primary online counseling skills, so, quondam men and children may non frequently consult physicians on the net or ask their family members to perform online inquiries. In the 20000 records, 12600 of the physicians (63.0%) are chief physicians or associate master physicians, while19400 hospitals (97.0%) were ranked 3A (come across Tabular array two).In order to verify the feasibility and effectiveness of the proposed recommendation algorithms for department and medico, the experiment was conducted to compare them with the content-based recommendation algorithm and user-based collaborative filtering algorithm. First, randomly extract 100 pieces of information from the dataset T based on the hospital department name then perform word vector training. After the process of give-and-take segmentation and end word removal for data of dissimilar departments, the keyword set up was obtained, and the word vector model was trained using this keyword set (see Table three). The word vector model consisted of patients' real consultation questions, and the other words excluding those questions within the grouping were considered as noise words, representing meaningless words unrelated to patient's consultation. Three different algorithms were all used to measure similarity for keywords to give hospital department recommendation (encounter results of three algorithms in Table 4).

Table ii

Summary of the characteristics of the collected data records (N = 20000).

Characteristic	Value, n (%)
Gender
Male	4540 (33.7)
Female person	15460 (77.3)
Age (years)
25-30	1586 (7.9)
31-45	16800 (84.0)
46-l	1014 (5.1)
>55	600 (3.0)
Physician'southward professional championship
Resident physician	2670 (xiii.35)
Attending physician	4330 (21.65)
Associate primary md	8040 (40.2)
Main doctor	4560 (22.eight)
Other	400 (two.0)
Hospital's ranking level
3A	19400 (97.0)
Other	600 (3.0)

Tabular array 3

Word vector model and keyword examples.

Word vector-based model	Keyword set	Department
Word vector-based model	Headache, nausea, right middle, swelling, stuffy nose, right ear, tinnitus, etc.	Neurology
Keyword set	ane. Migraines, nausea, loss of appetite two. Headache, dizziness, protrusion of left eye, congestion of eyeball 3. Head distension, stuffiness, dizziness, palpitation, and restlessness 4. Palpitations and palpitations 10. Weak right hand, unable to clamp a fist, palpitation, unable to breathes

Table 4

Comparison of accuracy and retrieve rate.

Algorithm method	Accurateness rate (%)	Recall charge per unit (%)
Word vector-based	74	78
Content-based	63	67
Co-occurring word-based	54	56

Seen from Table four, the proposed similarity recommendation method in this newspaper that incorporates ontology features and disease text data mining was the all-time when applied to consultation near selecting advisable infirmary department since the accuracy rate and recall rate were much higher than the other ii algorithms. This is because the word vector sentence similarity measurement strategy tin amend measure out the semantic similarity of sentences. For example, for sentence pairs "I went to the hospital to see the dentist and went domicile, dizzy, heavy caput, runny olfactory organ" and "When I came back from the dentist, I started to feel Dizziness with symptoms of heavy caput and runny nose". If a co-occurring give-and-take-based measurement method based on co-occurrence words is used, the similarity value is low, because the judgement pair contains such things equally (dizziness, dizziness), (heavy head, sinking head), and (runny nose, runny olfactory organ). Synonym pairs such as clear nose) make the content-based method relatively proficient, and the word vector method has the all-time result, indicating that it can more accurately capture the underlying semantics of the judgement. On one hand, this is because the method in this paper tin can measure out the similarity of keywords better. For instance, keywords of "headache, palpitation, insomnia" and keywords of "caput amplification and restlessness" were considered every bit similar. The results were better than the sentence similarity measurement based on collocates. On the other mitt, the proposed method in this paper took fully consideration of factors such equally location information of doctors and patients, as well as doctors' expertise field, which would not exist the case for the content-based recommendation method that only takes the patient's disease information into account.

Seen from Figures 3 and 4, the recommendation performance of the word vector method was varying for different infirmary departments. The recommendation accuracy of pediatric section was below 0.5, and that of neurology, endocrinology, gynecology, and dermatology departments were all above 0.five, among which the recommendation accuracy of gynecology was the most improved. With regard to the iv departments with relatively higher recommendation accuracy, including neurology, obstetrics, gynecology, and dermatology, what they had in mutual was that the characteristics of the consultation questions were very typical and obvious. For case, high blood sugar, sudden weight loss, and thirst are typical for endocrinology; red rash, circular rash, redness, swelling, and itching are typical for dermatology; pregnancy and irregular menstruation are typical for gynecology. However, the situation is dissimilar for pediatric section in that if information indicating age such as baby, child, and 6 months old is non included in the consultation, it may lead to the systematic recommendation to other departments, reducing the accurateness accordingly.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.003.jpg

Recommendation accuracy comparison of different departments.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.004.jpg

Comparison of recommendation rates of diverse departments.

Finally,The SQL statement query role in the MYSQL database used to integrate the patient'southward regional factors. According to the patient'due south region, we utilise the section and regional keyword matching and recommend the doctors in the infirmary to patient in the region that encounter the needs, such equally "Zhang San, from Zhejiang, the condition is described as uterus Thick intima, heavy menstrual catamenia, and stomachache," and the recommended doctor is "Zhejiang First Hospital-Gynecology-Dr. Wang (Chief Physician)." The process is shown in Figure 5.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.005.jpg

Doctor recommendation framework.

12. Conclusion

Traditional manual medical guidance is increasingly unable to meet the people's medical needs, registration is hard, and the problem of not finding a clinic has go increasingly prominent. Aiming at the shortcomings of traditional medical department recommendation research methods and factors such as the necessity for professional medical diagnosis expertise and information disproportion between doctors and patients makes it impossible for patients to place the appropriate clinic room or doctors. Once mistakes are made, online consultation time would be wasted, increasing the cost of hospitals and patients when the patient goes offline instead for medical handling. In this paper, the proposed online prediagnosis doctor recommendation model integrates ontology characteristics and illness text mining. The experimental process uses real data on the Internet medical comprehensive website and is like to the judgement based on content based, and based on collocate based is compared; the experiment verifies the reliability and effectiveness of the method in this newspaper. This provides great convenience for patients to seek medical treatment and at the aforementioned time reduces medical costs. Information technology gives a relatively more accurate recommendation advice based on ontology characteristics such as patients' clarification texts and doctors' specialties. As a result, the proposed online prediagnosis doctor recommendation model improves patients' online consultation experience and offline treatment convenience, enriching the value of online prediagnosis data. In addition, the primary existent data from the online medical consultation platform were utilized to verify the reliability and effectiveness of the proposed method.

13. Limitations

Information technology is not without limitation in this paper. First of all, this study was only carried out based on data from one online medical community, rendering its generalizability a question. Futurity study may consider collecting data from multiple online medical community platforms to verify the recommendation event of the proposed algorithm. 2d, considering that this study is solely focused on the proposed recommendation model for Chinese patients, similar studies shall be carried out in Western background in the future. Third, because of the complexity of the medical domain knowledge, follow-upward researches shall not only comprise techniques such every bit semantic analysis and sentiment assay to expand the sample into general practice data but as well consider introducing users' other behavioral information to introduce the user information behavior factor optimize the target object, for intelligent department recommendation tasks, in addition to controlling data quality and deep learning algorithms such as LSTM shall be applied to improve model accuracy in the future. The intelligent department recommendation task tin also be abstracted as a multilabel classification chore for texts. Accordingly, multiple department categories can be recommended for patients' questions covering multiple departments, etc. to further improve the accurateness of the proposed recommendation model, expecting to apply it to more online medical consultation platforms.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.001.jpg

This module preprocesses the sample dataset using the following code. The aim is to segment words, remove stop words, and retain central parts or cardinal symptoms with regard to patients' status description online.

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.002.jpg

This module used the word2VEC library to train the word vector model of dermatology on sample data such as "dermatology. XLS."

An external file that holds a picture, illustration, etc. Object name is BMRI2021-7431199.alg.003.jpg

The module mainly had 2 goals to accomplish. First, preprocess the test data, including give-and-take sectionalization and stop word removal, and retaining key parts or symptoms for the disease description. Second, compare the word vectors of test data and that of the training results, and the departments with loftier similarity were recommended to patients.

Acknowledgments

This projection was funded past grants from the National Natural Science Foundation of China: Inquiry on Consumer Credit Value Measurement Integrating Online Social Relationships in eCommerce (71571162). The data were collected with help from the ambassador of the WeiYi platform. The data were nerveless with help from the administrator of the WeiYi platform.

Data Availability

The data were collected with assistance from the administrator of the WeiYi platform. Due to third-political party rights, patient privacy, and commercial confidentiality, information is not open source.

Upstanding Approval

The data in this paper is divided into 2 parts. I part is the information crawled from the platform, such as patient comments and medico profiles. This kind of information is open to the public and everyone can utilize computer applied science to obtain information technology on the platform. The other part is the patient'southward age, gender, geographical location, and other information provided by the microdoctor. The WEI-Yi platform is one of the hundreds of online medical platforms in China, with tens of thousands of registered hospitals, registered doctors, and hundreds of thousands of patients using the platform. The platform itself has a sound risk command system, and we have also signed a confidentiality agreement with the platform to define the scope of data utilize.

Disclosure

The newspaper was published in a reduced version at the IEEE 6th International Briefing on Big Data Assay (ICBDA) in 2021.

Conflicts of Interest

The authors declare that they take no conflicts of interest.

Authors' Contributions

SZ and CJ refined the topics and methods at the initial stage of paper writing. And so, SZ conducted the statistical analysis and wrote the paper nether the guidance of CJ. Both authors reviewed, revised, and approved the concluding typhoon.

References

1. Balarajan Y., Selvaraj S., Subramanian S. Health care and equity in Bharat[J]Health care and disinterestedness in India. The Lancet. 2011;377, article 9764:505–515. doi: 10.1016/S0140-6736(10)61894-6. [PMC free article] [PubMed] [CrossRef] [Google Scholar]

2. Goh J. Yard., Gao G., Agarwal R. The creation of social value: Can can an online health community reduce rural-urban health disparities? MIS Quarterly. 2016;forty(ane):247–263. doi: 10.25300/misq/2016/40.1.11. [CrossRef] [Google Scholar]

3. Pan J., Shallcross D. Geographic distribution of hospital beds throughout China: a county-level econometric analysis. International Periodical for Equity in Health. 2016;15(1):p. 179. doi: 10.1186/s12939-016-0467-9. [PMC free article] [PubMed] [CrossRef] [Google Scholar]

v. Bo H. Pattern and realization of AISCP guiding system built in knowlege base. SUZHOU:SoochowUniversity; 2006. [CrossRef] [Google Scholar]

6. Ru H. In: The blueprint and implementation of the guidance system based on the reasoning algorithm. FEI H. E., editor. Anhui Academy; 2016. [CrossRef] [Google Scholar]

8. Ju C., Zhang S. Inquiry on md recommendation model for Pre-Diagnosis online based on Large data Mining. 2021 IEEE 6th International Conference on Big Data Analysis (ICBDA 2021); 2021. [Google Scholar]

nine. Chuan-Peng C., Zhi-Gang West. A method of sentence similarity computing based on Hownet. Calculator Applied science and Science. 2012;34(two):172–175. doi: 10.3969/j.issn.1007-130X.2012.02.031. [CrossRef] [Google Scholar]

10. Yifeng X., Lijun L., Qingsong H., Tiewei F. Research on TF-IDF weight improvement algorithm in intelligent guidance system. Computer Applied science and Applications. 2017;53(four):238–243. doi: 10.3778/j.issn.1002-8331.1506-0258. [CrossRef] [Google Scholar]

eleven. Hai-Ling X., Xiao West., Xiao-Dong L., Yan B.-P. Comparison written report of internet recommendation organisation. Journal of Software. 2009;20(ii):350–362. doi: 10.3724/SP.J.1001.2009.03388. [CrossRef] [Google Scholar]

12. Huang C.-G., Yin J., Wang J., Liu Y.-B., Wang J.-H. Uncertain Neighbors'Collaborative filtering recommendation algorithm. Chinese Periodical of Computers. 2010;33(8):1369–1377. doi: ten.3724/SP.J.1016.2010.01369. [CrossRef] [Google Scholar]

thirteen. Liang Z., Na Z. Improved collaborative filtering algorithm. Computer Systems & Applications. 2016;25(7):147–150. doi: 10.15888/j.cnki.csa.005224. [CrossRef] [Google Scholar]

14. Mingming J. Contain Topic Model into Collaborative Filtering. Beijing: Beijing Insititute of Technology; 2016. [Google Scholar]

15. Wu Y., Rui T., Ling L. News recommendation method by fusion of content-based recommendation and collaborative filtering. Periodical of Computer Applications. 2016;36(2):414–418. doi: 10.11772/j.issn.1001-9081.2016.02.0414. [CrossRef] [Google Scholar]

sixteen. López-Nores Thousand., Blanco-Fernández Y., Pazos-Arias J. J., Gil-Solla A. Property-based collaborative filtering for health-aware recommender systems. Skilful Systems with Applications. 2012;39(8):7451–7457. doi: x.1016/j.eswa.2012.01.112. [CrossRef] [Google Scholar]

17. Surong Y., Xiaoqing F., Yixing Fifty. Matrix factorization based social recommender model. Journal of Tsinghua University(Science and Engineering science) 2016;56(vii):793–800. doi: 10.16511/j.cnki.qhdxxb.2016.21.045. [CrossRef] [Google Scholar]

18. Bing F., Xiaoting N. Tag-based matrix factorization recommendation algorithm. Application Research of Computers. 2017;34(4):1021–1025. doi: 10.3969/j.issn.1001-3695.2017.04.015. [CrossRef] [Google Scholar]

nineteen. Huang Z. X., Lu Ten. D., Duan H. L., Zhao C. Collaboration-based medical knowledge recommendation. Bogus Intelligence in Medicine. 2012;55(1):13–24. doi: 10.1016/j.artmed.2011.ten.002. [PubMed] [CrossRef] [Google Scholar]

20. Kim J. H., Lee D. Southward., Chung One thousand. Y. Item recommendation based on context-enlightened model for personalized u-healthcare service. Multimedia Tools and Applications. 2014;71(2):855–872. doi: 10.1007/s11042-011-0920-0. [CrossRef] [Google Scholar]

21. Deshpande M., Karypis 1000. Item-based summit-N recom- mendation algorithms. ACM Transactions on Informa- tion Systems. 2014;22(1):143–177. [Google Scholar]

23. Hu B. South., Feng D., Cao Due west. C., LQ F., JH 1000. Mobile intelligent illness diagnosis organization based on Bayesian analysis. Journal of Computer Applications. 2008;28(6):fifteen–17. [Google Scholar]

24. Shoukun X., Weiwei W. Balance recommendation algorthm for medical resources based on semantic. Computer Applied science. 2015;41(ix):74–79. doi: x.3969/j.issn.one thousand-3428.2015.09.013. [CrossRef] [Google Scholar]

25. Yan Z., Shiyao 50., Tin Z. An improved recommendation algorithm for mobile health intendance organisation. Journal of Academy of Chinese Academy of Sciences. 2016;34(1):112–118. doi: 10.7523/j.issn.2095-6134.2017.01.015. [CrossRef] [Google Scholar]

26. Jiang G. G., Song D. Yard., Liao L. J., Zhu F. A Bayesian rec-ommender model for user rating and review profiling. Tsnghua Science and Technology. 2015;20(6):634–643. doi: 10.1109/TST.2015.7350016. [CrossRef] [Google Scholar]

28. Xiang-Wu Thousand., Shu-Dong L., Yu-Jie Z., Xun H. Inquiry on social recommender systems. Periodical of Software. 2015;26(6):1356–1372. doi: ten.13328/j.cnki.jos.004831. [CrossRef] [Google Scholar]

thirty. Yang W., Yong Z., Zhendong L., Guanci Y. Rating prediction algorithm based on semantic similarity and matrix factorization. Journal of Computer Applications. 2017;37(Supplement one):287–291. [Google Scholar]

31. Xiaoyu F., Yongxiang D., Pengwei Z., Xiao Z. Report for the construction method of scientist profile with multi source data fusion. Library and Information Service. 2018;62(15):31–xl. doi: ten.13266/j.issn.0252-3116.2018.15.004. [CrossRef] [Google Scholar]

32. Qiu X., Zhang Q., Huang X. Fudan NLP:a toolkit for Chinese natural language processing. Proceedings of the meeting of the Association for Computational Linguistics: system demonstrations; 2013; Sofia: the Association for Computational Linguistics. pp. 49–54. [Google Scholar]

33. Zhou L., Zhang D. NLPIR: A theoretical framework for applying natural language processing to information retrieval. Periodical of the American Guild for Computer science and Engineering science. 2003;54(2):115–123. doi: ten.1002/asi.10193. [CrossRef] [Google Scholar]

34. Ting 50., Wanxiang C., Zhenghua 50. Linguistic communication technology platform. Periodical of Chinese Information Processing. 2011;25(six):53–62. doi: ten.3969/j.issn.1003-0077.2011.06.008. [CrossRef] [Google Scholar]

35. Wang Yard. User information extraction and analysis big data environment. Beijing: Beijing University of Posts and Telecommunications; 2018. [Google Scholar]

Articles from BioMed Research International are provided here courtesy of Hindawi Limited

beggspany1973.blogspot.com

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8379386/