Abstract: Extracting specific information from large volumes of text document possess a tremendous challenge. Medical records tend to have information glut problem (Masses of continuously increasing information) resulting in delay of early detection of diseases. This research concentrates on modeling a domain-based technique for knowledge acquisition and text mining in a database. The Term Frequency-Inverse Document Frequency (TF-IDF) was used to extract the entity's text features and the frequency of the terms is rescaled in TF-IDF by considering how often the words appear in all the documents. Random Forest Classifier (RFC) has been used to model the class of objects associated with symptoms’ illnesses to facilitate accurate prediction of likely diseases the knowledge -base and text mining system calculates the possibilities of diseases according to given symptoms and displays the probabilities of disease accuracy.
Keywords: Domain Based, Knowledge Acquisition, Text Mining, TF-IDF,RCF.
| DOI: 10.17148/IJARCCE.2021.10427