Submit Manuscript  

Article Details

An Analysis of PCOS Disease Prediction Model Using Machine Learning Classification Algorithms


Shivani Aggarwal* and Kavita Pandey   Pages 1 - 11 ( 11 )


Background: Polycystic ovary syndrome is commonly known as PCOS and it is surprising that it affects up to 18% of women in reproductive age. PCOS is the most usually occurring hormone-related disorder. Some of the symptoms of PCOS are irregular periods, increased facial and body hair growth, attain more weight, darkening of skin, diabetes and trouble conceiving (infertility). It also came into light that patients suffering from PCOS also possess a range of metabolic abnormalities. Due to metabolic abnormalities, some disorder may occur which increase the risk of insulin resistance, type 2 diabetes and impaired glucose tolerance (a sign of prediabetes). Family members of women suffering from PCOS are also at higher hazardous level for developing the same metabolic abnormalities. Obesity and overweight status contribute to insulin resistance in PCOS.

Objective: In the modern era, there are several new technologies available to diagnose PCOS and one of them is Machine learning algorithms because they are exposed to new data. These algorithms learn from past experiences to produce reliable and repeatable decisions. In this article, Machine learning algorithms are used to identify the important features to diagnose PCOS.

Methods: Several classification algorithms like Support vector machine (SVM), Logistic Regression, Gradient Boosting, Random Forest, Decision Tree and K-Nearest Neighbor (KNN) are uses well organized test datasets for classify huge records. Initially a dataset of 541 instances and 41 attributes has been taken to apply the prediction models and a manual feature selection is done over it. Results: After the feature selection, a set of 12 attributes has been identified which plays a crucial role in diagnosing PCOS.

Conclusion: There are several researches progressing in the direction of diagnosing PCOS but till now the relevant features are not identify for the same.


Polycystic Ovary Syndrome (PCOS), Machine learning algorithms, Random Forest, Decision Tree, Gradient Boosting, K-Nearest Neighbor, Logistic Regression, Support Vector Machine, Feature Selection.


Department of Computer Science, Jaypee Institute of Information Technology, Noida, Department of Computer Science, Jaypee Institute of Information Technology, Noida

Read Full-Text article