Skip to Main Content

COVID-19 is an emerging, rapidly evolving situation.

What people with cancer should know:

Get the latest public health information from CDC:

Get the latest research information from NIH:

Principal Investigator
Laboni Akter
Khulna University of Engineering and Technology
Position Title
Postgraduate Student
About this CDAS Project
PLCO (Learn more about this study)
Project ID
Initial CDAS Request Approval
May 12, 2020
Early Prediction of Ovarian Cancer from Images along with Some Blood Biomarkers Using Machine Learning
In the last few decades lot of research has been done in detection of the symptoms of the ovarian cancer. The ovarian cancer is regarded as the silent killer because patients in large amount are diagnosed at the last stage and no symptoms were obvious at an early the disease must be diagnosed at an early stage and treated on the basis of the symptoms. Ovarian tumor based on the various classification methods for the treatment of the disease. The diagnosis of the complex disease such as tumor has been based on the non-molecular characteristics such as tumor tissue, clinical phase and pathological characteristics. In America and across the world, 27% of women are dead due to unawareness of this disease. The diagnosis at the correct time leads to life saving of the patients. Ovarian cancer is the ninth most common cancer found in females and its ranks fifth in cause of death among the women. The main cause of the increasing rate of the death among the women is due to undiagnosed ovarian cancer because it may reaches from stage 3 to stage 4. To improve the situation a great effort has been done on the early detection of the ovarian cancer because diagnosis of the disease at an early stage leads to high survival rate of the patient. The ovarian cancer is the mainly deadly gynecologic malignancy; the less survival rate is the main cause of the early death of the patients. For an example, biomarker of cancer antigen 125 (CA125) is used for early detection of cancer for ovarian cancer detects 50-60% of the women’s instage 1 ovarian cancer. The complex biometric samples of the cancer disease can diagnosed specifically through biomarkers. In recent years of the advanced technology, the mass spectrometry based technologies used to detect the cancer diseases. In this project, detection of the ovarian cancer is done at an early stage so that the death rate of women may decrease to certain level. the detection and identification of ovarian cancer is done through selection of the images. The proposed project helps in detection of disease through different stages by acquiring the images. The Optimization algorithm employs for the selection and extraction of the features by using different operators which are employee, onlooker and scout bee with the MRI images. The classification and detection of the ovarian cancer is done using multi-layer CNN. The classification process is done using the selection and extraction of the features. The proposed work mainly emphasis on the evaluating and comparing the performance metrics, so that the performance of the image quality can be improved. The performance parameters are peak signal to noise ratio. The images are selected by extracting features to increase the accuracy rate, reduce the error rate and the time of the processing rate.

* Ovarian Cancer Data Collection
The related test data can help to tell if there is ovarian cancer or not and if ovarian cancer has spread to other organs.

* Data preprocessing:
Data Preprocessing is a technique that is used to convert the raw data into a clean data set. In other words, whenever the data is gathered from different sources it is collected in raw format which is not feasible for the analysis. When creating a machine learning project, it is not always a case that we come across the clean and formatted data. And while doing any operation with data, it is mandatory to clean it and put in a formatted way. So for this, we use data preprocessing task.

* Feature Extraction from Data
Feature extraction is a process of dimensionality reduction by which an initial set of raw data is reduced to more manageable groups for processing. A characteristic of these large data sets is a large number of variables that require a lot of computing resources to process. Feature extraction is the name for methods that select and /or combine variables into features, effectively reducing the amount of data that must be processed, while still accurately and completely describing the original data set.

* Feature Selection From Data
Feature selection is different from dimensionality reduction. Both methods seek to reduce the number of attributes in the dataset, but a dimensionality reduction method do so by creating new combinations of attributes, whereas feature selection methods include and exclude attributes present in the data without changing them.

* Learning Algorithm
For Classification:
SVM , Neural Network, KNN, Logistic Regression, Random Forest, etc.
For Regression:
Decision Tree, Linear Regression , Neural Network, SVR, Polynomial Regression etc.

* Model Training
Two model training styles are most common — supervised and unsupervised learning. The choice of each style depends on whether must forecast specific attributes or group data objects by similarities.
Supervised learning: Supervised learning allows for processing data with target attributes or labeled data.
Unsupervised learning: During this training style, an algorithm analyzes unlabeled data. The goal of model training is to find hidden interconnections between data objects and structure objects by similarities or differences.

* Model Performance Evaluation
For Classification: Accuracy, Sensitivity, Specificity
For Regression: R² ,MSE ,RMSE

Accuracy= (TN+TP)/(TN+TP+ FN+FP)
Sensitivity= TP/(TP+ FN)
Specificity= TN/(TN+TP)