Abstract:
One crucial stage in the data preparation procedure for breast cancer classification involves extracting a selection of
meaningful genes from microarray gene expression data. This stage is crucial because it discovers genes whose expression patterns can
differentiate between different types or stages of breast cancer. Two highly effective algorithms, CONSISTENCY-BFS and CFS-BFS,
have been developed for gene selection. These algorithms are designed to identify the genes that are most crucial in distinguishing
between different types and stages of breast cancer by analysing large volumes of genetic data. A noteworthy advancement is a refined
2-Stage Gene Selection technique specifically designed for predicting subtypes in breast cancer. The initial phase of the 2-Stage Gene
Selection (GeS) approach relies on the CFS-BFS algorithm, which plays a crucial role in effectively eliminating unnecessary, distracting,
and redundant genes. The initial filtering process plays a crucial role in simplifying the dataset and identifying the genes that have
the highest potential to shed light on the category of breast cancer. The CONSISTENCY-BFS algorithm guarantees that only the
most pertinent genes are retained by further refining the gene selection process. This stage is essential for eliminating any remaining
uncertainty and enhancing the overall efficiency of the algorithm. This innovative approach represents a significant advancement in
the field of bioinformatics as it offers a more accurate and targeted method for selecting genes based on their relevance to breast
cancer classification. When the 2-Stage GeS is constructed using Hidden Weight Naive Bayes, remarkably, it yields more precise and
dependable outcomes. The indicators that demonstrate positive outcomes encompass recollection, accuracy, f-score, and fallout rankings.
The Kaplan-Meier Survival Model was employed to further validate the top four genes, namely E2F3, PSMC3IP, GINS1, and PLAGL2.
Presumably, precision therapy will specifically focus on targeting the genes E2F3 and GINS1.