A leakage-aware entropy screening protocol for structured biomarker panel evaluation in ovarian cancer risk modelling.
Authors
P DLK, P M
Affiliations
- IT, Puducherry Technological University, Puducherry, India.
Abstract
In biomedical risk modeling, entropy-based feature screening is being increasingly used, but covariance statistics have been commonly calculated based on entire datasets before validation. It may introduce a small amount of information leakage and spur a perceived panel synergy especially when administrative or post-outcome variables are present. This article presents a leakage-aware entropy screening protocol for structured biomarker panel evaluation. The workflow integrates (i) systematic administrative variable auditing, (ii) fold-restricted von Neumann entropy estimation derived from covariance-based density matrices, and (iii) cross-validated predictive benchmarking within a unified implementation framework. Entropy computation is strictly confined to training partitions under stratified cross-validation, and panel definitions are fixed prior to analysis to avoid grouping bias. The protocol is demonstrated using a structured ovarian cancer dataset from the PLCO Cancer Screening Trial to illustrate leakage-controlled panel-level entropy estimation and validation-consistent benchmarking. The workflow provides an auditable and reproducible framework for multivariate panel screening and is adaptable to other structured biomedical datasets requiring grouped feature evaluation.
Publication Details
PubMed ID
41937921
Digital Object Identifier
10.1016/j.mex.2026.103880
Publication
MethodsX. 2026 Jun; Volume 16: Pages 103880
- PLCO-1651: Ovarian Cancer Classification using machine learning approaches (DSS Laskshmi Kumari P - 2024 )