RQS 2.0 with RRL Levels

Radiomics Quality Score (RQS) 2.0 with Radiomics Readiness Levels (RRLs)

Understanding RQS 2.0 and Radiomics Readiness Levels

The Radiomics Readiness Levels (RRLs) framework is embedded within RQS 2.0 to establish a structured, step-by-step approach to radiomics research. Figure 1 outlines the journey from early-stage exploration to full clinical deployment, highlighting key milestones that enhance scientific rigor.

To begin, select the maximum RRL level for evaluation and choose whether to assess handcrafted radiomics, deep learning, or both. Click the "Start" button to proceed.

Radiomics Quality Score Workflow Based on Radiomics Readiness Levels

Select Method:

Select Maximum RRL Level to Include:

RRL 1 – Foundational Exploration

No. Criteria Options
1 Unmet Clinical Need – Unmet clinical need (UCN) defined. ● UCN is agreed upon and defined by more than one center. ● UCN is defined using an established consensus method such as the Delphi method.
2 Hardware Description – Detailed description of the imaging hardware used, including model, manufacturer, and technical specifications.
3 Image Protocol Quality – Five levels of image protocol quality for TRIAC: ● Level 0: Protocol not formally approved. ● Level 1: Approved with a reference number in the institutional archive. ● Level 2: Approved with formal quality assurance (recommended minimum for prospective trials). ● Level 3: Established internationally; published in guidelines and peer-reviewed papers. ● Level 4: Future proof (follows TRIAC Level 3, FAIR principles, retains raw data).
4 Inclusion and Exclusion Criteria – Detailed criteria for patient selection in studies, including rationale.
5 Diversity and Distribution – Identify potential biases before the project (demographics, socioeconomic, geographic, medical profiles).

RRL 2 – Data Preparation

No. Criteria Options
6 Feature Robustness – Assess robustness via: 1. Imaging at multiple time points (test–retest). 2. Multiple segmentations (different physicians/algorithms/noise/perturbations). 3. Phantom study (identify inter-scanner/vendor differences).
7 Preprocessing of Images – Apply steps to standardize images with clear reasoning.
8 Harmonization – Use image-level (e.g. CycleGANs) or feature-level (e.g. ComBat) harmonization techniques.
9 Compliance with International Standards – Use implementations that adhere to standards (e.g., IBSI) for radiomic feature extraction.
10 Automatic Segmentation – Use an automated segmentation algorithm for ROI definition.

RRL 3 – Prototype Model Development

No. Criteria Options
11 Feature Reduction – Reduce features to lower the risk of overfitting (especially when features outnumber samples; check for correlations with volume).
12 Feature Robustness for Feature Selection – Integrate robustness evaluation into feature selection using prior test–retest, phantom, or segmentation studies.
13 HCR + DL Combination – Compare and explore the synergistic combination of handcrafted radiomics and deep learning models.
14 Multivariable Analysis – Incorporate non‑radiomics features (clinical, genomic, proteomic) to yield a holistic model.

RRL 4 – Internal Validation

No. Criteria Options
15 Single Center Validation – Validation performed on data from the same institute without retraining or adapting the cut-off value.
16 Cut-off Analyses – Identify optimal thresholds (e.g., using Youden’s Index) for classification or survival analysis.
17 Discrimination Statistics – Report discrimination metrics (e.g., ROC curve, sensitivity, specificity) with significance (p-values, CIs). ● Statistic reported ● With Resampling method
18 Calibration Statistics – Report calibration metrics (e.g., calibration-in-the-large, slope, plots).
19 Failure Mode Analysis – Document model limitations with examples of edge cases.
20 Open Science and Data – Make code and data publicly available. ● Open scans (+1) ● Open segmentations (+1) ● Open code (+1)

RRL 5 – Capability Testing

No. Criteria Options
21 Multi‑center Validation – Validation with data from multiple institutes ensuring no overlap: ● One external institute ● Two or more external institutes ● Third‑party platform with completely unseen data
22 Comparison with ‘Current Clinical Standard’ – Assess model agreement or superiority versus the current gold standard (e.g., TNM staging).
23 Comparison to Previous Work – Compare performance with published HCR signatures or DL algorithms.
24 Potential Clinical Utility – Report on the current and potential clinical application (e.g., decision curve analysis).

RRL 6 – Trustworthiness Assessment

No. Criteria Options
25 Explainability – Apply explainability tools (e.g., SHAP for HCR, GradCAM for DL) to clarify model predictions.
26 Explainability Evaluation – Conduct qualitative and quantitative evaluations of interpretability methods (e.g., checking consistency to adversarial perturbations).
27 Biological Correlates – Detect and discuss biological correlates to deepen understanding of radiomics and underlying biology.
28 Fairness Evaluation and Mitigation – Evaluate model performance for biases and apply bias correction if needed. ● Fairness evaluated ● Bias correction applied

RRL 7 – Prospective Validity

No. Criteria Options
29 Usability for Clinicians – Evaluate the tool’s usability, interface, and workflow integration.
30 Sample Size Calculation – Ensure statistical validity by calculating the appropriate sample size before prospective validation.
31 Clinical Trial Pre-registration – Register the prospective trial (including its statistical plan) on a clinical trial database (e.g., ClinicalTrials.gov).
32 Prospective Validation – Carry out prospective (or in silico) validation to ensure clinical validity of the biomarker.
33 Real‑World Clinical Assessment – Conduct human‑in‑the‑loop assessments to evaluate the practical impact of the radiomics model.

RRL 8 – Applicability and Sustainability

No. Criteria Options
34 Software Traceability – Implement and document a robust traceability process detailing development, changes, and version control.
35 Software Safeguards – Implement checks to prevent out‑of‑scope use or unreliable inputs.
36 Cost‑effectiveness Analysis – Report on the cost‑effectiveness of the clinical application (e.g., QALYs generated).
37 Performance Drift – Define a strategy to evaluate model performance periodically due to data shifts.
38 Continuous Learning – Define a strategy for continuous learning and improvement over time.

RRL 9 – Clinical Deployment

No. Criteria Options
39 Define the Level of Automation in Clinical Practice – ● Level 0: No Automation (clinician performs the task). ● Level 1: Clinical Assistance (model prediction assists). ● Level 2: Partial Automation (model prediction considered before final recommendation). ● Level 3: Conditional Automation (model provides predictions under supervision). ● Level 4: High Automation (predictions provided; clinician intervenes in special cases). ● Level 5: Full Automation (predictions provided without human intervention).
40 Quality Management System – Implement and maintain a QMS (e.g., ISO 9001) to ensure consistent quality and compliance.
41 Regulatory Requirements – Evaluate alignment with the chosen regulatory pathway (e.g., FDA 510(k), PMA, EMA, European AI Act).
42 Product on the Market – Successfully introduce the radiomics product to the market ensuring regulatory approval and clinical adoption.
Total Score: 0 out of 0 [0%]

Evolution of Score by RRL Level

Points Satisfied per Stage