Health Technology Assessment (Oct 2024)

MRI software and cognitive fusion biopsies in people with suspected prostate cancer: a systematic review, network meta-analysis and cost-effectiveness analysis

  • Alexis Llewellyn,
  • Thai Han Phung,
  • Marta O Soares,
  • Lucy Shepherd,
  • David Glynn,
  • Melissa Harden,
  • Ruth Walker,
  • Ana Duarte,
  • Sofia Dias

DOI
https://doi.org/10.3310/PLFG4210
Journal volume & issue
Vol. 28, no. 61

Abstract

Read online

Background Magnetic resonance imaging localises cancer in the prostate, allowing for a targeted biopsy with or without transrectal ultrasound-guided systematic biopsy. Targeted biopsy methods include cognitive fusion, where prostate lesions suspicious on magnetic resonance imaging are targeted visually during live ultrasound, and software fusion, where computer software overlays the magnetic resonance imaging image onto the ultrasound in real time. The effectiveness and cost-effectiveness of software fusion technologies compared with cognitive fusion biopsy are uncertain. Objectives To assess the clinical and cost-effectiveness of software fusion biopsy technologies in people with suspected localised and locally advanced prostate cancer. A systematic review was conducted to evaluate the diagnostic accuracy, clinical efficacy and practical implementation of nine software fusion devices compared to cognitive fusion biopsies, and with each other, in people with suspected prostate cancer. Comprehensive searches including MEDLINE, and Embase were conducted up to August 2022 to identify studies which compared software fusion and cognitive fusion biopsies in people with suspected prostate cancer. Risk of bias was assessed with quality assessment of diagnostic accuracy studies-comparative tool. A network meta-analysis comparing software and cognitive fusion with or without concomitant systematic biopsy, and systematic biopsy alone was conducted. Additional outcomes, including safety and usability, were synthesised narratively. A de novo decision model was developed to estimate the cost-effectiveness of targeted software fusion biopsy relative to cognitive fusion biopsy with or without concomitant systematic biopsy for prostate cancer identification in biopsy-naive people. Scenario analyses were undertaken to explore the robustness of the results to variation in the model data sources and alternative assumptions. Results Twenty-three studies (3773 patients with software fusion, 2154 cognitive fusion) were included, of which 13 informed the main meta-analyses. Evidence was available for seven of the nine fusion devices specified in the protocol and at high risk of bias. The meta-analyses show that patients undergoing software fusion biopsy may have: (1) a lower probability of being classified as not having cancer, (2) similar probability of being classified as having non-clinically significant cancer (International Society of Urological Pathology grade 1) and (3) higher probability of being classified at higher International Society of Urological Pathology grades, particularly International Society of Urological Pathology 2. Similar results were obtained when comparing between same biopsy methods where both were combined with systematic biopsy. Evidence was insufficient to conclude whether any individual devices were superior to cognitive fusion, or whether some software fusion technologies were superior to others. Uncertainty in the relative diagnostic accuracy of software fusion versus cognitive fusion reduce the strength of any statements on its cost-effectiveness. The economic analysis suggests incremental cost-effectiveness ratios for software fusion biopsy versus cognitive fusion are within the bounds of cost-effectiveness (£1826 and £5623 per additional quality-adjusted life-year with or with concomitant systematic biopsy, respectively), but this finding needs cautious interpretation. Limitations There was insufficient evidence to explore the impact of effect modifiers. Conclusions Software fusion biopsies may be associated with increased cancer detection in relation to cognitive fusion biopsies, but the evidence is at high risk of bias. Sufficiently powered, high-quality studies are required. Cost-effectiveness results should be interpreted with caution given the limitations of the diagnostic accuracy evidence. Study registration This trial is registered as PROSPERO CRD42022329259. Funding This award was funded by the National Institute for Health and Care Research (NIHR) Evidence Synthesis programme (NIHR award ref: 135477) and is published in full in Health Technology Assessment; Vol. 28, No. 61. See the NIHR Funding and Awards website for further information. Plain language summary Men with an magnetic resonance imaging scan that shows possible prostate cancer (PCa) are offered prostate biopsies, where samples of the prostate tissue are collected with a needle, to confirm the presence and severity of cancer. Different biopsy methods exist. In a cognitive fusion biopsy, clinicians will target abnormal looking parts of the prostate by looking at the magnetic resonance imaging scan alongside ‘live’ ultrasound images. During a software fusion (SF) biopsy, a computer software is used to overlay the magnetic resonance imaging scan onto the ultrasound image. This study evaluated whether SF is better at detecting cancer compared with cognitive fusion biopsy, and whether it represents value for money for the National Health Service. We did a comprehensive review of the literature. We combined and re-analysed the evidence, and assessed its quality. We investigated whether SF biopsies are sufficient value for money. Compared with cognitive fusion, patients receiving a SF biopsy may have: (1) a lower probability of having a ‘no cancer’ result, (2) similar probability of having a benign, non-clinically significant (CS) cancer result and (3) higher probability of detecting CS cancer. However, it is uncertain to what extent SF is more accurate than cognitive fusion, because of concerns about the quality of the evidence. We found no evidence that any SF devices were superior to others. Using additional, random biopsies alongside software or cognitive fusion would increase the detection of PCa. We also looked for evidence on the value for money of the SF biopsies to detect PCa and found no relevant studies. We weighed the costs and the benefits of SF biopsy compared to cognitive fusion to determine whether it could be a good use of National Health Service money. The poor quality of information makes the value of the technologies largely unknown. Scientific summary Background Prostate cancer (PCa) is the most commonly diagnosed cancer in men in the UK. In the NHS people with suspected PCa are offered multiparametric magnetic resonance imaging (mpMRI). People with suspected PCa, according to MRI, are offered a biopsy procedure to confirm the presence and severity of cancer. Traditionally patients were offered a systematic transrectal, ultrasound-guided prostate biopsy (or systematic biopsy). Since the introduction of mpMRI, specific areas of abnormal tissue can be targeted, by combining (or fusing) the results of mpMRI and ultrasound imaging. Several methods for fusing MRI and ultrasound images exist, including cognitive fusion (CF), in which a region of interest is identified prior to biopsy and the biopsy operator estimates where it might be on an ultrasound image, and software fusion (SF), where regions of interest on magnetic resonace images are identified and contoured before biopsy and overlayed with the prostate contours on ultrasound images during the biopsy. Systematic biopsy may be used in addition to targeted biopsy. A number of SF technologies are available. However, the effectiveness and cost-effectiveness of SF compared with CF is uncertain. Objectives This study aimed to assess the clinical and cost-effectiveness of SF biopsy systems in people with suspected localised and locally advanced PCa. Methods Systematic review A systematic review of the diagnostic accuracy, clinical effectiveness, safety and practical implementation of nine SF systems compared with CF and with each other, in people suspected PCa according to MRI was conducted. Comprehensive bibliographic searches, including MEDLINE and EMBASE and supplementary sources, were conducted up to 2 August 2022 for published and unpublished literature. Studies of people with suspected PCa who have had a MRI scan that indicates a significant lesion [Likert or prostate imaging – reporting and data system (PI-RADS) score of 3 or more], including biopsy-naive and repeat biopsy patients with a previous negative prostate biopsy, and comparing SF with CF or with another SF device, were included. The following SF technologies were included: ARTEMIS (InnoMedicus ARTEMIS), BioJet (Healthcare Supply Solutions Ltd), BiopSee (Medcom), bkFusion (BK Medical UK Ltd and MIM Software Inc.), Fusion Bx 2.0 (Focal Healthcare), FusionVu (Exact Imaging), iSR’obot Mona LisaTM (Biobot iSR’obot), KOELIS Trinity (KOELIS and Kebomed) and UroNav Fusion Biopsy System (Phillips). Previous versions were also eligible. In-bore (or in-gantry) biopsies were excluded. Prospective, randomised and non-randomised comparative studies were included, and retrospective evidence where no prospective evidence could be found for an eligible SF device. To provide sufficient evidence for a network meta-analysis (NMA), within-patient comparisons or randomised controlled trials (RCTs) between SF and systematic biopsy, and between CF and systematic biopsy, were also eligible to inform indirect comparisons of diagnostic accuracy. Two researchers independently screened the titles and abstracts of all reports identified by the bibliographic searches and of all full-text papers subsequently obtained. Data extraction and quality assessment were conducted by at least one researcher and checked by a second. Risk of bias of diagnostic accuracy studies was assessed using quality assessment of diagnostic accuracy studies-comparative (QUADAS-C). For diagnostic accuracy outcomes, studies reporting sufficient data were included in network meta-analyses comparing SF and CF with or without concomitant systematic biopsy, and systematic biopsy alone, where odds of being categorised in each of different cancer grades were allowed to vary by biopsy type. Results were reported as odds ratios with 95% credible intervals (CrIs). Additional diagnostic accuracy results that could not be pooled in a meta-analysis and clinical effectiveness, safety and implementation outcomes were synthesised narratively. Economic analysis Cost-effectiveness evidence comparing SF biopsy systems with CF for targeted prostate biopsy in men with suspected PCa was identified by the previously mentioned searches, with evidence narratively summarised and tabulated. Studies were appraised for their quality, generalisability and appropriateness to inform the decision problem as defined by the National Institute for Health and Care Excellence Diagnostics Assessment Report (NICE DAR) scope. A targeted search was conducted to identify evidence to support the development of a de novo decision model. The searches aimed to identify cost-effectiveness evidence of diagnostic strategies at the point of biopsy to support the model conceptualisation. Evidence was reviewed to (1) identify value components of the biopsy approaches, (2) characterise alternative mechanisms of evidence linkage from disease prevalence, diagnostic accuracy, choice of treatment to final outcomes, and (3) identify any UK-relevant sources of evidence. A de novo decision analytic model was developed to estimate the cost-effectiveness of SF compared to CF. The model evaluated two strategies for two alternative comparisons: (1) targeted SF biopsy versus targeted cognitive biopsy and (2) combined (targeted and systematic) SF biopsy versus combined cognitive biopsy. The four strategies could not be incrementally compared due to the mechanism of evidence generation for the diagnostic accuracy, which relied on separate evidence networks. The de novo model consisted of two components: (1) a decision tree, which captured biopsy adverse events (AEs), repeated biopsies and classified individuals according to their biopsy results and underlying true disease status, and (2) long-term model to link classification to clinical management decisions and this to longer-term costs and consequences (e.g. disease progression and PCa mortality) so that differences in costs, life-year gains and quality-adjusted life-years (QALYs) were quantified over a lifetime horizon. The model required the development of (1) an extension to the evidence synthesis to allow quantifying the extension of test misclassification in the diagnostic model with SF biopsy and CF biopsy, and (2) an inference model to derive unobservable transition probabilities for the long-term model. Results The systematic review of clinical evidence included a total of 3733 patients who received SF and 2154 individuals with CF from 23 studies. Evidence was included for all devices specified in the protocol, except for Fusion Bx 2.0 and FusionVu. Overall, the evidence for all devices was at high risk of bias. Overall, biopsy-naive patients were under-represented. Fourteen studies were included in the meta-analyses. Diagnostic accuracy Across all analyses results must be interpreted with caution due to the high risk of bias in the evidence base and wide uncertainty over the results. The meta-analyses show that patients undergoing SF biopsy may have: (1) a lower probability of being classified as not having cancer, (2) similar probability of being classified as having non-clinically significant cancer [International Society of Urological Pathology (ISUP) grade 1], and (3) higher probability of being classified at higher ISUP grades, particularly ISUP 2. Similar results were obtained where both biopsy methods were combined with systematic biopsy. Additional meta-analyses of cancer detection rates suggest that, compared with CF biopsy, SF may identify more PCa (any grade) (OR 1.30; 95% CrI 1.06, 1.61). Adding systematic biopsy to cognitive or SF may increase the detection of all PCa and of clinically significant (CS) cancer, and from this evidence there is no suggestion that SF with concomitant systematic biopsy is superior to CF with systematic biopsy. Meta-analyses of cancer detection rates, by individual device, showed that compared with CF biopsy, BioJet and Urostation are associated with a higher detection of PCa overall. There was no evidence that any of the SF devices increased detection of CS cancer (except for BioJet, although this is based on one low-quality study), and overall, the evidence was insufficient to conclude whether any individual devices were superior to CF, or whether some SF technologies are more accurate than others. Clinical effectiveness There is no evidence that biopsy positivity rates and safety outcomes differ significantly between SF and CF, or between SF devices. There was some evidence that systems with rigid registration (BioJet or UroNav) are easier and faster to use than elastic registration (KOELIS Trinity), although this is informed by a single, small study and is not conclusive. Cost-effectiveness One full cost-effectiveness study of SF compared targeted SF to targeted CF. However, the findings of the study were not considered generalisable to the decision problem under assessment. Sixteen studies were identified of which nine were selected to inform the conceptualisation and parameterisation of the de novo decision model. The base-case cost-effectiveness analysis suggests for the targeted biopsy and the combined biopsy comparisons, that SF strategy is on average costlier and yields greater QALYs than the CF strategy, resulting in a probabilistic incremental cost-effectiveness ratio (ICER) of £6197 and £2199 per additional QALY for each comparison, respectively. These ICERs are below the lower bound of the cost-effectiveness threshold range recommended by NICE, suggesting that SF may be cost-effective compared to CFs in both the targeted and the combined comparisons. However, these results should be interpreted cautiously given the uncertainties in the relative diagnostic accuracy evidence which informs the model. The probabilistic analysis suggests a higher probability of cost-effectiveness for SF versus CF at the range of cost-effectiveness thresholds recommended by NICE (0.64 and 0.68 at £20,000 and £30,000 per additional QALY for targeted SF biopsy). Discussion This assessment includes a broad, comprehensive literature search for software and CF technologies and has been conducted following recognised guidelines to ensure high quality. The review identified evidence on the diagnostic accuracy of nine SF technologies, and is the first systematic review to formally compare the relative accuracy of SF and CF, with and without systematic biopsy, as well as different SF devices, using both direct and indirect evidence in a NMA. Unlike recent systematic review evidence, our review found that SF increased detection of clinically insignificant cancer compared with CF. Our review has a number of limitations. The evidence included in the systematic review is at high risk of bias overall. There was variation in patient and study characteristics. Biopsy-naive patients, who form the large majority of patients eligible for targeted biopsy, were under-represented, although there was insufficient evidence to evaluate whether the relative accuracy of software and CF differed between biopsy-naive and repeat biopsy patients. There was insufficient evidence to explore the impact of a number of other potential effect modifiers, including lesion location, operator experience, biopsy routes and anaesthesia methods. There were few studies per comparison, not all studies reported outcomes by all cancer grades, and most estimates from the meta-analyses were imprecise, particularly at higher cancer grades where data were most sparse. The network meta-analyses relied on the assumption that CF was equivalent across different centres, which is uncertain. No evidence was found for most of this assessment’s prespecified outcomes: biopsy sample suitability/quality, number of repeat biopsies performed, procedure completion rates, software failure rate, time to diagnosis, length of hospital stay, time taken for MR image preparation, subsequent PCa management, re-biopsy rate, hospitalisation, overall survival, progression-free survival (PFS), patient- and carer- reported outcomes [including tolerability and health-related quality of life (HRQoL)], barriers and facilitators to implementations. The cost-effectiveness results are driven by the modelled differences in diagnostic accuracy between software and CF, particularly the increased correct detection of Cambridge Prognostic Group 1 (CPG 1) (resulting in net losses for SF) and CPG 2 (resulting in net gains for SF). The External Assessment Group (EAG)’s NMA and its extension underpinned the economic model, so its limitations apply to the cost-effectiveness estimates. The magnitude of value realised for SF, compared with CF, depends on the balance between different degrees of misclassification and correct classification with the two technologies and on the prevalence of disease at each cancer grade. The value of SF is thus driven by comparative diagnostic accuracy (compared to ‘gold standard’) derived where evidence is particularly sparse (cancer grades above 2), and by prevalence, which is also affected by evidence sparsity. Therefore, the estimates of cost-effectiveness are affected by unquantified uncertainty and should be interpreted with caution. Conclusions Compared to CF biopsy, patients undergoing SF biopsy may show a lower probability of being classified as not having cancer, similar probability of being classified as having non-CS cancer, and a higher probability of being classified at higher ISUPs, particularly ISUP 2. Both SF and CF biopsy can miss CS cancer lesions, and the addition of standard-systematic biopsy increases the detection of all PCa and CS cancer for both fusion methods. There is insufficient evidence to conclude on the relative accuracy and clinical effectiveness of different software devices. Cost-effectiveness estimates comparing software to CF were generally favourable to SF, except where the technologies were assumed to have the same diagnostic accuracy. The drivers of economic value of SF, comparative diagnostic accuracy and prevalence, are affected by unquantified uncertainty. Judgements on the economic value of SF require integration of the uncertainties over the clinical evidence with the overall cost-effectiveness. Recommendations for further research High-quality, sufficiently powered RCT evidence comparing SF biopsy with CF biopsy is required to address limitations from the existing evidence. Improved reporting of diagnostic accuracy outcomes would enable future syntheses to make use of a larger body of evidence. Study registration This trial is registered as PROSPERO CRD42022329259. Funding This award was funded by the National Institute for Health and Care Research (NIHR) Evidence Synthesis programme (NIHR award ref: 135477) and is published in full in Health Technology Assessment; Vol. 28, No. 61. See the NIHR Funding and Awards website for further information.

Keywords