
A New Statistical Framework for Bridging Pathologic Response and Survival in Breast Cancer
Bayesian modeling links residual cancer burden shifts to survival, helping neoadjuvant breast cancer trials and FDA approvals predict real benefit.
For nearly 2 decades, oncologists have grappled with a deceptively simple paradox: while achieving a pathologic complete response (pCR) to neoadjuvant chemotherapy is consistently one of the strongest predictors of long-term survival at the individual patient level, improvements in pCR rates across clinical trial arms have not reliably translated into meaningful gains in event-free or overall survival.1
Lajos Pusztai, MD, DPhil, professor of medicine and co-leader of Genetics, Genomics, and Epigenetics at Yale Cancer Center, is one of the researchers at the forefront of this conversation. In an interview with Targeted Oncology, he discussed research presented at the 2026 AACR Annual Meeting focusing on this disconnect.2 Using a sophisticated Bayesian hierarchical modeling framework may finally resolve this disconnect and, in doing so, offer regulators and clinical trialists a more reliable statistical bridge between a neoadjuvant endpoint and long-term outcomes.
The Clinical Problem: A Decades-Old Paradox
The story begins in the early 2000s at MD Anderson Cancer Center, where Pusztai and colleagues first characterized the robust association between pCR and excellent individual-level outcomes.
"About 25 years ago, my colleagues and I at MD Anderson noticed that complete eradication of invasive cancer from both the breast and lymph nodes with preoperative therapies is associated with excellent long-term survival, and we called this pathologic complete response [pCR]," he explained.
The challenge emerged when researchers attempted to scale this observation to the trial level. Randomized studies demonstrating improved pCR rates sometimes yielded only marginal, nonstatistically significant improvements in event-free survival, while in other trials the association appeared robust. The result was a pattern that frustrated investigators and regulators alike.
Pusztai attributes the instability between increased pCR rate and survival improvement to several converging biological realities. First, pathologic response exists on a continuum that the binary pCR versus residual disease classification fails to capture. "Residual cancer after chemotherapy can be extensive, moderate, or very small, and the extent of residual cancer has a major impact on prognosis," he noted. "To have a 1-cm residual cancer is better than 2 cm, and 2 cm is better than 3 cm in terms of prognosis.” The extent of residual disease—not just its presence or absence—carries prognostic weight.
Second, different drug classes affect the distribution of residual disease in meaningfully different ways. "Some drugs move the very small residual disease into complete response," he explained, meaning some patients shifted by a new drug into pCR who would have ended up with minimal residual disease with an older therapy, and these patients have a good prognosis even with the older treatment (in the control arm). At the trial level, this type of improvement in pCR rate would cause a small improvement in arm-level survival. Other agents, like immunotherapies and HER2-targeted agents like trastuzumab (Herceptin), shift the entire residual disease distribution toward lower residual disease values, reducing the number of patients with extensive, moderate, and minimal residual cancer simultaneously. This response pattern is more clinically impactful, resulting in larger arm-level improvement in survival, but is invisible when only the pCR rate difference between trial arms is reported.
A third complicating factor is the presence of competing treatment effects, particularly adjuvant endocrine therapy in hormone receptor-positive disease. Postoperative adjuvant therapy that is highly effective reduces the impact of success, or failure, of the preoperative treatment on survival by improving the outcome among those with residual disease. This is increasingly relevant in modern trials where patients with residual disease routinely receive adjuvant capecitabine for triple-negative cancers or HER2-targeting antibody-drug conjugates in HER2-positive disease.
The RCB Score: Moving Beyond Binary Assessment
To address the limitations of binary pCR classification, Pusztai and colleagues developed the residual cancer burden (RCB) index, a continuous composite score that integrates primary tumor size, tumor cellularity, number of involved lymph nodes, and the size of nodal metastases into a single quantitative measure.
"It's a statistically sound formula that weighs the importance of tumor size, the cellularity of the cancer, and nodal metastases into a score that correlates closely with long-term recurrence-free survival," Pusztai explained. "Achieving a pCR is good, but achieving a minimal residual disease is better than a moderate amount, and having a moderate amount is better than an extensive amount of residual disease; with the RCB score, we now can quantify this spectrum.”
A New Statistical Approach: Borrowing from Ecology and Epidemiology
The centerpiece of the current AACR presentation, presented by first author Keli Santos-Parker, MD, MS, mathematics PhD and current surgery resident at the University of California, San Francisco, is the application of Bayesian hierarchical network analysis to pooled patient-level data of about 6000 patients treated with neoadjuvant chemotherapy for early-stage breast cancer in 12 global sites (study cohorts) and assembled by the I-SPY clinical trial consortium. Bayesian hierarchical network meta-analysis is a statistical method that combines direct and indirect evidence from multiple studies and allows simultaneous comparison of many interventions when they have not been directly compared head to head.3
Pusztai is direct about where the methodology originates. "This approach is not commonly used in oncology trial-level meta-analyses, but it's broadly used to assess ecological changes over time and across different regions of the world, in epidemiological studies, and in molecular biology to infer regulatory networks from high-throughput biological data." The approach, he notes, is used in fields where researchers have long been forced to account for systematic heterogeneity across sites, regions, and populations when comparisons are made.
Applied to the breast cancer neoadjuvant setting, the method accomplishes something that prior meta-analytic approaches could not. "It combines the information across all study cohorts and takes into account the variability within each cohort as separate units for comparison," Pusztai explained. It adjusts for systemic treatment differences between geographic regions and institutions—a meaningful source of noise, since standard-of-care drug preferences can differ between European and American centers and even between US coasts, translating into variable pCR rates and RCB distributions across the study cohorts.
Earlier efforts to correlate pCR rate improvement with survival improvement plotted the odds ratio of pCR versus the hazard ratio for survival trial-by-trial. In Pusztai's assessment, “this is simple but not powerful." The Bayesian hierarchical approach proposed by Dr Santo-Parker, by contrast, borrows statistical strength across trials while rigorously modeling within-trial/cohort variability.
The result: a tight, reliable, and statistically significant correlation between improvements in pCR rate and, even more so in RCB distribution and improvements in distant recurrence-free survival—the approach can also generate prediction thresholds for survival improvement based on shifts in RCB distribution between trial arms.3
"You actually can use an RCB change based threshold to predict if improvement in survival is likely or unlikely, and that's kind of important from the FDA point of view," Pusztai noted.
Regulatory and Clinical Implications
The FDA has approved agents for early breast cancer on the basis of pCR rate improvement but has in recent years grown more cautious about accepting pCR as a surrogate end point.4 The framework that Santos-Parker and colleagues are proposing offers a potential path forward. "What we are actually proposing," said Pusztai, "is a new statistical framework to translate improvement in pathologic response to trial arm level survival difference, a more accurate prediction if a particular degree of RCB distribution shift would impact the more important end point, recurrence with survival."
Crucially, the team shows that using the RCB score as an input, rather than binary pCR status, improves the predictive accuracy.1 "When you use the RCB score as an input, you get more reliable or more accurate predictions than just using pCR vs residual disease as input," Pusztai said. This finding reinforces the I-SPY team's longstanding advocacy for routine RCB collection in neoadjuvant clinical trials.
Next Steps: Independent Validation
The findings presented at AACR are based on study cohorts, rather than separate clinical trial arms; it estimates performance in internal leave-one-cohort-out cross-validation in which subsets of patients are withheld from model training and then used to test predictive accuracy. While encouraging, Pusztai emphasizes that true external validation is the priority next step. "To validate the method and define a robust threshold for predicting success, or failure, in survival, we need patient-level data from multiple large, randomized trials," he said. "Our next goal is to put together a consortium of interested pharmaceutical companies, clinical trial groups, and regulators for independent validation.”
In this envisioned collaboration, trial sponsors would contribute patient-level pCR and RCB results along with basic clinicopathologic information and long-term follow-up data. These data currently mostly reside in large pharmaceutical companies or with the US FDA. "However, validation of our method could be done in a trial- and drug-blinded manner, knowing what treatment was given that altered RCB distribution is not required."
If validated, the framework could meaningfully reshape how neoadjuvant breast cancer trials are designed, how pathological end points are reported, and how regulators evaluate accelerated approval requests for early-stage disease. Even if the mathematics behind the analysis are, Pusztai concludes, "The overall message is actually not that complicated.”
REFERENCES
1. Symmans WF, Peintinger F, Hatzis C, et al. Measurement of residual breast cancer burden to predict survival after neoadjuvant chemotherapy. J Clin Oncol. 2007 Oct 1;25(28):4414-22.
2. Santos-Parker KS, Santos-Parker JR, Symmans WF, et al. A novel statistical framework for surrogate endpoint prediction of survival in neoadjuvant breast cancer trials. Presented at: 2026 AACR Annual Meeting; April 17-22, 2026; San Diego, CA. Abstract 1401/10.
3. Florez ID, De La Cruz-Mena JE, Veroniki AA. Network Meta-Analysis: A Powerful Tool for Clinicians, Decision-Makers, and Methodologists. J Clin Epidemiology.2024.
4. Pathological Complete Response in Neoadjuvant Treatment of High-Risk Early-Stage Breast Cancer: Use as an Endpoint to Support Accelerated Approval. US FDA. Updated July 29, 2020. Accessed April 20, 2026.
































