AI and data fuel innovation in clinical trials and beyond

Laurel: So to mention the pandemic, it really showed us how critical and uphill the race is to get new treatments and vaccines to patients. Could you explain what evidence generation is and how it fits into drug development?

Arnaub: Sure. So, as a concept, generating evidence in drug development is nothing new. It is the art of bringing together data and analysis that successfully demonstrates the safety, efficacy and value of your product to a group of different stakeholders, regulators, payers, providers and, ultimately, and most importantly, to patients. And to this day, I would say that the generation of evidence is not just reading the trial itself, but now there are different types of studies that are done by pharmaceutical or medical device companies, and it can be act of studies such as literature reviews or observational data studies or analyzes that demonstrate disease burden or even treatment patterns. And if you look at how most companies are designed, clinical development teams are focused on designing a protocol, running the trial, and they’re responsible for a successful read in the trial. And most of this work is done within clinical development. But as a drug gets closer to launch, health economics, outcome research, and epidemiology teams are the ones to help paint what the value is and how do we understand disease more effectively?

So I think we’re at a pretty interesting inflection point in the industry right now. Producing evidence is a multi-year activity, both during trial and, in many cases, long after trial. And we’ve seen that as particularly true for vaccine trials, but also for oncology or other therapeutic areas. In covid, the vaccine companies put together their evidence briefs in record time, and it was an incredible effort. And now, I think what’s happening is that the FDA is navigating a delicate balance where they want to promote the innovation that we were talking about, advances in new therapies for patients. They have built vehicles to speed up therapies such as fast-track approvals, but we need confirmatory trials or long-term follow-up to really understand the evidence and understand the safety and effectiveness of these drugs. And that’s why this concept that we’re talking about today is so important, how can we do this faster?

Laurel: This is certainly important when it comes to life-saving innovations, but as you mentioned earlier, with the combination of the rapid pace of technological innovation and the data being generated and reviewed, we are at a special inflection point here. So how has the generation of data and evidence evolved over the past couple of years, and how possible would this ability to create a vaccine and all the packages of evidence now be five or 10 years ago? year ?

Arnaub: It is important to distinguish here between clinical trial data and what is called real-world data. The randomized controlled trial is, and has remained, the gold standard for generating and submitting evidence. And we know that in clinical trials we have a very tightly controlled set of parameters and a focus on a subset of patients. And there’s a lot of specificity and granularity in what’s captured. There is a regular interval of assessment, but we also know that the trial environment is not necessarily representative of how patients end up performing in the real world. And that term, “real world,” is kind of a Wild West of a lot of different things. These are claims data or insurance company billing records. These are the electronic medical records that are emerging from providers, hospital systems, and labs, and even increasingly new forms of data that you might see from devices or even patient-reported data. And RWD, or Real World Data, is a large and diverse set of different sources that can capture patient performance as patients enter and leave different healthcare systems and environments.

Ten years ago, when I was first working in this field, the term “real world data” didn’t even exist. It was like a dirty word, and it was basically a word that was created in recent years by the pharmaceutical and regulatory sectors. So I think what we’re seeing now, the other important element or dimension is that regulators, through very important pieces of legislation like the 21st Century Remedies Act, have kicked off and propelled the way whose real-world data can be used and incorporated to increase our understanding of treatments and disease. So there’s a lot of momentum here. Real-world data is used in 85%, 90% of new drug applications approved by the FDA. So it’s a world we have to navigate.

How do you keep the rigor of the clinical trial and tell the whole story, and then how do you integrate real-world data to kind of complete that picture? It’s an issue that we’ve been focusing on for two years, and we’ve even built a solution around it during covid called Medidata Link that actually links the patient level data in the clinical trial to all the non-result data of the trial that exists in the world for every patient. And as you can imagine, the reason this made a lot of sense during covid, and we actually started this with a covid vaccine maker, was so we could study long-term outcomes, so we can tie this trial data to what we see post-trial. And does the vaccine make sense in the long term? Is it safe? Is it effective? And that’s, I think, something that’s going to emerge and that’s been a big part of our evolution over the last two years in terms of data collection.

Laurel: This history of data collection is certainly part of the challenges of generating this high quality evidence. What other industry shortcomings have you seen?

Arnaub: I think the elephant in the development room in the pharmaceutical industry is that despite all the data and all the advances in analytics, the likelihood of technical success or regulatory success as it’s called for drugs, progress is still very weak. The overall probability of first-phase approval is consistently below 10% for a number of different therapeutic areas. It’s less than 5% in cardiovascular, it’s just over 5% in oncology and neurology, and I think what underlies these failures is the lack of data to demonstrate efficacy. This is where many companies submit or include what regulators call a faulty study design, an inappropriate statistical endpoint, or in many cases the trials are underpowered, meaning the size of the sample was too small to reject the null hypothesis. So that means you’re grappling with a number of key decisions if you just look at the trial itself and some of the gaps where data should be more involved and influential in decision-making.

So when you design a trial, you assess, “What are my primary and secondary endpoints? What inclusion or exclusion criteria should I select? What is my comparator? What is my use of a biomarker? And then how can I understand the results? How to understand the mechanism of action? It’s a myriad of different choices and a permutation of different decisions that have to be made in parallel, all of that data and information coming from the real world; we talked about the dynamics of value that an electronic health record might have. But the gap here, the problem is, how is the data collected? How to check where it comes from? Can we trust him?

So while the volume is good, discrepancies contribute to it, and there is significant risk of bias in a variety of different areas. Selection bias, which means there are differences in the types of patients you select for treatment. There is performance bias, detection, a number of issues with the data itself. So I think what we’re trying to navigate here is how can you do that in a robust way by bringing these data sets together, addressing some of these key issues with drug failure that I was referring to earlier ? Our personal approach took a historical clinical trial dataset curated on our platform and uses it to contextualize what we see in the real world and to better understand how patients respond to treatment. And that should, in theory, and what we’ve seen with our work, help clinical development teams use a new way to use data to design a trial protocol, or to improve some of the work of statistical analysis they do.