The AI Drug Discovery Data War

plus: Why Pharma Is Buying AI Startups

Happy Friday! It’s January 23rd.

How are you liking the change so far? Since we started leaning more into AI drug discovery, we’ve gotten some really helpful feedback. Thanks to everyone who took the time to share what’s working and what isn’t!

This space is much closer to our own expertise and interests, which means we can go deeper and offer more useful insights than we could with broader healthcare AI coverage. There are already plenty of great resources out there for that. Our goal here is to add something different.

Anyway, enjoy this week’s brief, and take a first look at what we’re building below.

Our picks for the week:

  • Featured Perspective: The AI Drug Discovery Data War

  • Product Pipeline: Why Pharma Is Buying AI Startups

Read Time: 5 minutes

FEATURED PERSPECTIVE

The AI Drug Discovery Arms Race Just Changed

A 2D illustration of a doctor reading her notebook with a robot trying to do the same on the right.

For years, AI-driven drug discovery has been constrained by one hard limit: access to high-quality biological data at scale. Training better models has meant either stitching together fragmented public datasets or paying for access to proprietary ones.

This month, two announcements show those paths diverging more clearly than ever, and raise a bigger question about where innovation will actually come from.

On one side is Europe’s push for open science. On the other is a rapidly expanding (but closed) commercial data stack.

Open data makes a bid for scale: A few days ago, the Structural Genomics Consortium and its partners launched LIGAND-AI, a five-year, €60 million public-private project funded by the Innovative Health Initiative.

The goal is to generate billions of protein-ligand interaction data points, make them openly available, and use them to train and benchmark AI models that predict how molecules bind to human proteins.

Led by Pfizer and spanning 18 partners across nine countries, LIGAND-AI targets thousands of proteins linked to rare, neurological, and cancer indications.

All data will be released under FAIR principles, designed for reuse by any lab, startup, or company worldwide.

The bet is that openness itself becomes an accelerator, reducing duplication and letting innovation compound across the ecosystem.

Proprietary data goes massive: Last week we featured a closed case where Illumina unveiled its Billion Cell Atlas, a proprietary effort to map how one billion human cells respond to genetic perturbations using CRISPR.

Built with founding partners including AstraZeneca, Merck, and Eli Lilly and Company, the atlas is the first phase of a planned five-billion-cell resource.

The dataset spans more than 200 disease-relevant cell lines and is designed to train large AI models for target validation and virtual cell modeling, all hosted within its own analytics platform.

Where the tension lies: These approaches solve different problems. Open efforts like LIGAND-AI promise transparency, benchmarking, and broad participation.

Proprietary atlases offer consistency, depth, and tight integration with industrial pipelines.

The risk is fragmentation, where models trained on closed datasets cannot be compared to those built on open ones.

The more interesting question is whether open science can move fast enough at scale to rival commercial platforms. The race is on… which side do YOU think has more potential?

Brain Booster

Before a new drug can be approved for public use, it must pass through several phases of clinical trials. What is the main goal of Phase I in this process?

Login or Subscribe to participate in polls.

Select the right answer! (See explanation below and source)

What Caught My Eye

PHARMA AI

The Line Between Pharma and AI Is Starting to Disappears

This week made something uncomfortable clear. Pharma is no longer experimenting with AI… it is absorbing it. AstraZeneca buying Modella and insitro acquiring CombinAbleAI seems to be the beginning of a new trend, a shift from collaboration to control.

Platforms that once lived on the edge of R&D are being quickly acquired and privatized.

The reason is simple. AI is no longer just accelerating individual tasks like target discovery or molecule design. It is shaping how decisions are made across R&D. Owning the platform means owning the data pipelines, models, and iteration speed, not just licensing them.

I only bring this up because it reminded me of last year’s report from CB Insights which shows leading pharma companies running dozens of AI partnerships and investments at once.

At some point, the distinction between “pharma” and “AI-first pharma” may disappear.

When discovery, development, and strategy are all model-driven, the question becomes less about adopting AI and more about whether a drug company can exist without it.

Byte-Sized Break

📢 Other Happenings in Healthcare AI

  • Isomorphic Labs, a Google-backed AI drug discovery startup, has delayed its first clinical trials to late 2026 (from 2025) [Link]

  • ECRI, a nonprofit known for evaluating medical tech and safety, ranked AI chatbot misuse as 2026’s top health tech hazard, warning that unregulated, misleading responses from tools like ChatGPT could cause patient harm despite growing use in care. [Link]

  • The Gates Foundation and OpenAI launched Horizon 1000, a $50M initiative starting in Rwanda to bring AI-powered healthcare support to 1,000 African clinics by 2028, aiming to ease clinician workloads and improve care access. [Link]

Resources To Come

Just a quick peak (as promised from last week), we’re building out a series of resources, trackers and dashboards for you… for FREE!

Over the past year we’ve been doing this, we’ve collected and tracked a large body of data. All manually curated!

That means no going to ChatGPT and asking “Hey, give me a list of AI Healthcare Companies…” We vet every company, so keep on the look out. (I know I keep saying it but it does take a long time)

Coming soon…

Have a Great Weekend!

❤️ Help us create something you'll love—tell us what matters!

💬 We read all of your replies, comments, and questions.

👉 See you all next week! - Bauris

Trivia Answer: B) To determine the drug’s safety and appropriate dosage

Phase I clinical trials are the first stage of testing a new drug in humans. The primary goal is to evaluate the drug’s safety profile and identify a safe dosage range while monitoring for side effects. These trials typically involve a small number of participants and help researchers understand how the drug behaves in the human body before larger efficacy studies begin. [Source]

How did we do this week?

Login or Subscribe to participate in polls.

Reply

or to participate.