AI for Biology: Why It’s a Harder Problem Than Self-Driving

More time and more data =\= better AI system

TL;DR

  • Self-driving AI works in a world where inputs and outputs are predictable; biology doesn’t—outcomes vary even with identical conditions.

  • Twin studies and real-world biology show irreducible variance: the same DNA, same environment, but different health outcomes.

  • For investors, the opportunity isn’t in companies that promise certainty—it’s in those that design products to navigate uncertainty better.

Using AI to solve human biology is often sold with the same inevitability as self-driving cars: messy today, solved tomorrow. Just give it more data and one day it will be able to predict disease outcomes.

But we believe that framing is dangerously wrong.

Most of today’s notable AI use cases succeed in environments that are relatively predictable and controllable. In contrast, digital biology operates in an environment where outcomes are variable and often unpredictable. This difference—the contrast between systems that behave consistently versus those that respond variably—is what makes building AI for biology especially difficult.

🚀 Turn Insights into Investing Action 🚀

Staying ahead in healthcare investing means understanding where the real opportunities lie. Our dealflow newsletter curates high-potential early-stage healthcare companies, helping investors act early on trends like the opportunities in healthcare AI we are sharing here. Subscribe today to discover the companies solving the biggest challenges in healthcare - right before the big name funds get involved.

[Sign up for our exclusive dealflow newsletter here]

Deterministic vs. Stochastic: The Biggest Barrier

For readers less familiar with the terms: deterministic systems behave consistently—the same input produces the same output every time. Stochastic systems behave variably—the same input can lead to different outcomes.

Here’s the simplest way to put it, using self driving cars as comparison:

  • Self-driving: The AI controls a deterministic agent (the car). When it decides to brake, the car brakes, and nothing else happens.

  • Digital biology: The AI “intervenes” on a stochastic agent (the body). When it decides to treat, a range of outcomes are possible. For example, two patients may receive the same cancer therapy: one responds well, the other shows no improvement, and a third might suffer severe side effects. The same input produces multiple possible outputs.

In driving: deviation from the ideal is error.
In biology: deviation from the ideal is expected.

That’s why digital biology can’t converge to a clean solution the way self-driving can. Biology isn’t noisy data waiting to be tamed—it’s a stochastic system where randomness is part of the design.

Twin Studies: Proof in the Real World

Identical twins are the cleanest experiment we have that demonstrate the stochasticity of the human biology. Consider that we can safely assume that twins have: 

  • Same DNA.

  • Same early environment.

  • Divergent health outcomes over time.

Despite the similarities, one twin may develop cancer while the other does not. One may face autoimmune disease while the other remains healthy. Over time, factors like epigenetic drift, immune exposures, and microbiome differences accumulate, creating divergent outcomes even in genetically matched individuals.

Twin studies prove that even with matched inputs, biology yields different outputs. This isn’t missing data. It’s irreducible variance.

Beyond Determinism: Other Factors That Make Biology Harder

Even if we accept biology’s stochastic nature, other layers add difficulty:

  • Environment: Driving faces probabilistic challenges like other drivers and weather. Biology faces probabilistic factors like comorbidities, behaviors, and micro-environments.

  • Feedback loops: Driving offers fast, high-volume real-world validation. Biology’s feedback is slow, confounded, and ethically constrained.

  • Observability: Many biological states can’t be measured in real time, unlike sensors in cars.

  • Intervention tools: Software commands guarantee execution; biology’s interventions don’t.

  • Path to progress: For driving, more data steadily improves control. For biology, variance remains irreducible.

All of this introduces layers of complexity and confounding factors that exponentially increase the difficulty of building AI for biology, even if biology can ultimately become deterministic. It’s not just a matter of more data—it’s the nature of the system itself that resists clean solutions.

What It Would Take to Make Biology "Deterministic"

For the sake of discussion, in order to make biology deterministic, you’d need:

  1. Perfect biological observability: real-time multi-omic and microenvironment data. In practice, this isn’t feasible because signals decay quickly, most measurements are destructive, and continuous monitoring across every cell type is beyond current technology. Even advanced sensors struggle with resolution and timing, leaving critical blind spots.

  2. Fully causal models: we would need to explain and predict every biological interaction. But biology is too complex, with countless nonlinear pathways and context-dependent effects that make fully causal modeling an unattainable goal.

  3. Intervention control: Gene editing, cell therapy, or synthetic biology that guarantees response are examples of interventional control. In reality, interventions often have off-target effects, immune complications, and unpredictable long-term outcomes, making guaranteed control unrealistic.

  4. Mass experimentation at scale: Experiments need to be done free of ethical and practical constraints. Large-scale human experimentation is impossible to run without limits, and animal models can’t fully capture human biology, leaving a permanent gap in validation.

  5. Inherent stochastic processes: Gene expression, immune responses, and cellular interactions include random fluctuations by design. This means even identical conditions can yield different outcomes, limiting predictability.

In other words, you’d need the body to operate like a machine. Not impossible, but highly improbable.

Physics offers a useful analogy here: you can’t know both position and momentum of a particle precisely, a hard limit of nature. Biology has its own version of this principle: you can’t measure all the relevant states of a living system simultaneously.

These are not gaps in our technology. They are ceilings on knowability.

What This Means for Investors

AI in biology will absolutely improve our ability to predict, diagnose, and personalize. But investors should be cautious when evaluating claims.

This space is not uninvestable, but the winners will be the companies that acknowledge biology’s inherent variance and design their products with those limits in mind. When you hear a pitch, ask:

  • Does the team understand the stochastic nature of biology?

  • Are they modeling distributions rather than certainties?

  • Have they built their product strategy around working with irreducible uncertainty rather than pretending it doesn’t exist?

Backing AI biology requires conviction that a company isn’t overselling determinism. The real opportunity lies in those that navigate uncertainty better, and not those that promise to eliminate it.

Please subscribe to our newsletter if you haven’t, and share our newsletter with a friend. Stay tuned to our newsletter for more insights into healthcare innovation!

Join us at The Healthcare Syndicate as we back the most ambitious founders 10Xing the standard of healthcare!

Reply

or to participate.