Computational Biology · AI/ML · Opinion

Biology Is Noise. AI Is Pattern. Humans Are the Bridge.

Biological systems are fundamentally stochastic. Understanding what that means for AI in biology is not a philosophical exercise. It is a practical prerequisite for building tools that actually work.

There is a moment that every computational biologist experiences at least once, usually early in their career and with a specific look of confusion on their face. You run your analysis pipeline on two biological replicates from the same organism, same growth condition, same timepoint. The data come back different. Not a little different. Noticeably, frustratingly, meaningfully different. You check your code. You check your parameters. You check your inputs. Everything is correct. The biology is just different.

That experience is not a quality control failure. It is an introduction to one of the most fundamental properties of living systems: stochasticity. Biology is not a deterministic machine that produces the same output given the same input. It is a probabilistic process operating at the edge of thermodynamic noise, and understanding that has enormous consequences for what we can reasonably expect artificial intelligence to do inside it.


What Biological Stochasticity Actually Means

The term stochasticity gets used loosely in biology. It is worth being precise. Biological stochasticity refers to randomness that is intrinsic to the system, not an artifact of measurement error or experimental imprecision. Even if you could measure a biological system perfectly, with zero technical noise, you would still observe variability. That variability is real. It is the biology.

Intrinsic Noise: The Molecular Layer

Gene expression is the canonical example. A given gene in a given cell does not produce a fixed number of mRNA transcripts per unit time. Transcription is governed by the stochastic binding and unbinding of transcription factors to promoters, events that occur at the scale of individual molecules. When a cell contains only a handful of molecules of a given regulatory protein, the timing and frequency of binding events is governed by random thermal motion. The result is that genetically identical cells in the same environment can exhibit dramatically different levels of gene expression at any given moment.

Seminal work by Elowitz and colleagues demonstrated this directly in 2002, showing that isogenic E. coli cells carrying identical fluorescent reporter constructs expressed those reporters at different levels, and that this variability was not noise in the measurement but genuine cell-to-cell heterogeneity in transcriptional activity. This is intrinsic noise. It cannot be engineered away. It is a feature of molecular biology at low copy numbers.

Extrinsic Noise: The Environmental Layer

Layered on top of intrinsic noise is extrinsic noise: variability that arises from differences in the cellular environment. Cells in the same culture vessel do not all experience exactly the same local concentrations of nutrients, oxygen, and signaling molecules. They are at different stages of the cell cycle. They have different numbers of ribosomes. Their histories of gene expression differ. Extrinsic noise sources can correlate across genes within a cell, which is why two genes in the same cell often show correlated fluctuations even when they are not directly regulated by each other.

The distinction between intrinsic and extrinsic noise matters practically. Intrinsic noise sets a floor on how precisely gene expression can be controlled. Extrinsic noise can in principle be reduced by better experimental control, but in living organisms it is never eliminated.

Stochasticity at the Systems Level

Stochastic effects are not confined to individual genes. They propagate through biological networks. A random fluctuation in the abundance of one protein can cascade through a signaling pathway, shifting the state of the entire cell. This is not a bug in biological design. Theoretical work has shown that noise can be actively exploited by biological systems to generate phenotypic diversity within a clonal population, essentially a bet-hedging strategy that increases the probability of some individuals surviving an unpredictable environmental challenge.

The practical implication is that a biological system's response to a perturbation cannot be predicted from a single measurement or a single experiment. The response is a distribution, not a point estimate.

A biological system's response to a perturbation is a distribution, not a point estimate. Any tool that treats it otherwise is already wrong before it runs.


What AI Brings to a Noisy System

Given that biological systems are fundamentally stochastic, the entry of machine learning into biology raises a specific and underappreciated question: what exactly is the model learning, and is that the thing we want it to learn?

Machine learning algorithms learn from data. In biology, that data is generated by noisy, stochastic processes. The model's learned representation is therefore, at best, an approximation of the underlying biology, and at worst, a representation of the particular noise structure of the training dataset. This is not an argument against AI in biology. It is a requirement for intellectual honesty about what AI models in biology actually represent.

What AI Does Well in Stochastic Systems

The case for AI in biology is real and substantial. Machine learning models are exceptionally good at identifying statistical regularities across large datasets. When those regularities are genuinely biological, the models find them. AlphaFold2 is the most dramatic example: by learning from the accumulated evolutionary record of protein sequences and structures, it found regularities that capture the physical constraints of protein folding with a precision that had eluded explicit computational methods for decades. The model did not need to know about stochasticity in protein folding pathways. It learned a mapping from sequence to structure that is accurate because structure is largely determined by sequence, and the training data was large enough to average across the noise.

The same logic applies across a range of problems where signal genuinely dominates noise in sufficiently large datasets: variant effect prediction, drug-target interaction modeling, cell type classification from single-cell transcriptomics. In each of these domains, AI has demonstrated that it can extract meaningful signal from noisy biological data, not by eliminating the noise but by learning in spite of it.

Where AI Breaks Down

The breakdown cases are instructive. They tend to occur at precisely the boundary where biological stochasticity is not a background nuisance but the phenomenon of interest.

Consider cell fate decisions in development. A multipotent progenitor cell will differentiate into one of several possible cell types. The outcome is probabilistic. Genetically identical cells in the same signaling environment do not all take the same fate. The stochastic dynamics of a small number of key transcription factors tips the decision. If you train a classifier to predict cell fate from gene expression snapshots, the model will learn the average transcriptomic signatures associated with each fate. But the classifier cannot tell you which individual cell will take which path, because that information is not fully encoded in the gene expression snapshot. Some of the variance is genuinely irreducible. The model will have systematic errors at the decision boundary that cannot be resolved by adding more data or improving the architecture, because the data itself does not contain the information.

The same problem appears in antimicrobial resistance, where stochastic gene expression can produce drug-tolerant persister cells from an otherwise susceptible population. A model trained to predict resistance from genomic features will miss the phenomenon almost entirely because persistence is not primarily a genomic phenomenon. It is a phenotypic state produced by noise.

The places where AI models in biology fail most systematically are the places where stochasticity is not background noise to be averaged over. It is the signal.


The Compounding Problem: Training Data Is a Snapshot of a Moving Target

There is a second stochasticity problem that receives less attention than it deserves, and it operates at a different timescale. Biological systems evolve. Microbial populations under selection change their allele frequencies over days. Tumors accumulate mutations over months. Ecosystems shift their composition over years. The stochasticity is not just within experiments. It is across them.

An AI model trained on transcriptomic data from a bacterial strain cultured under one set of conditions will encounter a different organism if that strain is passaged for fifty generations under selection. The model has not changed. The biology has. This is a form of distribution shift that is not primarily a data collection artifact. It is an intrinsic property of living systems. It means that biological AI models have a time-validity problem that has no direct analog in most other application domains.

In my own work building machine learning models for metabolic network optimization in Novosphingobium, I encountered this directly. Models trained on one set of experimental conditions showed degraded predictive performance when applied to strains that had been adapted to different carbon sources, even though the genomic differences were small. The biology had moved underneath the model. Keeping a model current in a living biological system is not a one-time training problem. It is an ongoing process that requires human judgment about when the model needs to be retrained, augmented, or retired.


Why Humans Are Not Optional

The argument for keeping humans in the loop in biological AI is not sentimental. It is structural. There are specific things that human judgment provides in the context of stochastic biological systems that current AI architectures do not and cannot replicate.

Distinguishing Noise from Biology

An experienced biologist who looks at a dataset with an unexpected variance structure asks a different question than a model does. The model asks: what pattern does this data contain? The biologist asks: is this variance telling me something about the biology, or is this a plate effect, a batch effect, or a contamination? That question requires contextual knowledge that is often not in the data. It requires knowing that the cells on the edge of the plate tend to behave differently, that two of the samples were processed on a different day, that the growth medium for that batch was prepared by a different person. Models trained on data do not know these things unless they are encoded as features, and they often are not, because no one thought to record them at the time.

Formulating the Right Question

AI models in biology are answers to questions. The questions themselves are not generated by the models. They are generated by scientists who understand enough biology to know which questions matter and why. In a stochastic system where multiple outcomes are always possible, deciding which outcome is biologically relevant requires judgment that is embedded in a scientific tradition, a body of domain knowledge, and an understanding of what the research is ultimately for. That judgment is irreducibly human, not because machines cannot process information, but because the value system that determines which questions are worth asking is a human construction.

Acting Under Irreducible Uncertainty

The most critical role for humans in biological AI is at the point of decision under uncertainty. A model can output a probability distribution. It cannot decide what to do given that distribution in a context where the consequences of being wrong are asymmetric, where regulatory constraints apply, where patient safety is at stake, or where the right action depends on ethical considerations that are not encodable as a loss function.

In a drug development pipeline, a model might correctly identify that a compound has a 70% probability of being efficacious in a target population. The decision of whether to advance that compound to the next stage of clinical development is not a 70% decision. It involves cost considerations, alternative pipeline candidates, competitive landscape, patient population characteristics, and risk tolerance decisions that are made by humans who are accountable for the outcome in ways that models are not.

Knowing When the Model Is Wrong

Perhaps the most underappreciated human contribution in biological AI is the capacity to recognize when a model's output is biologically implausible and refuse to trust it. Models fail silently. They produce outputs that look statistically reasonable but are biologically nonsensical. Catching these failures requires domain knowledge that runs deep enough to recognize when a predicted pathway flux is thermodynamically impossible, when a predicted protein-protein interaction violates known structural constraints, or when a model's confidence intervals do not reflect the genuine uncertainty in the underlying biology. This is not a skill that scales by adding more compute. It scales by training more scientists.

Models fail silently. They produce outputs that look statistically reasonable and are biologically nonsensical. Catching that requires a scientist, not more data.


A Practical Framework for Human-in-the-Loop Biological AI

The question is not whether humans should be in the loop in biological AI. They must be. The question is how to design systems that integrate human judgment effectively rather than treating it as an afterthought or an override mechanism for edge cases.

From my experience building scientific software and computational pipelines, three principles hold up across different biological contexts.

First, uncertainty must be a first-class output. A model that outputs predictions without calibrated uncertainty estimates is not ready for use in biology. Scientists need to know not just what the model predicts but how confident it is, and they need that confidence to be honest. Overconfident models are more dangerous in stochastic biological systems than no model at all, because they create false precision that substitutes for judgment rather than informing it.

Second, the interface between model and scientist must be interpretable. A model whose reasoning is opaque cannot be effectively interrogated by a human who suspects it is wrong. Interpretability is not a feature request. It is a requirement for scientific use. The scientist needs to be able to ask: why did the model predict this? And the answer needs to be in terms that connect to biological mechanisms, not just feature importance scores.

Third, the feedback loop must be closed. Models trained on historical biological data and deployed without mechanisms for the scientist to flag errors, provide corrections, and trigger retraining will drift out of validity as the biology moves. The human in the loop is not just a check on the model at inference time. They are the mechanism by which the model is kept current in a world where the training distribution is always changing.


Deciding from a Distribution: What That Actually Looks Like

Saying that models should output distributions rather than point estimates is the easy part. The harder question is: what does a scientist or a decision-maker actually do with a distribution? This is not a rhetorical problem. It is a practical one, and there are well-developed precedents for it across several high-stakes fields that biology can learn from directly.

Weather Forecasting: The Oldest Proof of Concept

Numerical weather prediction has operated on probabilistic outputs for decades. A modern forecast does not tell you it will rain on Thursday. It tells you there is a 70% probability of precipitation above a certain threshold. Meteorologists do not collapse that distribution into a binary prediction before presenting it. They communicate the distribution directly, and decision-makers have learned to act on it. A farmer plants or does not plant based on that probability and the asymmetric cost of being wrong in either direction. An airline operations team uses it to pre-position maintenance crews. A city emergency manager decides whether to pre-treat roads based on the tails of the temperature distribution, not just the mean.

The key insight from weather forecasting is that the distribution is not communicated raw. Decades of research in probabilistic communication have established how to present uncertainty so that non-specialist decision-makers can act on it without misinterpreting it. The work of translating a probability distribution into an actionable decision is shared between the model, the communicator, and the decision-maker. None of them can do it alone.

Clinical Oncology: Survival Curves as Distributions

Oncologists have long made treatment decisions from distributional outputs, even if they do not always describe it that way. A Kaplan-Meier survival curve is a distribution over time to event. When an oncologist presents a patient with a choice between two treatment regimens, the conversation is not "this drug will give you three more years." It is "this drug shifts the median survival from fourteen months to twenty-two months, but the variance is wide and we cannot predict where you individually will fall in that distribution." The patient then makes a decision that integrates that probabilistic information with their own values, their tolerance for treatment side effects, and their personal assessment of what the outcomes in the tails of the distribution mean for them.

This is distributional decision-making in practice. The model, in this case a statistical survival analysis, outputs a distribution. The clinician interprets it in light of patient-specific context. The patient makes a values-weighted choice under irreducible uncertainty. Each role is distinct and none is replaceable by either of the others.

More recently, models like PREDICT Breast Cancer and Adjuvant! Online formalize this further. They output probability distributions over ten-year survival across treatment options, and clinical guidelines in multiple countries have been updated to incorporate them into the decision framework explicitly, not as a replacement for oncologist judgment, but as a structured input to it.

Genomic Medicine: Polygenic Risk Scores

Polygenic risk scores, now increasingly used in cardiovascular medicine and beginning to enter psychiatric and oncological practice, are another example of distributional output in clinical decision-making. A polygenic risk score does not tell you that a patient will develop coronary artery disease. It tells you that their genetic architecture places them at the 87th percentile of the population distribution for lifetime risk. That is a statement about where they sit in a distribution, not a prediction of what will happen to them individually.

How clinicians act on that information is a genuine area of ongoing research and debate. The Khera et al. 2018 study in Nature Genetics showed that individuals in the top percentile of polygenic risk for coronary artery disease had lifetime risk comparable to carriers of monogenic risk variants, and this finding has begun to shift guidelines toward treating high polygenic risk as a clinical actionable finding. But the decision of when and how to intervene still requires integrating the distributional risk estimate with the patient's age, existing risk factors, medication tolerance, and values. The distribution narrows the space of reasonable decisions. It does not make the decision.

From Flux Balance Analysis to Flux Distributions: A Case Study in Letting Go of the Point Estimate

For anyone who has worked in metabolic modeling, flux balance analysis is probably the tool they reached for first. It is elegant in a specific way: given a stoichiometric model of a metabolic network and a set of constraints, it finds the single optimal flux distribution that maximizes a defined objective, usually growth rate or product yield. You get a number. The flux through the reaction producing your target bioproduct is 4.7 millimoles per gram dry weight per hour. Now go engineer the strain toward that.

The problem is that this number is a mathematical construct, not a biological prediction. It is the optimal solution under a set of assumptions about the cell's objective function, the accuracy of the stoichiometric coefficients, the tightness of the exchange flux bounds, and the absence of regulatory constraints that the model does not encode. Real cells do not solve linear programs. They run stochastic gene expression, imperfect enzyme kinetics, and metabolite concentrations that fluctuate continuously. The point estimate that FBA returns has real value as a bound on what is theoretically achievable, but treating it as a prediction of what a cell will actually do is a category error that metabolic engineers have been quietly correcting for twenty years.

The field has developed several responses to this. Flux variability analysis, introduced by Mahadevan and Schilling in 2003, was an early acknowledgment that the single FBA solution is one point in a feasible solution space that may be very large. FVA computes, for each reaction, the full range of flux values compatible with optimal or near-optimal growth. What it returns is not a point but an interval: this reaction can carry anywhere from 1.2 to 9.8 millimoles per gram dry weight per hour while the cell remains at 95% of maximum growth. That is a distribution over feasible states, and it changes the engineering question substantially.

MCMC-based flux sampling goes further, sampling uniformly from the feasible flux polytope to characterize the distribution of metabolic states that are consistent with the model's constraints. The output is not a single optimal flux vector but a population of flux vectors, each representing a biologically plausible metabolic state. Ensemble modeling approaches extend this further by propagating uncertainty in model parameters, kinetic constants, and regulatory interactions through the model to produce distributions over predicted phenotypes.

What Changes When You Replace the FBA Point Estimate with a Distribution

The practical consequences for decision-making in strain engineering are significant and worth being specific about.

Target identification shifts from single reactions to reaction classes. A classical FBA analysis might identify one reaction as the single optimal knockout target for improving yield. A flux sampling analysis reveals that a cluster of reactions in the same pathway all carry correlated, high-variance flux distributions, meaning any one of them is a plausible engineering target, but the choice among them cannot be resolved by the model. That is useful information. It tells the engineer that the model does not have the resolution to distinguish between these targets and that experimental data, not more modeling, is required to make the call.

High-variance reactions are flagged as sites of biological regulation or model uncertainty. When flux sampling shows that a reaction has very wide flux variability across feasible states, it means one of two things: either the cell genuinely has flexibility at that step and uses it adaptively, or the model's constraints at that step are too loose to be informative. Both conclusions are actionable. The first suggests a regulatory mechanism worth investigating experimentally. The second suggests where the model needs to be improved with tighter experimental bounds.

Yield predictions become risk-adjusted, not target-fixed. Classical FBA tells you the maximum theoretical yield of your target compound. Flux sampling tells you the distribution of yields across the feasible metabolic space, which is almost always lower than the theoretical maximum and has a specific shape. If that distribution is narrow and centered near the theoretical maximum, the pathway is well-constrained and yield improvement is primarily an expression engineering problem. If the distribution is wide and skewed, it signals that the metabolic network has many competing routes for carbon flux and that the engineering problem is fundamentally about flux rerouting, not just expression tuning. These are different experiments. Knowing which one you are running before you start saves months.

DBTL cycle prioritization becomes probabilistic. In a design-build-test-learn workflow, distributional metabolic modeling allows you to rank candidate designs not by their predicted optimal performance but by the full shape of their performance distributions. A design with a slightly lower median yield but a much tighter distribution may be the better engineering bet than one with a higher median but high variance, because variance in a DBTL context translates directly to unpredictability across test conditions. This is exactly analogous to the asymmetry assessment principle: you are not just optimizing the expected outcome, you are managing the risk of the tail outcomes.

The challenge that flux distributions introduce, and where the human judgment requirement becomes most acute, is that a distribution does not tell you which strain to build next. It tells you which region of metabolic space is most likely to contain productive phenotypes. Translating that into a specific genetic intervention requires mechanistic knowledge of which genes control which fluxes, which regulatory interactions will resist the intervention, and what the realistic bounds on enzyme expression changes are in the host organism. That knowledge lives in the scientist's head, informed by literature and experimental history that no model fully encodes.

I encountered this directly in metabolic modeling work on Novosphingobium. Flux distributions across the central carbon metabolic network showed wide variability in reactions branching into the methylmalonyl-CoA pathway, the precursor route for several target bioproducts. The distribution identified the branch point as a high-priority region but could not distinguish between two competing interventions: increasing the expression of the branch point enzyme versus blocking the competing drain into the TCA cycle. That decision required returning to the literature on enzyme kinetics in related organisms and making a judgment call about which intervention was more likely to be tolerated by the cell's regulatory network. The model narrowed the question. The scientist answered it.

A Framework for Acting on Distributional Outputs

Across these precedents, a consistent decision-making structure emerges. It has four components that are worth making explicit for biological AI contexts.

Asymmetry assessment. The first question is whether the costs of errors are symmetric. If a false negative (missing a real signal) and a false positive (acting on a noise artifact) carry the same cost, the decision threshold sits near the center of the distribution. In most biological contexts, they do not. The cost of advancing a toxic compound to clinical trials is categorically different from the cost of dropping a promising one. The cost of missing an antimicrobial resistance event in a hospital outbreak is different from the cost of an unnecessary isolation measure. Distributional outputs force this asymmetry to be made explicit, which is one of their most important contributions to better decision-making.

Tail focus. For many high-stakes biological decisions, what matters is not the modal prediction but the behavior in the tails. In strain engineering, the question is not what happens to the average cell in a population. It is what fraction of the population achieves the target phenotype, and what fraction behaves pathologically. A model that outputs only a mean is discarding exactly the information needed to answer that question.

Scenario planning over single predictions. Rather than collapsing a distribution to a point estimate for planning purposes, experienced decision-makers in high-uncertainty domains use scenario planning. The distribution is discretized into representative scenarios, each with an associated probability, and decisions are stress-tested against each scenario. This approach, formalized in strategic planning as sensitivity analysis and in finance as scenario-based risk modeling, is increasingly being applied in drug development portfolio management and is directly applicable to biological experimental design.

Explicit value weighting. Distributions become decisions only when weighted by values. Two decision-makers looking at the same survival curve may make different treatment choices because they weight years of life against quality of life differently. Two biotech companies looking at the same compound efficacy distribution may make different go/no-go calls because their pipeline alternatives and capital constraints differ. Distributional outputs do not remove this value-weighting step. They clarify that it is the human's job, not the model's.

Where Biology Is Still Catching Up

Despite these precedents, most biological AI tools still communicate their outputs as point estimates with optional confidence intervals that are frequently ignored. This is a design failure, not a user failure. Tools that surface uncertainty only in supplementary figures train their users to ignore it. The norm in biological AI needs to shift toward what weather forecasting and clinical decision support have already established: uncertainty is part of the primary output, not a footnote to it.

Bayesian deep learning approaches, conformal prediction, and ensemble methods all provide routes to calibrated uncertainty quantification in biological AI models. The technical toolkit exists. What lags behind is the design culture and the training, for both tool builders and tool users, that makes distributional thinking the default rather than the exception.

The single-cell field is probably furthest ahead here. Trajectory inference tools like Monocle and scVelo output probabilistic state transitions, not deterministic cell fate assignments. The community has developed visual conventions for communicating this uncertainty, and researchers in the field have developed interpretive norms for it. That is what distributional thinking in biological AI looks like when it matures. It is not just a better model. It is a new relationship between the model and the scientist who uses it.

A distribution without a decision framework is just a plot. The work is in building the shared language between model outputs and human judgment that lets one inform the other.


Conclusion: Noise Is Not a Problem to Be Solved

The most important reframe I have arrived at, after years of building computational tools for biological systems, is that stochasticity in biology is not primarily a problem to be solved. It is a property to be understood. Evolution has had billions of years to build deterministic biological systems if determinism were advantageous. It has not, in large part because noise is useful. It generates diversity, enables bet-hedging, and allows populations to explore phenotypic space without committing every individual to a single strategy.

AI tools that treat biological variability as noise to be filtered are systematically discarding information. The best tools in biological AI are the ones that model the distribution, not just the mean, and that give scientists the information they need to interpret and act on that distribution intelligently.

The future of AI in biology is not the replacement of biological expertise. It is the amplification of it. Models handle the pattern recognition at scale. Scientists handle the judgment, the question formulation, the interpretation, and the decision-making under irreducible uncertainty. That division of labor is not a transitional arrangement while AI catches up. It reflects something fundamental about what science is and what living systems are.

The biology will always be noisier than the model. The job of the scientist is to know what that means.


Key References

[01] Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297(5584):1183-1186. The foundational demonstration of intrinsic and extrinsic noise in gene expression using dual-reporter E. coli.
[02] Raj A, van Oudenaarden A. Nature, nurture, or chance: stochastic gene expression and its consequences. Cell. 2008;135(2):216-226. A landmark review on the sources and functional consequences of transcriptional noise.
[03] Jumper J, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583-589. The AlphaFold2 paper, demonstrating what deep learning achieves when signal dominates noise across a large and well-curated training set.
[04] Lin Z, et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science. 2023;379(6637):1123-1130. The ESM-2 paper, establishing protein language models as infrastructure for computational biology.
[05] Balazsi G, van Oudenaarden A, Collins JJ. Cellular decision making and biological noise: from microbes to mammals. Cell. 2011;144(6):910-925. A broad treatment of how biological systems exploit rather than suppress stochasticity.
[06] Raser JM, O'Shea EK. Noise in gene expression: origins, consequences, and control. Science. 2005;309(5743):2010-2013. Foundational paper on the mechanisms and regulatory control of gene expression noise.
[07] Subramanian I, et al. Multi-omics data integration, interpretation, and its application. Bioinformatics and Biology Insights. 2020;14:1177932219899051. A practical treatment of the challenges of integrating noisy, heterogeneous biological datasets.
[08] Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine. 2019;25:44-56. A clinical perspective on why human-AI collaboration, not AI replacement of humans, is the practical model for consequential biological decisions.
[09] Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nature Reviews Genetics. 2015;16:321-332. A rigorous survey of where machine learning in genomics succeeds and where its assumptions break down.
[10] Gilpin LH, et al. Explaining explanations: an overview of interpretability in machine learning. IEEE Symposium on Data Science and Analytics. 2018. A clear framework for thinking about what model interpretability actually requires and why it matters in high-stakes domains.
[11] Mahadevan R, Schilling CH. The effects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metabolic Engineering. 2003;5(4):264-276. The original flux variability analysis paper, establishing that FBA solutions are one point in a feasible space and that characterizing that space is essential for reliable strain engineering decisions.
[12] Bordbar A, et al. Constraint-based models predict metabolic and associated cellular functions. Nature Reviews Genetics. 2014;15:107-120. A thorough review of constraint-based metabolic modeling, covering the theoretical basis of FBA, its extensions, and the practical limits of point-estimate predictions for biological decision-making.
[13] Herrmann HA, Dyson BC, Vass L, Johnson GN, Schwartz JM. Flux sampling is a powerful tool to study metabolism under changing environmental conditions. NPJ Systems Biology and Applications. 2019;5:32. A practical demonstration of MCMC-based flux sampling for characterizing the distribution of feasible metabolic states, directly applicable to strain engineering decision-making.
[14] Khera AV, et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics. 2018;50:1219-1224. The study that established polygenic risk scores as clinically actionable distributional outputs in cardiovascular medicine, a direct precedent for probabilistic decision-making in genomics.
[15] Gneiting T, Raftery AE. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association. 2007;102(477):359-378. The foundational theoretical treatment of calibrated probabilistic forecasting, the mathematical basis for why distributional outputs are preferable to point estimates in high-uncertainty systems.
[16] Cao J, et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019;566:496-502. An example of large-scale single-cell trajectory inference where distributional state representations are a core part of biological interpretation, not an optional statistical annotation.
[17] Angelopoulos AN, Bates S. A gentle introduction to conformal prediction and distribution-free uncertainty quantification. arXiv:2107.07511. 2022. A practical introduction to conformal prediction methods that provide rigorous, model-agnostic uncertainty bounds, increasingly applicable to biological AI.
B

Blaise Manga Enuh, PhD

Computational biologist and bioinformatics engineer at the Great Lakes Bioenergy Research Center. I build ML models, bioinformatics pipelines, and scientific software tools at the intersection of microbial biology and machine learning.

Back to site    Get in touch
All writing