Scientific Software · AI/ML · Opinion

The Evolution of Software Architecture in Science and What AI Changes Next

From monolithic Fortran subroutines to cloud-native ML pipelines, scientific software has been quietly transformed. Here is what that history tells us about where we are going.

When I built my recent scientific software tool, a Python application for automated microbial growth kinetic modeling, I made a decision that felt obvious at the time: I would give it a web-based interface. Not because anyone asked me to. Because I had watched too many brilliant pieces of scientific software die quiet deaths on someone's hard drive, never used by the people they were built for, because they required a command line and three environment variables to run.

That decision made me think harder than I expected about what scientific software actually is, who it is for, and how the answers to those questions have changed dramatically over the past several decades. This post is my attempt to trace that arc and to think seriously about what AI means for its next chapter.


A Brief, Honest History

Scientific computing did not begin with usability in mind. It began with necessity. In the mid-twentieth century, researchers needed to solve systems of equations that no human could compute by hand at useful scale. The tools they built were engineered for correctness and performance, not for the scientist in the next lab who did not know Fortran.

Era 1

The Monolithic Era: 1950s to 1980s

Scientific software in this period was inseparable from the hardware it ran on. Fortran dominated because it compiled efficiently to the numerical hardware of the day. Programs were monolithic; a single, large codebase where data structures, computation, and output were tightly coupled. Modularity was aspirational. Documentation was sparse. Reuse was rare.

The implicit assumption was that the person writing the software and the person using it were the same person, or at least colleagues in the same building who shared an operating system and a manual. Software was not a product. It was a working artifact of a specific experiment.

Era 2

The Scripting and Modularity Era: 1990s to 2000s

The rise of Perl, Python, and R in the life sciences fundamentally changed what it meant to write scientific software. These languages lowered the barrier to entry enormously. A biologist who could not write C could write a Python script. A statistician who had never seen a compiler could build a fully functional analysis pipeline in R.

This era also introduced the idea of the scientific software package, a reusable, distributable unit of computational functionality. Bioconductor, Biopython, SciPy, these projects established that scientific software could be shared, versioned, and built upon. The community-maintained package repository became the new infrastructure of computational science.

The limitation was fragmentation. Reproducibility was still mostly aspirational. Dependency hell was real. Every lab had its own flavour of the same pipeline, slightly different, not quite compatible, passed down through generations of graduate students like oral tradition.

Era 3

The Pipeline and Cloud Era: 2010s

Next-generation sequencing broke scientific software in the best possible way. The data volumes produced by instruments like Illumina's HiSeq were simply too large to handle with the tools and architectures of the previous era. This forced a reckoning with scalability, and out of that reckoning came a new generation of scientific software design principles.

Workflow management systems: Snakemake, Nextflow, CWL became the grammar of large-scale bioinformatics. Containerisation with Docker and Singularity solved the reproducibility problem that had haunted the scripting era. Cloud computing decoupled analysis from local hardware. For the first time, a pipeline built at one institution could run identically at another.

The architecture shift was profound: from monolithic scripts to modular, composable, orchestrated workflows. Software became infrastructure. The computational biologist became, in important ways, a software engineer.

The history of scientific software is really the history of scientists being forced, reluctantly and then gratefully, to adopt the engineering practices that the rest of the software world had already worked out.


Where We Are Now And What Still Breaks

We are living in a genuinely interesting moment. The tools available to a computational biologist in 2026 are extraordinary by historical standards. Cloud HPC is accessible. Containerised workflows are reproducible. Package management is mostly solved. Python has won.

And yet. Walk into most wet labs and you will find scientists who cannot use any of it. You will find researchers running analyses through someone else's GUI, not knowing what is happening under the hood, unable to modify parameters or interpret edge cases. You will find postdocs maintaining analysis scripts that they inherited and do not fully understand. You will find tools that are cited thousands of times and have not been updated in six years.

The fundamental unsolved problem of scientific software is not technical. It is the gap between the people who build the tools and the people who need them. Every era of scientific software has made that gap smaller. None has closed it.

The Reproducibility Problem Is Still Not Solved

Containers and workflow managers have made reproducibility much easier but they have not made it the default. Most published analyses still cannot be reproduced without significant effort. The cultural norm of sharing code and data has improved, but the norm of sharing code and data in a form that actually runs remains aspirational in large parts of biology.

Scientific Software Has a Maintenance Crisis

The incentive structure of academic science systematically undervalues software maintenance. A researcher who spends six months writing a well-documented, well-tested bioinformatics tool produces one line on their CV. A researcher who spends the same six months writing two papers produces two lines. The result is a landscape littered with tools that work brilliantly for the dataset they were tested on and quietly produce wrong answers on everything else.


What AI Actually Changes

The conversation about AI in science often swings between two poles. On one end: breathless claims that AI will automate discovery and replace researchers. On the other: dismissive arguments that AI is just statistics and will not change anything fundamental. Both are wrong, and the truth is more interesting.

AI does not change what scientific software needs to do. It changes what scientific software can do and it changes the architecture required to do it.

From Rule-Based to Learning-Based Pipelines

The dominant paradigm in bioinformatics for the past two decades has been rule-based: we define the algorithm, encode our biological knowledge into explicit computational steps, and apply it to data. This works well for well-understood problems. It breaks down at the edges of knowledge which is exactly where the interesting biology lives (I think :)).

AI-native pipelines flip this relationship. Instead of encoding what we know into rules, we train models on data and let them learn the patterns we have not been able to articulate explicitly. AlphaFold2 is the canonical example: not because it solved protein folding using our understanding of protein folding, but because it learned a representation of the problem that no human had written down. The biology was in the data. The model found it.

Foundation Models as Infrastructure

The most important architectural shift in scientific software right now is the emergence of biological foundation models like ESM-2, Nucleotide Transformer, scGPT, and their successors. These models are pre-trained on biological sequence or omics data at enormous scale, and they learn representations that transfer across tasks.

This changes the architecture of scientific software in a fundamental way. Previously, each new analysis required building a new tool, encoding new domain knowledge, validating new heuristics. With foundation models, an enormous amount of that work is done once, at scale, by the pre-training process. The researcher's job shifts from building representations to fine-tuning them, from constructing the map to navigating it.

This is not a small change. It is a structural reorganisation of where scientific knowledge lives and how it is accessed.

The Closing of the Usability Gap

The most underappreciated consequence of AI in scientific software may be usability. Large language models can generate code, explain parameters, interpret outputs, and translate between the language of biology and the language of computation in ways that were not possible three years ago.

The wet-lab scientist who could not run a bioinformatics pipeline in 2022 can increasingly do so in 2026. Not because the pipeline got simpler, but because there is now an intelligent layer between the pipeline and the user. This is not a replacement for computational expertise. It is an amplifier of biological expertise. The researcher who knows what question to ask gains enormously from tools that help them ask it computationally.

The next frontier in scientific software is not a more powerful algorithm. It is a more intelligent interface, one that meets the scientist where they are, not where the developer assumed they would be.


What This Means in Practice: My Own Experience

I put in extra effort to incorporate software engineering knoweldge into my kinetic modeling tool because I realised the common way of writing programs in long monolithic hard coded FinalAnalysis.py or never ending python notebooks was a bottleneck. They never went beyond the computer of the researcher who made them and in many cases couldn't be used by anyone else, either due to package incompatibilities or just minor differences in input data. This is without considering reproducibility and slow execution.

Building a better tool and a web interface kind of solves one layer of that problem. But the deeper insight was this: the gap between the tool and the user is not just only a structure or UX problem. It is also an epistemological one. Subsequent users of the software will not always have the same combination of skills that I have. The software needs to be a translation layer, not just an execution environment.

I believe AI will take us further. The tools I am building now and the transformer model I am developing for protein structure-function prediction are designed with that lesson in mind. Not: how do I build the most powerful model? But: how do I build a system whose outputs a user can effortlessly understand, interpret, question, and act on?

I think that is the right question for this moment in scientific AI. Not capability; we have more capability than we know how to use. Interpretability, interoperability, and accessibility. Those are the hard problems.


Where We Are Going

Five things I expect to be true about scientific software in ten years

  • Foundation models will be the new packages. Just as Biopython standardised sequence analysis, domain-specific biological foundation models will become the assumed starting point for most new computational biology tools, extended, fine-tuned, and specialised rather than built from scratch.
  • Software and data will be inseparable. The current model which is to publish code separately, hope someone can reproduce it, will give way to integrated, executable research objects where data, models, and analysis are versioned and distributed together.
  • The interface layer will become the competitive advantage. As underlying model capabilities become commoditised, the tools that win will win on usability, not the most accurate model, but the most useful one for the scientist who needs to make a decision on Monday morning.
  • Autonomous agents will run routine pipelines. The parts of bioinformatics workflows that are well-defined and repetitive QC, alignment, genome annotation, Binning, variant calling, basic differential expression will increasingly be handled by AI agents that execute, monitor, and report without human supervision. I don't think this is a threat to computational biologists. It is a reassignment of their attention toward harder problems.
  • Interpretability will become a first-class research output. As models become more capable, the pressure to understand what they have learned and why will intensify. Scientific AI tools that cannot explain their outputs will not be trusted in regulatory, clinical, or high-stakes research contexts. Interpretability is not a nice-to-have. It is a scientific requirement.

The arc of scientific software bends toward accessibility. Every transition from Fortran to scripting languages, from scripts to pipelines, from local to cloud has brought more scientists into computational practice. AI is the next step in that progression, not the end of it.

The researchers who will shape this transition most are not the ones who know the most about AI in the abstract. They are the ones who sit at the boundary who understand both the biology and the computation, who know what questions matter and why, and who can build tools that are not just powerful but actually used.

That, I think, is the most interesting place to be right now.

B

Blaise Manga Enuh, PhD

Computational and Systems Biology Research Associate at the Great Lakes Bioenergy Research Center. I Obtain Insights from Omics data, build ML models, bioinformatics pipelines, and scientific software tools at the intersection of microbial biology and machine learning.

← Back to site    Get in touch
← All writing