From Describing Experiments to Running Them
AI agents in drug discovery are software systems that combine large language models with specialized scientific tools so they can translate human instructions into complete workflows that design molecules, screen compounds and analyze results, moving from theoretical suggestions to practical actions with experimental validation in real pharmaceutical research environments. This is the real shift now underway: agents are no longer intellectual sparring partners; they are starting to behave like junior scientists. Instead of drafting protocols and hypothesizing about molecular interactions, they install tools, authenticate APIs and run autonomous drug screening jobs end-to-end. That raises a blunt question for the industry: when AI agents can carry out real work, not just comment on it, how much of early-stage discovery should they own? The latest launches from NVIDIA and Boltz argue that a sizable slice is ready to move.

BioNeMo Agent Toolkit: Turning Frontier Models Into Working Tools
NVIDIA’s BioNeMo Agent Toolkit is the clearest signal that AI pharmaceutical research is entering an operational phase. Instead of building yet another agent, NVIDIA is shipping the scientific toolbox: validated skills for protein-structure prediction, molecular docking, generative chemistry and genomic analysis that any agent can call on its own. Debuting with traction from nearly 50 partners, including Eli Lilly, Thermo Fisher Scientific and Dassault Systèmes, this is not a toy ecosystem; it is being plugged directly into high-stakes pipelines where failed bets cost billions. The company is explicit that general-purpose models cannot break a request like “design me a binder” into the five to seven domain-specific tasks it requires without help. BioNeMo’s harness-agnostic design means developers on any platform can access the same accelerated, governed tools, a practical step toward making AI agents drug discovery work reproducible rather than experimental theatre.

Boltz’s Agent-First API: Drug Screens by Conversation
If BioNeMo is the toolbox, Boltz’s drug-discovery API is the conversational workbench. BoltzProt-1 for protein design and BoltzMol-1 for small-molecule hit discovery are exposed through an API built “for agents as much as for people,” in CEO Gabriele Corso’s words. Boltz’s own scientists have already been reaching these models through coding agents like Claude Code, Codex and Gemini, making agents the primary interface for serious work rather than a novelty. In one test, an AI coding agent was asked in plain English to run a small hit-discovery screen against EGFR kinase using commercially available compounds; it installed and authenticated the command-line tool, refused to invent missing data, estimated the cost, and returned ranked structures in about 3.5 minutes. The run applied recommended medicinal-chemistry filters, plus Lipinski and PAINS, at an estimated USD 0.025 (approx. RM115) per molecule and USD 0.20 (approx. RM920) for eight compounds, ultimately billing only USD 0.10 (approx. RM460) because Boltz charges only for scored molecules.
From Simulations to Experimental Validation
The real breakthrough is not just that AI agents can talk to scientific tools; it is that their outputs are being checked against the wet lab. BoltzMol-1 has experimental validation across 10 targets spanning GPCRs, kinases, ion channels and protein–protein interactions, with confirmed binders highlighted at the predicted sites. On the BioNeMo side, NVIDIA is blunt about how to convince skeptics: give agents “validated domain-specific scientific tools that researchers already use,” fully documented with inputs, outputs and troubleshooting so the agent can call the right skill and understand its limits. This is a move away from pure simulation culture toward AI workflows that can be reproduced and falsified, the core currency of science. The near-autonomous AI chemist pairing GPT-5.4 with Molecule.one’s Maria agent to run more than 10,000 reactions, which human chemists then replicated at the bench, shows that this pattern is emerging across AI pharmaceutical research, not just in one vendor stack.
Trust, Adoption and the Coming Agent Workday
Industry behavior suggests quiet confidence in agent-driven workflows even as surveys still record skepticism. A recent poll found only 1% of life-sciences leaders saw AI delivering value in the wet lab, yet 45% of scientists admitted using public generative-AI tools via personal accounts, a kind of shadow AI habit that reveals demand outrunning official policy. BioNeMo’s nearly 50 partners and Boltz’s agent plugins becoming the main way its chemists and protein engineers run models show where that demand is heading: into governed, domain-specific agent stacks rather than improvised side channels. One portfolio leader describes the future as “little experts” embedded into lab notebooks, taking routine tasks while humans focus on interpretation. The question is not whether AI agents drug discovery workflows will exist, but how quickly teams will treat them as standard colleagues. Given the experimental validation emerging around these toolkits, pretending agents are only for literature searches is starting to look like denial rather than caution.






