AI in Biology: How Evo 2 Is Mapping the Genetic Code of Life
What if we can decode life? Have you ever wondered what would happen if our scientists started reading every genetic code? In the race to decode life, the world saw a major breakthrough. Yes, with the advancing technology, the scientists have developed Evo2 to understand the genetic code of life. This is a milestone of AI in Biology.
This is not just another AI tool. It is the most powerful AI model ever built for biology. The system can read the DNA of thousands of species. It can detect hidden patterns in genes and even design new genetic sequences. All this was revealed in a recent article published in the renowned journal Nature. This has marked an important milestone in biology.Â
This innovative and powerful model was built by the researchers at Arc Institute in partnership with NVIDIA. The scientists from Stanford University, University of California, Berkeley, and University of California, San Francisco were also part of it. This system was built to help researchers in exploring large, complex genetic data that has been collected over the past two decades.Â
Biologists often say DNA is the language of life. Every living species has the code of survival stored in its DNA. But decoding these codes is not easy. Our researchers usually need years to understand what a single mutation might do. That’s where Evo2 comes into play.Â
Learning from the tree of life
Evo 2 is trained with the help of a huge dataset. The model studied more than 9.3 trillion nucleotides. These sequences were collected from more than 128,000 genomes, including plants, animals, and other organisms.
With such a large dataset, the AI learned patterns of evolution. These patterns help us understand how genes work and how a small change in DNA can affect life.Â
Patrick Hsu, the co-founder of Arc Institute and professor at UC Berkeley, said that their goal is to build machines that can understand biology in a broad way. According to him, the Evo models allow computers to read and write the language of DNA. It can be used for different experiments in research.Â
A large model built with serious computing power
Training Evo 2 required huge computing power. The model was trained for several months on the NVIDIA DGX Cloud with support from Amazon Web Services, using more than 2,000 NVIDIA H100 GPUs. Researchers also developed a new AI architecture, StripedHyena 2, that enables the system to analyze extremely long DNA sequences. Evo 2 can process up to one million nucleotides at once, helping it detect links between distant parts of the genome.
Brian Hie of Stanford University, one of the study’s senior authors, says evolution has left important clues in DNA. By studying millions of sequences, the model learns how biological molecules behave.
Predicting disease mutations
One of Evo 2’s most useful abilities is predicting whether genetic mutations are harmful. In tests involving the cancer-related gene BRCA1, the model achieved more than 90 percent accuracy in identifying mutations that may cause disease.
For scientists studying genetic disorders, this could save significant time and resources. Instead of testing thousands of mutations in the lab, researchers can focus on the most important ones first. The model has also been used to study genetic risk linked to Alzheimer’s disease and to examine genetic variation in domesticated animals.
Designing new genetic tools
Evo 2 can also generate new DNA sequences. Researchers at the Arc Institute have already used it to design synthetic bacteriophages, viruses that infect bacteria and could potentially help fight antibiotic-resistant infections.
The technology may also support the development of gene therapy. Hani Goodarzi from the University of California, San Francisco, says scientists could design genetic switches that activate therapies only in specific cells, such as neurons or liver cells, thereby reducing unwanted side effects.
Open access and safety
The Evo 2 team has released the model as open source, allowing scientists worldwide to access its code, data, and model weights. Dave Burke of Arc Institute says the model can act like the core of an operating system that researchers can build upon.
To address safety concerns, the team excluded pathogens that infect humans and other complex organisms from the training data. The system is also designed to avoid providing useful information about these pathogens. Responsible development efforts involved experts, including Tina Hernandez-Boussard of Stanford.
A growing role for AI in biology
Evo 2 highlights how quickly AI in biology is advancing. With access to large genetic datasets and powerful computing, scientists can now explore the genetic code in new ways. For researchers studying disease, developing medicines, or designing biological tools, models like Evo 2 may soon become essential partners in the lab.


