It’s early in the morning, 7 a.m. to be exact, but Yi Liu is wide-awake. We’ve scheduled an interview with the senior AI engineer to talk about his work in life science and his experience with machine learning.
And Yi does have some remarkable experience in the field. He graduated with a Ph.D. in biomedical informatics from Stanford and is the recipient of a Stanford Graduate Fellowship. In his thesis, he used statistics and machine learning methods to “model various aspects of B-cell lineages and repertoires in humans.” He then went to Vicarious AI, where he worked on computer vision for three years.
Applying AI to understand our genome
Yi recently joined Calico, a life science company known for their goal to combat the effects of aging. With the help of genetic research and artificial intelligence, Calico wants to improve our understanding of the human body to develop new ways to fight age-related diseases.
Decoding the human genome has been a huge step in this direction. Now that we know the what, researchers have begun investigating the how. Which gene or combination of genes is responsible for what? Could deactivating certain genes help to fight diseases like cancer?
CRISPR to the rescue
A recent addition to the toolbox of genetic engineers is CRISPR/Cas9. Derived from a DNA sequence that bacteria use to defend themselves against viruses, CRISPR/Cas9 has been developed into a genetic engineering scalpel. It can cut or modify targeted DNA sequences.
“CRISPR is getting a lot of press these days and rightfully so,” Yi explained. “It’s proven itself to be a reliable and effective method for performing large-scale knock-out and knock-in experiments.” In these experiments, scientists use CRISPR/Cas9 to activate or deactivate genes to study their effects.
Genetic experiments generate large amounts of data — data that AI can help analyze. Tools like CRISPR/Cas9 raise “new questions related to experimental design and data interpretation,” Yi told us. “Machine learning and AI play a huge role in the resolution and investigation of these questions.”
Yi thinks we’re just at the beginning of genomic research. Scientists are exploring the potential of CRISPR/Cas9 beyond its original scope. In addition to gene knock-out experiments, it can also be used to pursue more fine-grained experiments.
“There are variants of the experimental setup where you’re not just deleting a gene. Instead, you’re modulating the expression and effectiveness of various genes,” Yi told us. “There are so many possibilities and we don’t know where the limit is.”
Life science and AI: evolving together
For Yi, AI is one of the driving factors in exploring these limits. “As far as machine learning is concerned, I see it as a broader ecosystem of tools.”
These tools include frameworks, such as TensorFlow and Torch. Just like CRISPR is growing beyond its original scope, Yi sees the modern deep learning frameworks outgrow their roots and develop use cases outside of their original gradient-based optimization context: “The focus recently has been on getting the learning right, because we have this new tool — deep learning.”
However, this focus on learning has to be balanced with a focus on inference, the application of trained models to real-world problems. Yi sees a bright future for software frameworks: “They will continue to play a big role when the inference part shifts back into focus.”
Another promising tool in the AI ecosystem are capsules, an idea championed by Geoffrey Hinton. Yi referred us to a new paper called “Matrix capsules with EM routing”, which outlines how capsules could be used in image recognition. A capsule is an isolated part of the network that specializes in detecting a specific pattern in the input.
In image detection, capsules can detect features independent of their orientation in space much better than traditional networks. “The higher level goal is to route the signals into different parts of the network that are the most suitable for processing it,” Yi explained.
Transfer learning could reduce data costs
Another concept that Yi believes should receive more research attention is transfer learning. “Sometimes, you can take advantage of intrinsic similarities between different tasks,” he explained. The idea is to collect knowledge from one problem domain and apply it to another. This kind of knowledge transfer can be especially useful in areas where data acquisition is costly.
While training recommendation engines is cheap with data from customer behavior analyses, obtaining genetic samples is much more challenging. In some cases, like rare diseases, the availability of human data is severely limited. CRISPR/Cas9 has made genetic engineering much easier than it was a few years ago. But even so, obtaining the large amounts of data necessary to train a deep learning model remains incredibly expensive.
Being able to apply knowledge from other, related problems to the task of DNA analysis could help bring down costs. But Yi thinks that the AI community is still missing important puzzle pieces for effective transfer learning: “The internal representation of the world in machine learning systems is highly structured and tightly coded.”
Since these structures have little overlap among different problems, it remains very difficult to apply knowledge gained from one problem to another. In fact, we might be stuck at an even lower level: “I’m not sure if we even have a language to properly formalize that state of the world and what it means to transfer learning from one task to something that’s very different.”
Soon, AI might do the science for us
Nevertheless, Yi is optimistic about the future of AI and life science. He envisions a future where “an AI system handles the proposal of a scientific project, its execution, and also the analysis of the experiments.”
A system like this would be able to follow scientific standards much closer than humans typically do, such as ensuring the reproducibility of experiments under the exact same conditions. For example, in a systematic search for genetic variants, a more self-reliant AI could confirm or disprove its own conclusions by designing and executing a suitable experiment.
Yi agrees that this sort of machine autonomy might feel strange at first, but he emphasizes its advantages: “We will see that certain aspects of our jobs that currently consumes 5% of our time will get blown up to 80% or 90%, while the rest of the work gets more and more automated.”
Scientists would be spending less time on developing and executing experiments. Instead, they could focus on deriving conclusions from the results. Let machines handle the repetitive detail work, while researchers focus on the big picture.
Life science could bring massive benefits to humanity
As both life science and AI research are moving forward, Yi is proud to be at the core of this process. “In life science, there are many opportunities to bring massive amounts of benefits to humanity,” he said, and you could hear the excitement in his voice.
Although Yi wasn’t able to talk about the specifics of his work at Calico, he conveyed his expertise and excitement about AI and life science to us perfectly. Even at 7 in the morning.