You’ve probably heard over and over that the mitochondria is the powerhouse of the cell. Similarly, proteins can be thought of as the powerhouse of life, playing a diverse range of roles that allow your body to function. Proteins are macromolecules present in all of your cells that are involved in critical processes such as tissue repair and growth, metabolic reactions (like digestion), and hormonal signaling. The study of the structure and function of proteins and how they interact with each other is known as proteomics. In this month’s edition, we feature cutting-edge research in proteomics, from exciting new discoveries to innovative technology.
If you would like further updates on the field of genetics research, please consider subscribing to Behind the Genes and sharing it with others using the button below! By doing so, you will be updated with a new edition every month. We would truly appreciate it if you could spread the word about us as well ~ our impact grows with each additional reader.
Research Map
Explore
A glimpse into cutting-edge proteomics research

MS‐based proteomics to examine skeletal muscle adaptations to exercise
It’s no surprise that exercise causes physiological changes to your body like strengthening your muscles or bones. These changes are the result of alterations to tissue-specific proteins that occur in response to environmental stimuli. The skeletal muscle is by far the most responsive tissue – to both acute and long-term exercise – as it undergoes changes to improve the body’s response to fatigue and accelerate energy production. To trace the effects of acute exercise and chronic training on skeletal muscles, mass spectrometry (MS)-based proteomics is a method that can provide information about structural changes to proteins. Each protein has its own unique signature based on its mass, which can be used to identify it and track its activity. In this review article, the authors discuss how advances in MS-based proteomics are providing a more comprehensive understanding of the physiological effects of exercise on skeletal muscles. [Cervone et al., 2023] 🇩🇰
Decrypting drug actions and protein modifications by dose- and time-resolved proteomics
Cancer drugs often target proteins and small molecules that are unusually active and are likely to be involved with tumor development, but why do these drugs actually work? This question is not well-understood, especially since drugs designed to target one molecule may only work because they are simultaneously interfering with an unknown pathway. This knowledge gap inspired the development of DecryptM, a proteomic technology that captures the “fingerprints” of drug response (changes in proteins) to create mini profiles of drugs and how they engage with cancer cells over time. Since its development, DecryptM has been used to profile 31 cancer drugs in 13 cancer cell line models and has generated 1.8 million dose-response graphs (which show how cancer cells respond to drugs at different doses). This open source data is publicly accessible and can serve as a foundation for future research in drug discovery. [Zecha et al., 2023] 🇩🇪
Transmembrane coupling of liquid-like protein condensates
If you thought biology was bizarre before, it’s about time you heard about transmembrane coupling. Proteins may randomly arrange themselves into liquid-like clusters called condensates, a property that is important for biological processes like structuring cellular membranes. For example, these proteins may organize the lipids that make up the membrane. However, until recently, we haven’t been able to study how condensates on opposite sides of a membrane interact with each other. In this paper, scientists used a free-standing membrane model to study interactions between condensates and found that liquid-like protein condensates on either side coupled with each other into stable arrangements. This suggests that these condensates are involved with signaling across the membrane to arrange biological structures within the membrane. [Lee et al., 2023] 🇺🇸🇰🇷
Innovate
Soaring to the next level with novel tools

The promises of large language models for protein design and modeling
We can treat protein sequences like sentences made from an alphabet of 20 amino acids, with “words” or sections that have certain meanings or functions. Large language models are a type of unsupervised machine learning model that learn semantic relationships between words, or in this case, between amino acids or strings of amino acids. They are a promising method to predict the function of proteins and even design novel proteins. However, the application of LLMs in modeling protein sequences is not as straightforward as using them to model sentences. For starters, how can protein sequences be divided into “words” that are known to have some function? Even if we do identify certain recurring segments or important sequences, how do we know that the model will pick up on their correct function? In this paper, researchers investigate how the possibilities of protein modeling are being redefined by the advancement of large language models. [Valentini et al., 2023] 🇮🇹🇺🇸
Large language models improve annotation of prokaryotic viral proteins
In metagenomic databases, genomic sequences are annotated to identify important features within the sequence and their corresponding functions – this is an important step to allow researchers to make use of the data. However, annotations of viral genomes in metagenomic databases are not always up to standards, which presents an obstacle to our understanding of virus function. Protein language models are computational tools that can assign labels to portions of viral sequences to better capture prokaryotic viral protein function. This serves as a method to better detect viral proteins and extract meaning from metagenomic samples of viruses. [Flamholz et al., 2024] 🇺🇸
Tpgen: a language model for stable protein design with a specific topology structure
Naturally existing proteins are only a fraction of all possible proteins that could exist – that is, if we artificially generated protein sequences. Unfortunately, we cannot just randomly string letters of amino acids together to create a protein sequence, as not all sequences will create functional and topologically stable proteins. So, how can we generate sequences we know will fold into proteins and have some sort of function? TopoProGenerator is a generative neural network that can design topologically stable proteins, as it is trained on shorter protein sequences (maximum of 65 amino acids long) that tend to be more stable. The model uses reinforcement learning, which is a trial-and-error method of improving performance, and adversarial learning, which allows the model to detect deceptive or incorrect inputs. TopoProGenerator adds amino acids to the sequence based on the previous amino acid until it reaches a signal to terminate or the maximum length. [Min et al., 2024] 🇨🇳
Computational drug development for membrane protein targets
Sometimes, we may need to target proteins that are embedded in the cell membrane to inhibit processes like signaling and intracellular communication. Deep learning can be used to develop drugs that target these proteins, as they can predict protein structures based on their functions, which in turn allows us to target these structures. However, although they can predict membrane protein structures with incredible resolution and speed, they do not perform as well with the development of drugs that target these proteins. This challenge still exists because of a gap in our knowledge of dynamic transmembrane signaling networks, which control the way that cells communicate with each other. Resolving this barrier will require a blend of experimental and computational approaches. We can use microscopes to quantify how molecules interact with each other in the membrane, which gives insight into how these molecules will modulate the effectiveness of drugs. We can use cryo-electron microscopy to determine structural changes of proteins. Lastly, computational tools will come in handy to analyze data, predict structure and function of cell signaling networks, and generate potential drugs. [Li et al., 2024] 🇨🇳🇨🇭
Digest
Review articles that showcase broader concepts

Sparks of function by de novo protein design
Proteins start off as a chain of amino acids that fold into a 3D structure and have a particular function. The goal of protein design in the lab is to reverse this: start with a desired function, find a corresponding structure, and determine which sequence folds into that structure. You can imagine the major implications of designing proteins with any function we want — we could create antibodies to boost our immune system, or make enzymes that ensure certain biochemical processes are happening in your body, and so on. To actually create functional proteins, however, is a complex task that now involves the use of deep learning techniques. In this review, researchers discuss advances in classical de novo protein design due to the integration of deep learning and anticipate future challenges in this area, such as the flexibility of protein design. [Chu et al., 2024] 🇺🇸
For thousands of years, dairy products have been treasured for their delicious taste and nutritious properties, but now, people are turning to plant-based alternatives. Why now? We have seen increased awareness of sustainable food practices in the food industry (plus an increase in dairy allergies). Creating plant based foods to simulate foods that were previously made from dairy is no easy task - dairy and plant proteins have very different structural properties, textures, and gelation mechanisms, which dictate how a system of these proteins can be converted to gel. This article reviews and compares structure of plant and dairy proteins to determine which plant proteins should be used to replace dairy proteins, depending on their gelation properties and ability to simulate the desired taste. 🇮🇪
Metamorphic proteins and how to find them
When you hear metamorphosis, you might think of a caterpillar maturing into a butterfly. As it turns out, some classes of proteins also possess this ability to jump between different states, making them “metamorphic.” Proteins are essentially folded chains of amino acids, and they can simply unfold and refold to assume a different 3D structure. Metamorphic proteins can switch between the two native states that are encoded by the amino acids, giving them different functions in each state. However, this conversion process is not very well understood. In this review, the authors discuss advances in the identifying metamorphic proteins, particularly the prospects and challenges for using artificial intelligence to predict their structures. [Porter et al., 2024] 🇺🇸🇨🇱
Research Community
This month’s edition featured research from 8 countries!
Spread the Word!
Thank you so much for reading this month’s edition of Behind the Genes! We would appreciate it if you could share this newsletter with anyone you know to encourage more research in this field to be conducted.