Modern genomics employs a diverse toolkit to identify the complex relationship between genes, environment, and phenotype. Two recent studies, focused on different organisms and traits, demonstrate the complementary nature of modern research strategies. One study represents a bottom-up functional genomics approach, where a specific gene is identified and its mechanism is validated through direct genetic manipulation in soybeans (Wu et al., 2025). Another is a top-down epigenome-wide association study (EWAS), which searches for statistical correlations between epigenetic patterns, environmental factors, and a complex human disease (Lee et al., 2025). Together, these papers emphasize the distinct methods of mechanistic and associative research in the broader field of genomics.

The two studies used different types of data and analytical tools. The Wu et al. (2025) study integrated multiple data types to build a case for the role of the GmERF205 gene in soybeans (Wu et al., 2025). The researchers began with functional genomic data, using RNA-sequencing to identify genes highly expressed during drought. This was followed by biochemical evidence, where they measured the activity of antioxidant enzymes to understand the cellular effects of the gene’s overexpression. The core of their work involved direct genetic manipulation, using CRISPR/Cas9 gene-editing and Agrobacterium-mediated transformation to create transgenic plants.

The Lee et al. (2025) study was an observational analysis of a human population (Lee et al., 2025). The primary dataset consisted of epigenetic data from the Infinium Methylation 850k array, which measures DNA methylation levels. This was combined with detailed phenotypic data and environmental data from a food frequency questionnaire. Their primary tools were statistical, using R packages to perform linear modeling and identify differentially methylated positions associated with obesity and diet.

The experimental designs reflected the different goals of each study. One investigation did not require considerations like allelic diversity or population stratification because it was not a population study (Wu et al., 2025). Instead of sampling a diverse population, the researchers used a single soybean variety and created genetically modified versions. This controlled genetic background allowed them to isolate the specific effect of the GmERF205 gene. Their phenotyping was experimental and highly detailed, involving the measurement of plant growth and physiological responses under controlled drought conditions. Conversely, the other study’s design was centered on population-level analysis (Lee et al., 2025). Sample size was a critical parameter, and the researchers utilized a large cohort of 1,526 individuals. Because their study involved humans, they had to address the potential for population stratification, which they managed by using a relatively homogenous Korean cohort and by statistically adjusting for potential confounding variables, including estimated blood cell-type proportions.

Both research teams faced significant challenges. For the soybean study, a primary challenge was identifying a single, impactful gene from the large ERF transcription factor family. They addressed this by using RNA-sequencing data to prioritize candidates that were most responsive to drought stress, effectively narrowing the field (Wu et al., 2025). The human study explicitly detailed its limitations. A major challenge was the temporal discrepancy between the dietary data and the methylation data, which were collected four years apart. The authors acknowledged that this prevents definitive causal claims and proposed future longitudinal studies with concurrent data collection. Another key challenge was correcting for the confounding effect of different blood cell types in their samples, which they addressed using a reference-based statistical deconvolution algorithm (Lee et al., 2025).

Despite their different approaches, both studies have enriched the field of population genomics. The soybean study provides a powerful example of functional validation. While population genomics can identify a genetic region associated with a trait, it cannot prove which gene in that region is responsible. Functional studies provide that crucial mechanistic link, demonstrating a specific gene’s causal role and providing a validated target that can now be screened for in diverse populations (Wu et al., 2025). The EWAS of the Korean cohort pushes the boundaries of population genomics into the realm of epigenomics. It demonstrates that different aspects of a complex disease have distinct epigenetic signatures tied to environmental factors like diet. This highlights the importance of studying gene-environment interactions and moves the field beyond the static DNA sequence to understand the dynamic regulatory layers that connect lifestyle to health outcomes (Lee et al., 2025).

These two distinct approaches are complementary. Association studies like the EWAS generate hypotheses about which pathways are important, while functional studies like the soybean research provide the definitive proof of a gene’s role, ultimately creating a more complete picture of how complex traits are controlled.

References

Lee, J., Choi, HK., Park, SH. et al. Epigenome-wide association study of BMI and waist-to-hip ratio and their associations with dietary patterns in Korean adults. Sci Rep 15, 28681 (2025). https://doi.org/10.1038/s41598-025-13868-6

Wu, N., Feng, Y., Jiang, T. et al. Genome-wide study and expression analysis of soybean ERF transcription factors and overexpression of GmERF205 enhances drought resistance in soybean. BMC Genomics 26, 726 (2025). https://doi.org/10.1186/s12864-025-11829-x

Posted in

Leave a comment