15–17 Sept 2025
Centro Polifunzionale Studenti Università di Bari
Europe/Rome timezone
CSS/ITALY 2025

Archetypal Decomposition of Metagenomic Profiles for Cross-Study Diagnostic Classification

16 Sept 2025, 10:30
20m
Centro Polifunzionale Studenti Università di Bari

Centro Polifunzionale Studenti Università di Bari

Speaker

M. Seppi

Description

Shotgun metagenomics enables the quantitative profiling of microbial communities in biological samples, providing a rich, high-dimensional description of microbiota composition. These microbial profiles are increasingly used to investigate associations with host health. However, the high dimensionality of such data—thousands of microbial or functional features per sample—and the heterogeneity introduced by cross-study aggregation pose significant challenges for data analysis and interpretation.
In this work, we explore the use of Archetypal Analysis (AA) as a geometry-aware dimensionality reduction method to extract interpretable low-dimensional structure from metagenomic data, with particular application to inflammatory gastrointestinal conditions. AA approximates each sample as a convex combination of a small number of archetypes, corresponding to extreme points in the data cloud. Compared to PCA, which captures directions of maximal variance, AA emphasizes the boundary geometry of the dataset, enabling the identification of meaningful data directions in terms of compositional extremes.
We apply AA to an aggregated dataset of metagenomic profiles drawn from multiple independent studies, uniformly reprocessed through a common pipeline. Archetypes specific to individual studies can be removed to improve cross-study comparability, allowing for the construction of a robust shared representation that preserves biologically relevant information. In this reduced space, the healthy/diseased status of samples emerges naturally as a prominent axis of separation, enabling the training of a classifier with good generalization performance across studies.
These results suggest that archetypal geometry can serve as a powerful tool in microbiome-based diagnostics, particularly when data integration across studies is necessary.

Presentation materials