EPSRC Doctoral Prize Fellow at the University of Bristol

News

2025-08-06

Published Stratify: Unifying Multi-Step Forecasting Strategies in ECML / Springer Nature!

2025-05-23

Published Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model in Astronomy & Astrophysics!

2025-03-25

Presented Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts at ICLR 2025 - First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models!

2025-03-19

Published Euclid Quick Data Release (Q1). Active galactic nuclei identification using diffusion-based inpainting of Euclid VIS images on Arxiv!

2024-10-03

I have successfully defended my thesis and have been awarded my PhD in Interactive AI!

2024-09-19

Published Euclid preparation - XLIII. Measuring detailed galaxy morphologies for Euclid with machine learning in Astronomy & Astrophysics!

2024-06-03

I have started my 2-year Fellowship at the University of Bristol working on Diffusion Models and Digital Twins!

2024-02-15

Published Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting on Arxiv!

2024-02-15

Presented a talk on 'Optimizing Data Efficiency: Using Active Learning Strategies and the QUEST Method for Efficient Classification and Labeling in Large Datasets' at the Galaxies & AGN with the First Euclid Data and Beyond in Bologna.

2023-03-28

Presented a pecha-kucha talk on my PhD research at the Interactive AI CDT Spring Research Conference.

Publications

Keywords: active-learning, classification, computer-vision, diffusion, euclid-consortium, foundation-models, meta-learning, mixture-of-experts, multistep-forecasting, regression, software, time-series, transfer-learning

2025

Top: Respective noised images produced by the cosine-beta schedule at different timesteps. Each image is a sample from the respective signal-to-noise bin directly below it. Due to the scales of pixel values, the introduced noise has a more significant impact on the typically fainter, low S/N images, leading to the images converging to Gaussian noise much sooner into forward process. The relationship of S/N and rate of convergence results in the entirety of the top left of the grid of images being pure noise, indicating inefficient training for lower S/N images. This highlights the difficulty in applying off-the-shelf pipelines to the complexities of real-world astronomical data that feature high dynamic range and varying quality over images. Bottom: Distribution of S/N of galaxy images. Even though the sample is dominated by lower S/N images, a non-negligible number of sources with S/N→1000 remains in the training set.
Euclid Quick Data Release (Q1). Active galactic nuclei identification using diffusion-based inpainting of Euclid VIS images
G. Stevens, S. Fotopoulou, M.N. Bremer, T. Matamoro Zatarain, K. Jahnke, B. Margalef-Bentabol, M. Huertas-Company, M.J. Smith, M. Walmsley, M. Salvato, M. Mezcua, A. Paulino-Afonso, M. Siudek, M. Talia, F. Ricci, W. Roster, Euclid Consortium
Arxiv Preprint
PDF DOI BIB ABSTRACT
Keywords: diffusion, computer-vision, euclid-consortium, classification
Light emission from galaxies exhibit diverse brightness profiles, influenced by factors such as galaxy type, structural features and interactions with other galaxies. Elliptical galaxies feature more uniform light distributions, while spiral and irregular galaxies have complex, varied light profiles due to their structural heterogeneity and star-forming activity. In addition, galaxies with an active galactic nucleus (AGN) feature intense, concentrated emission from gas accretion around supermassive black holes, superimposed on regular galactic light, while quasi-stellar objects (QSO) are the extreme case of the AGN emission dominating the galaxy. The challenge of identifying AGN and QSO has been discussed many times in the literature, often requiring multi-wavelength observations. This paper introduces a novel approach to identify AGN and QSO from a single image. Diffusion models have been recently developed in the machine-learning literature to generate realistic-looking images of everyday objects. Utilising the spatial resolving power of the Euclid VIS images, we created a diffusion model trained on one million sources, without using any source pre-selection or labels. The model learns to reconstruct light distributions of normal galaxies, since the population is dominated by them. We condition the prediction of the central light distribution by masking the central few pixels of each source and reconstruct the light according to the diffusion model. We further use this prediction to identify sources that deviate from this profile by examining the reconstruction error of the few central pixels regenerated in each source's core. Our approach, solely using VIS imaging, features high completeness compared to traditional methods of AGN and QSO selection, including optical, near-infrared, mid-infrared, and X-rays.
@misc{stevens2025EuclidInpaintingAGN, author = {{Stevens}, G. and {Fotopoulou}, S. and {Bremer}, M.~N. and {Matamoro Zatarain}, T. and {Jahnke}, K. and {Margalef-Bentabol}, B. and {Huertas-Company}, M. and {Smith}, M.~J. and {Walmsley}, M. and {Salvato}, M. and {Mezcua}, M. and {Paulino-Afonso}, A. and {Siudek}, M. and {Talia}, M. and {Ricci}, F. and {Roster}, W. and the {Euclid Collaboration}.}, title = "{Euclid Quick Data Release (Q1). Active galactic nuclei identification using diffusion-based inpainting of Euclid VIS images}", year = {2025}, eprint = {2503.15321} }
Summary of the strategies in MSF. In bold are our contributions. We extend the single-output Rectify strategy into its multi-output [13] variant, analogous to RecMO, DirMO [10], and DirRecMO [9]. Stratify is a framework which generalises all existing strategies and introduces novel strategies with improved performance. Lines show the evolution and fusion of previous strategies to form new ones.
Stratify: Unifying Multi-Step Forecasting Strategies
R. Green, G. Stevens, Z. Abdallah, T. de Menezes e Silva Filho
ECML / Springer Nature
PDF DOI BIB ABSTRACT
Keywords: time-series, multistep-forecasting, regression
A key aspect of temporal domains is the ability to make predictions multiple time-steps into the future, a process known as multi-step forecasting (MSF). At the core of this process is selecting a forecasting strategy; however, with no existing frameworks to map out the space of strategies, practitioners are left with ad-hoc methods for strategy selection. In this work, we propose Stratify, a parameterised framework that addresses multi-step forecasting, unifying existing strategies and introducing novel, improved strategies. We evaluate Stratify on 18 benchmark datasets, five function classes, and short to long forecast horizons (10, 20, 40, 80) in the univariate setting. In over 84% of 1080 experiments, novel strategies in Stratify improved performance compared to all existing ones. Importantly, we find that no single strategy consistently outperforms others in all task settings, highlighting the need for practitioners to explore the Stratify space to carefully search and select forecasting strategies based on task-specific requirements. Our results are the most comprehensive benchmarking of known and novel forecasting strategies. We share our code to reproduce our results.
@article{green2025stratify, title = {Stratify: unifying multi-step forecasting strategies}, volume = {39}, issn = {1573-756X}, shorttitle = {Stratify}, doi = {10.1007/s10618-025-01135-1}, number = {5}, journal = {Data Mining and Knowledge Discovery}, author = {Green, Riku and Stevens, Grant and Abdallah, Zahraa S. and Silva Filho, Telmo M.}, month = aug, year = {2025}, pages = {64}}
UMAP visualisations of the embeddings from AstroPT trained on VIS+NISP+SEDs with example cutouts and SEDs.
Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model
M. Siudek, M. Huertas-Company, M. Smith, G. Martinez-Solaeche, F. Lanusse, S. Ho, E. Angeloudi, P. A. C. Cunha, H. Domínguez Sánchez, M. Dunn, Y. Fu, P. Iglesias-Navarro, J. Junais, J. H. Knapen, B. Laloux, M. Mezcua, W. Roster, G. Stevens, J. Vega-Ferrero, Euclid Consortium
Astronomy & Astrophysics
PDF DOI BIB ABSTRACT
Keywords: foundation-models, computer-vision, euclid-consortium, classification
Modern astronomical surveys, such as the Euclid mission, produce high-dimensional, multi-modal data sets that include imaging and spectroscopic information for millions of galaxies. These data serve as an ideal benchmark for large, pre-trained multi-modal models, which can leverage vast amounts of unlabelled data. In this work, we present the first exploration of Euclid data with AstroPT, an autoregressive multi-modal foundation model trained on approximately 300000 optical and infrared Euclid images and spectral energy distributions (SEDs) from the first Euclid Quick Data Release. We compare self-supervised pre-training with baseline fully supervised training across several tasks: galaxy morphology classification; redshift estimation; similarity searches; and outlier detection. Our results show that: (a) AstroPT embeddings are highly informative, correlating with morphology and effectively isolating outliers; (b) including infrared data helps to isolate stars, but degrades the identification of edge-on galaxies, which are better captured by optical images; (c) simple fine-tuning of these embeddings for photometric redshift and stellar mass estimation outperforms a fully supervised approach, even when using only 1% of the training labels; and (d) incorporating SED data into AstroPT via a straightforward multi-modal token-chaining method improves photo-z predictions, and allow us to identify potentially more interesting anomalies (such as ringed or interacting galaxies) compared to a model pre-trained solely on imaging data.
@article{Siudek2025EuclidFoundation, author = {{Siudek}, M. and {Huertas-Company}, M. and {Smith}, M. and {Martinez-Solaeche}, G. and {Lanusse}, F. and {Ho}, S. and {Angeloudi}, E. and {Cunha}, P.~A.~C. and {Domínguez Sánchez}, H. and {Dunn}, M. and {Fu}, Y. and {Iglesias-Navarro}, P. and {Junais}, J. and {Knapen}, J.~H. and {Laloux}, B. and {Mezcua}, M. and {Roster}, W. and {Stevens}, G. and {Vega-Ferrero}, J. and the {Euclid Collaboration}.}, title = "{Euclid Quick Data Release (Q1) Exploring galaxy properties with a multi-modal foundation model}", journal={Astronomy \& Astrophysics}, year={2025}, publisher={EDP sciences}, DOI= "10.1051/0004-6361/202554611", }
Illustration of vanilla MoE and our proposed MixER layer. (Left) Vanilla MoE setting where a single input x is passed through a gating network whose outputs enable the router to assign computation to a specific expert (Chen et al., 2022). (Right) Our sparse MixER layer requires a context vector ξ alongside the input x. The gating network computes expert affinities based on this context vector. Contrary to MoE, the MixER layer disregards the softmax-weighted output aggregation.
Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts
R. Desmond Nzoyem, G. Stevens, A. Sahota, D.AW Barton, T. Deakin
ICLR 2025 - First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models
PDF ARXIV BIB ABSTRACT
Keywords: foundation-models, mixture-of-experts, meta-learning
As foundational models reshape scientific discovery, a bottleneck persists in dynamical system reconstruction (DSR): the ability to learn across system hierarchies. Many meta-learning approaches have been applied successfully to single systems, but falter when confronted with sparse, loosely related datasets requiring multiple hierarchies to be learned. Mixture of Experts (MoE) offers a natural paradigm to address these challenges. Despite their potential, we demonstrate that naive MoEs are inadequate for the nuanced demands of hierarchical DSR, largely due to their gradient descent-based gating update mechanism which leads to slow updates and conflicted routing during training. To overcome this limitation, we introduce MixER: Mixture of Expert Reconstructors, a novel sparse top-1 MoE layer employing a custom gating update algorithm based on K-means and least squares. Extensive experiments validate MixER's capabilities, demonstrating efficient training and scalability to systems of up to ten parametric ordinary differential equations. However, our layer underperforms state-of-the-art meta-learners in high-data regimes, particularly when each expert is constrained to process only a fraction of a dataset composed of highly related data points. Further analysis with synthetic and neuroscientific time series suggests that the quality of the contextual representations generated by MixER is closely linked to the presence of hierarchical structure in the data.
@inproceedings{nzoyemmixer, title={Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts}, author={{Desmond Nzoyem}, R. and {Stevens}, G. and {Sahota}, A. and {Barton}, D.AW and {Deakin}, T.}, booktitle={First Workshop on Scalable Optimization for Efficient and Adaptive Foundation Models, ICLR 2025} }

2024

Vote fraction mean deviations δi of the model predictions and the volunteer labels for the different morphology answers i (see Eq. (3)). The model was trained with all galaxies from the complete set. The deviations are displayed for all galaxies of the test set and for galaxies within a magnitude interval with m = m_I814W. Lower δi indicates better performance. The black dashed line marks 12% vote fraction mean deviation.
Euclid preparation - XLIII. Measuring detailed galaxy morphologies for Euclid with machine learning
B. Aussel, S. Kruk, M. Walmsley, M. Huertas-Company, M. Castellano, C.J. Conselice, M. Delli Veneri, H. Domínguez-Sánchez, P.-A. Duc, U. Kuchner, A. La Marca, B. Margalef-Bentabol, F.R. Marleau, G. Stevens, Y. Toba, C. Tortora, L. Wang, Euclid Consortium
Astronomy & Astrophysics

PDF DOI BIB ABSTRACT
Keywords: transfer-learning, computer-vision, euclid-consortium, classification, regression
The Euclid mission is expected to image millions of galaxies at high resolution, providing an extensive dataset with which to study galaxy evolution. Because galaxy morphology is both a fundamental parameter and one that is hard to determine for large samples, we investigate the application of deep learning in predicting the detailed morphologies of galaxies in Euclid using Zoobot, a convolutional neural network pretrained with 450 000 galaxies from the Galaxy Zoo project. We adapted Zoobot for use with emulated Euclid images generated based on Hubble Space Telescope COSMOS images and with labels provided by volunteers in the Galaxy Zoo: Hubble project. We experimented with different numbers of galaxies and various magnitude cuts during the training process. We demonstrate that the trained Zoobot model successfully measures detailed galaxy morphology in emulated Euclid images. It effectively predicts whether a galaxy has features and identifies and characterises various features, such as spiral arms, clumps, bars, discs, and central bulges. When compared to volunteer classifications, Zoobot achieves mean vote fraction deviations of less than 12% and an accuracy of above 91% for the confident volunteer classifications across most morphology types. However, the performance varies depending on the specific morphological class. For the global classes, such as disc or smooth galaxies, the mean deviations are less than 10%, with only 1000 training galaxies necessary to reach this performance. On the other hand, for more detailed structures and complex tasks, such as detecting and counting spiral arms or clumps, the deviations are slightly higher, of namely around 12% with 60 000 galaxies used for training. In order to enhance the performance on complex morphologies, we anticipate that a larger pool of labelled galaxies is needed, which could be obtained using crowd sourcing. We estimate that, with our model, the detailed morphology of approximately 800 million galaxies of the Euclid Wide Survey could be reliably measured and that approximately 230 million of these galaxies would display features. Finally, our findings imply that the model can be effectively adapted to new morphological labels. We demonstrate this adaptability by applying Zoobot to peculiar galaxies. In summary, our trained Zoobot CNN can readily predict morphological catalogues for Euclid images.
@article{ Aussel2024euclid, title={Euclid preparation-XLIII. Measuring detailed galaxy morphologies for Euclid with machine learning}, author={{Aussel}, B. and {Kruk}, S. and {Walmsley}, M. and {Castellano}, M. and {Conselice}, C.J. and {Delli Veneri}, M. and {Dominguez Sanchez}, H. and {Duc}, P.-A. and {Knapen}, J.H. and {Kuchner}, U. and {La Marca}, A. and {Margalef-Bentabol}, B. and {Marleau}, F.R. and {Stevens}, G. and {Toba}, Y. and {Tortora}, C. and {Wang}, L. and the {Euclid Collaboration}}, journal={Astronomy \& Astrophysics}, volume={689}, pages={A274}, year={2024}, publisher={EDP sciences}, DOI= "10.1051/0004-6361/202449609", }
The top-1 accuracy (the proportion of within-task instances where a strategy is optimal) aggregated over all datasets and task settings. DIRMO and RECMO include all σ parameters.
Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting
R. Green, G. Stevens, T. de Menezes e Silva Filho, Z. Abdallah
Arxiv Preprint
PDF DOI BIB ABSTRACT
Keywords: time-series, multistep-forecasting, regression
Multi-step forecasting (MSF) in time-series, the ability to make predictions multiple time steps into the future, is fundamental to almost all temporal domains. To make such forecasts, one must assume the recursive complexity of the temporal dynamics. Such assumptions are referred to as the forecasting strategy used to train a predictive model. Previous work shows that it is not clear which forecasting strategy is optimal a priori to evaluating on unseen data. Furthermore, current approaches to MSF use a single (fixed) forecasting strategy. In this paper, we characterise the instance-level variance of optimal forecasting strategies and propose Dynamic Strategies (DyStrat) for MSF. We experiment using 10 datasets from different scales, domains, and lengths of multi-step horizons. When using a random-forest-based classifier, DyStrat outperforms the best fixed strategy, which is not knowable a priori, 94% of the time, with an average reduction in mean-squared error of 11%. Our approach typically triples the top-1 accuracy compared to current approaches. Notably, we show DyStrat generalises well for any MSF task.
@misc{green2024timeseries, title={Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting}, author={{Green}, R. and {Stevens}, G. and {Abdallah}, Z. and {de Menezes e Silva Filho}, T.}, year={2024}, eprint={2402.08373} }

2021

AstronomicAL: an interactive dashboard for visualisation, integration and classification of data with Active Learning
G. Stevens, S. Fotopoulou, M.N. Bremer, O. Ray
Journal for Open Source Software

PDF DOI BIB ABSTRACT PROJECT
Keywords: active-learning, software, classification
AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning. This technique prioritises data that offer high information gain, leading to improved performance using substantially less data. The system allows users to visualise and integrate data from different sources and deal with incorrect or missing labels and imbalanced class sizes. AstronomicAL enables experts to visualise domain-specific plots and key information relating both to broader context and details of a point of interest drawn from a variety of data sources, ensuring reliable labels. In addition, AstronomicAL provides functionality to explore all aspects of the training process, including custom models and query strategies. This makes the software a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. We illustrate using the system with an astronomical dataset due to the field’s immediate need; however, AstronomicAL has been designed for datasets from any discipline. Finally, by exporting a simple configuration file, entire layouts, models, and assigned labels can be shared with the community. This allows for complete transparency and ensures that the process of reproducing results is effortless.
@article{Stevens_2021, doi = {10.21105/joss.03635}, year = 2021, month = {sep}, publisher = {The Open Journal}, volume = {6}, number = {65}, pages = {3635}, author = {{Stevens}, G. and {Fotopoulou}, S. and {Bremer}, M.~N. and {Ray}, O.}, title = {{AstronomicAL}: an interactive dashboard for visualisation, integration and classification of data with Active Learning}, journal = {Journal of Open Source Software} }