Published Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning on Arxiv!
2024-02-15
Published Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting on Arxiv!
2024-02-15
Presented a talk on 'Optimizing Data Efficiency: Using Active Learning Strategies and the QUEST Method for Efficient Classification and Labeling in Large Datasets' at the Galaxies & AGN with the First Euclid Data and Beyond in Bologna.
I recently finished my 6 month placement as an AI Research Engineer at Imagination Technologies where I was working on Sparsity, Lidar data and CUDA implementations of custom DL layers. A patent for my work has been submitted and is awaiting approval.
2022-04-27
Presented a talk on 'Using active learning to create reliable and robust classifiers for Euclid' at the 2022 Annual Euclid Consortium Meeting in Oslo.
From June, I will be starting a 6 month placement as an AI Research Engineer at Imagination Technologies!
2021-12-16
Presented a talk on 'Using active learning to create reliable and robust classifiers for Euclid' at the 2021 Euclid Consortium UK Meeting.
2021-10-18
Presented research poster of AstronomicAL at the 2021 IAP Colloquium which was dedicated to critical analysis of Machine Learning methods in Astronomy.
2021-09-03
Published AstronomicAL: an interactive dashboard for visualisation, integration and classification of data with Active Learning in the Journal for Open Source Software!
Welcome to my site.
I am currently an EPSRC Doctoral Prize Fellow at the University of Bristol. I specialise in machine learning techniques applied to astronomical data, with a focus on improving active learning performance and exploring the utility of weak supervision. My work has enabled me to be involved and consult in the morphology classification pipeline of the recently launched ESA telescope Euclid.
Industry Placement
I have recently completed my 6 month placement as an AI Research Engineer at Imagination Technologies.
Creating novel query strategies to improve accuracy and reduce labelling costs for active learning.
Combining the use of weak supervision methods with active learning to improve performance on datasets where labels are scarce, noisy, or difficult to obtain.
Using active learning for galaxy morphology classification with noisy image data and unreliable labels.
Source classification (star, galaxy, AGN, QSO separation) using Active Learning and Outlier Detection methods.
Creating interactive software for researchers to make use of cutting-edge machine learning techniques.
To take advantage of the experience I gained from undergrad, as well as being one of the key aims of the CDT, I have a strong interest in creating software that aids researchers in applying machine learning methods to their respective fields.
Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning B. Aussel, S. Kruk, M. Walmsley, M. Huertas-Company, M. Castellano, C.J. Conselice, M. Delli Veneri, H. Domínguez-Sánchez, P.-A. Duc, U. Kuchner, A. La Marca, B. Margalef-Bentabol, F.R. Marleau, G. Stevens, Y. Toba, C. Tortora, L. Wang, Euclid Consortium Arxiv Preprint PDFARXIVBIB ABSTRACT
The Euclid mission is expected to image millions of galaxies with high resolution, providing an extensive dataset to study galaxy evolution. We investigate the application of deep learning to predict the detailed morphologies of galaxies in Euclid using Zoobot a convolutional neural network pretrained with 450000 galaxies from the Galaxy Zoo project. We adapted Zoobot for emulated Euclid images, generated based on Hubble Space Telescope COSMOS images, and with labels provided by volunteers in the Galaxy Zoo: Hubble project. We demonstrate that the trained Zoobot model successfully measures detailed morphology for emulated Euclid images. It effectively predicts whether a galaxy has features and identifies and characterises various features such as spiral arms, clumps, bars, disks, and central bulges. When compared to volunteer classifications Zoobot achieves mean vote fraction deviations of less than 12% and an accuracy above 91% for the confident volunteer classifications across most morphology types. However, the performance varies depending on the specific morphological class. For the global classes such as disk or smooth galaxies, the mean deviations are less than 10%, with only 1000 training galaxies necessary to reach this performance. For more detailed structures and complex tasks like detecting and counting spiral arms or clumps, the deviations are slightly higher, around 12% with 60000 galaxies used for training. In order to enhance the performance on complex morphologies, we anticipate that a larger pool of labelled galaxies is needed, which could be obtained using crowdsourcing. Finally, our findings imply that the model can be effectively adapted to new morphological labels. We demonstrate this adaptability by applying Zoobot to peculiar galaxies. In summary, our trained Zoobot CNN can readily predict morphological catalogues for Euclid images.
@misc{euclidcollaboration2024euclid, title={Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning}, author={Euclid Collaboration and B. Aussel and S. Kruk and M. Walmsley and M. Huertas-Company and M. Castellano and C. J. Conselice and M. Delli Veneri and H. Domínguez Sánchez and P. -A. Duc and U. Kuchner and A. La Marca and B. Margalef-Bentabol and F. R. Marleau and G. Stevens and Y. Toba and C. Tortora and L. Wang and N. Aghanim and B. Altieri and A. Amara and S. Andreon and N. Auricchio and M. Baldi and S. Bardelli and R. Bender and C. Bodendorf and D. Bonino and E. Branchini and M. Brescia and J. Brinchmann and S. Camera and V. Capobianco and C. Carbone and J. Carretero and S. Casas and S. Cavuoti and A. Cimatti and G. Congedo and L. Conversi and Y. Copin and F. Courbin and H. M. Courtois and M. Cropper and A. Da Silva and H. Degaudenzi and A. M. Di Giorgio and J. Dinis and F. Dubath and X. Dupac and S. Dusini and M. Farina and S. Farrens and S. Ferriol and S. Fotopoulou and M. Frailis and E. Franceschi and P. Franzetti and M. Fumana and S. Galeotta and B. Garilli and B. Gillis and C. Giocoli and A. Grazian and F. Grupp and S. V. H. Haugan and W. Holmes and I. Hook and F. Hormuth and A. Hornstrup and P. Hudelot and K. Jahnke and E. Keihänen and S. Kermiche and A. Kiessling and M. Kilbinger and B. Kubik and M. Kümmel and M. Kunz and H. Kurki-Suonio and R. Laureijs and S. Ligori and P. B. Lilje and V. Lindholm and I. Lloro and E. Maiorano and O. Mansutti and O. Marggraf and K. Markovic and N. Martinet and F. Marulli and R. Massey and S. Maurogordato and E. Medinaceli and S. Mei and Y. Mellier and M. Meneghetti and E. Merlin and G. Meylan and M. Moresco and L. Moscardini and E. Munari and S. -M. Niemi and C. Padilla and S. Paltani and F. Pasian and K. Pedersen and W. J. Percival and V. Pettorino and S. Pires and G. Polenta and M. Poncet and L. A. Popa and L. Pozzetti and F. Raison and R. Rebolo and A. Renzi and J. Rhodes and G. Riccio and E. Romelli and M. Roncarelli and E. Rossetti and R. Saglia and D. Sapone and B. Sartoris and M. Schirmer and P. Schneider and A. Secroun and G. Seidel and S. Serrano and C. Sirignano and G. Sirri and L. Stanco and J. -L. Starck and P. Tallada-Crespí and A. N. Taylor and H. I. Teplitz and I. Tereno and R. Toledo-Moreo and F. Torradeflot and I. Tutusaus and E. A. Valentijn and L. Valenziano and T. Vassallo and A. Veropalumbo and Y. Wang and J. Weller and A. Zacchei and G. Zamorani and J. Zoubian and E. Zucca and A. Biviano and M. Bolzonella and A. Boucaud and E. Bozzo and C. Burigana and C. Colodro-Conde and D. Di Ferdinando and R. Farinelli and J. Graciá-Carpio and G. Mainetti and S. Marcin and N. Mauri and C. Neissner and A. A. Nucita and Z. Sakr and V. Scottez and M. Tenti and M. Viel and M. Wiesmann and Y. Akrami and V. Allevato and S. Anselmi and C. Baccigalupi and M. Ballardini and S. Borgani and A. S. Borlaff and H. Bretonnière and S. Bruton and R. Cabanac and A. Calabro and A. Cappi and C. S. Carvalho and G. Castignani and T. Castro and G. Cañas-Herrera and K. C. Chambers and J. Coupon and O. Cucciati and S. Davini and G. De Lucia and G. Desprez and S. Di Domizio and H. Dole and A. Díaz-Sánchez and J. A. Escartin Vigo and S. Escoffier and I. Ferrero and F. Finelli and L. Gabarra and K. Ganga and J. García-Bellido and E. Gaztanaga and K. George and F. Giacomini and G. Gozaliasl and A. Gregorio and D. Guinet and A. Hall and H. Hildebrandt and A. Jimenez Munoz and J. J. E. Kajava and V. Kansal and D. Karagiannis and C. C. Kirkpatrick and L. Legrand and A. Loureiro and J. Macias-Perez and M. Magliocchetti and R. Maoli and M. Martinelli and C. J. A. P. Martins and S. Matthew and M. Maturi and L. Maurin and R. B. Metcalf and M. Migliaccio and P. Monaco and G. Morgante and S. Nadathur and Nicholas A. Walton and A. Peel and A. Pezzotta and V. Popa and C. Porciani and D. Potter and M. Pöntinen and P. Reimberg and P. -F. Rocci and A. G. Sánchez and A. Schneider and E. Sefusatti and M. Sereno and P. Simon and A. Spurio Mancini and S. A. Stanford and J. Steinwagner and G. Testera and M. Tewes and R. Teyssier and S. Toft and S. Tosi and A. Troja and M. Tucci and C. Valieri and J. Valiviita and D. Vergani and I. A. Zinchenko}, year={2024}, eprint={2402.10187} }
Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting R. Green, G. Stevens, T. de Menezes e Silva Filho, Z. Abdallah Arxiv Preprint PDFDOIBIB ABSTRACT
Multi-step forecasting (MSF) in time-series, the ability to make predictions multiple time steps into the future, is fundamental to almost all temporal domains. To make such forecasts, one must assume the recursive complexity of the temporal dynamics. Such assumptions are referred to as the forecasting strategy used to train a predictive model. Previous work shows that it is not clear which forecasting strategy is optimal a priori to evaluating on unseen data. Furthermore, current approaches to MSF use a single (fixed) forecasting strategy. In this paper, we characterise the instance-level variance of optimal forecasting strategies and propose Dynamic Strategies (DyStrat) for MSF. We experiment using 10 datasets from different scales, domains, and lengths of multi-step horizons. When using a random-forest-based classifier, DyStrat outperforms the best fixed strategy, which is not knowable a priori, 94% of the time, with an average reduction in mean-squared error of 11%. Our approach typically triples the top-1 accuracy compared to current approaches. Notably, we show DyStrat generalises well for any MSF task.
@misc{green2024timeseries, title={Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting}, author={Riku Green and Grant Stevens and Telmo de Menezes e Silva Filho and Zahraa Abdallah}, year={2024}, eprint={2402.08373} }
On-campus presentation to 30 local sixth form students who intend to study Engineering at university. This presentation immediately followed the AI & ML:Cutting Through The Hype talk and was used to show how ML tasks are often not as straightforward as they may seem. This talk is very interactive with the aim that the students are able to discover the problems that appear themselves and see why certain solutions may not be sufficient for a problem. Read more
Webinar presented to 60 sixth form students who intend to study Engineering at university. The presentation starts with an introduction to what Computer Science is (and is not) like at university. Following this, the (very brief) foundations of what Machine Learning and AI really are. Unfortunately, the adoption of these tools has led to a large amount of over-exaggeration and overuse of certain buzzwords throughout the industry, making it seem like companies are doing super complicated and ground-breaking things when most of the time they’re doing nothing more than the Maths the students use in their A-Level studies. I also show the Dot-Com Boom and the AI Winter as examples for how overhyping can be damaging for research progress and the economy. Read more