Publications
2024
- Euclid preparation - XLIII. Measuring detailed galaxy morphologies for Euclid with machine learning
B. Aussel, S. Kruk, M. Walmsley, M. Huertas-Company, M. Castellano, C.J. Conselice, M. Delli Veneri, H. Domínguez-Sánchez, P.-A. Duc, U. Kuchner, A. La Marca, B. Margalef-Bentabol, F.R. Marleau, G. Stevens, Y. Toba, C. Tortora, L. Wang, Euclid Consortium
Astronomy & Astrophysics
PDF DOI BIB ABSTRACT Keywords: transfer-learning, galaxy-classification, morphology-classification, computer-vision, euclid-consortium, classification, regressionThe Euclid mission is expected to image millions of galaxies at high resolution, providing an extensive dataset with which to study galaxy evolution. Because galaxy morphology is both a fundamental parameter and one that is hard to determine for large samples, we investigate the application of deep learning in predicting the detailed morphologies of galaxies in Euclid using Zoobot, a convolutional neural network pretrained with 450 000 galaxies from the Galaxy Zoo project. We adapted Zoobot for use with emulated Euclid images generated based on Hubble Space Telescope COSMOS images and with labels provided by volunteers in the Galaxy Zoo: Hubble project. We experimented with different numbers of galaxies and various magnitude cuts during the training process. We demonstrate that the trained Zoobot model successfully measures detailed galaxy morphology in emulated Euclid images. It effectively predicts whether a galaxy has features and identifies and characterises various features, such as spiral arms, clumps, bars, discs, and central bulges. When compared to volunteer classifications, Zoobot achieves mean vote fraction deviations of less than 12% and an accuracy of above 91% for the confident volunteer classifications across most morphology types. However, the performance varies depending on the specific morphological class. For the global classes, such as disc or smooth galaxies, the mean deviations are less than 10%, with only 1000 training galaxies necessary to reach this performance. On the other hand, for more detailed structures and complex tasks, such as detecting and counting spiral arms or clumps, the deviations are slightly higher, of namely around 12% with 60 000 galaxies used for training. In order to enhance the performance on complex morphologies, we anticipate that a larger pool of labelled galaxies is needed, which could be obtained using crowd sourcing. We estimate that, with our model, the detailed morphology of approximately 800 million galaxies of the Euclid Wide Survey could be reliably measured and that approximately 230 million of these galaxies would display features. Finally, our findings imply that the model can be effectively adapted to new morphological labels. We demonstrate this adaptability by applying Zoobot to peculiar galaxies. In summary, our trained Zoobot CNN can readily predict morphological catalogues for Euclid images.@article{ Aussel2024euclid, author = {{Euclid Collaboration:} and {Aussel, B.} and {Kruk, S.} and {Walmsley, M.} and {Huertas-Company, M.} and {Castellano, M.} and {Conselice, C. J.} and {Veneri, M. Delli} and {Sánchez, H. Domínguez} and {Duc, P.-A.} and {Knapen, J. H.} and {Kuchner, U.} and {La Marca, A.} and {Margalef-Bentabol, B.} and {Marleau, F. R.} and {Stevens, G.} and {Toba, Y.} and {Tortora, C.} and {Wang, L.} and {Aghanim, N.} and {Altieri, B.} and {Amara, A.} and {Andreon, S.} and {Auricchio, N.} and {Baldi, M.} and {Bardelli, S.} and {Bender, R.} and {Bodendorf, C.} and {Bonino, D.} and {Branchini, E.} and {Brescia, M.} and {Brinchmann, J.} and {Camera, S.} and {Capobianco, V.} and {Carbone, C.} and {Carretero, J.} and {Casas, S.} and {Cavuoti, S.} and {Cimatti, A.} and {Congedo, G.} and {Conversi, L.} and {Copin, Y.} and {Courbin, F.} and {Courtois, H. M.} and {Cropper, M.} and {Da Silva, A.} and {Degaudenzi, H.} and {Di Giorgio, A. M.} and {Dinis, J.} and {Dubath, F.} and {Dupac, X.} and {Dusini, S.} and {Farina, M.} and {Farrens, S.} and {Ferriol, S.} and {Fotopoulou, S.} and {Frailis, M.} and {Franceschi, E.} and {Franzetti, P.} and {Fumana, M.} and {Galeotta, S.} and {Garilli, B.} and {Gillis, B.} and {Giocoli, C.} and {Grazian, A.} and {Grupp, F.} and {Haugan, S. V. H.} and {Holmes, W.} and {Hook, I.} and {Hormuth, F.} and {Hornstrup, A.} and {Hudelot, P.} and {Jahnke, K.} and {Keihänen, E.} and {Kermiche, S.} and {Kiessling, A.} and {Kilbinger, M.} and {Kubik, B.} and {Kümmel, M.} and {Kunz, M.} and {Kurki-Suonio, H.} and {Laureijs, R.} and {Ligori, S.} and {Lilje, P. B.} and {Lindholm, V.} and {Lloro, I.} and {Maiorano, E.} and {Mansutti, O.} and {Marggraf, O.} and {Markovic, K.} and {Martinet, N.} and {Marulli, F.} and {Massey, R.} and {Maurogordato, S.} and {Medinaceli, E.} and {Mei, S.} and {Mellier, Y.} and {Meneghetti, M.} and {Merlin, E.} and {Meylan, G.} and {Moresco, M.} and {Moscardini, L.} and {Munari, E.} and {Niemi, S.-M.} and {Padilla, C.} and {Paltani, S.} and {Pasian, F.} and {Pedersen, K.} and {Percival, W. J.} and {Pettorino, V.} and {Pires, S.} and {Polenta, G.} and {Poncet, M.} and {Popa, L. A.} and {Pozzetti, L.} and {Raison, F.} and {Rebolo, R.} and {Renzi, A.} and {Rhodes, J.} and {Riccio, G.} and {Romelli, E.} and {Roncarelli, M.} and {Rossetti, E.} and {Saglia, R.} and {Sapone, D.} and {Sartoris, B.} and {Schirmer, M.} and {Schneider, P.} and {Secroun, A.} and {Seidel, G.} and {Serrano, S.} and {Sirignano, C.} and {Sirri, G.} and {Stanco, L.} and {Starck, J.-L.} and {Tallada-Crespí, P.} and {Taylor, A. N.} and {Teplitz, H. I.} and {Tereno, I.} and {Toledo-Moreo, R.} and {Torradeflot, F.} and {Tutusaus, I.} and {Valentijn, E. A.} and {Valenziano, L.} and {Vassallo, T.} and {Veropalumbo, A.} and {Wang, Y.} and {Weller, J.} and {Zacchei, A.} and {Zamorani, G.} and {Zoubian, J.} and {Zucca, E.} and {Biviano, A.} and {Bolzonella, M.} and {Boucaud, A.} and {Bozzo, E.} and {Burigana, C.} and {Colodro-Conde, C.} and {Di Ferdinando, D.} and {Farinelli, R.} and {Graciá-Carpio, J.} and {Mainetti, G.} and {Marcin, S.} and {Mauri, N.} and {Neissner, C.} and {Nucita, A. A.} and {Sakr, Z.} and {Scottez, V.} and {Tenti, M.} and {Viel, M.} and {Wiesmann, M.} and {Akrami, Y.} and {Allevato, V.} and {Anselmi, S.} and {Baccigalupi, C.} and {Ballardini, M.} and {Borgani, S.} and {Borlaff, A. S.} and {Bretonnière, H.} and {Bruton, S.} and {Cabanac, R.} and {Calabro, A.} and {Cappi, A.} and {Carvalho, C. S.} and {Castignani, G.} and {Castro, T.} and {Cañas-Herrera, G.} and {Chambers, K. C.} and {Coupon, J.} and {Cucciati, O.} and {Davini, S.} and {De Lucia, G.} and {Desprez, G.} and {Di Domizio, S.} and {Dole, H.} and {Díaz-Sánchez, A.} and {Vigo, J. A. Escartin} and {Escoffier, S.} and {Ferrero, I.} and {Finelli, F.} and {Gabarra, L.} and {Ganga, K.} and {García-Bellido, J.} and {Gaztanaga, E.} and {George, K.} and {Giacomini, F.} and {Gozaliasl, G.} and {Gregorio, A.} and {Guinet, D.} and {Hall, A.} and {Hildebrandt, H.} and {Muñoz, A. Jimenez} and {Kajava, J. J. E.} and {Kansal, V.} and {Karagiannis, D.} and {Kirkpatrick, C. C.} and {Legrand, L.} and {Loureiro, A.} and {Macias-Perez, J.} and {Magliocchetti, M.} and {Maoli, R.} and {Martinelli, M.} and {Martins, C. J. A. P.} and {Matthew, S.} and {Maturi, M.} and {Maurin, L.} and {Metcalf, R. B.} and {Migliaccio, M.} and {Monaco, P.} and {Morgante, G.} and {Nadathur, S.} and {Walton, Nicholas A.} and {Peel, A.} and {Pezzotta, A.} and {Popa, V.} and {Porciani, C.} and {Potter, D.} and {Pöntinen, M.} and {Reimberg, P.} and {Rocci, P.-F.} and {Sánchez, A. G.} and {Schneider, A.} and {Sefusatti, E.} and {Sereno, M.} and {Simon, P.} and {Mancini, A. Spurio} and {Stanford, S. A.} and {Steinwagner, J.} and {Testera, G.} and {Tewes, M.} and {Teyssier, R.} and {Toft, S.} and {Tosi, S.} and {Troja, A.} and {Tucci, M.} and {Valieri, C.} and {Valiviita, J.} and {Vergani, D.} and {Zinchenko, I. A.}}, title = {Euclid preparation - XLIII. Measuring detailed galaxy morphologies for Euclid with machine learning}, DOI= "10.1051/0004-6361/202449609", url= "https://doi.org/10.1051/0004-6361/202449609", journal = {A\&A}, year = 2024, volume = 689, pages = "A274", }
- Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting
R. Green, G. Stevens, T. de Menezes e Silva Filho, Z. Abdallah
Arxiv Preprint
PDF DOI BIB ABSTRACT Keywords: time-series, multistep-forecasting, regressionMulti-step forecasting (MSF) in time-series, the ability to make predictions multiple time steps into the future, is fundamental to almost all temporal domains. To make such forecasts, one must assume the recursive complexity of the temporal dynamics. Such assumptions are referred to as the forecasting strategy used to train a predictive model. Previous work shows that it is not clear which forecasting strategy is optimal a priori to evaluating on unseen data. Furthermore, current approaches to MSF use a single (fixed) forecasting strategy. In this paper, we characterise the instance-level variance of optimal forecasting strategies and propose Dynamic Strategies (DyStrat) for MSF. We experiment using 10 datasets from different scales, domains, and lengths of multi-step horizons. When using a random-forest-based classifier, DyStrat outperforms the best fixed strategy, which is not knowable a priori, 94% of the time, with an average reduction in mean-squared error of 11%. Our approach typically triples the top-1 accuracy compared to current approaches. Notably, we show DyStrat generalises well for any MSF task.@misc{green2024timeseries, title={Time-Series Classification for Dynamic Strategies in Multi-Step Forecasting}, author={Riku Green and Grant Stevens and Telmo de Menezes e Silva Filho and Zahraa Abdallah}, year={2024}, eprint={2402.08373} }
2021
- AstronomicAL: an interactive dashboard for visualisation, integration and classification of data with Active Learning
G. Stevens, S. Fotopoulou, M.N. Bremer, O. Ray
Journal for Open Source Software
PDF DOI BIB ABSTRACT PROJECTKeywords: active-learning, interactive, dashboard, software, galaxy-classification, classificationAstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning. This technique prioritises data that offer high information gain, leading to improved performance using substantially less data. The system allows users to visualise and integrate data from different sources and deal with incorrect or missing labels and imbalanced class sizes. AstronomicAL enables experts to visualise domain-specific plots and key information relating both to broader context and details of a point of interest drawn from a variety of data sources, ensuring reliable labels. In addition, AstronomicAL provides functionality to explore all aspects of the training process, including custom models and query strategies. This makes the software a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. We illustrate using the system with an astronomical dataset due to the field’s immediate need; however, AstronomicAL has been designed for datasets from any discipline. Finally, by exporting a simple configuration file, entire layouts, models, and assigned labels can be shared with the community. This allows for complete transparency and ensures that the process of reproducing results is effortless.@article{Stevens_2021, doi = {10.21105/joss.03635}, year = 2021, month = {sep}, publisher = {The Open Journal}, volume = {6}, number = {65}, pages = {3635}, author = {Grant Stevens and Sotiria Fotopoulou and Malcolm Bremer and Oliver Ray}, title = {{AstronomicAL}: an interactive dashboard for visualisation, integration and classification of data with Active Learning}, journal = {Journal of Open Source Software} }