## Session 11 - Machine learning and statistical inference techniques

Enrico Camporeale (CWI, The Netherlands), Simon Wing (JHUAPL, USA & CWI, The Netherlands), Jay Johnson (PPPL, USA), Jacob Bortnik (UCLA, USA)
Thursday 17/11, 10:00-13:30
Mercator

The science of 'making predictions' has been historically based on statistical inference and, more recently, on machine learning techniques. Other areas that are concerned with predictions and are somehow overlapping in scope and methodology include system identification, data assimilation, information theory, signal processing, and uncertainty quantification. All of these disciplines have been studied and developed in context typically unrelated to Space Weather. However, the science behind Space Weather is becoming increasingly multidisciplinary, and the ease of accessing and processing large volumes of data makes these techniques very attractive for the Space Weather community. This session is devoted to contributions that use any of these approaches for Space Weather forecasting.

Poster Viewing
Thursday November 17, 10:00 - 11:00, Poster Area

Talks
Thursday November 17, 11:00 - 13:30, Mercator

 11:00 Blind source separation for better tailored space weather products Dudok de wit, T et al. Oral T. Dudok de Wit LPC2E, University of Orléans Many data products for space weather are derived from multiple observations made by networks of instruments (e.g. geomagnetic indices) or come from synoptic observations (e.g. sunspot observations). These are then combined or averaged in order to obtain an end product with desirable properties. Such combinations are often done empirically. Could we infer from the data (rather than impose) other combinations that would lead to better tailored products ? Recently, blind source separation (BSS) has become a powerful concept for reducing such large ensembles of partly redundant observations into smaller subsets of “sources” that capture their salient features. Principal Component Analysis is just a very simple way of doing BSS. More advanced methods exist, allowing to impose on the sources specific constraints such as positivity or sparsity, in an effort to obtain physically meaningful solutions. To illustrate potential applications of BSS to space weather products, three examples will be addressed: 1) use synoptic radio observations at centimetric wavelengths to define a new quantity that is a better proxy for the solar UV flux than the usual F10.7 index; 2) use energetic electron fluxes recorded by ACE/EPAM in the solar wind to monitor the nature of the electron acceleration processes, and the hardness of the energy spectrum; 3) provide a roadmap for geomagnetic proxies that better describe specific current systems such as the ring current. 11:10 FLARECAST Prediction Algorithms: Machine-learning methods for flare prediction and feature selection Piana, M et al. Oral Anna Maria Massone[1], Federico Benvenuto[2], Annalisa Perasso[1], Michele Piana[1,2], Kostas Florios[3], D Shaun Bloomfield[4,5] and the FLARECAST Team [1]CNR - SPIN, Genova, Italy; [2]Dipartimento di Matematica, Università di Genova, Italy; [3]Academy of Athens, Greece; [4]Trinity College Dublin, Ireland; [5]Northumbria University, Newcastle Upon Tyne, UK FLARECAST is an H2020 project with the aim to construct a technological platform realizing machine learning methods for flare prediction. Over the last decades, machine-learning methods have dramatically increased the capability and efficiency of information extraction from data. Rough taxonomies of this wide framework can be based on different aspects. For example, the distinction between supervised and unsupervised methods informs about the way methods learn, while the distinction between regression and classification refers to the ability of the methods to provide either real-valued or binary prediction outcomes. Alternatively, some methods stand out because, besides prediction, they also afford the capability of weighing the prediction power of the feature(s) they analyze. Here we discuss the problem of feature selection for machine-learning methods when applied to solar flare prediction. We compare the prediction capability of the various algorithms in the following two cases: i) catalogue data provided by the US National Oceanic and Atmospheric Administration (NOAA) Space Weather Prediction Center (SWPC); ii) magnetic measurements from the Solar Dynamics Observatory Helioseismic and Magnetic Imager (SDO/HMI), provided via the FLARECAST platform. Special focus will be given to a novel hybrid machine-learning method that combines the feature selection power of constrained regularization with the data-driven classification power of fuzzy clustering. This research was supported by the European Union's Horizon 2020 research and innovation programme under grant agreement No. 640216 (FLARECAST project). 11:20 Characterization of active regions' time evolution in view of solar flare prediction Attie, R et al. Oral Raphael Attie[1], Ruben De Visscher[2], Veronique Delouille[1] [1]Royal Observatory of Belgium; [2]Awingu NV We consider the problem of flare prediction, and more precisely of differentiating the photospheric evolution of active regions (AR) prone to produce strong flares from the more quiet AR. Our approach summarizes the information contained in magnetogram and continuum images using so-called ‘predictors’ and feed these into supervised classification algorithms. While such approach has been largely exploited in the literature, there are a number of statistical and numerical issues related to such approach that needs to be addressed properly. To tackle these issues we have built a new framework, called PREDISOL, with three related components : 1- A highly efficient database and management system, 2- a processing pipeline that automates: the remapping of solar images, the determination of region of interest surrounding and co-rotating with the ARs as far as the limb, and the computation of predictors in those regions of interests. The current predictors in the pipeline are related to AR size and statistics of the magnetic field and neutral line. 3- A supervised classification of ARs based on the time series of predictors. The catalogues of ARs and flare events combined in PREDISOL cover the SOHO-MDI epoch 1996 - 2012. In order to augment on the set of physical predictors, we implemented a set of modules interacting with the first two components of PREDISOL, and that exploits a tracking algorithm known as “Magnetic Balltracking”, to characterize in a precise way the time evolution of ARs. We present some case studies and show how they enable a more in-depth analysis of the evolution of the magnetic flux in active regions, and hence should lead to the definition of more informative predictors of solar flaring activity. 11:30 Solar Flare Forecasting from Magnetic Feature Properties Generated by the SMART Algorithm Bloomfield, D et al. Oral D. Shaun Bloomfield[1,2], Katarina Domijan[3], Peter T. Gallagher[1] [1]Trinity College Dublin, College Green, Dublin 2, Ireland; [2]Northumbria University, Newcastle Upon Tyne, NE1 8ST, UK; [3]National University of Ireland Maynooth, Maynooth, Co. Kildare, Ireland Solar flare forecasting has historically used categorical classifications of active regions from ground-based data, such as the white-light structural McIntosh scheme. Recent satellite missions have provided a continuous series of magnetogram images from which several characteristics can be derived. Here, we study the predictive capabilities for flare forecasting of multiple magnetic feature (MF) properties extracted by the SolarMonitor Active Region Tracking (SMART) algorithm from SOHO/MDI data. We use a simple feature selection method to identify the most useful SMART MF properties for separating flaring/non-flaring detections and a well-known linear classifier (logistic regression) to derive classification rules to predict flaring/non-flaring classes of future SMART MF observations. We obtain significantly better results to those previously published on the same data set with a neural network approach. In addition, we separately analyze a subset of these MF detections that are matched to NOAA active regions, which is more comparable to data sets published in the literature. This research was part-supported by the European Union's Horizon 2020 research and innovation programme under grant agreement No. 640216 (FLARECAST project). 11:40 Application of data assimilation techniques to heliospheric modelling: two preliminary studies Innocenti, M et al. Invited Oral M. E. Innocenti[1], G. Lapenta[1], B. Vrsnak[2], C. Skandrani[3], M. Temmer[4], A. Veronig[4], L. Bettarini[5], S. Markidis[6] and M. Skender[7] [1]Center for mathematical Plasma Astrophysics, University of Leuven (KULeuven), Leuven, Belgium; [2]Faculty of Geodesy, Hvar Observatory, Zagreb, Croatia; [3]Noveltis, Ramonville-Saint-Agne, France; [4]Institute of Physics, University of Graz, Graz, Austria; [5]Solar-Terrestrial Center of Excellence-SIDC, Royal Observatory of Belgium, Brussels, Belgium; [6]Department of Computational Science and Technologies, KTH Royal Institute of Technology, Stockholm, Sweden; [7]Istituto Nazionale di Astrofisica, Osservatorio Astronomico di Capodimonte, Napoli Data assimilation techniques are a way to obtain a better estimate of the state of a system by combining modelling (e.g., simulations) and measures of relevant quantities. Data assimilation methods are routinely used in fields, such as meteorology, ionospheric modelling, radiation belt dynamics, oceanic studies, where a variety of observations are available. Their application to heliospheric or solar modelling is still in its infancy. We present here two preliminary studies. In Innocenti et al, 2011, Kalman filtering techniques are applied to an empirical solar wind forecasting model [Vrsnak et al, 2007]. It is shown that Kalman filtering can improve the quality of the forecasts and extend the period of applicability of the baseline model. In a subset of cases, some degree of robustness towards solar transient activity (i.e. Coronal Mass Ejections- CMEs) not accounted for in the original model is also provided. This is a very significant point since background solar wind models struggle with the inclusion of CMEs in the simulations. In Skandrani et al, 2014, the representers technique is used to assess how process and model state errors propagate in a MagnetoHydro Dynamic (MHD) code, FLIP-MHD, used for the simulation of solar wind propagation from the source surface to Earth. The aim is to understand the impact of source surface input parameters on the evolution of MHD heliospheric models and the potentialities of data assimilation techniques in solar wind forecasting. The representer technique allows one to understand how far from the observation point the improvement granted from the assimilation of a measure propagates. Skandrani, C., Innocenti, M., Bettarini, L., Crespon, F., Lamouroux, J., and Lapenta, G. (2014). FLIP-MHD-based model sensitivity analysis. Nonlinear Processes in Geophysics, 21(2):539–553. Innocenti, M., Lapenta, G., Vrsnak, B., Crespon, F., Skandrani, C., Temmer, M., Veronig, A., Bettarini, L., Markidis, S., and Skender, M. (2011). Improved forecasts of solar wind parameters using the Kalman filter. Space Weather, 9(10). 11:50 Solar wind types from a machine learning point of view Heidrich-meisner, V et al. Oral Verena Heidrich-Meisner, Robert F. Wimmer-Schweingruber Department of Extraterrestrial Physics, Christian-Albrechts Universität zu Kiel The steady solar wind is typically divided into mainly two categories often called slow and fast solar wind. However, these names are misleading since the solar wind speed does not represent the decisive difference between them. Nevertheless, as a simple approximation the solar wind speed is still frequently used as the sole parameter to distinguish between solar wind types. More sophisticated approaches base the distinction on elemental and charge state composition of the solar wind plasma (see for example Zurbuchen et al. (2002) and Zhao et al. (2009)). A qualitatively different approach is employed in Xu & Borovsky (2015). Here, the categorization of solar wind plasma is based on decision boundaries in a 3-dimensional space of typically observed solar wind parameters, i.e the Alfvén speed, the proton specific entropy and the proton temperature compared to a velocity-dependent expected temperature. In both cases, the decision boundaries are defined heuristically and require expert knowledge and experience. Although this on the one hand ensures that the resulting categories are physically plausible this can on the other hand introduce a perception bias. In order to investigate this effect we here adopt a machine learning perspective and employ clustering and supervised learning methods to solar wind plasma data measured by the solar wind ion and composition spectrometer (SWICS), solar wind electron proton and alpha monitor (SWEPAM), and magnetometer (MAG) on the advanced composition explorer (ACE). The resulting solar wind characterizations are then compared to the aforementioned existing solar wind categorization schemes. 12:00 5' Break Break, Break break 12:05 NARMAX approach to the Space Weather forecast: results and capabilities Balikhin, M et al. Invited Oral Michael A. Balikhin, Richard J. Boynton, Simon N. Walker The University of Sheffield, UK Methods developed in the field of System Science are effective tools in the studies of complex physical objects for which models based on first principles are not developed yet. Among the various System Science techniques NARMAX (Nonlinear Autoregressive Moving Average Model with eXogenous inputs) is one of the most powerful and widely used methods. In this presentation we will review NARMAX methodology, results of NARMAX application to the modelling and forecast of Space Weather parameters. We will also discussed how the NARMAX capabilities can be used further to address important problems in Space Weather field. Finally it is shown how application of NARMAX can can advance our physical knowledge about magnetospheric processes. 12:20 Information theoretical approach to discovering solar wind drivers of the outer radiation belt Wing, S et al. Oral Simon Wing[1], Enrico Camporeale[2], Jay Johnson[3], Geoffrey Reeves[4] [1]The Johns Hopkins University; [2]Center for Mathematics and Computer Science (CWI); [3]Princeton University; [4]Los Alamos National Laboratory The solar wind–magnetosphere system is nonlinear. The solar wind drivers of geosynchronous electrons with energy range of 1.8–3.5 MeV are investigated using mutual information (MI), conditional mutual information (CMI), and transfer entropy (TE). These information theoretical tools can establish linear and nonlinear relationships as well as information transfer. The information transfer from solar wind velocity (Vsw) to geosynchronous MeV electron flux (Je) peaks with a lag time (τ) of 2 days. As previously reported, Je is anticorrelated with solar wind density (nsw) with a lag of 1 day. However, this lag time and anticorrelation can be attributed mainly to the Je(t + 2 days) correlation with Vsw(t) and nsw(t + 1 day) anticorrelation with Vsw(t). Analyses of solar wind driving of the magnetosphere need to consider the large lag times, up to 3 days, in the (Vsw, nsw) anticorrelation. Using CMI to remove the effects of Vsw, the response of Je to nsw is 30% smaller and has a lag time < 24 hr, suggesting that the MeV electron loss mechanism due to nsw or solar wind dynamic pressure has to operate in < 24 hr. nsw transfers about 36% as much information as Vsw (the primary driver) to Je. Nonstationarity in the system dynamics are investigated using windowed TE. When the data is ordered according to high or low transfer entropy it is possible to understand details of the triangle distribution that has been identified between Je(t + 2 days) vs. Vsw(t). 12:30 Machine Learning in Radiation Belt Physics Shprits, Y et al. Invited Oral Yuri Shprits[1,2], Irina Zhelavskaya[1,2], Maria Spasojevic[3] [1]GFZ, Potsdam; [2]UCLA; [3]Stanford University The new methods of data analysis are discussed in this presentation. In particular we discuss machine learning on an example of Neural network for Upper-hybrid Resonance Detection (NURD) that was trained to infer the plasma density using Van Allen Probes Spectrogram. We demonstrate that we can accurately infer the values for plasma density for the entire Van Allen Probes mission. This data set can be used in the future to train the new generation of plasma density models. Other applications of Neural Networsk for the Radiation Belt prediction and understanding are discussed and illustrated. 12:45 On the use of the local ensemble transform Kalman filter (LETKF) for ionospheric data assimilation Angling, M et al. Oral M J Angling, S Elvidge Space Environment and Radio Engineering Group, University of Birmingham, UK The local ensemble transform Kalman filter (LETKF) is an ensemble Kalman filter variant first described by Hunt et al. [1]. The LETKF combines the transform ensemble Kalman filter (ETKF) [2] with the local ensemble Kalman filter (LEKF) [3]. The localization in the LEKF allows the analysis to be performed around each grid point and in parallel. The individual analyses are then combined to form the global analysis. The ETKF uses ensemble perturbation matrices, where the ensemble mean (or some other control) is removed from each ensemble member. The distance from the control to the ensemble member provides information about the spread of the ensemble, from which one can estimate the model covariances. The LETKF results are equivalent to the LEKF [4] results but are calculated in a more efficient manner, similar to the ETKF. The Advanced European electron density (Ne) Assimilation System (AENeAS) is a new ionosphere/thermosphere LETKF assimilation model being developed at the University of Birmingham, UK. AENeAS uses the Thermosphere Ionosphere Electrodynamics General Circulation Model (TIE-GCM) [5] as its background model. This is extended to GPS altitudes using the NeQuick [6] topside. The paper will describe the use of the LETKF in AENeAS, results from current testing and plans for its future development. References 1. Hunt, B.R. et al. Efficient data assimilation for spatiotemporal chaos: A local ensemble transform Kalman filter. Physica D: Nonlinear Phenomena, 2007, 230 (1-2), pp. 112–126. DOI: 10.1016/j.physd.2006.11.008 2. Bishop, C.H. et al. Adaptive Sampling with the Ensemble Transform Kalman Filter. Part I: Theoretical Aspects. Monthly Weather Review, 2001, 129, pp. 420–436. 3. Ott, E. et al. A local ensemble Kalman filter for atmospheric data assimilation. Tellus, 2004, 56, pp. 415–428. 4. Ott, E. et al. Exploiting Local Low Dimensionality of the Atmospheric Dynamics for Efficient Ensemble Kalman Filtering. 2002. 5. Richmond, A.D. et al. A thermosphere/ionosphere general circulation model with coupled electrodynamics. Geophysical Research Letters, 1992, pp. 601–604. 6. Nava, B. et al. A new version of the neQuick ionosphere electron density model. Journal of Atmospheric and Solar Terrestrial Physics, 2008, 70 (15), pp. 1856–1862. DOI: 10.1016/j.jatsp.2008.01.015 12:55 Development of new geomagnetic index forecasts using the Markov Chain method Jackson, D et al. Oral Edward Pope[1], David Stephenson[2], David Jackson[1] and Suzy Bingham[1] [1]Met Office; [2]University of Exeter Forecasts of geomagnetic indices such as Kp and ap are very important in space weather since, together with solar index information, they can be used to drive forecast models of the radiation belts, magnetosphere, thermosphere and ionosphere. These indices act as a proxy for space weather effects which would otherwise have to be represented by coupling the above models with solar and heliospheric models, which is both computationally expensive and limited by our incomplete knowledge of the relevant physical processes and coupling mechanisms. A range of statistical methods have been used to forecast geomagnetic indices, and include neural networks, autoregressive and moving average approaches. Here, we present a new approach based on Markov chains. The Markov chain model considers current conditions and uses a matrix of transition probabilities to predict the future state. As such the approach can be well suited to indices such as Kp, which often show abrupt increases at the onset of geomagnetic storms, followed by a gradual return to lower values. Here we present initial results, including a discussion of optimal methods of calculating and updating the transition probabilities and a comparison of the results against a reference forecasts (such as climatology). 13:05 An application of machine learning to geomagnetic index prediction: aiding human space weather forecasting Billingham, L et al. Oral Laurence Billingham, Gemma Kelly British Geological Survey Geomagnetic indices are ubiquitous parameterizations of storm-time magnetic conditions. Their prediction is one goal of space weather forecasting and they are required inputs for a variety of models. Despite much recent progress: human space weather forecasters, unlike terrestrial weather forecasters, cannot yet rely on predictions from physical models able to capture the relevant physics of processes across the whole system. Empirical models derived through machine learning algorithms can prove useful forecaster aids: providing robust, accurate, and fast predictions of global geomagnetic disturbances. The \textit{Kp} index is a pervasive quantification of geomagnetic disturbance level, it maps to other measures of activity such as the NOAA \textit{G}-scale and the \textit{ap} index. We build models predicting \textit{Kp}, \textit{ap}, or \textit{G}–scale with lead times of up to 24 hours. We have trained a variety of models based on different features of solar, solar wind and geomagnetic data. Comparing regression and classification algorithms we show some potential advantages of categorical models: even when target values might be treated as real numbers susceptible to regression. We investigate two broad classes of model: ones that partition their parameter space and make a series of local fits; and models which produce a single optimal fit across their global parameter space. The differing feature importances between local and global models might provide insight into the loading and dissipation processes in the coupled solar wind-magnetosphere-ionosphere system. Index prediction presents a number of challenges for statistical models. Potentially infrastructure-damaging large storms are the most important events to predict. However, distributions of geomagnetic activity are positively skewed with very heavy tails: these large events are rare and algorithms tend to treat them as noise. We present strategies for dealing with the rarity of large storms; both in predicting them accurately, and in being able to quantify a model’s large-storm predictive power. We also demonstrate schemes for data cleaning and assimilation including the integration of disparate data types within a single model. Our trained models are compared to one another and against an existing operational model. In many cases we show that the machine learning methods can predict storms better than the existing approach. 13:15 Gaussian Process Models for Prediction of the Dst Index. Chandorkar, M et al. Oral Mandar Chandorkar[1], Enrico Camporeale[1], Simon Wing[2] [1]Center for Mathematics and Computer Science (CWI), Amsterdam; [2]Johns Hopkins University, Applied Physics Lab In this talk we train Gaussian Process regression models for One Step Ahead (OSA) prediction of the Disturbance Storm Time ($Dst$) geomagnetic index. We propose two variants of the Gaussian Process model. \begin{enumerate*} \item Gaussian Process Auto regressive (GP-AR) \item Gaussian Process Auto regressive with eXogenous inputs (GP-ARX). \end{enumerate*} We compare the performance of these models with the current state of the art in one step ahead $Dst$ prediction on a set of 63 benchmark storms from 1998-2006.