JKAS Archive

Journal of the Korean Astronomical Society - Vol. 52 , No. 4

[ Article ]
Journal of the Korean Astronomical Society - Vol. 52, No. 4, pp.133-144
Abbreviation: JKAS
ISSN: 1225-4614 (Print) 2288-890X (Online)
Print publication date 31 Aug 2019
Received 24 Apr 2019 Accepted 15 Jul 2019
DOI: https://doi.org/10.5303/JKAS.2019.52.4.133

1School of Space Research, Kyung Hee University, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, Korea (dalim@khu.ac.kr)(moonyj@khu.ac.kr)
2Korea Astronomy and Space Science Institute, 776, Daedeokdae-ro, Yuseong-gu, Daejeon, 34055, Korea
3Electronics and Telecommunications Research Institute, 218, Gajeong-ro, Yuseong-gu, Daejeon, 34129, Korea
4Department of Astronomy & Space Science, Kyung Hee University, 1732, Deogyeong-daero, Giheung-gu, Yongin-si, Gyeonggi-do 17104, Korea

Correspondence to : Y.-J. Moon

Published under Creative Commons license CC BY-SA 4.0
Funding Information ▼


We develop forecast models of daily probabilities of major flares (M- and X-class) based on empirical relationships between photospheric magnetic parameters and daily flaring rates from May 2010 to April 2018. In this study, we consider ten magnetic parameters characterizing size, distribution, and non-potentiality of vector magnetic fields from Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI) and Geostationary Operational Environmental Satellites (GOES) X-ray flare data. The magnetic parameters are classified into three types: the total unsigned parameters, the total signed parameters, and the mean parameters. We divide the data into two sets chronologically: 70% for training and 30% for testing. The empirical relationships between the parameters and flaring rates are used to predict flare occurrence probabilities for a given magnetic parameter value. Major results of this study are as follows. First, major flare occurrence rates are well correlated with ten parameters having correlation coefficients above 0.85. Second, logarithmic values of flaring rates are well approximated by linear equations. Third, using total unsigned and signed parameters achieved better performance for predicting flares than the mean parameters in terms of verification measures of probabilistic and converted binary forecasts. We conclude that the total quantity of non-potentiality of magnetic fields is crucial for flare forecasting among the magnetic parameters considered in this study. When this model is applied for operational use, it can be used using the data of 21:00 TAI with a slight underestimation of 2-6.3%.

Keywords: Sun: activity, Sun: flares, Sun: magnetic fields


Solar flares rapidly release a tremendous amount of energy in the form of electromagnetic radiation, high energy particles and shock waves, which result in space weather hazards. In space age, precautions against economic risk by predicting solar flare occurrence are essential (Tsurutani et al. 2005; Schwenn 2006; Bocchialini et al. 2018).

Most solar energetic events such as solar flares explode out of active regions (ARs), which are areas of complex and intense photospheric magnetic fields. Thus, characteristics of solar ARs are considered to be closely related to solar flares. Several different types of AR information have been used for solar flare forecasts. Many studies have considered morphological characteristics of ARs such as size, shape and complexity using Mount Wilson magnetic classification and McIntosh classification (Hale et al. 1919; McIntosh 1990; Bornmann & Shaw 1994; Sammis et al. 2000; Gallagher et al. 2002; McAteer et al. 2005; Qahwaji & Colak 2007; Li et al. 2008; Colak & Qahwaji 2009; Bloomfield et al. 2012; Lee et al. 2012; Li & Zhu 2013; Lee et al. 2016; McCloskey et al. 2016; Shin et al. 2016). In addition, various magnetic parameters characterizing distribution and non-potentiality of ARs from magnetograms have been studied for flare forecasting (Leka & Barnes 2003a,b; Cui et al. 2006; Leka & Barnes 2007; Barnes et al. 2007; Schrijver 2007; Yu et al. 2009; Yuan et al. 2010; Falconer et al. 2011; Ahmed et al. 2013; Huang & Wang 2013; Falconer et al. 2014; Bobra & Couvidat 2015; Barnes et al. 2016; Nishizuka et al. 2017; Raboonik et al. 2017; Liu et al. 2017; Nishizuka et al. 2018; Huang et al. 2018; Leka et al. 2018; Park et al. 2018). Magnetic parameters have been also calculated by magnetohydrodynamic (MHD) simulations and used for predicting flares (Aulanier et al. 2013; Guennou et al. 2017; Toriumi & Takasao 2017).

There have been many studies on flare probability based on AR characteristics and flare probability forecasting models. Giovanelli (1939) examined probabilities of an solar eruption in relation to sunspot group's size, type, and development. Gallagher et al. (2002) developed a flare prediction system using McIntosh classification, in which gives daily flare probabilities based on Poisson statistics. Leka & Barnes (2003a) considered magnetic parameters such as the vertical current, the current helicity, the twist parameter α, and the magnetic shear angles for discriminating flaring and flare-quiet ARs, which had been examined in a series of papers (Leka & Barnes 2003b; Barnes & Leka 2006; Leka & Barnes 2007). In order to parameterize these magnetic parameters, they first used vector magnetograms from Imaging Vector Magnetograph (IVM) at the University of Hawai'i Mees Solar Observatory. As a forecast result of their studies, Barnes et al. (2007) provided flare probabilities using Bayes's theorem. Falconer et al. (2011) found an empirical relationship between flare event rates and a proxy of magnetic free energy based on line-of-sight magnetograms from Solar and Helio-spheric Observatory (SOHO; Domingo 1995)/Michelson Doppler Imager (MDI; Scherrer et al. 1995). They predicted flare probabilities from a power law relationship between M- and X-class flare occurrence rates and the free magnetic energy proxy. Bloomfield et al. (2012) determined the Poisson flare probabilities from McIntosh sunspot classes. Among a variety of forecast verification scores, they proposed true skill statistic (TSS) as a standard score for comparing between flare forecasts. They also presented optimum thresholds to convert probabilities into binary predictions and forecast verification measures using these thresholds. Lee et al. (2012) also reported flare occurrence Poisson probabilities of McIntosh classification. They found that these flare probabilities tend to increase with AR's area and flaring probabilities for increasing AR's area are higher than those for steadying and decreasing area.

Recently, Solar Dynamics Observatory (SDO; Pesnell et al. 2012) was launched in 2010 and Helioseismic and Magnetic Imager (HMI), which is one of three instruments on the SDO, provides full-disk photospheric vector magnetic fields with 12 minutes cadence (Scherrer et al. 2012; Schou et al. 2012; Hoeksema et al. 2014).

The HMI team developed a set of derivative data called Space-weather HMI Active Region Patches (SHARPs) data (Bobra et al. 2014). These data contain automatically identified HMI Active Region Patches (HARPs) and magnetic parameters which summarize the size, distribution, and non-potentiality of vector magnetic fields in each HARP, and these parameters have been adapted from numerous studies (Leka & Barnes 2003a; Schrijver 2007; Fisher et al. 2012). Machine learning algorithms have been applied to SHARP data for binary flare forecasting (Bobra & Couvidat 2015; Nishizuka et al. 2017; Raboonik et al. 2017; Liu et al. 2017; Nishizuka et al. 2018). The probabilistic forecasting have been performed by Kontogiannis et al. (2017) and Leka et al. (2018) using Bayesian probabilities. Although there have been many categorical forecasts using vector magnetic fields, probabilistic forecasts have been rarely considered. SHARP magnetic parameters are also ranked by performance of discriminating between flaring events and non-flaring events based on machine learning algorithm (Bobra & Couvidat 2015; Liu et al. 2017).

In this paper, we study empirical relationships between photospheric SHARP magnetic parameters and daily major flaring rates. Furthermore, we will develop forecast models of daily probability of major flares (M- and X-class) based on these empirical relationships. We will also identify the rank of SHARP magnetic parameters examining the performance of the models developed from the parameters. In general, the occurrence probability could present more continuous information on potentiality of flare occurrence than the binary (flaring/non-flaring) forecasts. Furthermore, probability forecasts can be converted into binary forecasts using proper threshold values (Colak & Qahwaji 2009; Crown 2012; Bloomfield et al. 2012; Park et al. 2017; Murray et al. 2017; Leka et al. 2018).

The paper is organized as follows. Section 2 provides a detailed description of the data. The empirical relationships using our model are explained in detail in Section 3. Forecasting models and their results are presented in Section 4. We summarise and conclude with a discussion in Section 5.

2.1. Geostationary Operational Environmental Satellites (GOES) X-ray flares

GOES have measured solar X-rays in the passbands of 1 - 8 Å and 0.5 - 4 Å. Solar X-ray flares are classified according to the peak flux of X-rays with wavelength bands 1 to 8 Å as measured by GOES. We use GOES major (M- and X-class) X-ray flare data from May 2010 to April 2018 with their locations being identified by the Lockheed Martin Solar and Astrophysics Laboratory (LMSAL).1 Our data include 448 M-class flares and 27 X-class flares.

2.2. SDO/HMI and Magnetic Parameters

SHARP magnetic parameters have been used for flare forecasting (Bobra & Couvidat 2015; Liu et al. 2017) based on machine learning algorithm. Among the SHARP parameters, we consider ten parameters which have linear Pearson correlation coefficients (CCs) between these and flaring rates higher than 0.85 as described in Section 3. The ten parameters are classified into three types: the total unsigned parameters (TOTUSJH HCtotal , TOTUSJZ Jz,total, TOTPOT ρtot, and USFLUX Φ), the total signed parameters (SAVNCPP Jz,sum and ABSNJZH HC,abs), and the mean parameters (MEANPOT ρ, SHRGT45 Ashear, MEANSHR Γ, and MEANGAM γ). The description and formula of these parameters are listed in Table 1.

Table 1 
Descriptions and formulas of ten SHARP magnetic parameters. Constants have been omitted.
Keyword Description Formula
TOTUSJH Total unsigned current helicity HCtotal=BzJz
TOTUSJZ Total unsigned vertical current Jz,total=Jz dA
TOTPOT Total photospheric magnetic free energy density ρtot=Bobs-Bpot2 dA
USFLUX Total unsigned magnetic flux Φ=Bz dA
SAVNCPP Sum of the net current emanating from each polarity Φ=Bz dA
ABSNJZH Absolute value of the net current helicity HC,abs=BzJz
MEANPOT Mean photospheric magnetic free energy density ρ¯=1NBobs-Bpot2
SHRGT45 Fractional area with shear > 45° Ashear=Area with shear>45°/HARParea
MEANSHR Mean shear angle Γ¯=1NarccosBobsBpotBobsBpot
MEANGAM Mean angle of field from radial γ¯=1Narctan BhBz

We use 00:00 TAI definitive HARPs in cylindrical equal area (CEA) coordinates (hmi.sharp_720s_cea data series) when their longitudes are within ± 60 degrees of the central meridian and corresponding ten magnetic parameters from the Joint Science Operations Center (JSOC).2 According to Hoeksema et al. (2014), the number of high-confidence pixels in SHARP data decreases significantly beyond ± 60 degrees of the central meridian. In this study, a HARP is regarded as an unit of area to occur flares.


To develop a solar flare occurrence probability forecasting model, we need relationships between ten magnetic parameters and major flare occurrence rates. For identifying these relationships, we need a data set of major flare occurrence history and corresponding ten parameter values for each HARP.

3.1. Flare Identification

We identify each HARP at 00:00 TAI that produced one or more major flares within a day by using the flare locations corrected for differential rotation rates. When a flare event is located in a HARP's box, the flare event is considered to occur from that HARP.We assume that all flare events are independent of one another.

3.2. Data Set

Our data sets are divided into two (training and test) in chronological order. 70% of the data, HARPs from 1 May 2010 to 20 April 2015 including the ascending and maximum phase of the solar cycle (SC) 24, are used for finding a relationship between parameters and flaring rates. And 30% of the data, HARPs from 21 April 2015 to 30 April 2018 including the part of the descending phase of SC 24, are used for testing it. Accordingly, the training data consist of 11040 samples (different 1889 HARPs) and the test data consist of 4724 samples (different 898 HARPs). The training data sample consist of 224 event samples and 10816 non-event samples. The test data sample consist of 38 event samples and 4686 non-event samples. The sample ratio is as unbalanced as previous major flare forecasting models, because major flares are rare events and the solar cycle 24 is unprecedentedly quiet.

3.3. Major Flare Occurrence Rates as a Function of Magnetic Parameters

We want to identify relationships between ten magnetic parameters and major flare occurrence rates. For each parameter, we divide our data into 50 subgroups with equal number of HARPs. Then, we determine the average parameter value and the number of major flares within a day from each daily HARP in each subgroup. The mean major flare occurrence rate (Ri) of i-th group (Gi) is given by

Ri=# of major flares of Gi# of HARPs of Gi.(1) 

In order to find relationships, these rates are plotted as a function of each parameter in log-log scales. As Falconer et al. (2011) presented a power law function of major flare occurrence rates and their parameter from line-of-sight magnetic fields, our models also considers the power law function of each parameter as shown in Figure 1. The occurrence rates range from 0.001 to about 1 for all parameters. The flaring rates are only considered above 0.01 for obtaining more accurate fitting functions as Falconer et al. (2011) did. The fitting function is

Figure 1. 
Major flare occurrence rates versus ten magnetic parameters in log-log scale. The occurrence rate of each group is indicated by diamond symbols. Vertical error p bars represent the uncertainty of the occurrence rate of each group (σ=R1-R/N). Horizontal error bars represent the root mean square error (RMSE) between parameter values of each group and its mean value. Red lines correspond to power law fits. Data with flare rates above 0.01 (blue dashed line) are used in the fitting procedure.


where R is a mean flare occurrence rate, x is a mean parameter value of a group, a is a power law slope, and b is a fitting constant. To examine the dependence of the binning size, we also consider three cases of binning size = 10, 20, 100. Power law functions are well fitted with data for all four cases and their differences are very small. Thus, we use the power law function with binning size = 50 and its fitting coefficients and uncertainties of ten SHARP parameters are shown in Table 2.

Table 2 
Correlations between flare occurrence rates and magnetic parameters
Parameter CC a b RMSE
HC,total 0.91 1.61 ± 0.003 −6.34 ± 0.012 0.25
Jz,total 0.95 1.56 ± 0.003 −22.21 ± 0.045 0.17
ρtot 0.87 1.11 ± 0.002 −27.25 ± 0.058 0.27
Φ 0.86 1.34 ± 0.004 −30.95 ± 0.08 0.27
Jz,sum 0.90 1.16 ± 0.002 −16.0 ± 0.029 0.24
HC,abs 0.86 0.98 ± 0.002 −3.31 ± 0.004 0.32
ρ 0.90 1.81 ± 0.006 −8.1 ± 0.023 0.22
Ashear 0.88 1.85 ± 0.006 −3.87 ± 0.009 0.24
Γ 0.86 4.64 ± 0.017 −8.46 ± 0.026 0.24
γ 0.86 4.78 ± 0.013 −9.14 ± 0.021 0.27
Correlation coeffcients (CCs) between major flare occurrence rates and ten magnetic parameters, their fitting coeffcients, and root mean square errors (RMSEs) between flaring rates and fitting lines.

In Table 2, the CCs between flaring rates and ten SHARP parameters are high (above 0.85), implying that they are well correlated with each other. Among the ten parameters, the total unsigned vertical current has the minimum root mean square error (RMSE) between the power law fitting function and its original value (Figure 1a and 1b).

4.1. Forecast Models Based on Empirical Relationships between Parameters and Flaring Rates

We develop forecasting models of daily probability of major flares using the relationships between ten magnetic parameters and daily flaring rates. For a given x, we can predict a major flare occurrence rate from the empirical fitting functions. From these predicted rates, we calculate flare probabilities using the Poisson distribution (Wheatland 2000; Moon et al. 2001; Gallagher et al. 2002; Bloomfield et al. 2012; Lee et al. 2012). The probability (P) of occurring at least one flare in a day is given by

4.2. Verification Measures
4.2.1. Verification Measures of Probability

Forecast models represent verification measures which are a single number measuring forecast performance. We consider the mean squared error (MSE), the Brier skill score (BSS), and the reliability plots. These measures have been used for verifying the performance of probabilistic forecast models (Wheatland 2005; Barnes et al. 2007, 2016). The MSE, which is a measure of accuracy, is given by


where Pi is the predicted probability and Oi is the observation that events occurred (Oi = 1) or did not occur (Oi = 0). When perfect forecast occurs, the MSE is 0. The BSS, which represents the relative skill compared to the model using the climatological event rate during the testing interval, is given by


When perfect forecast occurs, BSS is 1, and "no-skill" as compared to the climatological forecast results in 0. The reliability plots are observed occurrence rates against predicted occurrence rates. When perfect forecast occurs, all points in the reliability plot lie on the diagonal line.

From the test samples, we calculate the MSE and BSS of ten forecast models, which are listed in Table 3. The total unsigned current helicity has the best performance in terms of both MSE and BSS. In view of the BSS, the total unsigned parameters have higher value than the signed and mean parameters; more specifically, HC,total > HC,abs, Jz,total > Jz,sum, and ρtot > ρ.

Table 3 
Verification measures of probabilistic forecasts
Parameter MSE (Perfect = 0) BSS (Perfect = 1)
HC,total 0.006 ± 0.0001 0.22 ± 0.005
Jz,total 0.007 ± 0.0001 0.17 ± 0.004
ρtot 0.007 ± 0.0001 0.12 ± 0.007
Φ 0.007 ± 0.0001 0.12 ± 0.004
Jz,sum 0.007 ± 0.0001 0.14 ± 0.007
HC,abs 0.007 ± 0.0001 0.15 ± 0.007
ρ 0.010 ± 0.0001 −0.26 ± 0.014
Ashear 0.009 ± 0.0001 −0.15 ± 0.009
Γ 0.009 ± 0.0001 −0.18 ± 0.010
γ 0.009 ± 0.0001 −0.14 ± 0.007
MSE: mean squared error; BSS: Brier skill score.

Figure 2 show reliability plots and their RMSEs (i.e., standard deviation of the residuals from the y = x line) between observed rates and predicted ones. Most of the data points for the total unsigned and signed parameters are relatively adjacent to the diagonal line compared to the mean parameters. The total signed parameters can predict a wider range of probabilities than the other types of parameters. The Φ gives the best reliability (RMSE = 0.03), but the reliability plots and their RMSEs can depend on the selection of probabilitybins.

Figure 2. 
Observed flaring rates versus the predicted rates for major flares in log-log scale. The uncertainties of observed rates for each group are represented by error bars. Perfect reliability occurs when all points lie on the x = y line. Root mean squre errors (RMSE) between predicted rates and observed ones are as follows: (a) 0.07, (b) 0.07, (c) 0.06, (d) 0.03, (e) 0.12, (f) 0.13, (g) 0.18, (h) 0.18, (i) 0.19, and (j) 0.29.

As an example, we present the probabilities of HARP 7115 (NOAA 12673) predicted by our models with top six BSS at the cadence of 1 hour from 2017 September 2 to 6 in Figure 3. Our forecast models present their own probabilities of major flare occurrence within a day after the observation time. Our predicted probabilities from six models are compared with GOES-15 X-ray fluxes (5 minute data). During this period, several major flares occurred in HARP 7115 (NOAA 12673). On September 2, probabilities that major flares occur are mostly lower than 20%. Actually, there is no major flare during that period. The models of the total signed parameters (HC,abs and Jz,sum) show a rapid and significant increase of forecast probabilities around September 4 when the first major flare occurred as well as persistence of high probabilities (greater than 50%) over the next three days. On the other hand, the models of the total unsigned parameters show relatively low probabilities (mostly less than 50%) with a gradually increasing trend. This suggests that the total signed parameters may help the empirical models in this study to better perform in some cases such as HARP 7115.

Figure 3. 
Predicted major flare occurrence probabilities of HARP 7115 (NOAA 12673) using the model of (a) HC,total, (b) Jz,total, (c) HC,abs, (d) Jz,sum, (e) ρtot , and (f) Φ from 2017 September 2 to 6 and GOES-15 X-ray flux (5 minute data) in the passbands of 1 - 8 Å (red). These are updated once per hour. The dashed line indicates the M-class threshold.

Our forecast models are to predict daily flare probability at 00:00 TAI using SDO/HMI observations. However, the models may have an issue on operational forecasting since there is a delay of getting magnetic parameters after the observations. At present, the data that we can get the fastest are available in less than three hours after the observation time. The calculation of the probability from our models needs only a few minutes. Thus, the delay time from observations to forecasts will not exceed 3 hours. For operational forecasting, we can predict daily flaring probability at 00:00 TAI using the observation at 21:00 TAI of the previous day. In order to examine how much this operational model is different from the original model, we investigate the relationship between Jz,total at 21:00 TAI, which has the highest CC between parameter values and flaring rates in Table 2, and the flaring rate for a day (00:00-23:59 TAI). In this case, the fitting coefficients, a and b, are 1.53 and –21.83, respectively, which is different by about 2% from the fitting coefficients of the original model with Jz,total in Table 2. The predicted probabilities from the operational model are underestimated by 2-6.3% compared to those from the original model. If we provide the operational forecasting with this model, the forecast has to have this uncertainty together with its original uncertainties.

4.2.2. Verification Measures of Binary

The probabilistic forecast results have been converted into contingency tables by using probability thresholds (Colak & Qahwaji 2009; Crown 2012; Bloomfield et al. 2012). A contingency table consists of four components: the number of TP (flare event predicted and occurred), FN (no flare event predicted and flare occurred), FP (flare event predicted and did not occur), and TN (no flare event predicted and did not occur) as shown in Table 4. Verification measures for binary forecasts are obtained by combining these four components of the contingency table. We build contingency tables using probability thresholds and calculate six measures.

Table 4 
Contingency table
Observation Flare No flare

In this study, we consider the following verification measures: the proportion correct (PC), the probability of detection (POD), the false alarm ratio (FAR), the Heidke skill score (HSS), the true skill statistic (TSS), and the symmetric extremal dependence index (SEDI). The first five measures have been widely used for the evaluation of flare forecasts, and the last one has been rarely used but is meaningful for validating forecasts of rare events such as major flares (Ferro & Stephenson 2011; Kubo et al. 2017). These six performance measures are described in Table 5.

Table 5 
Verification Measures
Measure Equation Perfect
SEDI logPOFD-logPOD-log1-POFD+log1-PODlogPOFD+logPOD+log1-PODF+log1-POD 1
Proportion correct (PC) measures what fraction of the forecasts were correct. The fraction of are events observed that were correctly forecast is given by the probability of detection (POD) or hit rate. POD should be used in conjunction with a false alarm ratio (FAR), which quanti_es what fraction of the predicted are events was actually not observed. The probability of false detection (POFD) is the fraction of observations for which ares were incorrectly forecast to occur. A true skill statistic (TSS) describes how well the forecast discriminated are events from non-are events. The Heidke skill score (HSS) gives the accuracy of the forecast relative to that of random chance. Recently, the symmetric extremal dependence index (SEDI) has been proposed by (Ferro & Stephenson 2011); this skill score is for forecasting rare events.

We determine the values of six measures and their thresholds to achieve optimum TSS and HSS in contingency tables, which are summarized with their uncertainties in Table 6 and 7, respectively. In view of the optimized TSS, the total unsigned parameters (HC,total, ρtot, Jz,total, and Φ) achieved higher values of verification measures than the total signed (HC,abs and Jz,sum) and mean parameters (ρ, Ashear, Γ, and γ). HC,total is one of the parameters having the highest TSS, which is consistent with the highest F-score in the classification between flaring events and non-flaring events (Bobra & Couvidat 2015). The rank order of TSS in view of the optimized TSS is also consistent with that of F-score from Bobra & Couvidat (2015). When the probability threshold is small, TSS values are very high but FAR values are also high. In view of the optimized HSS, most of total unsigned parameters also have higher value than the signed and mean parameters (HC,total > HC,abs, Jz,totalJz,sum, and ρtot > ρ). Bloomfield et al. (2012) suggested that TSS is optimized when FN/FP is similar to the ratio of flare event number to non-flare event number, and the HSS is optimized when FN/FP ≈ 1. The thresholds of the optimum TSS and HSS proposed by Bloomfield et al. (2012) are slightly different (from 0 to 0.03) from the thresholds of the optimum TSS and HSS in our models.

Table 6 
Verification measure values and probability thresholds chosen to achieve the optimum true skill statistic (TSS) from ten SHARP magnetic parameters
Parameter Threshold PC POD FAR TSS HSS SEDI
HC,total 0.04 0.93 ± 0.002 0.97 ± 0.003 0.89 ± 0.004 0.91 ± 0.004 0.18 ± 0.006 0.97 ± 0.0007
ρtot 0.055 0.93 ± 0.0008 0.97 ± 0.003 0.9 ± 0.002 0.91 ± 0.003 0.18 ± 0.003 0.97 ± 0.0004
Jz,total 0.045 0.93 ± 0.001 0.97 ± 0.003 0.9 ± 0.002 0.91 ± 0.003 0.18 ± 0.004 0.96 ± 0.0005
Φ 0.035 0.9 ± 0.006 0.97 ± 0.005 0.93 ± 0.006 0.88 ± 0.005 0.13 ± 0.01 0.96 ± 0.001
HC,abs 0.07 0.93 ± 0.001 0.87 ± 0.007 0.91 ± 0.003 0.8 ± 0.007 0.16 ± 0.004 0.91 ± 0.004
Jz,sum 0.03 0.84 ± 0.001 0.95 ± 0.004 0.95 ± 0.001 0.79 ± 0.005 0.07 ± 0.002 0.91 ± 0.003
ρ 0.03 0.8 ± 0.0007 0.97 ± 0.003 0.96 ± 0.0008 0.78 ± 0.003 0.06 ± 0.001 0.91 ± 0.0007
Ashear 0.015 0.74 ± 0.006 0.95 ± 0.006 0.97 ± 0.0006 0.68 ± 0.003 0.04 ± 0.001 0.84 ± 0.003
Γ 0.015 0.7 ± 0.037 0.95 ± 0.035 0.98 ± 0.004 0.65 ± 0.007 0.03 ± 0.008 0.82 ± 0.007
γ 0.025 0.77 ± 0.003 0.84 ± 0.008 0.97 ± 0.0009 0.61 ± 0.008 0.04 ± 0.001 0.77 ± 0.007

Table 7 
Verification measure values and probability thresholds chosen to achieve the optimum Heidke skill score (HSS) from ten SHARP magnetic parameters
Parameter Threshold PC POD FAR TSS HSS SEDI
HC,total 0.195 0.99 ± 0.0001 0.45 ± 0.01 0.55 ± 0.01 0.44 ± 0.01 0.44 ± 0.009 0.76 ± 0.007
HC,abs 0.21 0.99 ± 0.0008 0.45 ± 0.03 0.59 ± 0.03 0.44 ± 0.03 0.43 ± 0.009 0.76 ± 0.02
Jz,total 0.15 0.99 ± 0.0004 0.61 ± 0.02 0.68 ± 0.007 0.59 ± 0.02 0.41 ± 0.008 0.83 ± 0.009
Jz,sum 0.22 0.99 ± 0.0003 0.45 ± 0.02 0.61 ± 0.03 0.44 ± 0.02 0.41 ± 0.009 0.75 ± 0.008
ρtot 0.205 0.98 ± 0.001 0.58 ± 0.05 0.75 ± 0.03 0.57 ± 0.05 0.34 ± 0.007 0.8 ± 0.02
Φ 0.135 0.98 ± 0.001 0.55 ± 0.04 0.77 ± 0.009 0.54 ± 0.04 0.32 ± 0.006 0.78 ± 0.02
ρ 0.185 0.96 ± 0.007 0.53 ± 0.09 0.89 ± 0.009 0.49 ± 0.08 0.17 ± 0.004 0.71 ± 0.06
Ashear 0.12 0.94 ± 0.003 0.58 ± 0.03 0.93 ± 0.002 0.52 ± 0.03 0.11 ± 0.003 0.71 ± 0.02
Γ 0.1 0.93 ± 0.02 0.68 ± 0.17 0.93 ± 0.005 0.61 ± 0.15 0.12 ± 0.003 0.78 ± 0.12
γ 0.12 0.95 ± 0.003 0.37 ± 0.02 0.94 ± 0.002 0.32 ± 0.02 0.09 ± 0.003 0.54 ± 0.02

Another interesting measure is a relative operating characteristic (ROC) curve, in which POD is plotted as a function of POFD, and an ROC area, which is area under the ROC curve. When perfect forecast occurs, the curve travels from bottom left to top left of diagram, then across to top right of diagram. In this case, the ROC area is 1. ROC areas under the curve of ten parameters with their uncertainties are as follows: 0.98 ± 0.001 from HC,total, 0.98 ± 0.001 from Jz,total, 0.97±0.001 from ρtot, 0.97±0.001 from Φ, 0.95±0.002 from Jz,sum, 0.96±0.001 from HC,abs, 0.93±0.001 from ρ, 0.91 ± 0.002 from Ashear, 0.9 ± 0.002 from Γ, and 0.88 ± 0.002 from γ. As shown in Figure 4, the total unsigned parameters have higher ROC areas than the total signed parameters and the mean parameters, which gives the better performance in terms of the ROC curve measure. The difference in ROC areas is very small between all of the total unsigned parameters.

Figure 4. 
Receiver operating characteristic (ROC) curve in which probability of detection (POD) is plotted as a function of probability of false detection (POFD). When perfect forecast occurs, the curve travels from bottom left to top left of the diagram, then continues to the top right of the diagram.

4.2.3. Uncertainties of Verification Measures

Several forecast studies have mentioned the necessity of uncertainties for forecast verification measures (Barnes et al. 2016; Kubo et al. 2017; Leka et al. 2018) because the verification measures are calculated from fnite number of samples. To estimate uncertainties of verification measures, we use a bootstrap method, which accounts for random errors using resampling. We consider the size of resampled data which is the difference between the number of non-events and events. We make a resampled data with size = 4648 by picking randomly. Then, we calculate the verification measures from the resampled data. This process iterates 1000 times and the uncertainties are estimated by the standard deviation of the resampled values of verification measures in Table 6 and 7.


We have presented the forecasting models of major flare probability based on the power law relationships between ten magnetic parameters and major flare occurrence rates. The magnetic parameters calculated from SDO/HMI vector magnetic fields are used. The data are taken from May 2010 to April 2018 and divided into two sets (training of 11,040 HARPs and test of 4,724 HARPs) in chronological order, which is proper for forecast purpose. All values of the magnetic parameters are divided into 50 subgroups to estimate corresponding flare occurrence rates. From this, we considered the power law relationships between magnetic parameters and flaring rates.

The major results of this study are summarized as follows. First, major flare occurrence rates are well correlated with ten magnetic parameters (CC ≥ 0.86). Second, the logarithmic values of flaring rates are well approximated by linear equations. Third, the total unsigned parameters achieved relatively higher values of the optimized TSS and HSS than the total signed and mean parameters. Among the total unsigned parameters, HC,total, ρtot, and Jz,total are also the highest ranked of univariate F-scores in Bobra & Couvidat (2015). Our results are well consistent with Toriumi & Takasao (2017) who found that HC,total, ρtot, and Jz,total are more strongly proportional to magnetic free energy, calculated from 3D reconstructed magnetic fields, than the other parameters.

Although we calculate various verification measures of ten magnetic parameters, it is difficult to determine which parameter outperforms the others. Most parameters have high values of measures with only small differences. The result also depends on verification, where the total unsigned parameters have slightly better performance than the total signed parameters in terms of optimized TSS and HSS, but the total signed parameters can predict higher probabilities than the total unsigned parameters in practical forecasts for given AR. The performances of converted binary forecasts depend on the probability thresholds. When the probability threshold is small, TSS are very high but FAR are also high. On the other hand, all verification measure values are moderate in case of optimized HSS. Thus, the decision-maker or user may select a proper threshold or model for their own purpose.

The parameters that show a relatively good performance are conventionally considered to be non-potential parameters. This result supports the importance of non-potential magnetic fields in ARs (e.g., Low 1994; Canfield et al. 1999; Schrijver 2009; Jing et al. 2012; Zhang 2016). Bao et al. (1999) have found that the time variations of current helicity in the highly fiaring active regions are more significant than those of the poorly flaring active regions. The magnetic free energy plays an important role in producing major flares, which has been already well known (Canfield et al. 1999; Moore et al. 2001). Ji et al. (2003) revealed a quite strong correlation between vertical current and flares. Liu et al. (2016) suggested that the total unsigned vertical current and the photospheric magnetic free energy should be responsible for flare productivity. Our results, together with the previous results, demonstrate the importance of the total quantities of non-potential magnetic properties for flare forecasting (Welsch et al. 2009; Bobra & Couvidat 2015; Liu et al. 2017; Toriumi & Takasao 2017).


We thank the referees for helpful comments. This work was supported by the BK21+ program through the National Research Foundation (NRF) funded by the Ministry of Education of Korea, the Basic Science Research Program through the NRF (grants NRF-2016R1A2B4013131, NRF-2016R1A6A3A11932534, NRF-2019R1A2C1002634), NRF grant NRF-2013M1A3A3A02042232, the Korea Astronomy and Space Science Institute under the R&D program supervised by the Ministry of Science, ICT and Future Planning, the Korea Astronomy and Space Science Institute under the R&D program Development of a Solar Coronagraph on International Space Station (Project No. 2019-1-850-02) supervised by the Ministry of Science, ICT and Future Planning, and Institute for Information & communications Technology Promotion (IITP) grant (MSIP) (2018-0-01422, 'Study of Analysis and Prediction Techniques for Solar Flares'). The data used here are courtesy of NASA/SDO and the HMI science team, as well as the GOES team.

1. Ahmed, O. W., Qahwaji, R., Colak, T., et al., (2013), Solar Flare Prediction Using Advanced Feature Extraction, Machine Learning, and Feature Selection, SoPh, 283, p157.
2. Aulanier, G., Démoulin, P., Schrijver, C. J., Janvier, M., Pariat, E., & Schmieder, B., (2013), The Standard Flare Model in Three Dimensions II. Upper Limit on Solar Flare Energy, A&A, 549, pA66.
3. Bao, S. D., Zhang, H. Q., Ai, G. X., & Zhang, M., (1999), A Survey of Flares and Current Helicity in Active Regions, A&AS, 139, p311.
4. Barnes, G., Leka, K. D., Schrijver, C. J., et al., (2016), A Comparison of Flare Forecasting Methods. 1. Results from the "All-Clear" Workshop, ApJ, 829, p89.
5. Barnes, G., & Leka, K. D., (2006), Photospheric Magnetic Field Properties of Flaring Versus Flare-quiet Active Regions. 3. Magnetic Charge Topology Models, ApJ, 646, p1303.
6. Barnes, G., Leka, K. D., Schumer, E. A., & Della-Rose, D. J., (2007), Probabilistic Forecasting of Solar Flares from Vector Magnetogram Data, SpWea, 5, pS09002.
7. Bloomfield, D. S., Higgins, P. A., McAteer, R. T. J., & Gallagher, P. T., (2012), Toward Reliable Benchmarking of Solar Flare Forecasting Methods, ApJL, 747, pL41.
8. Bobra, M. G., & Couvidat, S., (2015), Solar Flare Prediction Using SDO/HMI Vector Magnetic Field Data with a Machine-learning Algorithm, ApJ, 798, p135.
9. Bobra, M. G., Sun, X., Hoeksema, J. T., et al., (2014), The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: SHARPs - Space-Weather HMI Active REgion Patches, SoPh, 289, p3549.
10. Bocchialini, K., Grison, B., Menvielle, M., et al., (2018), Statistical Analysis of Solar Events Associated with Storm Sudden Commencements over One Year of Solar Maximum During Cycle 23: Propagation from the Sun to the Earth and Effects, SoPh, 293, p75.
11. Bornmann, P. L., & Shaw, D., (1994), Flare Rates and the McIntosh Active-region Classifications, SoPh, 150, p127.
12. Canfield, R. C., Hudson, H. S., & McKenzie, D. E., (1999), Sigmoidal Morphology and Eruptive Solar Activity, GeoRL, 26, p627.
13. Colak, T., & Qahwaji, R., (2009), Automated Solar Activity Prediction: A Hybrid Computer Platform Using Machine Learning and Solar Imaging for Automated Prediction of Solar Flares, SpWea, 7, pS06001.
14. Cui, Y., Li, R., Zhang, L., He, Y., & Wang, H., (2006), Correlation between Solar Flare Productivity and Photospheric Magnetic Field Properties, SoPh, 237, p45.
15. Crown, M. D., (2012), Validation of the NOAA Space Weather Prediction Center's Solar Flare Forecasting Look-up Table and Forecaster-issued Probabilities, SpWea, 10, pS06006.
16. Domingo, V., Fleck, B., & Poland, A. I., (1995), The SOHO Mission: An Overview, SoPh, 162, p1.
17. Falconer, D. A., Barghouty, A. F., Khazanov, I., & Moore, R. L., (2011), A Tool for Empirical Forecasting of Major Flares, Coronal Mass Ejections, and Solar Particle Events from a Proxy of Active-region Free Magnetic Energy, SpWea, 9, pS04003.
18. Falconer, D. A., Moore, R. L., Barghouty, A. F., & Khazanov, I., (2014), MAG4 Versus Alternative Techniques for Forecasting Active Region Flare Productivity, SpWea, 12, p306.
19. Ferro, C. A T., & Stephenson, D. B., (2011), Extremal Dependence Indices: Improved Verification Measures for Deterministic Forecasts of Rare Binary Events, Wea. Forecasting, 26, p699.
20. Fisher, G. H., Bercik, D. J., Welsch, B. T., & Hudson, H. S., (2012), Global Forces in Eruptive Solar Flares: The Lorentz Force Acting on the Solar Interior, SoPh, 277, p59.
21. Gallagher, P. T., Moon, Y.-J., & Wang, H., (2002), Active-region Monitoring and Flare Forecasting. 1. Data Processing and First Results, SoPh, 209, p171.
22. Giovanelli, R. G., (1939), The Relationships between Eruptions and Sunspots, ApJ, 89, p555.
23. Guennou, C., Pariat, E., Leake, J. E., & Vilmer, N., (2017), Testing Predictors of Eruptivity Using Parametric Flux Emergence Simulations, JSWSC, 7, pA17.
24. Hale, G. E., Ellerman, F., Nicholson, S. B., & Joy, A. H., (1919), The Magnetic Polarity of Sun-spots, ApJ, 49, p153.
25. Hoeksema, J. T., Liu, Y., Hayashi, K., et al., (2014), The Helioseismic and Magnetic Imager (HMI) Vector Magnetic Field Pipeline: Overview and Performance, SoPh, 289, p3483.
26. Huang, X., & Wang, H.-N., (2013), Solar Flare Prediction Using Highly Stressed Longitudinal Magnetic Field Parameters, RAA, 13, p351.
27. Huang, X., Wang, H., Xu, L., et al., (2018), Deep Learning Based Solar Flare Forecasting Model. 1. Results for Line-of-sight Magnetograms, ApJ, 856, p7.
28. Ji, H. S., Song, M. T., Zhang, Y. A., & Song, S. M., (2003), The Horizontal and Vertical Electric Currents in Three Solar Active Regions and Their Relations with Flares, Chin. Astron. Astrophys., 27, p79.
29. Jing, J., Park, S.-H., Liu, C., et al., (2012), Evolution of Relative Magnetic Helicity and Current Helicity in NOAA Active Region 11158, ApJL, 752, pL9.
30. Kontogiannis, I., Georgoulis, M. K., Park, S.-H., & Guerra, J. A., (2017), Non-neutralized Electric Currents in Solar Active Regions and Flare Productivity, SoPh, 292, p159.
31. Kubo, Y., Den, M., & Ishii, M., (2017), Verification of Operational Solar Flare Forecast: Case of Regional Warning Center Japan, JSWSC, 7, pA20.
32. Kusano, K., Bamba, Y., Yamamoto, T. T., Toriumi, S., & Asai, A., (2012), Magnetic Field Structures Triggering Solar Flares And Coronal Mass Ejections, ApJ, 760, p31.
33. Lee, K., Moon, Y.-J., Lee, J.-Y., et al., (2012), Solar Flare Occurrence Rate and Probability in Terms of the Sunspot Classification Supplemented with Sunspot Area and Its Changes, SoPh, 281, p639.
34. Lee, K., Moon, Y.-J., & Nakariakov, V. M., (2016), Dependence of Occurrence Rates of Solar Flares and Coronal Mass Ejections on the Solar Cycle Phase and the Importance of Large-scale Connectivity, ApJ, 831, p131.
35. Leka, K. D., & Barnes, G., (2003a), Photospheric Magnetic Field Properties of Flaring Versus Flare-quiet Active Regions. 1. Data, General Approach, and Sample Results, ApJ, 595, p1277.
36. Leka, K. D., & Barnes, G., (2003b), Photospheric Magnetic Field Properties of Flaring Versus Flare-quiet Active Regions. 2. Discriminant Analysis, ApJ, 595, p1296.
37. Leka, K. D., & Barnes, G., (2007), Photospheric Magnetic Field Properties of Flaring Versus Flare-quiet Active Regions. 4. A Statistically Significant Sample, ApJ, 656, p1173.
38. Leka, K. D., Barnes, G., & Wagner, E., (2018), The NWRA Classification Infrastructure: Description and Extension to the Discriminant Analysis Flare Forecasting System (DAFFS), JSWSC, 8, pA25.
39. Li, R., Cui, Y., He, H., & Wang, H., (2008), Application of Support VectorMachine Combined with K-nearest Neighbors in Solar Flare and Solar Proton Events Forecasting, AdSpR, 42, p1469.
40. Li, R., & Zhu, J., (2013), Solar Flare Forecasting Based on Sequential Sunspot Data, RAA, 13, p1118.
41. Liu, C., Deng, N., Wang, J. T. L., &Wang, H., (2017), Predicting Solar Flares Using SDO/HMI Vector Magnetic Data Products and the Random Forest Algorithm, ApJ, 843, p104.
42. Liu, L., Wang, Y., Wang, J., Shen, C., Ye, P., Liu, R., Che, J., Zhang, Q., & Wang, S., (2016), Why Is a Flare-rich Active Region CME-poor?, ApJ, 826, p119.
43. Low, B. C., (1994), Magnetohydrodynamic Processes in the Solar Corona: Flares, Coronal Mass Ejections, and Magnetic Helicity, PhPl, 1, p1684.
44. McAteer, R. T. J., Gallagher, P. T., & Ireland, J., (2005), Statistics of Active Region Complexity: A Large-scale Fractal Dimension Survey, ApJ, 631, p628.
45. McCloskey, A. E., Gallagher, P. T., & Bloomfield, D. S., (2016), Flaring Rates and the Evolution of Sunspot Group McIntosh Classifications, SoPh, 291, p1711.
46. McIntosh, P. S., (1990), The Classification of Sunspot Groups, SoPh, 125, p251.
47. Moon, Y.-J., Choe, G. S., Yun, H. S., & Park, Y. D., (2001), Flaring Time Interval Distribution and Spatial Correlation of Major X-ray Solar Flares, JGR, 106, p29951.
48. Moore, R. L., Sterling, A. C., Hudson, H. S., & Lemen, J. R., (2001), Onset of the Magnetic Explosion in Solar Flares and Coronal Mass Ejections, ApJ, 552, p833.
49. Murray, S. A., Bingham, S., Sharpe, M., & Jackson, D. R., (2017), Flare Forecasting at the Met Office Space Weather Operations Centre, SpWea, 15, p577.
50. Nishizuka, N., Sugiura, K., Kubo, Y., et al., (2017), Solar Flare Prediction Model with Three Machine-learning Algorithms Using Ultraviolet Brightening and Vector Magnetograms, ApJ, 835, p156.
51. Nishizuka, N., Sugiura, K., Kubo, Y., et al., (2018), Deep Flare Net (DeFN) Model for Solar Flare Prediction, ApJ, 858, p113.
52. Park, E., Moon, Y.-J., Shin, S., et al., (2018), Application of the Deep Convolutional Neural Network to the Forecast of Solar Flare Occurrence Using Full-disk Solar Magnetograms, ApJ, 869, p91.
53. Park, J., Moon, Y.-J., Choi, S., et al., (2017), Application of Decision-making to a Solar Flare Forecast in the Cost-loss Ratio Situation, SpWea, 15, p704.
54. Pesnell, W. D., Thompson, B. J., & Chamberlin, P. C., (2012), The Solar Dynamics Observatory (SDO), SoPh, 275, p3.
55. Qahwaji, R., & Colak, T., (2007), Automatic Short-term Solar Flare Prediction Using Machine Learning and Sunspot Associations, SoPh, 241, p195.
56. Raboonik, A., Safari, H., Alipour, N., & Wheatland, M. S., (2017), Prediction of Solar Flares Using Unique Signatures of Magnetic Field Images, ApJ, 834, p11.
57. Sammis, I., Tang, F., & Zirin, H., (2000), The Dependence of Large Flare Occurrence on the Magnetic Structure of Sunspots, ApJ, 540, p583.
58. Scherrer, P. H., Bogart, R. S., Bush, R. I., et al., (1995), The Solar Oscillations Investigation - Michelson Doppler Imager, SoPh, 162, p129.
59. Scherrer, P. H., Schou, J., Bush, R. I., et al., (2012), The Helioseismic and Magnetic Imager (HMI) Investigation for the Solar Dynamics Observatory (SDO), SoPh, 275, p207.
60. Schou, J., Scherrer, P. H., Bush, R. I., et al., (2012), Design and Ground Calibration of the Helioseismic and Magnetic Imager (HMI) Instrument on the Solar Dynamics Observatory (SDO), SoPh, 275, p229.
61. Schrijver, C. J., (2007), A Characteristic Magnetic Field Pattern Associated with All Major Solar Flares and Its Use in Flare Forecasting, ApJL, 655, pL117.
62. Schrijver, C. J., (2009), Driving Major Solar Flares and Eruptions: A Review, AdSpR, 43, p739.
63. Schwenn, R., (2006), Space Weather: The Solar Perspective, LRSP, 3, p2.
64. Shin, S., Lee, J.-Y., Moon, Y.-J., et al., (2016), Development of Daily Maximum Flare-Flux Forecast Models for Strong Solar Flares, SoPh, 291, p897.
65. Toriumi, S., & Takasao, S., (2017), Numerical Simulations of Flare-productive Active Regions: δ-sunspots, Sheared Polarity Inversion Lines, Energy Storage, and Predictions, ApJ, 850, p39.
66. Tsurutani, B. T., Judge, D. L., Guarnieri, F. L., Gangopadhyay, P., Jones, A. R., et al., (2005), The October 28, 2003 Extreme EUV Solar Flares and Resultant Extreme Ionospheric Effect: Comparison to Other Halloween Events and the Bastille Day Event, GRL, 32, pL03S09.
67. Welsch, B. T., Li, Y., Schuck, P. W., & Fisher, G. H., (2009), What Is the Relationship between Photospheric Flow Fields and Solar Flares?, ApJ, 705, p821.
68. Wheatland, M. S., (2000), The Origin of the Solar Flare Waiting-time Distribution, ApJL, 536, pL109.
69. Wheatland, M. S., (2005), A Statistical Solar Flare Forecast Method, SpWea, 3, pS07003.
70. Yu, D., Huang, X., Wang, H., & Cui, Y., (2009), Short-term Solar Flare Prediction Using a Sequential Supervised Learning Method, SoPh, 255, p91.
71. Yuan, Y., Shih, F. Y., Jing, J., & Wang, H.-M., (2010), Automated Flare Forecasting Using a Statistical Learning Technique, RAA, 10, p785.
72. Zhang, H., (2016), Photospheric Magnetic Free Energy Density of Solar Active Regions, SoPh, 291, p3501.