Exploring LHC Run 1 and 2 data using the Madala hypothesis

The Standard Model (SM) Higgs boson, with its experimental discovery in 2012, has long been an interesting particle to study with the intention of exploring new physics ideas beyond the SM (BSM). Its properties are still not well understood, and there are several features in LHC Run 1 and Run 2 data which point at the possibility of extensions to the SM Higgs sector. This work explores the Madala hypothesis, which is the introduction of a heavy scalar (the Madala boson) to the SM, in addition to a real scalar S and dark matter (DM) candidate χ. This hypothesis has previously been used to explain several anomalous features observe in the LHC Run 1 data. This work extends the study to Run 2 data, and shows that the particle spectrum predicted in the Madala hypothesis is indeed compatible with LHC data. Further study prospects and striking signatures for searches are presented.


Introduction
In 2012, the Standard Model (SM) had its particle spectrum completed by the discovery of the SM Higgs boson (h) by the ATLAS [1] and CMS [2] collaborations at the Large Hadron Collider (LHC). The Madala hypothesis is an extension of the SM, and is one of the many hypotheses in the literature which predicts physics beyond the SM (BSM). At its core, the Madala hypothesis extends the SM by introducing two new scalars that are heavier than the SM Higgs boson. As discussed below, the hypothesis is merely a simplified model with the aim of explaining several particular features of the LHC Run 1 and 2 data. It can be discussed in the context of UV-complete and gauge symmetric BSM scenarios that predict extra scalars (such as 2HDMs, lefr-right symmetric models, etc.), but the focus of this short paper is to treat it as simply as possible with the purpose of exploring the data.
A first study on the hypothesis was done in 2015, where a new heavy scalar H (the Madala 1 boson) was introduced to explain several anomalous features in the LHC Run 1 data [3]. In this past work, H was considered to have Higgs-like couplings to the SM particles, as well as be a source of resonant di-Higgs production. It therefore was only considered in the mass range 2m h < m H < 2m t , since anything heavier would have been dominated by H → tt decays and anything lighter would not allow for resonant di-Higgs decays. This also lead to the assumption that H is produced dominantly through gluon fusion (ggF), however the strength of the g-g-H interaction was considered to be a free parameter, and the SM-like value could be rescaled by a factor β 2 g . The driving force behind the existence of H was the Higgs p T spectrum as   Figure 1(a). The DM candidate was a scalar, for simplicity, and fit all cosmological and detector constraints at a mass close to 1 2 m h . With this model in place, a simultaneous fit was done to a collection of relevant ATLAS and CMS results, and a best fit mass of H was found at m H = 272 +12 −9 GeV, with β g = 1.5 ± 0.6. This result was obtained as a 3σ improvement of the SM prediction.
In 2016, a study was done [6] on determining how to understand the effective vertex in Figure 1(a). The results from the 2015 study seemed to indicate that the branching ratio (BR) of the H → hχχ decay mode should have to be quite large to explain the data. It is not natural for a 3-body decay to have such a large BR. For this reason, the diagram shown in Figure 1(b) was proposed to explain the nature of the effective vertex. In this case, a scalar DM mediator S was introduced, and the decays of H changed such that the dominant modes are H → SS, Sh, hh. The S boson has a mass in the range m h < m S < m H −m h such that it is more kinematically accessible through the decays of H as mentioned above. It was also considered that S has Higgs-like couplings to the SM, although its direct production is suppressed. 2 In this case, an S boson with a mass around 160 GeV would decay dominantly to W bosons, as can be seen in Figure 2. The BRs to SM particles would, however, have to be suppressed by the S → χχ BR, which is a free parameter of the theory.
The Madala hypothesis differs from many BSM hypotheses which predict heavy scalars, in that the Madala boson should dominantly decay to pairs of h and S. The initial statistical study done in 2015 provided some insight into the potential parameter space of the model. However, since then a plethora of newer results from ATLAS and CMS have been presented (several of these at √ s = 13 TeV). For this reason, it is important to identify whether or not the results which are available at the time of writing this short paper are compatible with the results from the 2015 study.

Statistical methodology
The experimental results which are relevant to study when considering the Madala hypothesis are shown in Table 1. It is apparent that such a diverse set of data and final states needs to be carefully combined in order for interesting information to be extracted. Branching ratio WW ZZ bb ττ gg γγ Zγ cc µµ Figure 2: The BRs of a Higgs-like boson in the mass range above and below m h , taken from the LHC Higgs cross section working group [7]. The S boson (having a mass higher than m h ) would decay dominantly to the massive vector bosons increasingly depending on its mass. generic approach has been adopted in order to deal with statistics. That is, all experimental results are interpreted in terms of units of χ 2 . This simple approach is necessary due to the fact that not enough information is presented as part of experimental results. The experimental results considered here are usually presented in two ways. Firstly, in the case where measurements are considered, a χ 2 is calculated as Pearson's test statistic: Here, a theoretical prediction µ th is compared against an experimental measurement µ exp , along with their respective uncertainties ∆µ th and ∆µ exp . In the denominator, the experimental and theoretical uncertainties have already been added in quadrature since they are independent of each other. Secondly, experimental results can come in the form of limits. For searches where no significant excess is seen, a 95% CL is commonly what is presented. In this case, Pearson's test statistic is modified. The difference between the expected and observed limits are treated as a signal with a large error, and therefore limits contribute very weakly to a χ 2 . The contribution is written as follows: where L exp and L obs are the experimentally calculated expected and observed limits, respectively.
Here again, µ th is a theoretical prediction and its error is considered to be negligible compared with the experimental uncertainty, which is calculated as L exp /1.96.  calculating Pearson's test statistic has been tested in various cases, and found to be consistent with the standard definition given in Equation 1, assuming that the calculated limits are statistically Gaussian.
Combinations of results can be performed using this χ 2 method, and the results of the 2015 fit to data [3] that constrained the parameters of the Madala hypothesis used this procedure. With this in mind, the newer search results can undergo the same treatment in order to understand whether or not the results are compatible with the 2015 fit result.

Compatibility checks
The fit result of the Madala boson mass in 2015 was found to be m H = 272 +12 −9 GeV, with β g = 1.5 ± 0.6, as mentioned in section 1. Using this as a benchmark, one can use the statistical methodology described in section 2 to combine the di-Higgs and di-boson search results in Table 1 to try and understand whether the 2015 fit result is compatible with an updated dataset. These results all contain an interpretation that a heavy resonance H is decaying to a pair of Higgs bosons (in the di-Higgs case) or a pair of massive vector bosons (W W or ZZ, in the di-boson case). Since none of these results consider resonance masses lower that 260 GeV, the full parameter space considered for the Madala boson cannot be explored.
The results of a combination of the experimental data can be seen in Figure 3. On the vertical axis of each of these plots is a best fit value of cross section times BR for the associated search channel. On the horizontal axis the mass of the Madala boson is scanned. Bands have been drawn around the combined result, which represent a 1σ uncertainty in the result. Since we would expect differences in cross section for different center of mass energies, the results are separated into whether they come from Run 1 or Run 2. As can be seen, the combined result often deviates from the null hypothesis (i.e. that no resonance exists), and this most often happens in regions below m H = 300 GeV. The region around m H = 272 GeV shows an enhancement in cross section times BR in every case excepting the Run 2 H → ZZ result shown in Figure 3(d). By and large, these results are compatible with the 2015 fit result, and a more detailed study could provide us with a better constraint on the best fit value for m H .
As mentioned above in section 1, the initial driving force behind the investigation of the Madala boson is the Higgs p T spectrum. It is therefore also important to determine whether     Figure 1 To study the Higgs p T , a set of Monte Carlo (MC) samples were made to reproduce the different components of it. The SM Higgs p T spectrum was separated into its different production mechanisms. The ggF spectrum was generated using the NNLOPS procedure [32], which is accurate to next-to-next-to leading order (NNLO) in QCD. The associated production modes -vector boson fusion (VBF), V h and tth labelled together as Xh -were generated at next to leading order (NLO) using MG5 aMC@NLO [33]. These spectra are scaled to the cross sections provided by the LHC Higgs Cross Section Working Group [7] (from which the theoretical uncertainty also comes). The events are passed through an event selection identical to the fiducial selection used by the experimental collaborations. The dominant ggF prediction was further scaled by the experimentally measured signal strength for each decay mode, µ ggF .
The BSM prediction was considered to be the Madala hypothesis prediction of gg → H → hχχ through an effective vertex, as shown in Figure 1(a). This was generated using Pythia 8.2 [34], and scaled to the LHC Higgs Cross Section Working Group N 3 LO ggF cross sections for   Error weighted mean 1.92 ± 0.38 Table 3: The measured µ values for tth production in multileptonic analysis channels. A combination is estimated as the error weighted mean of each quoted combined result.
a high mass Higgs-like scalar. The events were passed through the fiducial selections as in the case of the SM prediction. Since the Run 1 fit result had a best fit mass of m H = 272 GeV with m χ = 60 GeV, the mass points considered for this study were m H = 270 GeV and m χ = 60 GeV. The SM and BSM components were added together and then tested for compatibility with the data. A χ 2 value was calculated for each bin per channel, as in Equation 1. The normalisation of the BSM spectrum is used to minimise the χ 2 , with the interpretation that it is scaled by the free parameter β 2 g . The results of this fit are shown in Table 2. Comparing against the 2015 best fit point of β 2 g = 2.25, the ATLAS Run 1 h → W W and ATLAS Run 2 h → γγ results are compatible with this value. The CMS Run 1 h → W W is not improved by the BSM hypothesis. For reference, the p T spectra at their best fit values are shown in Figure 4 for the two spectra which are improved by the BSM hypothesis.

The future of the Madala hypothesis
The Madala hypothesis is dependent on the availability of experimental results against which it can be tested. In 2015 a limited set of results was used, and its parameters were constrained using what was available at the time. The ATLAS and CMS collaborations, however, are always hard at work producing updated results with larger datasets. This short paper has compiled the results which were available before Moriond 2017, and shown that the newer results are compatible with the fit result obtained in 2015.
The results presented in this short paper are, however, not a complete re-fit of all available data. This task will be the focus of a future work, at a time where results are presented with the full 2015 and 2016 datasets. In addition to this, several aspects of the hypothesis have not been explored in this short paper. Most notably, the Madala hypothesis predicts that an enhanced