An Updated Mass–Radius Analysis of the 2017–2018 NICER Data Set of PSR J0030+0451

In 2019 the NICER collaboration published the first mass and radius inferred for PSR J0030+0451, thanks to NICER observations, and consequent constraints on the equation of state characterizing dense matter. Two independent analyses found a mass of ∼1.3–1.4 M ⊙ and a radius of ∼13 km. They also both found that the hot spots were all located on the same hemisphere, opposite to the observer, and that at least one of them had a significantly elongated shape. Here we reanalyze, in greater detail, the same NICER data set, incorporating the effects of an updated NICER response matrix and using an upgraded analysis framework. We expand the adopted models and also jointly analyze XMM-Newton data, which enables us to better constrain the fraction of observed counts coming from PSR J0030+0451. Adopting the same models used in previous publications, we find consistent results, although with more stringent inference requirements. We also find a multimodal structure in the posterior surface. This becomes crucial when XMM-Newton data is accounted for. Including the corresponding constraints disfavors the main solutions found previously, in favor of the new and more complex models. These have inferred masses and radii of ∼[1.4 M ⊙, 11.5 km] and ∼[1.7 M ⊙, 14.5 km], depending on the assumed model. They display configurations that do not require the two hot spots generating the observed X-rays to be on the same hemisphere, nor to show very elongated features, and point instead to the presence of temperature gradients and the need to account for them.


INTRODUCTION
The Neutron Star Interior Composition Explorer (NICER) is an instrument installed on the International Space Station to detect the soft thermal X-ray emission of rotation-powered millisecond pulsars (MSPs).
By modeling this emission, originating from the hot polar caps, we are able to infer neutron star (NS) masses and radii, hence constraining the equation of state (EoS) governing dense and cold matter.So far the NICER collaboration has released results for two sources: PSR J0030+0451 (Miller et al. 2019;Riley et al. 2019) and the high mass pulsar PSR J0740+6620 (Miller et al. 2021;Riley et al. 2021;Salmi et al. 2022).
The X-ray emission of the MSPs that NICER targets is characterised by pulsations generated by return currents heating up the NS surface at the magnetic poles.
Special and general relativistic effects on the propagation of radiation ensure that the phase-resolved spectrum of the pulsations carries information about the space-time surrounding the NS.By modeling all of the relevant relativistic effects, a technique known as Pulse Profile Modeling (PPM), it is possible to infer mass and radius (for a general introduction to PPM see Watts 2019;Bogdanov et al. 2019bBogdanov et al. , 2021)).As a by-product of the analysis, PPM also allows us to infer the properties (size, shape, location) of the hot emitting regions.
In this paper we revisit the first source for which NICER results were released, PSR J0030+0451, using the X-ray Pulse Simulation and Inference (X-PSI)1 package (Riley et al. 2023).Our analysis is an update to Riley et al. (2019, hereafter R19), since validated by Afle et al. (2023) who also used X-PSI; the independent analysis of Miller et al. (2019) used a different pipeline.
PSR J0030+0451 is an isolated MSP rotating at ∼ 205 Hz, located at a distance of ∼ 325 pc (Arzoumanian et al. 2018;Ding et al. 2023).The first X-PSI analysis of the NICER data set for PSR J0030+0451 (R19) found that several models for the size and shape of the hot polar caps could reproduce the data quite well.However, when the models were ranked according to their evidence, the preferred model was one in which emission originated from one small (circular) and one extended (arc-like) hot-spot, located in the same rotational hemisphere.An independent analysis of the same data by Miller et al. (2019), using a different parameter estimation code, found qualitatively similar spot shapes and locations on the star, although they used ovals instead of arcs.The shape and location of the hot emitting regions suggested the need for profound changes to the standard picture of the NSs's magnetic field, as they were incompatible with a classical centered dipole (Bilous et al. 2019;Chen et al. 2020;Kalapotharakos et al. 2021).
For the preferred emission pattern and viewing geometry, the mass and radius inferred by R19 for PSR J0030+0451 was 1.34 +0.15  −0.16 M ⊙ and 12.71 +1.14  −1.19 km, where the uncertainties, here and throughout this work, are specified at approximately the 16% and 84% quantiles in the 1D marginal posterior (for comparison, Miller  et al. 2019 inferred 1.44 +0.15  −0.14 M ⊙ and 13.02 +1.24 −1.06 km).The mass and radius posteriors have since been used in a variety of studies to constrain the EoS (see for example Raaijmakers et al. 2021;Tang et al. 2021;Biswas 2022;Sabatucci et al. 2022;Rutherford et al. 2023;Sun et al. 2023).The fact that PSR J0030+0451 has an inferred radius similar to that inferred for the ∼ 2.1 M ⊙ pulsar PSR J0740+6620 (Fonseca et al. 2021;Miller et al. 2021;Riley et al. 2021;Salmi et al. 2022), despite the much lower inferred mass, is particularly notable (Raaijmakers et al. 2021).The shift in radius as mass increases can be used to distinguish between EoS models (Drischler et al. 2021).
Improving the robustness of the NICER PPM results and updating them whenever possible is therefore crucial for advancing our understanding of the EoS.In a companion paper (Vinciguerra et al. 2023, hereafter V23a) parameter recovery simulations, tailored to PSR J0030+0451, were carried out to test the reliability of our results for complex hot spot models, and the sensitivity to stochastic fluctuations and inference setups.This study revealed some sensitivity of PPM posteriors to noise, analysis settings and random sampling processes.Their effects are interconnected: depending on the noise realisation, specific analysis settings may or may not be sufficient to assert a certain degree of stability in the results and, in general, computationally cheap analysis settings are more prone to variability due to random sampling processes.The other major finding of this work was the presence of a clearly multi-modal structure in the posterior.These posterior maxima are often considerably different in terms of associated probabilities, so they do not always appear as clear multimodal structures in the posterior plots.Sometimes they may be present as tails of the distributions, other times they may even be completely obliterated by the main mode.However their properties can be considerably different, and the application of independent constraints may completely shift their importance.
Due to limitations in the computational resources, the V23a simulation study focused on synthetic data sets based on only two different parameter vectors.This implies that some of its findings may not be general.In particular, comparing the results of V23a with the ones reported in R19, we note some differences that may be tied to the specific choice of parameter vectors.For example, a negligible difference in evidence between different models was found by V23a but not by R19.V23a also reported that different models could recover consistent mass and radius, given a specific synthetic data set; while R19 showed that when analysing the original NICER data set, presented in Bogdanov et al. (2019a), employing different models drastically changes the inferred mass and radius posterior distributions.In this paper we apply what we have learnt from V23a and revisit the PPM for PSR J0030+0451, making a number of improvements over the analysis presented in R19.
X-PSI has undergone a series of major updates since the analysis presented in R19: for example, the surface pattern model suite is more comprehensive, and it is now possible to perform joint analysis with data sets observed by multiple instruments.In R19, the inferred background was checked a posteriori against constraints derived from XMM-Newton observations (Bogdanov & Grindlay 2009).In this paper we include XMM-Newton constraints directly in the inference analysis, as was done for PSR J0740+6620 in Riley et al. (2021); Miller et al. (2021); Salmi et al. (2022).With more computational resources, we were also able to carry out a wider range and higher resolution study.This enables us to explore the robustness of the findings in more depth.Another improvement is to the instrument response model of NICER, which has been updated since the analysis presented in R19.What we present here is the analysis of a revised NICER data set (hereafter B19v006), derived from the one presented in Bogdanov et al. (2019a) and covering the same observations, but modified to be consistent with the latest NICER response matrix.This therefore becomes the baseline to which inference of PSR J0030+0451 data sets containing newer observations should eventually be compared.
Below, for our PPM inferences, we will consider four different models which describe PSR J0030+0451's system, adopting increasingly more complex surface patterns.With NICER-only analyses, we test the sensitivity of the identified posteriors to various assumptions and analysis settings.We also check the impact of random sampling processes, by repeating multiple times inferences identically set up.Given the various analyses initiated with NICER-only data, we decided to appoint four reference runs, one for each of the adopted models.These are used for the general discussion and have been performed with computationally expensive analysis settings; however, the robustness of our findings is also in this case not always guaranteed.We only performed one production run per model, when jointly analysing NICER and XMM-Newton data; in this case there is therefore no need for references.Since joint inferences require considerably higher computing resources, the analysis settings are, most of the time, less optimal compared to the NICER-only case.The corresponding posteriors are therefore not proven to be robust, yet.To help and guide the interpretation of our findings, more details and a wider context are given throughout the text that follows.
In Section 2 we describe the models adopted for this analysis and the most significant upgrades to X-PSI since the R19 analysis.In Section 3 we outline the most relevant properties and changes to the NICER and XMM-Newton data sets that are analyzed.The results of our analysis are then presented in Section 4 and discussed in Section 5. We present our final remarks in Section 6.

METHODOLOGY: X-PSI UPGRADES AND MAIN SETUPS
In this work, we use the package X-PSI to analyse the X-ray emission produced by PSR J0030+0451 and detected by NICER.NICER data are registered as events (counts) per PI (instrumental-energy) channel, where each event is characterised by a specific detection time.As described in R19, data collected over many (O(10 8 )) rotational cycles are folded over the spin period of the MSP of interest, in our case PSR J0030+0451, and are then binned into 32 phase bins.The data analyzed by X-PSI therefore have the form of number of events (counts) per instrumental channel and rotational phase bin.
X-PSI is a software package which performs parameter estimation by modeling the thermal emission generated at the NS surface and detected by NICER, following the methodology described in Bogdanov et al. (2019b); Riley et al. (2019); Bogdanov et al. (2021); Riley et al. (2021).Each hot spot on the NS surface is modeled by overlapping spherical caps (see Section 2.1 for more details).To account for the potential well and the shape of the NS, our analysis relies on relativistic ray-tracing described by the Oblate Schwarzschild plus Doppler approximation, introduced by Morsink et al. (2007); Al-Gendy & Morsink (2014).The final intensity at the observer is then calculated accounting for the effects of the NS atmosphere and the interstellar medium.Such a signal is generated for every parameter vector sampled by MultiNest (Feroz & Hobson 2008a;Feroz et al. 2009Feroz et al. , 2019)), the algorithm adopted within X-PSI to explore the model parameter space through nested sampling algorithm (Skilling 2004).This simulation process is essential for our inference analysis, which then compares these synthetic signals against the actual NICER data within a Bayesian framework.
In this work we use X-PSI versions v0.7 and v1 (for inference v0.7.3, v0.7.9 and v0.7.10 -which differ from one another only by upgrades and minor bug fixes that are not expected to affect the results -and v1.0.0 -v2.0.0 for post-processing).Compared to the version used in R19 (X-PSI v0.1) these X-PSI versions incorporate the possibility of multiple images, as described in Section 2.2.3 of Riley et al. (2021).Note that we do not expect this modification to significantly affect our results, since it is relevant for compactness values C = M R −1 > 0.284 (for brevity, we assume c = G = 1) and R19 found negligible posterior probability for compactness above ∼ 0.2 in all tested models (see Figure 19 of that paper).

X-PSI Models
A summary of the method underlying the analyses performed with X-PSI and its recent upgrades and modifications can be found in Section 2 of V23a.We expand on that overview by presenting below additional analysis specifications that are necessary to introduce the wider range of tests included in this work.

Atmosphere and Interstellar Medium
Throughout this work we assume that the hot spots of PSR J0030+0451 have a fully ionized and non-magnetic NSX hydrogen atmosphere (Ho & Lai 2001;Ho & Heinke 2009).As in Riley et al. (2021), Salmi et al. (2023) and V23a, the intensity of the radiation field is calculated by interpolating it from a table (extended compared to that used in R19), which expresses it for different values of effective temperature, surface gravity, photon energy and the cosine of emission angle calculated from the surface normal (for more details see Section 2.4.1 of R19).A detailed discussion of the implications of this choice (and possible alternatives) is presented in Salmi et al. (2023), specifically for PSR J0030+0451 and the data set analyzed here.For the first time, we test the potential impact of adding possible in-band emission from the NS surface, outside the hot spots.For this preliminary test, we still assume a fully ionized hydrogen atmosphere for the hot spots, but model the radiation generated from the remaining part of the surface as black-body emission, to reduce the corresponding computational cost.Even with this simplification, adding such a component can indeed considerably increase the computational resources required 2 .
The attenuating effects that the interstellar medium has at different energies are simulated with the same pre-computed tables, based on the tbnew model 3 , used in R19.As outlined in R19, we parameterize the effect of the interstellar medium with a single variable, the neutral hydrogen column density N H [cm −2 ].Other chemical abundances are then assumed from it, following Wilms et al. (2000).

Parametrizations of Hot Spots
To model the thermal emission of the hot spots in the NICER band, we use parameterized models that are motivated by theoretical studies of return currents and polar cap heating in rotation-powered pulsars (Harding & Muslimov 2001, 2011;Timokhin & Arons 2013;Kalapotharakos et al. 2014;Gralla et al. 2017;Lockhart et al. 2019;Kalapotharakos et al. 2021).In particular we define each hot spot with one or two overlapping spherical caps on the NS surface.One of these components is always emitting at a constant and uniform temperature; the other can also radiate (forming a hot spot with two temperatures) or mask the X-rays of the emitting component (widening the range of possible hot spot shapes to include arcs or rings).
X-PSI analyses of NICER data have always assumed the presence of two non-overlapping hot spots, motivated by their theoretical connection to the magnetic poles and the data structure of the pulse profile.We refer to these two hot spots as the primary and the secondary.Within the X-PSI framework we can define different models, given the set up just described.More details about the naming convention adopted for our 2 Setting what we consider an adequate number of cells to present such surface (sqrt num cells = 512), the core hours required for an ST-U inference run roughly quadrupled.For the ST+PST inference, we had to lower by a lot the number of cells (sqrt num cells = 64) and still also set the sampling efficiency to 0.8, to make the run affordable (about twice the core hours required with a similar run, excluding the emission from the remaining part of the NS surface). 3https://pulsar.sternwarte.uni-erlangen.de/wilms/research/tbabs/ models and a schematic representation of them is reported in Section 2.3.3 and Figure 1 of V23a.
In this work we adopt nested models that describe the surface patterns with increasing complexity: • ST-U: where each of the two hot spots is described by a single spherical cap.The primary is defined as the hot spot with lower colatitude with respect to the angular momentum vector (spin axis)4 ; • ST+PST: where the primary (ST) is described by a single spherical cap and the secondary (PST) by two components, one emitting and one masking (see Figure 2 of V23a).Within this model, for comparison we adopt both the prior used for analyses in R19 and the more Comprehensive Hot spot (CoH) prior described in Section 2.3.4 of V23a.The latter allows for the hot spot described by a single spherical cap to overlap with the masking component of the PST one; • ST+PDT: where the primary (ST) is described by a single spherical cap and the secondary (PDT) by two components, both emitting; • PDT-U: where each of the two hot spots are described by two emitting spherical caps.The primary is defined as the hot spot with lower colatitude.
The first two are described in detail in V23a; a more detailed explanation of the others is given below.R19 suggested that one of the hot spots was already well represented by our simplest description (i.e. by a circular component with uniform temperature).For this reason, starting from the simplest model here considered, ST-U, we first increase the complexity of the description of only one of the hot spots: first allowing the creation of different shapes with the ST+PST model, and then allowing two overlapping circular components to emit at different temperatures, with the ST+PDT model.Only the most complex model, PDT-U allows for both hot spots to be described with such complexity.In this project, for the first time within the NICER PPM analyses, we consider also the emission from the remaining portion of the NS surface, introducing the elsewhere temperature (T else ), which can be added to all models.
In Table 1, we briefly list all the parameters used in this work to describe the ST-U (+T else ) model.Within our naming convention, the -U, for unshared, signifies that the values of model parameters describing one hot spot are completely independent from the other 5 .Subscripts p and s are used to mark when a parameter refers to the primary or the secondary hot spot.The parameters in this table are in general common to all the models used in this paper; if two components are used to describe a hot spot, then the quantities below refer to either the only emitting spherical cap involved or -if both radiate -to the superseding one.The only exception is the phase ϕ, which refers to the masking component, if present.
When a hot spot is modeled by two spherical caps, we need to define a few more parameters.We divide them into two cases: PST (Protruding Single Temperature) if only one component is emitting and PDT (Protruding Double Temperature) if both of them are.We describe all these parameters in Table 2.A more detailed explanation of the adopted parametrization in models using  Note-Definition of the parameters that are common to all models adopted in this work.More detailed descriptions can be found in Section 2.3.3 of V23a.
† Hereafter also more simply referred to as radius.
PST and PDT hot spots can be found in Section 2.5.6 of R19.
In this work, unlike in R19 and V23a, we coherently incorporate XMM-Newton data into some of our analyses.XMM-Newton is an imaging X-ray telescope.For this reason it performs better in identifying the photons that are generated by a particular source.Introducing this data set, with associated background, and trying to simultaneously fit both NICER and XMM-Newton data, allows us to better evaluate which fraction of the registered events (for both data sets) actually come from PSR J0030+0451 and which are due to external contributions (see Section 4.2 of Riley et al. 2021 for more details).We describe below how we model uncertainties in the instrument responses of both NICER and XMM-Newton.
Modeling of the instrument response: When we analyze only the NICER data set B19v006, we use the same parametrization described in Section 2.2 of V23a and summarised below.To model uncertain- ties in the instrument response, we adopt a variable , where D is the pulsar distance in kpc and α XTI the energy independent factor that scales the reference response matrix, and hence the total effective area, of the NICER X-ray Timing Instrument (XTI).β represents the only combination of α XTI and D to which our analysis is sensitive.When both XMM-Newton and NICER data sets are used, we instead sample from α XTI , α MOS and D. As described in Table 1, α MOS represents the energy-independent scaling factor applied to the reference instrument responses of XMM-Newton two cameras: MOS1 and MOS2 (i.e.we assume α MOS = α MOS1 = α MOS2 , which may not actually be the case).We make this parametrization choice to describe the correlation between the NICER and XMM-Newton instrument responses as explained in see Section 2.4 of Riley et al. (2021).As in Salmi et al. (2022), in this work we adopt the compressed effective area prior (see Section 4.2 of Riley et al. 2021).
In this paper we chose to apply pretty conservative priors for the α parameters.These are comparable to what was used in R19 and more constraining compared to what was adopted for the headline results of Riley et al. (2021), but still broad enough to allow followup tests, adopting importance sampling, with more restricted priors, which could reflect more accurately the current understanding of NICER and XMM-Newton calibration uncertainties.

Priors and Settings
For this paper we use the same priors and settings introduced in Section 2.2 of V23a.These include a flat prior in the joint mass and radius parameter space, where hard boundaries are applied: the mass is constrained to be in the range ∈ (1.0, 3.0) M ⊙ ; while the radius must be less then 16 km.Similarly to R19; V23a, Salmi et al. (2022Salmi et al. ( , 2023) ) we also limit the compactness and the surface gravity.Differently from R19, we adopt the instrumental PI channels in the range [30, 300) and we consider isotropic priors (flat in cosine) for the inclination angle i and the colatitudes of the hot spots' centers θ p and θ s .Priors describing the additional emitting component in the ST+PDT and PDT-U models are constrained by the overlapping condition, which imposes overlaps between the superseding and the ceding spherical caps.These priors are therefore coupled to the value of the parameters describing shape and location of the superceding component.These dependencies are similarly implemented as for the ST+PST model, starting from the angular radius of the ceding region ζ c (see Section 2.5.6 of R19 for more details).When used, the temperature of the additional elsewhere component log 10 (T else /K) is sampled uniformly between the bounds (5.0, 6.5).In this work, as well as in V23a, we use both highresolution (HR) and low-resolution (LR) X-PSI settings (the specific parameters, that define such settings, and their values are presented in Section 2.3.1 of V23a).The latter allows us to run complex models, whilst saving significant computational resources.According to the results presented in V23a, no significant changes are expected in the results adopting the low-resolution settings.Below we test whether this hypothesis holds also in the case of the real NICER data.
2.1.4.MultiNest X-PSI needs to be coupled with a sampling program.In this work we use PyMultiNest (Buchner et al. 2014), a library that allows us to interface easily with Multi-Nest.MultiNest is a Bayesian inference technique and a program that targets the estimation of the evidence.Doing so, it explores the parameter space, and so allows for parameter estimation.More details are given in Section 2.4 of V23a; in Table 3 we summarize the most relevant settings for the analyses performed in this work.Our default starting MultiNest settings are: sampling efficiency 0.3; evidence tolerance 0.1; live points 1000 and multi-mode/mode-separation off (SE 0.3, ET 0.1, LP 1000, MM off).The sampling efficiency initially set is then modified within X-PSI to account for the effective unit hyper-cube volume of the prior space considered in the analysis.

Test Cases
The main goal of this paper is to establish a baseline for the analysis of the upcoming new PSR J0030+0451 data sets and their interpretation.It is supported by the simulations carried out in V23a.To achieve this goal we first set up a few exploratory runs, aiming to test the robustness of the solutions found by R19.We first try to reproduce those results, with the new set-up described in this Section.Then we investigate the effect of background constraints, by including the XMM-Newton data set in our inference runs.We look at the possible presence of multi-modal structures in the posterior surface, as well as conceivable shortcomings of the analysis (in light of what was found in V23a).
Given the scope of this paper, we decided to focus most of our tests on the two simplest models ST-U and ST+PST.The latter was identified as the preferred model in R19 (and associated with the headline mass-radius result); the former was found to be the simplest model that could represent the PSR J0030+0451 NICER data without showing clear structures in the residuals.With these two models we test robustness of the obtained inferred parameter values and their dependencies on random sampling processes and analysis settings.We also begin to explore the parameter space of the more and most complex two hot spot models: ST+PDT and PDT-U.Since V23a uncovered the presence of prominent multimodal structures in the posterior surface, for NICERonly analyses, we here perform one inference adopting  (Feroz & Hobson 2008b;Buchner 2021).No formal number of live points have been suggested, likely because an adequate number would depend on the problem at hand.Given the same settings, adopting the mode-separation modality slightly worsens the precision of the final results, as live points no longer move as efficiently.Lower numbers for sampling efficiency and evidence tolerance, and higher numbers of live points, imply higher accuracy in the calculation; however they also significantly increase the computational resources required.
the mode-separation variant with 10 4 live points for each model; these are our (NICER-only) reference runs.These settings are proven (see Section 4.1.2) to generate stable posteriors when the ST-U model is used; however this is not necessarily the case for the more complex models, for which a more detailed testing would have been computationally too expensive.For all four models, we also perform a preliminary, production run including XMM-Newton data in the analysis.We summarize all of the runs carried out, and their settings, in Table 4.

DATA SETS 3.1. NICER B19v006 Data Set
For this work, we use the same dataset as in the initial NICER analyses of PSR J0030+0451 (Miller et al. 2019;Riley et al. 2019), with data processing as described in Bogdanov et al. (2019a) (B19).The dataset contains 1.936 Ms of exposure time collected over the period 2017 July 24 to 2018 December 9. However we have recalibrated the gain using the nicerpi tool, updating the calibration to use the gain calibration file 20170601v006.The response matrices matched to this gain solution are in NICER CALDB file xti20200722.Compared to the inferences reported in R19 (which also included energy channels 25 − 30), analyses of this recalibrated data set need to be limited to energy channels ≥ 30, as the procedure adopted to create the updated B19v006 data set from the original B19 one allows for possible calibration errors at these excluded lowest channels.In Section 5.1 we demonstrate that the effects of removing these channels are minor.This is in agreement with the analysis of Miller et al. (2019), which consid-ered only channels ≥ 40 and found that this choice did not affect their results.

XMM Data Set
The XMM-Newton data of PSR J0030+0451 used in this analysis (first presented in Bogdanov & Grindlay 2009) are from two archival observations obtained on 2001 June 19 (ObsID 0112320101) and 2007 December 12 (ObsID 0502290101).We only consider the EPIC MOS1 and MOS2 'Full Frame' mode imaging observations; although they do not possess adequate sampling time to resolve pulsations from PSR J0030+0451 they provide reliable low background phase-averaged source spectra.The EPIC pn data acquired in 'Timing' mode, which provides imaging only in one direction of the detector, is not used due to the considerably larger uncertainties in the calibration of the instrument in this observing mode.The data reduction and extraction of the EPIC MOS data were carried out using the Science Analysis Software (SAS6 ) version xmmsas 20211130 0941.The event data were screened for instances of high background count rates and the recommended PATTERN (≤12 for MOS1/2) and FLAG (0) filters were applied.This resulted in 121.2 ks and 120.9 ks of total effective exposure for MOS1 and MOS2, respectively, from combining both observations.Collection of inference runs performed for this work and the corresponding properties and settings.The first column represents the name of the hot spot model adopted for the run (see Section 2.1.2and Section 2.5 of R19).When the conventional model name is followed by T else , the run also includes the modeling of the temperature of the rest of the MSP's surface, exterior to the two hot spots.The second column describes whether the run was performed with high (HR) or low (LR) resolution for some of the main X-PSI parameters.The four columns that follow represent the MultiNest settings as described in Table 3.The seventh column is used to flag inference runs where NICER and XMM-Newton data have been fitted at the same time; and the last column shows how many identical repetitions of the same run were performed to test the impact of random sampling processes.Reference runs for this study are highlighted in bold.* : Two of these three runs were resumed with sampling efficiency 0.8, as was done in R19; * * : These runs were performed adopting the updated CoH prior.

Model
In the following, we present the results of the inference runs described in Section 2.2 and summarized in Table 4.We consider each hot spot model presented in Section 2.1.2,going from the simplest to the most complex.For the simplest, ST-U and ST+PST, we start with parameter estimation analyses that allow an easier comparison with the findings of R19 and explore our sensitivity to settings and random sampling processes, as in V23a.The main data and post-processing routines necessary to reproduce our main results are available at Vinciguerra et al. (2023, the Zenodo link will be made public, and the files available, once the publication is accepted).
As estimating mass and radius of MSPs is the main science goal of NICER, in this section we focus in particular on the inferred posterior distributions of these two parameters.We also show posteriors for compactness, since our analysis is expected to be most sensitive to this particular combination of mass and radius7 .The corner plots reported in most of the figures that follow show the 1D and 2D posterior distributions of these three parameters.As in V23a, R19, Riley et al. (2021); Salmi et al. (2022), the colored band present in the 1D poste-rior plots shows the area enclosed within the ∼ 16% and ∼ 84% quantiles of the 1D marginalized posterior distributions (medians and corresponding lower and upper limits are written on top of each 1D posterior); while the contours in the 2D marginalized distributions represent the ∼ 68.3%, 95.5% and 99.7% credible regions.As explained in Section 4 of V23a, the X-PSI post-processing tools adopt the GetDist8 KDE to smooth the distribution of the posterior samples.In 2D posterior plots, this may introduce artificial gradients in the density, when sharp cut offs (more complex than a threshold on a single parameter) are present in the prior.This is for example the case for the 2D posterior density plot of radius and compactness (see also footnote 12 of V23a).
In this section, as well as in the discussion (Section 5), for simplicity in guiding our thoughts, we sometimes refer to a single sample, the maximum likelihood of the particular run or mode that we are considering.We use it as a reference point to describe some of the properties of such portion of the posterior.However, generalizations need to be carefully evaluated since a single sample cannot fully represent the whole range of possibilities spanned by a posterior mode9 .

Reproducing R19 Results
R19 identified the ST-U as the simplest model that could represent the original NICER data set of PSR J0030+0451 analyzed in that paper.The derived 68% credible interval for mass and radius, assuming ST-U, were respectively 1.09 +0.11  −0.07 M ⊙ and 10.44 +1.10 −0.86 km.For that inference, the MultiNest settings adopted to analyse PSR J0030+0451 were: SE 0.3, ET 0.001, LP 1000, MM off (hereafter we refer to these as the R19 MultiNest settings for ST-U) 10 , with the posteriors of the 3 tested runs, overlapping (see e.g. the posteriors of run 1 in the left panel of Figure 1).Within this new framework (which, compared to the analyses of R19, incorporates an updated NICER data set -adapted to the most recent NICER instrument response-as well as upgraded software), we therefore perform three identical and independent analyses using the same Multi-Nest settings.The obtained posterior distributions for mass, radius and compactness are reported in the left corner plot of Figure 1.The variability of the results inferred from these three runs suggests that, in contrast to what was reported in R19, in our current analysis framework (including all the changes previously described) these settings are insufficient to exhaustively explore the parameter space.For this reason, in the same plot, we also report posterior distributions for the SE 0.3, ET 0.1, LP 10 4 , MM on, HR run, which we consider effectively representative of the ST-U results in this new framework (see Figure 2).Although we notice a slight increase in the inferred median value of mass and radius as well as in the widths of their posteriors, they are in good agreement with the findings of R19.If instead of focusing on inference runs with 10 4 live points, we focus on the results derived here with MultiNest settings similar to those adopted in R19, the newly inferred radius and mass posterior distributions are narrower and display considerably larger medians (differences in radius can be ≳ 1 km).We consider the LP=10 4 run to be more robust since it explores the parameter space better and is more stable (see indeed the overlap between posteriors with LP ≥ 6 × 10 3 reported in Figure 2 and discussed in Section 4.1.2).
The right corner plot of Figure 1 displays the posterior distributions obtained with our default MultiNest settings (SE 0.3, ET 0.1, LP 10 3 , MM off).Not surprisingly, also in this case, where the accuracy requirement over the evidence estimate is lower, the results exhibit some variability.In addition, in the same plot, we report the findings obtained adopting the ST-U+T else , which includes the possibility that the whole surface of the NS could emit in the NICER sensitivity band.We find that adding this further element in the model produces results consistent with the other inference runs performed with the same settings.

Impact of Analysis Settings
The findings of V23a highlight the importance of MultiNest settings.For ST-U (computationally the cheapest of our models) we therefore tested the impact of adopting different values of sampling efficiency or evidence tolerance, starting from our default settings (based on the settings adopted for models more complex than ST-U in R19).The inferred posterior distributions are reported in the left corner plot of Figure 2. We report: in pink, the results for all the three runs with our default settings, and in brown those for the three inference runs, performed with ET 0.001, shown individually in the right and left plot respectively of Figure 1.
On the right corner plot, we demonstrate the importance of setting an adequate number of live points to exhaustively cover the model parameter space and derive a stable solution.Again, in pink, we represent parameter estimates for three inference runs adopting our default MultiNest settings.We then show posterior distributions obtained respectively with runs adopting 3k, 6k, and 10k live points, demonstrating the gradual convergence to slightly lower values of radii and larger uncertainties.These results demonstrate that in this setup (and having fixed the sampling efficiency to 0.3 and the evidence tolerance to 0.1) we need a minimum number ST-U posterior distributions (smoothed by GetDist KDEs) from 9 inference runs, for radius, compactness and mass.The corner plot on the left shows results for three inference runs with MultiNest settings identical to those adopted by R19, but obtained in the new X-PSI and data framework described in Sections 2, 3 (differences listed in Section 5.1).For all these runs we adopted the high-resolution X-PSI settings.For comparison, in solid yellow lines we also show the distributions found for run 1, ST-U model, in R19.In dashed black we present posterior distributions for the SE 0.3, ET 0.1, LP 10 4 , MM on run, which we consider to be the reference runs for ST-U model in this work.The corner plot on the right shows the results for three inference runs with our default MultiNest settings.We also report posterior distributions for the run allowing the whole surface of the MSP to emit (labeled as ST-U+T else ).Each of the two corner plots shows new posteriors from four inference runs which use different MultiNest settings, as reported in the legend (for definitions see Section 2.1.4).Curves in the representation of the 2-dimensional posteriors trace the 68% credible area.On top of the 1D posteriors, we report the 68% credible intervals (representing the area within the 16% and 84% quantiles in the 1D marginalized posterior) starting from the median of the distributions.These values, as well as the colored areas, refer to the two dashed black lines: the run enabling the mode-separation modality, in the left corner plot, and the run adopting the ST-U+T else model, in the right corner plot.
of live points somewhere in the range (3 − 6) × 10 3 to guarantee an adequate exploration of the model parameter space.
In both the corner plots, as a reference, we also plot the posterior distributions obtained by the inference run SE 0.3, ET 0.1, LP 10 4 , MM on, HR.Adopting the latter as a reference, caveat the low number statistics, these results hint at the following conclusions: • In this new framework, (3×) 10 3 live points are not enough to exhaustively explore the model parameter space; • About 1 in every 3 runs seems to give a significantly different median value for the radius posterior, both with our default MultiNest settings and with the variation on the evidence tolerance (ET 0.001); • Compared to when we adopt ET 0.001, our default MultiNest settings (with ET 0.1) seem to lead to a greater variation in the radius and mass medians (but smaller in the compactness); these medians are also in general further away from the values identified with the reference SE 0.3, ET 0.1, LP 10 4 , MM on, HR inference run; • However, our default settings seem to also recover wider posteriors compared to their ET 0.001 counterparts; wider posteriors, again compared to the ET 0.001 runs, are also recovered with the ST-U reference run; • Changing the value of sampling efficiency does not seem to significantly affect the median value of the posteriors; however, when lowering it to 0.1, the widths of the posterior distributions increase, approaching the values obtained if one adopts a higher number of live points; • Given the appearance of a bias towards higher radius values when 10 3 live points are used, multiple runs with this setting seem unlikely to deliver the same result as one would obtain from a single run with a higher number of live points.

Model Exploration
The different mass and radius values obtained for the sequence of nested surface pattern models analyzed in R19 already suggested the presence of multiple modes in the likelihood surface.In this work, we highlight and expand on these findings.We focus here on the posterior distributions obtained with the SE 0.3, ET 0.1, LP 10 4 , MM on, HR ST-U inference run.Here we find three modes in the posterior, with considerably different inclinations and hot spot locations (configuration corresponding to the main one reported in panel A of Figure 3).Interestingly, they show the same hot spot configuration modes as found V23a when analysing with the ST-U model a synthetic data set produced with the ST+PST model (see Figure 7 of V23a for a visual representation).As in that case, the two secondary solutions have similar likelihood and local evidence values, but they perform considerably worse than the main identified mode (the maximum likelihood geometry, for the equivalent run but with MM off, is reported in panel A of Figure 3).They also differ considerably from it in inferred mass and radius (means and standard deviations associated with them are reported in Table 5).
From the right corner plot of Figure 2, we can compare the mass, radius, and compactness posterior distributions of this inference run to those obtained adopting the same MultiNest settings but disabling the mode-separation.When we enable the mode-separation the posteriors are slightly narrower but otherwise they are very similar to each other: the 68% credible intervals for the PSR J0030+0451 radius, with and without the activation of the mode-separation, are respectively 10.53 +1.15  −0.89 km and 10.60 +1.28 −0.96 km; while for the mass they are 1.12 +0.13 −0.08 M ⊙ and 1.12 +0.15 −0.09 M ⊙ .While the latter lets the live points freely and more optimally explore the parameter space, the former, our reference run for ST-U NICER-only analyses, allows us to identify modes in the inferred posterior, even when not visible by eye.

Joint Analysis of NICER and XMM-Newton Data
In the left corner plot of Figure 4, we report the main findings from the joint NICER and XMM-Newton inference analysis.They refer to the SE 0.3, ET 0.1, LP 10 4 , MM on, HR run, performed with the setup described in Section 2. The configuration associated with the maximum likelihood sample from its posterior distribution is shown in panel C of Figure 3.
The 1D posteriors for mass, radius, and compactness all move to larger values.In particular, the distribution clearly reaches the radius upper boundary of 16 km.Because our analysis is particularly sensitive to the compactness, this radius upper limit is expected to indirectly constrain the mass posterior as well.The plot suggests that the peak of the radius posterior would correspond to even larger radii than allowed by our (physically motivated) prior.There is, however, an elongated tail of the posterior involving considerably lower radii.The posteriors belonging to this tail are also characterised by considerably lower masses (see the 2D posterior distribution of mass and radius), in closer agreement with what was found as the main mode when only the NICER data was analyzed.This tail in the radius posterior of the joint NICER and XMM-Newton inference corresponds to the secondary mode found by the sampler, whose configuration is reported in panel D of Figure 3.The average and standard deviation of this secondary peak in the posterior, as well as of the main peak, are reported in the last two columns of Table 5.Both modes show a compactness posterior distribution which peaks at considerably higher values compared to the results obtained when only NICER data are analyzed.Including the XMM-Newton data in the inference process limits the contribution of the background (see Figure 5), compared to what is inferred from NICER data alone.To offset the reduction in the unpulsed background, the inferred compactness increases, creating a larger unpulsed signal arising from the star (although with the opposite effect, the same logic was applied to explain the findings from joint NICER and XMM-Newton analyses also in Riley et al. 2021;Salmi et al. 2022 for PSR J0740+6620).The same trend emerges for both ST-U and ST+PST models.
The higher radius and compactness values, yield lower angular radii describing the two hot spot sizes and allow the hot spot at higher colatitude (closest to the pole) to move slightly closer to the equator.The colatitudes of the hot spots are however not tightly constrained and this leads to a visible bi-modal structure in the posterior of the parameters describing the hot spot properties (see Figure 6).This is due to the ambiguity of the primary and secondary labels associated with the hot spots 11 .Since the MSP radius inferred through the secondary mode peaks at values only marginally larger than the one found with NICER-only data, the hot spots' angular radii still decrease, but not as much as for the main posterior mode (see Figure 6).The configurations arising from the main mode of the joint NICER and XMM-Newton inference show temperatures and incli-11 For models which adopt the same complexity to describe both hot spots, we label as primary the hot spot with lower colatitude.This implies that ambiguity on the hot spot associated with a specific label arises if there is a significant posterior mass (note that this is not the same as the posterior of the pulsar mass parameter, or the mass posterior).for similar colatitude values for both the hot spots (this chance increases with broader posteriors).Req [km] 10.7 ± 1.0 15.5 ± 0.4 15.5 ± 0.4 14.9 ± 1.0 11.4 ± 1.1 Configuration panel A, Fig. 3   nation that are very similar to those found in the main mode when analysing NICER-only data.This is not the case for the secondary mode, which points to a more edge on configuration, with inclination compatible with zero.In this new setting, hot spots are located close to the equator and characterised by significantly lower temperatures.

Analysis of XMM-Newton Data Only
The XMM-Newton data of PSR J0030+0451 was first analyzed in Bogdanov & Grindlay (2009) in an attempt to extract information about the NS radius.In this analysis, which employed a frequentist approach, a fixed mass of M = 1.4 M ⊙ was assumed and just R was allowed to vary.Only two circular hot spots were considered with a configuration equivalent to the ST-U model.In addition, the pulse profile obtained from the XMM-Newton EPIC pn (used in Bogdanov & Grindlay 2009, but not in this work, see Section 3.2) instrument was Figure 3. Representation of the main geometrical modes found with the analyses presented in this study.The surface patterns are shown from the observer point of view (i.e. according to the inferred inclination of the specific sample considered), at phase zero of the data.Hot spots on the same (opposite) hemisphere compared to the observer are plotted with full (dimmed) colors and and a cross (circular) marker at the centre.The color of the hot spot components changes from blue to red from the highest to the lowest temperature of the considered surface pattern (with higher precision compared to the temperature reported in the legend).Black is used for masking components.In all cases, we show, as a reference point, the hot spot geometry associated with the maximum likelihood sample of that specific mode within the considered inference run.Although this single sample cannot capture the whole possible variations present in a mode, we hope in this way to provide a simplified but more concrete indication of the configurations belonging to the mode.fit in only two broad energy bands (0.3-0.7 keV and 0.7-2 keV).Due to the limited photon statistics of the XMM-Newton data, this analysis resulted in only an upper limit on the neutron star radius of R > 10.7 km (95% confidence) and very broad constraints on the spot locations and geometry, which are generally consistent with configurations C and D in Figure 3.
Using the pipeline and procedures outlined in this paper, analysis of only the XMM-Newton data yields only very weakly constrained hot spot properties and 68% credible intervals of R eq = 11.3 ± 2.5 km and mass M = 1.6 +0.4 −0.3 M ⊙ .These are broader than the values quoted in Bogdanov & Grindlay (2009), likely because some of the assumptions have been relaxed and no timing information has been used.They are also much wider than the posterior distributions derived from NICER data, as expected due to the much smaller XMM-Newton exposure time, effective area, time and energy resolution.For ST+PST, R19 reported credible intervals for mass and radius respectively of 1.34 +0.15  −0.16 M ⊙ and 12.71 +1.14  −1.19 km.These results were obtained from a run adopting the following MultiNest settings: SE 0.3 (0.8), ET 0.1, LP 10 3 , MM off, where the sampling efficiency was increased to 0.8 when resuming the analysis.In Figure 7, we report posterior distributions derived analysing B19v006 with the ST+PST model in the new inference framework (see Section 2 for more details).The green lines on the left corner plot show results for the runs that match as much as possible the analysis settings adopted in R19.We notice that there is some scatter among these posterior distributions.In particular the major outlier only finds the secondary mode of the posterior surface (see Sections 4.2.2 and 5 for context and discussion over its multi-modality); this resembles the mode found for ST-U and is therefore characterized by smaller radii and masses.The other two runs still present some scatter, suggesting that these MultiNest settings are now (given the various changes to channel choices, instrument response etc. compared to R19) inadequate to explore the model parameter space.Compared to the credible intervals presented in R19, the new posteriors show slightly higher values (likely due to the difference in channels, as the same trend was found in preliminary analyses and attributed for the ST-U model to the removal of data in channels 25 to 30) and similar uncertainties.The only HR run with SE 0.3-only is the inference run which identifies as its main mode, the secondary (ST-U-like) mode (1D and 2D posteriors for mass, radius and compactness are represented in Figure 7 by the green outlier); the computational cost of this analysis was 3-4 times the cost of the two other HR runs.

ST+PST
The same corner plot also shows the effect of lowering the resolution of X-PSI settings; the green curves show the only posteriors obtained with high resolution for models more complex than ST-U.The yellow lines represent distributions, derived with low resolution but with the other settings directly comparable to those obtained with high resolution.Note that the two outliers are produced with the same fixed MultiNest and Python seeds.Excluding the outlier (which is again connected to the identification of the secondary mode) the lowresolution runs seem to converge to slightly more stable solutions located at slightly smaller radii and masses and are characterised by marginally larger posteriors, when the mode favoured by the evidence12 is identified (see Spectrum of NICER data and its inferred composition, according to the maximum likelihood sample of our X-PSI analysis.Data are plotted with a dashed magenta line; the total signal expected is shown with a solid pink line; the background (BKG) that maximises the likelihood for the (background-maginalised) maximum likelihood posterior sample (MaxL, in the legend) is plotted with a purple solid line.The contribution of the primary hot spot, the secondary hot spot, and the background, again associated with the maximum likelihood sample, are reported on top of each other with pink-shaded regions, increasingly more intensely colored.The purple shades show from darker to lighter color the average background ± 1, 2 or 3 standard deviations, calculated over 200 randomly selected posterior samples from the equal weight MultiNest output file.The inference runs for this figure concern the joint NICER and XMM-Newton analysis, performed with the X-PSI PDT-U model.The complete figure set (8 images) is available in the online journal, for the ST-U, ST+PST, ST+PDT, PDT-U models for both NICER-only analyses (reference runs highlighted in bold in Table 4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table 4, corresponding to "yes" entries in the XMM-Newton column).
Section 4.2.2).The behaviour is different when only the secondary mode is explored by the sampler; comparing the two outliers in the 1D radius plot, indeed, the high-resolution run produces much narrower posteriors, peaking at significantly lower values.We use low resolution also for our new reference ST+PST run, obtained increasing the number of live points to 10 4 and enabling the mode-separation option of MultiNest.We report the corresponding posterior with a black dashed line in the same left corner plot of Figure 7. Compared to the results found with the other low-resolution runs (where the number of MultiNest live points was 10 3 ), these distributions slightly widen and move toward lower values of mass and radius.Similar widening of the posteriors, with increasing number of MultiNest live points, was also found in Riley et al. (2021) for PSR J0740+6620 and could indicate a broader true posterior distribution.The 68% credible intervals of mass and radius derived from our new reference inference are: 1.37 ± 0.17 M ⊙ and 13.11 ± 1.30 km (as reported on top of the 1D posteriors in the left corner plot of Figure 7), largely in agreement with the findings of R19.The hot spot configuration corresponding to the maximum likelihood sample of this analysis is reported in panel B of Figure 3.Our current tests do not guarantee that these analysis settings are sufficient to adequately explore the model parameter space.
In the right corner plot of Figure 7, we show posterior distributions derived from each of the low-resolution runs, reported with yellow lines in the left panel.We also report in ocher the posterior identified by Riley et al. (2019) for the ST+PST model.The outlier, depicted in the right corner plot with blue lines, is found by a replica of an inference run with our default MultiNest settings.Hence the different results can be attributed solely to the randomness present in the sampling process.The occurrence of such outlier, however, suggests that the number of live points chosen for these inferences is not sufficient to effectively explore the model parameter space.The black dashed curves represent the results obtained when introducing the elsewhere temperature to the ST+PST model and are presented in more detail in Section 4.2.2.4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table 4, corresponding to "yes" entries in the XMM-Newton column).Tables 1 and 2 show the meaning of the adopted parameter labels.The multi-modal structure of the posterior, described in Section 4.1.4,is here clearly visible.In particular the bi-modality in the 2D posteriors including the radius, shows the main mode (configuration corresponding to its maximum likelihood sample shown in Panel C of Figure 3) and a secondary one (configuration corresponding to its maximum likelihood sample shown in Panel D of Figure 3).For more general details about the plots in this figure set, see Figure 1.
The other three inference runs show posterior distributions almost perfectly overlapping, despite the differences in MultiNest settings (the yellow lines show the results obtained enabling the mode-separation option) and in the prior (the green curves outline distributions derived from runs adopting the updated ST+PST CoH prior).With this plot and the associated inference runs, we demonstrate that, in analysing the NICER data B19v006, introducing the updated ST+PST CoH prior leads to no significant difference in the inferred posterior distributions and evidence estimation.We can draw the same conclusion for the effect of adopting the modeseparation MultiNest option, even though limited to the specific case here considered.

Model Exploration
Our mode-separation inference runs highlight the multi-modal structure present in the posterior surface of this specific model, given the chosen B19v006 data set (although we expect all PSR J0030+0451 NICER data sets to exhibit similar behaviour).As briefly mentioned in the previous section, there are two prominent modes in the posterior, which differ in local log-evidence by a factor of ∼ 5.5.The maximum likelihood samples belonging to the two modes actually differ by a much larger value (>10 in log-likelihood), from which we can deduce that the prior space supporting the secondary mode is considerably larger than that supporting the main mode.This also explains why a relatively low number of live points can result in the identification of the secondary mode as the main one.An illustrative hot spot geometry associated with the main mode is rendered, taking as example the maximum likelihood sample, in panel B of Figure 3.The secondary mode resembles, also in the inferred background, the main mode identified with the ST-U model and reported in panel A of the same Figure.The omitting component is always associated with the smaller, closer-to-the-equator, hot spot (labeled as primary in panel A of Figure 3).The location and size of the masking element can vary significantly within the identified mode.Analysing the SE 0.3, ET 0.1, LP 10 3 , MM off, LR run that missed the main mode, but found this secondary one (whose mass, compactness and radius posterior are represented with blue lines in the right corner plot of Figure 7), we notice that, while the maximum likelihood sample of this run depicts the primary as a ring, the maximum posterior sample represents it as an almost circular hot spot, where the omitting component barely touches the emitting one.This spread in the PST hot spot characteristics suggests, as found in V23a, that we are only very weakly sensitive to the small details of the hot spot shapes.In this case, it seems that we are mostly sensitive to the location and the overall emitting area of the PST hot spot: there is indeed a clear correlation between the size of the masking spherical cap and its distance from the center of the emitting one.
Both primary and secondary modes deliver a very similar inferred background (see Figure 11 of V23a, for a comparison between the background associated with ST+PST and ST-U main solutions) and compactness values, despite the differences in the inferred mass and radius (comparable compactness values are also deduced with our reference ST-U analysis).In other aspects these two solutions are quite different from each other.In particular the difference in the inferred observer inclination, estimated by the two modes, yields changes in the hot spot location and sizes, such that they are still observable at the right phases (a more detailed discussion on the effect of different inclinations can be found in Section 5.2.1 of V23a).These considerations highlight the presence of significant, and sometimes non-trivial, degeneracies in our model parameter spaces.
In the right plot of Figure 7, we show the effect of allowing the rest of the surface of the star (the part not covered by the two hot spots) to emit black-body radiation characterised by a finite, homogeneous temperature (ST+PST+T else ).This inference also uses the updated CoH prior, however, as mentioned above, we do not expect this choice to have a significant impact on the final results.The credible intervals on top of the 1D plots refer to the results of this inference.The configuration found for this model, with these MultiNest and X-PSI settings, resembles that obtained with the ST-U model and represented, for the maximum likelihood sample of that inference run, in panel A of Figure 3.The mask is again associated with the hot spot being closer to the equator (labeled as primary in panel A) and changes its shape to rings and crescents (both maximum likelihood and posterior exhibit ring-like shapes for this secondary hot spot).
Allowing all of the surface of the MSP to emit introduces more uncertainty on the background and increases uncertainty on the inferred compactness.We find an almost flat posterior support for elsewhere temperatures below ∼ 10 5.6 K, likely due to the negligible contribution of black-body radiation of such low temperatures within NICER band (for comparison the temperature of the two emitting components are T p ∼ T s ∼ 10 6.1 K).This additional component can explain part of the unpulsed emission that is otherwise mostly attributed to the background and to higher compactness values (Riley et al. 2021;Salmi et al. 2022).This degeneracy between these three modeled components (T else , background and compactness), and between T else and the hydrogen column density (whose 68% credible interval has indeed increased and widen, compared to the runs without T else that identified as main mode the ST-U-like mode), generates the aforementioned uncertainties, which are e.g.visible comparing the blue with the black dashed lines in the right corner plot of Figure 7.The slight cut in high compactness values, in combination with the increased uncertainties, yields a posterior that peaks at larger radii (compared to the other runs which identi- The green curves represent the posterior distributions inferred from the analyses that adopt high resolution (HR in the legend); the yellow ones are derived from low X-PSI resolution setting (our default for models more complex than ST-U).These latter have slightly different MultiNest settings, as marked with settings in parenthesis, but show very similar behaviours and are individually plotted on the right panel.The two outliers (one at HR, green, and one at LR, yellow) have been obtained fixing MultiNest and Python random seeds to the same values.With black dash lines, we also report the posterior distributions obtained with the SE 0.3, ET 0.1, LP 10 4 , MM on run; numbers on top of the 1D posteriors, as well as shaded areas refer to this inference.On the right panel, instead, we show new results for 5 ST+PST inference runs, with low X-PSI resolution, adopting our default MultiNest settings, the only exception being the second run from the top of the list, for which we enabled the mode-separation.Similar MultiNest settings have also been adopted in R19 (although higher X-PSI settings were in that case used).The ST+PST posteriors found in R19 are here shown in brown for comparison.This plot tests: the variability due to the randomness of the sampling process, given the specific analysis settings; the effect of updating the model prior, to allow the primary hot spot to overlap with the masked part of the secondary (CoH, runs marked in the legend with * ); and the effect of introducing emission from all the MSP surface (see run including T else in the legend).The black dashed lines represent the posterior distributions obtained when allowing the entire surface of the MSP to emit.Shaded areas and numbers on top of the 1D posterior distribution refer to the credible regions and intervals of this inference.See caption of Figure 1 for further details.* : runs adopting the updated CoH ST+PST prior, as explained in Section 2.3.4 of Vinciguerra et al. (2023).
fied the ST-U like mode as the main mode).Overall, the most prominent change generated by the elsewhere temperature is the broadening of the posteriors (in addition to doubling the computational time).

Joint Analysis of NICER and XMM-Newton Data
We report the mass, compactness, and radius posterior distributions obtained adopting the ST+PST model in a joint NICER and XMM-Newton inference analysis in the right corner plot of Figure 4.These results were obtained by a SE 0.8, ET 0.1, LP 10 3 , MM off, LR run in terms of X-PSI settings.
As is immediately clear, comparing the two corner plots of Figure 4, this ST+PST inference run leads to radius, compactness, and mass posterior distributions very similar to those derived with the ST-U model.The radius posterior is again truncated close to its peak by our prior cut at 16 km.This impacts the inferred mass distribution which appears to be then driven by constraints on the compactness.The similarities are not confined to these parameters, indeed the inferred hot spot geometries are also almost identical (as seen in the online set of Figure 6), and therefore resemble the configuration depicted in Panel C of Figure 3.As shown in Panel C, both hot spots have small sizes and sim-ilar temperatures.Small sizes lead to (type I, as explained in R19) degeneracies, when we assume complex hot spot geometries (i.e. when the hot spot is described by two components, as it is the case for our PST hot spot).In practice, the inferred small sizes of the hot spots imply a small sensitivity to their shape.Therefore this yields weak constraints on the omitting component, which, given the similarities between the two hot spots, can be assigned to either of them.Since in this case we define the PST hot spot as secondary, this ambiguity gives rise once again to visible bimodality in the posterior distributions of the parameters describing the properties of the two hot spots (see again the online set of Figure 6).
As described in Section 4.1.4,to jointly explain both the phase-resolved NICER data and the XMM-Newton data, it is necessary to increase the contribution of PSR J0030+0451 to the total counts.An increase of the unpulsed emission from the MSP can be produced by higher compactness values, as inferred here.
The similarities in the mass and radius posteriors with those obtained with the ST-U model, extend to the tails of these distributions.They thus hint at the presence of a secondary mode, again characterised by lower radius and mass values (respectively ∼ 11 km, and ∼ 1.4 M ⊙ ).The samples belonging to these tails display again (see panel D of Figure 3, for a qualitative representation) higher inclination, with both hot spots increasing slightly in size and migrating toward the equator, compared to the main mode.Also in this case, we find configurations in which the location of the ST and PST hot spots alternate.Depending on the size of the omitting component (which is not well constrained), the emitting region masked by it may slightly increase its size, to compensate for the otherwise decreased emission.This spread over the possible properties of the omitting region highlights again our weak sensitivity to the details of hot spot shapes.

Model Exploration
In Figure 8, we show the posterior distributions inferred for mass, radius, and compactness, adopting the ST+PDT model (a more complex model than the ST+PST model used for the headline result in R19, that was not explored in that paper).The left corner plot refers to the results derived with the SE 0.8, ET 0.1, LP 10 4 , MM on, LR run, analysing only the B19v006 NICER data set.The inferred 68% credible intervals for mass and radius are 1.20 +0.14 −0.11 M ⊙ and 11.16 +0.90 −0.80 km.This model therefore identifies quite a narrow peak in radius.The associated hot spot configuration and temperatures resemble the one reported in panel E of Figure 3.The bulk of the inferred posteriors describe hot spot patterns, mass, and radii similar to those identified as secondary mode in the joint analysis of NICER and XMM-Newton data sets with the ST-U model13 .We find considerable differences, however, in the two temperatures associated with the PDT hot spot, compared to the temperatures associated with ST emitting regions derived with the combined NICER and XMM-Newton analyses.Similar emission to the log 10 (T s /K) ∼ 6 ST hot spot in the ST-U model, is often modeled here, as a PDT hot spot, with one component slightly colder (∼ 10 5.9−6.0K) and larger and one tiny but very hot (∼ 10 6.3−6.4K).The peak of the posterior identifies the ceding region with the small hot spherical cap, however, there is some posterior mass that also associates it with the cold and more extended component.This bimodality is also visible, in the 95% credible area contours, in the 2D posterior distributions of the parameters describing the properties of the secondary hot spot (see Figure set 6 in the online journal).Unlike the results derived from ST-U and ST+PST models in joint NICER and XMM-Newton analyses, the posterior distributions identified with ST+PDT show no bimodality generated by label ambiguity.This means that there is a significant preference for the location of the PDT hot spot; i.e. the data are better represented if one specific emitting region (the first visible, in our representation of the data, see Figure 10) is described by two components with different temperatures.
With the SE 0.8, ET 0.1, LP 10 4 , MM on, LR run, we also find a secondary mode.This mode shows a maximum likelihood that differs from that of the main mode by ∼ 4 in log.The differences in local evidences are slightly larger, ∼ 6 in log.The secondary mode identifies configurations resembling (with some variation) the main mode identified with the ST-U model when analysing NICER-only data.In this case the dual temperature hot spot may be associated with either of the two emitting regions.The temperature of the superseding component of the PDT hot spot remains of the same order as found with the ST-U model log 10 (T s /K) ∼ 6.1 and is similar to the temperature associated with the ST hot spot.The ceding component is considerably colder log 10 (T c,s /K) ∼ 5.2 − 5.6.The resemblance to the ST-U NICER-only case also includes the mass and radius posteriors, which cluster around ∼ 1.1 M ⊙ and ∼ 10.5 km, slightly lower than the values for the ST+PDT main mode.
In Section 5.4, we will elaborate on the adequacy of the coverage of the model parameter space during the inference analyses presented in this Section.

Joint Analysis of NICER and XMM-Newton Data
The effect of introducing XMM-Newton data into the inference process, when adopting the ST+PDT model, is shown in the right corner plot of Figure 8.The reported posterior distributions were derived from a SE 0.8, ET 0.1, LP 10 3 , MM off, LR run.In panel E of Figure 3, we show the hot spot geometry of the identified maximum likelihood sample.
For the first time, coherently adding the XMM-Newton data set in the inference process does not yield solutions that are visibly different from those obtained analysing only NICER data.As with other models, the introduction of the XMM-Newton data constrains the background to slightly lower values than otherwise inferred.In this case, we also see an increase in the compactness, although the difference is significantly smaller when compared to the other models.The properties of the pulsed emission are then recovered by increasing both mass (whose 68% credible interval is now 1.40 +0.13  −0.12 M ⊙ ) and radius (whose 68% credible interval is now 11.71 +0.88  −0.83 km).Indeed for both of these parameters the posterior mass at higher values has considerably increased, while the posterior volume for the lowest values has significantly reduced, compared to analysis of the NICER-only data.
Looking at the posterior distributions of the hot spot parameters (see Figure set 6 in the online journal ), only the configuration assigning very high temperature to the superseding component of the PDT hot spot, has been identified by this joint NICER and XMM-Newton analysis (instead of the predominant alternative found with NICER-only data, where the tiny hot spherical cap was the ceding component and hence partially hidden by a cooler and larger component).There is also a significant anti-correlation between the sizes of the hot spots (particularly of the ST and the superseding component of the PDT hot spot) and inferred mass and radius of PSR J0030+0451.Most likely, this correlation can be explained by the necessity of a similar emitting area, which requires a lower angular radius of the hot spot if the MSP has a larger radius.It is not clear if this is a consequence of the introduction of the XMM-Newton data, or if it is caused by the low number of live points adopted to explore the large ST+PDT parameter space.Indeed, as also reported in Table 4, the ST+PDT joint NICER and XMM-Newton analysis was performed with a considerably lower number of live points (10 3 ), compared to the corresponding NICER-only inference.Similarly to the NICER-only inference, the uncertainty over the colatitude of the ST hot spot, allows it to be located on either the same or the opposite hemisphere as the observer.In this joint analysis, two distinct peaks are present in the 1D colatitude posterior; however this could again be due to the relatively low number of live points enabled to explore the parameter space.

Model Exploration
We analyse the revised NICER PSR J0030+0451 data set with a SE 0.8, ET 0.1, LP 10 4 , MM on, LR inference run.In the left corner plot of Figure 9, we report the radius, compactness, and mass posterior distributions derived adopting the X-PSI PDT-U model.We find relatively wide posteriors and hence large 68% cred-ible intervals; the estimated mass and radius are respectively 1.41 +0.20  −0.19 M ⊙ and 13.12 +1.35 −1.21 km.Uncertainties and values resemble those inferred with the ST+PST model, when XMM-Newton data are not included.
From the posterior distributions of the parameters describing the geometry and hot spot properties of the system (see corresponding corner plot, Figure set 6 in the online journal), we note the presence of three distinct modes.These were also identified through the modeseparation MultiNest option.The main mode is characterised by a hot spot configuration similar to that reported in Panel F of Figure 3.The solutions with the highest likelihood values from this inference run belong to this mode and have relatively low mass 1−1.3 M ⊙ ) and quite high radius (≳ 15 km), populating most of the lower tail of the observed 1D compactness posterior distribution.However, this mode extends to and dominates also other parts of the parameter space, spanning a relatively wide range in compactness.This spread includes considerably smaller radii and significantly higher masses.
Within this mode, both hot spots are characterised by a cold and large ceding component and a hot, small superseding component.The secondary hot spot (as for all the X-PSI models ending with -U, this refers to the hot spot with higher colatitude) typically coincides with the surface area generating the first emission peak recorded in the NICER data, according to our representation (see Figure 10).The secondary hot spot is characterised by both the hottest ceding (with average and standard deviation log 10 (T c,s /K) ∼ 5.86 ± 0.04 vs log 10 (T c,p /K) ∼ 5.7 ± 0.1 of the primary) and superseding component (with average and standard deviation log 10 (T s /K) ∼ 6.23 ± 0.03 vs log 10 (T p /K) ∼ 6.08 ± 0.03 of the primary).In contrast to what was found for the models analyzed in R19, both hot spots are typically located in the hemisphere directly facing the observer (as shown panel F of Figure 3).There is however some spread on their exact location and in the inferred inclination (average and standard deviation of cos(i) ∼ 0.3 ± 0.2); in particular if the observer inclination compared to the rotation axis approaches ∼ 0 deg, the hot spots tend to move towards the equator, to maintain the observed pulsed emission.Slight variations within this main mode also dominate the tails seen in the 2D posterior distributions reported in the left corner plot of Figure 9.
Given the relatively similar colatitude values inferred for the two hot spots and their uncertainties, it is natural to expect a bimodality due to the ambiguity of the primary and secondary labels.The posterior distributions of the parameters describing the hot spot properties clearly show this feature.The secondary mode shows again, for both hot spots, a colder and larger ceding component and a hot, very small superseding component.The temperature ranges characterising the hot spot components are similar to the main mode, but the warmer hot spot is now labeled as the primary.Still, the warmer hot spot is responsible for the first emission peak, given the definition of phase zero in the data set that we use.However this secondary mode, formally identified also thanks to the mode-separation MultiNest option, additionally displays slightly different overall features.Both hot spots are now more typically located in the hemisphere opposite to the observer, whose inclination compared to the rotation axis is now favoured at slightly higher angles (with average and standard deviation of cos(i) ∼ 0.2 ± 0.1).
The average and standard deviation of mass and radius posterior samples associated with this secondary mode are 1.46 ± 0.16 M ⊙ and 12.6 ± 0.9 km, and therefore populate the main posterior peak of the compactness.Although these values do not significantly differ compared to the one inferred from the main mode, in this case these are also representative of the highest likelihood posterior samples.
The maximum likelihood associated with samples belonging to the secondary and the main mode respectively differ by only 2 in units of log-likelihood, while the difference in local evidence amounts to ∼ 3 units in log.The background associated with these main two modes displays some variability, but it is considerably smaller than that inferred for the ST-U and ST+PST solutions.
Our inference run also identified a third mode, which is visible in some of the 2D posterior distributions corresponding to the parameters describing the hot spot properties.Its contribution is however marginal, as confirmed by the best likelihood values associated with this mode.They are worse by about ∼ 9 units in log compared to those associated with the main mode (the difference in local evidence is about ∼ 8 units in log).This third mode is the PDT-U representation of the main ST-U mode that was found when analysing only NICER data.Similarly to that case, the inferred posterior samples cluster around mass and radius values of 1.15 ± 0.10 M ⊙ and 10.7±0.9 km, expressed as average values with standard deviation uncertainties.As expected given this resemblance, the background associated with samples belonging to this mode is considerably higher than that inferred from the other two modes.
The overall posterior distributions of this third mode feature hot spot configurations similar to the main ST-U solutions and their corresponding solution within the ST+PST model parameter space (in that context, describing the secondary mode).As in the latter case, the primary, and now also the secondary, exhibits some variation in terms of the relation between the two components, in particular in the characterisation of the cold ceding component (average and standard deviation of log 10 (T c,p/s /K) ∼ 5.5 ± 0.3 for both hot spots).Focusing instead on the maximum likelihood solutions belonging to this mode, the complexity introduced to describe the hot spot leads to the presence of a very small (ζ s ∼ 0.04 rad) and hot (log 10 (T c,s /K) ∼ 6.6) ceding component, almost completely masked by the superseding one.

Joint Analysis of NICER and XMM-Newton Data
In the right corner plot of Figure 9, we show the posterior distributions of radius, compactness, and mass inferred by the joint analysis of NICER and XMM-Newton data sets.These results were obtained through the SE 0.8, ET 0.1, LP 10 3 , MM off, LR run.
Again the main effect of introducing XMM-Newton data in the analysis consists in increasing the fraction of photons recorded by NICER, assigned to the emission from PSR J0030+0451.Consequently, posterior samples that are associated with high background are here considerably downplayed, and the compactness distribution shifts towards slightly higher values.In this case, however, the identified solutions that can reproduce both data sets are qualitatively similar to those obtained analyzing only NICER data; more specifically belonging to the main and secondary modes of the NICER-only run.The same two modes originate from the bimodality structure clearly visible in some of the posterior distributions describing the properties of the two hot spots (see Figure set 6 in the online journal).Samples associated with the least prominent mode of the NICER-only inference, on the other hand, are unable to represent the data given the additional information provided by XMM-Newton.
In Panel F of Figure 3, we show, as example, the hot spot configuration associated with the maximum likelihood sample of the joint NICER and XMM-Newton inference run.The geometry of the system is qualitatively representative of the main mode of the posterior, which resembles the main mode inferred with only NICER data.The maximum likelihood samples reach quite high values for the radius, typically ≳ 15 km.However, now the associated masses are considerably larger, and cluster around 1.2 − 1.5 M ⊙ .The secondary mode (whose qualitative characteristics are described in Section 4.4.1) also migrates towards higher masses, once XMM-Newton data are included in the analysis.The maximum likelihood samples belonging to this mode show masses around ∼ (1.7 − 2.0) M ⊙ ; the corresponding radius also shifts toward higher values, in the range of ∼ (13 − 16) km.
Given the relatively high values of radius associated with both modes, we decided to inspect the solutions that form the posterior mass for equatorial radii < 12.5 km.The samples populating this part of the parameter space exhibit characteristics belonging to both modes.At least one of the hot spots (usually the primary) is often located in the hemisphere facing the observer, behaviour associated with the main mode.However, the observer inclination with respect to the spin axis is quite large, a characteristic more often associated with the secondary mode.
Medians and 68% credible intervals estimated for mass and radius according to the joint analysis of NICER and XMM-Newton data sets are 1.70 +0.18  −0.19 M ⊙ and 14.44 +0.88  −1.05 km respectively.The constraint coming from XMM-Newton data shifts the compactness, mass, and radius values toward slightly higher distribution.

DISCUSSION
In this Section we discuss and contextualize the results presented in Section 4.

Comparison with R19 Results
The results outlined in Sections 4.1.1 and 4.2.1 are in good agreement with the analyses presented in R19, when using the same ST-U and ST+PST models.A few small changes arise due to the differences in set up compared to that adopted in R19: • Instrument response: for this study we adopt the latest general NICER instrument response (the one used for the analysis of PSR J0740+6620 in Riley et al. 2021;Salmi et al. 2022Salmi et al. , 2023)); • Data sets: as presented in Section 3, the data set analyzed here, B19V006, has been derived from the same event list defined in Bogdanov et al. (2019a), but accounting for the new instrument response; • NICER channel range used: in this work [30:300), in R19 [25:300); • Priors on cos(i) and colatitudes: here isotropic and thus sampled from a uniform prior in cosine, in R19 uniformly sampled in angle; • Different modeling of instrument uncertainties: as outlined in Section 2.1.2,compared to this study, R19 use two additional parameters to model uncertainties in the instrument response, dealing with the Crab calibration (Ludlam et al. 2018); • Updates to the X-PSI pipeline: in this analysis we use X-PSI v0.7 and v1.0, whereas R19 used v0.1.
For full details of changes see the CHANGELOG file in the X-PSI GitHub repository14 ; • Settings: both in relation to the X-PSI pipeline and MultiNest (see in particular differences in live points), as discussed in Sections 2.1.3and 2.1.4.
One particularly relevant change, observed in the results, concerns the widening of the credible intervals associated with some of the model parameters, including the MSP mass and radius.We see this increase for both ST-U and ST+PST models.For the least computationally expensive model, ST-U, we investigated separately the effects of the different elements listed above.In Table 6, we show the 68% credible intervals for mass and radius starting from what was reported in R19 and ending with our reference results.In between we show the results of preliminary analyses that consequently implement the following changes15 , starting with the analysis of the old B19 data set and original response matrix, as used in R19, and finishing with the inference presented in the work for ST-U NICER-only reference cases.
test 1: new version of X-PSI, new modeling of uncertainties on instrument response, new priors; test 2: the effect of different MultiNest settings and in particular of increasing the number of live points; test 3: the effect of changing the lowest considered channel from 25 to 30.
Only the columns marked 'reference' show results for the new data sets, analyzed with the corresponding updated NICER response matrix.
Adopting the new X-PSI framework, including the updated modeling of instrument uncertainties and priors, has the effect of shrinking the credible intervals when the same MultiNest settings are used.This can be explained by the fact that there are fewer parameters associated with the model.This reduces the space of possible solutions and hence the range of radii associated with them.
Increasing the live points has the effect of increasing (asymmetrically) the credible interval.Although this was already pointed out for PSR J0740+6620 in Riley et al. (2021); Salmi et al. (2022Salmi et al. ( , 2023)), for PSR J0030+0451 this behaviour seems at odds with the findings of R19.There the authors performed three inference runs with a thousand live points.All of them generated well-overlapping contours, leading to the conclusion that a higher number of live points would not have influenced the results (unless other modes, with much smaller width but much higher likelihood were also present).This seems not to be the case in the new framework for the ST-U model.Similarly, Afle et al. (2023) tested the dependence of the ST+PST model on the number of live points.When they increased the number of live points from 1000 to 4000, they found only a very small increase in the size of the credible regions (for instance the 1 σ radius interval increased by 0.1 km).
Reducing the range of channels at low nominal energies also slightly widens the credible intervals.This highlights the important role played by events in the low energy tail of NICER's sensitivity range, where the thermal emission from the hot spots is expected to to peak.Since with test 3, we are losing part of the information used previously, the widening is not surprising.
Moving from the B19 to the updated B19v006 data set and the new instrument response slightly reduces the credible intervals once all of the changes mentioned above have been incorporated in the analysis framework.They are however overall still marginally wider than the original values published in R19.
Comparing the R19 results for the ST+PST model and the credible intervals of our new reference inference analysis, we find similar results, for likely similar reasons (see also Afle et al. 2023).
In general, to obtain robust results in the new analysis framework and with the updated B19v006 data set, we require more computationally expensive MultiNest settings (not all combinations have been tested, but e.g.adopting a larger number of live points seems necessary to obtain stable posteriors).In particular our various ST-U and ST+PST analyses (including the ones presented in Section 4.1.2and 4.2.1)demonstrate that, in presence of a multi-modal structure in the posterior, adopting an insufficient number of live points has not only the effect of underestimating the posterior widths, but it can also lead to significant biases.

Comparison with Findings from Simulations
The simulation tests performed in V23a have highlighted expected and unexpected dependencies and behaviours of the results of our analyses.In this Section we comment on the most relevant ones, relating them to our findings based on the analysis of the B19v006 updated NICER data set for PSR J0030+0451.

The Multi-modal Structure of our Posteriors
In V23a, the authors revealed the presence of multimodal structures in the posterior surface16 ; this is common to both real data and simulations.The presence of such structure is often clear only when the multimode/mode-separation setting of MultiNest is activated or after inspecting tails of the posterior distributions.Due to the differences in posterior mass between the various modes, only for some cases the multi-modal structure is prominent enough to be perceivable by eye in the marginalized posterior distributions.In both data and simulations, we clearly see very similar pulses and background generated by considerably different hot spot configurations, MSP and interstellar properties.Multiple models and corresponding parameter vectors can reproduce NICER data without visibly impacting the quality of the residuals.This seems to differ from the case of PSR J0740+6620, where, with the tight mass, distance and inclination priors, a single and stable mode was identified, for both the ST-U and the ST+PST model, that was well-constrained in both the MSP radius and in its hot spot geometry.
More and higher quality data, in addition to external constraints, could mitigate the challenges in the analysis of PSR J0030+0451's NICER data set posed by this multi-modal structure of the posterior surface.For example, as we see when we introduce XMM-Newton data, if we had a stringent and reliable background estimate for NICER, that may lead solutions to a different portion of the parameter space and be less susceptible to multi-modality.
The various modes characterising the posterior surfaces often display different widths of the credible intervals.Moreover, as shown in V23a, posterior widths are quite sensitive to the specific noise realisation.It is therefore complicated to predict in advance the evolution of credible intervals once further data and constraints are included in the analysis.If however, the main identified mode is unchanged, we expect, with some scatter, that uncertainties in the 2D posterior distribution of mass and radius will decrease with exposure time or the square root of it, depending on how strong the correlation between the two is in that specific analysis (see Lo et al. 2013;Psaltis et al. 2014;Miller & Lamb 2015).

Data versus Simulations
In addition to the multi-modal structure, and just as in some of the analyses based on the synthetic data and reported in V23a, analysis of actual NICER data also shows some variability in the results depending on the analysis settings.However, comparing the results outlined in this paper with the findings from the simulations of V23a, we also notice some differences.
• Here, as in R19, the inferred mass and radius change depending on the adopted hot spot model, while for the simulations, presented in V23a, it did not matter whether the ST-U or ST+PST model was used: the resulting posterior distributions found with these two models for mass, radius and compactness were in any case very similar to each other.
• Adopting the HR or LR X-PSI settings has a visible, although not significant, effect on the posterior distributions of mass and radius (as shown by comparing the green, HR, and the yellow, LR, lines in the left panel of Figure 7), difference that was not observed in the analyses of synthetic data presented in V23a.
• Comparing the evidences obtained for NICERonly analyses when adopting different models, we see a significant preference for more complex hot spot shapes.For the two cases tested in V23a, Table 6.68% credible intervals for mass and radius for different inference analyses adopting ST-U and ST+PST models, as expressed in the labels of the first row.In the labels R19 refers to the results reported in R19, reference to the reference results presented in this paper and test 1-3, to findings of preliminary analyses of the original data set used in R19.Test 1-3 check for, in order: the effect of the new X-PSI framework and priors (for more details, see Section 2.1.3);the effect of different MultiNest settings (for more details, see Section 2.1.4);and finally, the effect of excluding channels [25,30).Note that each change is implemented in addition to the previous one(s).
instead, this evidence difference reduced to a factor of a few in log-evidence, not enough to draw a decisive conclusion.
The origin of this different behaviour for real data versus simulations could be due to a combination of several factors.The analyzed synthetic data may not cover a representative part of the parameter space; indeed in V23a, only one parameter vector per model was considered.If significantly more computational resources were available, it would be possible to produce and analyse multiple synthetic data sets, which would better sample the posterior inferred from real data (instead of limiting injected parameters to maximum likelihood samples).The different nature of the considered data may also generate some discrepancies in our findings.The simulations are based on synthetic data generated assuming the same physical processes and simplifications later adopted in the inference analyses.However, real data may contain physics that is not accounted for in our models: for example, hot spots are unlikely to have such simple shapes and will very likely display some temperature gradients.Normally we assume that our data are not sufficiently good for these pieces of missing physics to affect the outcome of the analysis; but this is something that should be tested in more detail.

Model Comparison and the Effect of Including Constraints from XMM-Newton
We report the main results for each model considered in this study in Table 7.The first four rows of the table refer to NICER-only reference inferences (settings of reference runs are highlighted in bold in Table 3); while the last five show results for joint NICER and XMM-Newton analyses (see specific settings in Table 3, for entries with XMM-Newton data included).From left to right, we report in each column: (i) adopted model, 68% credible interval for (ii) mass and (iii) radius, (iv) natural logarithm of evidence, (v) natural logarithm of the maximum likelihood, (vi) only for joint NICER and XMM-Newton analyses -natural logarithm of the maximum likelihood within the NICER-only framework, (vii) only for NICER-only inferences -natural logarithm of the maximum likelihood within the joint NICER and XMM-Newton framework (assuming α XTI = α MOS1 = α MOS2 = 1, with β defining the distance) and (viii) core hours necessary to compute the run.The second row in the joint NICER and XMM-Newton block refers to the secondary ST-U mode, identified in the ST-U inference run, whose main results are summarized in the row just above.This mode is particularly interesting since a similar configuration is also found adopting the ST+PDT model and it resembles the main findings for PSR J0740+6620.In this case only, in the fourth column, we report the local evidence; for all other rows, we refer to the global evidence.
We highlight the implications of our findings, summarized in Table 7, in the following subsections.

NICER-only Analyses
The first four rows of Table 7 summarize the results of the NICER-only inference runs.As in R19, the evidence clearly disfavours the ST-U model, but does not display a clear preference for any of the more complex models.However, while the analysis settings for all NICER-only runs whose results are displayed in the table are the same, this does not necessarily imply the same accuracy in the evidence evaluation, since the dimension of the parameter space increases with the complexity of the model.Hence, how well each model explored the parameter space would ideally need to be checked in more detail to corroborate our findings.Note for example that the maximum likelihood value found for the ST+PST model is higher than the one identified in our reference ST+PDT inference run.Since the ST+PST model is approximately contained in the more complex ST+PDT model, this likely implies that our reference ST+PDT run has not adequately explored the parameter space 17 .This has implications for the estimated evidence, whose value is therefore likely underestimated.
Exclusively in terms of mass and radius posteriors, two main classes of solutions are present: M ∼ 1.1 − 1.2 M ⊙ and R eq ∼ 10.5 − 11.1 km; or M ∼ 1.4 M ⊙ and R eq ∼ 17 Another possibility could be that the prior mass (note that this is not the same as the posterior of the pulsar mass parameter, or the mass posterior) of this mode is too small for MultiNest to consider it relevant for the evidence estimation.This latter hypothesis is however unconvincing, since a detailed look into the posterior samples showed no solution similar to the one found in ST+PST and hence the total contribution of this mode has at least not been accurately estimated by MultiNest.

NICER-only
Model 13 km 18 .The former seems to be slightly more asymmetric and narrower than the latter (given the same adopted settings, which however again do not guarantee the same quality of the uncovered posterior).
The newly introduced ST+PDT and PDT-U models, which were not studied in R19, show significant posterior mass for configurations with at least one of the hot spots on the same hemisphere as the observer.These geometric configurations were not found in R19.

Joint NICER and XMM-Newton Analyses
We focus now on the entries of Table 7 corresponding to joint NICER and XMM-Newton analyses.
Evidence: unlike for the NICER-only analyses, for the joint inferences, the evidence indicates a significant preference for the PDT-U model.Both ST-U and ST+PST models are strongly disfavoured, with similar evidence values differing from the best one by factors of ∼ 40 in log units.The difference between PDT-U and ST+PDT is milder (about 7 − 8 18 These two groups of mass and radius values, do not however reflect the hot spot configurations associated with them, which show a greater variety. in log units), but still decisive.If we assume that including XMM-Newton data introduces unbiased constraints, the substantial shift in solution(s) driven by the additional XMM-Newton data, compared to those reported in R19, shows that the PSR J0030+0451 NICER results need to be interpreted carefully.This implies that, for this source, whose data are clearly characterised by a prominent multi-modal structure, further constraints could once again significantly modify the results of our inference.
Radii inferred with ST-U and ST+PST models: for the NICER and XMM-Newton joint analyses, the new main solutions identified with the ST-U and ST+PST models are very similar to each other, in mass and radius as well as in hot spot configurations.Some of the similarities, especially in terms of mass and radius, could be caused by the radius posteriors' tendency of reaching the upper limit of the prior and consequently constraining the mass through inferences on the compactness.We also highlight that such high radius values are difficult to reconcile with constraints posed by other observations, such as the inferred tidal  4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table 4, corresponding to "yes" entries in the XMM-Newton column).
Hot spot complexity: introducing the masking component that distinguishes the ST+PST model from the simplest ST-U one does not improve our ability to represent both NICER and XMM-Newton data.
Looking at the corresponding geometrical parameter posterior distributions (see Figure set 6 in the online journal) there is a clear degeneracy due to the small size of both hot spots (degeneracy of type I, as described in R19).This generates ambiguity in the association of the ST and PST labels with the hot spots (i.e.either hot spot could in principle be represented with or without the omitting component).On the other hand, when we introduce an emitting component via the ST+PDT model, its association with the hot spot which we see first in the data is strongly favoured, and the associated evidence and maximum likelihood values improve drastically.Introducing a second emitting component also for the other hot spots, as present in the PDT-U model, further improves the quality of our results.
Similarities in the tails of the posteriors: further examinations of the posterior distributions derived with the ST+PST model reveal that the configurations populating the tails at low radii are likely forming a secondary mode, as was the case for the joint NICER and XMM-Newton ST-U analysis.These secondary modes display similar geometries (being viewed almost from the equator and with almost antipodal hot spots), masses, and radii as the main ST+PDT mode.As we will discuss further in Section 5.3.5, this configuration is of particular interest because it could potentially better explain the gamma-ray emission associated with PSR J0030+0451; moreover, it would be similar to the geometry found for PSR J0740+6620 (which we may expect if the underlying magnetic field and its history are similar).

Transitioning from NICER-only to Joint NICER and XMM-Newton Analyses
In this Section we briefly outline the main effects that we observe in our findings, transitioning from analysing only the NICER data to jointly considering both NICER and XMM-Newton data sets.We however remind the reader that, as more extensively discussed in Section 5.4, inferences of NICER-only data were performed with better (and more computationally expensive) MultiNest settings, compared to the joint instrument analyses (the joint analyses generally required a larger number of core hours, given the same settings, to reach the termination condition set by the evidence tolerance).
We start by noting that possible differences in the results were already anticipated in R19 (see the discussion in Section 4.1.2 of that paper).Introducing XMM-Newton data has the general effect of decreasing the inferred background, leading to higher compactness values (following the same logic presented in Riley et al. 2021;Salmi et al. 2022).This is in general true for all models, which, although in different ways, always reach higher compactness by shifting the posteriors of both masses and radii toward higher values.However, the findings obtained adopting the ST-U and ST+PST models are the ones mostly affected by the change.Indeed the ST-U and ST+PST NICER-only analyses identify, as main modes, solutions that are only supported by higher background levels, compared to the broader background support for the ST+PDT and PDT-U main modes.
For the ST+PDT and PDT-U models, introducing XMM-Newton data into the analyses has the effect of reducing the background support for the main modes identified with NICER-only inferences, whilst still maintaining them as the most probable configurations (now characterised by slightly narrower posteriors, looking e.g. at radius and mass, as a consequence of the restricted background support of the same mode) 19 .
Instead, solutions found with ST-U and ST+PST in the NICER-only inferences are downplayed by the introduction of the constraints posed by XMM-Newton data, which jointly with NICER data can now be better represented by a completely different mode.As mentioned in Section 5.3.2, both ST-U and ST+PST models favour solutions with high radius values (R eq ∼ 15 − 16 km), and hot spot configurations resembling panel C of Figure 3, for the main bulk of the posteriors.
Introducing XMM-Newton data and constraints does not significantly affect the ratio between α XTI and the distance squared, or the β parameter in NICER-only analyses (representing the only distance and α XTI combination to which the analysis is sensitive), which remains between ∼ 8.4 − 8.9.
Transitioning to joint analyses has different effects on the inferred column density, depending on the assumed model.This is most likely an indirect consequence of changes (for ST-U and ST+PST), or the absence of changes (for ST+PDT and PDT-U), in the identified main mode.For both ST-U and ST+PST the inferred N H posterior shifts from peaking at specific values (∼ 1.48 × 1020 cm −2 and 0.9 × 10 20 cm −2 ) to be completely consistent with zero (i.e.clearly limited at ∼ 0.1 × 10 20 cm −2 by the lower limit of our prior).For the ST+PDT analyses, introducing XMM-Newton data does not significantly affect the N H posterior, which clusters around 0.7 × 10 20 cm −2 .Similarly for PDT-U, the joint analysis points towards an inferred N H posterior only marginally different from what was found in NICER-only inferences, with median values changing from 1.58 × 10 20 cm −2 to 1.48 × 10 20 cm −2 .Given the different estimates of N H that the various models provide, it is clear that strong constraints would considerably help our analyses and improve our interpretation of their results.In future analyses, we plan to include a more realistic prior on the N H , based on literature.We hope to find up-to-date estimates in particular of its lower limit, as this would greatly benefit our analyses.
Finally, introducing XMM-Newton does not seem to significantly affect the quality of the NICER residuals (which are still good, see Figure set 10 in the online journal), while the XMM-Newton data are also well represented (for an example, see panels in Figure 11).From Table 7, we can see that the maximum likelihood for NICER-only drops considerably if we use the maximum likelihood solution identified in the joint analysis for the ST-U and ST+PST models (with a drop of about 30 in log likelihood).However, this is not the case for ST+PDT and PDT-U.The former shows a similar maximum likelihood value to that found in the corresponding ST+PDT NICER-only analysis; while for the latter the best fitting solution has a NICER likelihood comparable to the overall best solution from the NICER-only inferences (and therefore very close to the one identified with the ST+PDT NICER-only analysis) 20 .

New versus Old Favoured Configurations
As mentioned in Section 5.3.2,overall we find that the NICER and XMM-Newton data sets can be jointly well described by the solutions identified with both ST+PDT and PDT-U models.Interestingly, compared to what was found for the models and data analyzed in R19, in neither case there is a need for elongated hot spots or for both the hot spots being located in the same hemisphere (see panels E and F of Figure 3).For both models, the PDT hot spot(s), i.e. the ones described by two emitting elements, are characterised by a larger cooler component and a much hotter and smaller one.This hints at the presence of temperature gradients, to which our analysis is apparently sensitive.We believe this should motivate investigations aimed at characterising the sensitivity and robustness of our findings to more realistic physical properties of the MSP surface hot spots.We also note that for both models, the overall dimensions of the two hot spots are comparable and that they are almost antipodal (at least in phase).These characteristics resemble the configuration found by analysing NICER-only observations and jointly the NICER and XMM-Newton data for PSR J0740+6620.In particular the solutions identified with ST+PDT show even more similarities with the findings for PSR J0740+6620, including the high inclination (cos(i) ∼ 0.1) and the relatively low radius value of R eq ∼ 12 km.As mentioned previously, very similar solutions are also found in the tail of the joint NICER and XMM-Newton posteriors derived with the ST-U and ST+PST models, as secondary modes.

Gamma-Ray constraints
The hot spot configurations found with ST+PDT and as a secondary mode of the ST-U (and ST+PST) joint NICER and XMM-Newton inferences are, interestingly, more consistent with the gamma-ray emission associated with PSR J0030+0451.Kalapotharakos et al. (2021) considered multipolar field geometries consistent with the mass, radius, and viewing geometry (inclination ∼ 50 degrees) results of R19 and Miller et al. (2019), under the assumption the hot spots correspond to open field line regions.Open field lines were identified by integration from the surface to the light cylinder in either static vacuum or force-free models, with corresponding identified surface regions assumed to be uniform in temperature.Only bolometric light curves were considered in Kalapotharakos et al. (2021).The offset dipoles and quadrupoles resulted in a variety of surface patterns qualitatively different in shape but consistent in position with R19 and Miller et al. (2019)'s highest likelihood shapes.
However, Kalapotharakos et al. (2021) also found degeneracies, i.e. a multimodal likelihood surface, when only the X-rays were considered.A key aspect of the force-free models of Kalapotharakos et al. (2021) was that they allowed for the computation of gamma-ray emitting regions, beyond the light cylinder in the current sheet.This high-altitude field region's morphology is most sensitive to the global dipolar field geometry and orientation relative to the spin axis.A key novel aspect in Kalapotharakos et al. (2021) was the computation of synchronous (in absolute phase, relative to the radio) X-ray and gamma-ray pulse profiles with multipolar fields -this involved correcting for light travel times and different emitting regions in curved space-time.Of the force-free models considered in Kalapotharakos et al. (2021), only one (model RF4 11) seems to describe both the X-rays and gamma-rays well in phase alignment.This model's X-ray hot spots are circular and are reminiscent of models and modes D and E of Figure 3 in this work.The study by Kalapotharakos et al. (2021) implies that a self-consistent treatment of both X-rays and gamma-rays will reduce parameter degeneracies and identify the correct hot spot modes and viewing geometry for PSR J0030+0451.Such an investigation is planned.Note that seeing both radio and gamma-rays from a pulsar generally limits the viewing geometry (the radio is from magnetic polar regions, while the gammarays are from equatorial current sheets).These visibility regions cut across almost orthogonally in the space of dipole inclination angle and spin axis viewing direction (see for example Corongiu et al. 2021).

Caveats
In this Section we list the major caveats and assumptions that affect the analyses presented here.
Atmosphere state and composition: for the analyses in this paper we have assumed a fully ionized hydrogen atmosphere.While this assumption is quite standard, and well-motivated, it can have a substantial impact on the parameter posteriors that we recover, as discussed and described in detail in Salmi et al. (2023).
Simplification of the hot spot properties: here, as in all NICER studies so far, we adopt models that greatly simplify the physics characterising the hot spots.The underlying assumption is that our data and relative uncertainties are not sensitive enough to such details for us to need to account for them.Degeneracies, in the analyses presented both here and in V23a, show that we are not sensitive e.g. to the direction of arcs, if thin and elongated, or even to the presence of a structure more complex than a circle, if the hot spot is small enough (see e.g.label ambiguity in the ST+PST joint NICER and XMM-Newton analysis).However, the jump in evidence and maximum likelihood values obtained when introducing a second emitting component per hot spot suggests that our analyses are in fact sensitive to temperatures and, likely, temperature gradients (considering indeed the usually very small size of the hotter component).
Priors: priors are particularly important in our inference analyses, as they drive the MultiNest exploration of the parameter space and sometimes they clearly impact the range of our credible intervals.That is for example the case for the radius posteriors inferred in the ST-U and ST+PST joint NICER and XMM-Newton analyses, where our hard upper limit clearly modifies the natural 1D posterior distribution of that mode, truncating it at 16 km and thus skewing the higher radius end of the credible interval.Priors also play a crucial role in the evidence and yet their definition always, at least partially, relies on arbitrary choices: firstly through the chosen parametrization to describe the problem and secondly through their practical implementation, which often includes approximations, simplifications and arbitrary cut-off values.Thankfully, when the posterior is data-dominated, the impact of these choices is heavily suppressed.However, any time that we use the evidence, we should keep in mind that priors are crucial to evaluate it.As noted, our reference ST+PDT in NICER-only analysis missed exploring an important part of the parameter space, since the solution for ST+PST is also in the ST+PDT parameter space and has a better likelihood.However, it is also possible that MultiNest did not pick such a solution because its supporting prior volume is too small (if for example it required temperatures very close to the prior lower bounds) when put in the higher dimension parameter space describing the ST+PDT model.It should be stressed that Multi-Nest explores the parameter space with the aim of computing the evidence, through definitions of isolikelihood contours (Feroz & Hobson 2008a;Feroz et al. 2009Feroz et al. , 2019)).
Use and interpretation of XMM-Newton data: introducing XMM-Newton data could lead to biases if the underlying assumptions are not correct, e.g. if the assumption on cross calibrations are incorrect, or other astrophysical contamination is present at the pulsar sky location without being accounted for in our models.
However, while we should keep this in mind and test for alternatives when possible, we think it is a crucial procedure to cross-check the compatibility of our findings with other observations, given the large amount of data increasingly available.Hence, independently of the outcome, we believe including XMM-Newton data and constraints promotes correct scientific practice.Cross checks are especially important for NICER sources like PSR J0030+0451, which lack independent information to coherently incorporate in priors (unlike for PSR J0740+6620, and other NICER sources like PSR J0437−4715, where radio constraints can considerably inform our analyses and reduce the prior parameter space associated with masses and inclination).
Adequacy of the analysis settings: The data quality for PSR J0030+0451 is very high, which increases the computational cost of this type of analysis21 .Given the computational resources available to us, we could not fully test how well the parameter space was explored for the most complex models; analyses adopting ST-U and focusing on NICER-only data were the only exception.With the various tested settings, it was possible to prove that in this case our reference run had reached a converged state (although a significantly higher number of live points could potentially still reveal unexplored and relevant portions of the parameter space).The exploratory runs, which adopt an increasingly higher number of live points, also proved that merging the results of multiple inference runs with lower live points is not always equivalent to a single run with higher live points (e.g.merging all our ST-U runs with LP 10 3 would have converged to a radius about ∼ 1 km higher than what we find with a more adequate amount of live points (6 − 10) × 10 3 ).Indeed, contrary to expectations, we find, as also shown to a lesser extent for PSR J0740+6620 in Riley et al. (2021); Salmi et al. (2022), that a lower number of live points does not only underestimate the width of a mode, but also slightly biases it.In our ST-U NICER-only cases, for example, we see a systematically higher median values when only 10 3 live points are used.However, further checks are still needed; e.g. it is yet unclear if decreasing the evidence tolerance to 0.001 would have an effect when a high number of live points is employed.
For the more complex models, we cannot guarantee an exhaustive exploration of the parameter space given its increased dimension and the hence increased computational cost 22 .This is particu- 22 This implies that we cannot guarantee that significant posterior shifts and changes would not appear in case of repeated inference runs with the same analysis settings.
larly true for the joint NICER and XMM-Newton inference, whose requirements in terms of analysis settings had to be lowered to render them computationally tractable.As suggested in V23a, future analyses of upgraded data sets, targeting a new headline inference of mass and radius for PSR J0030+0451, will need adequate studies to address whether settings are appropriate, given the data and the model framework.More tailored techniques may also be employed to assess robustness in the obtained posterior and the adequacy of the adopted models.The robustness of the findings for the ST+PST, ST+PDT and PDT-U models, both for NICER-only inferences and joint NICER and XMM-Newton analyses, needs more extensive testing and this will be considered in our computational budget requests for future analysis.
Correct interpretation of evidences: in the presence of a highly multi-modal structure of the posterior surfaces, it is important to reflect on what should be inferred from evidence values.In particular, if one model is preferred by the evidence, it does not necessarily imply that the radius posterior associated with its main mode is also favored.We also point out that, in this context, preferences in terms of posteriors seem to be naturally driven by their correspondent local evidence.However this is also an arbitrary choice.Moreover the evidence depends on the priors which, as we mention above, are defined with some arbitrary components.Therefore, in certain contexts, it may be preferable to focus instead on the best fitting solu-tion (i.e. the solution given the highest likelihood or posterior value), independently from its width, instead of the more "robust" solution, found with evidence and hence backed up by a larger volume in the prior space.
6. CONCLUSIONS In this paper we analyzed the reprocessed NICER data set for PSR J0030+0451, B19v006.We adopted four different models to describe increasing complexity in the hot spot shapes: ST-U, ST+PST, ST+PDT and PDT-U.The former two were both found in R19 to well represent the data (i.e. they had no suspicious structure in the residuals); however the more complex model ST+PST was favoured by the evidence.According to this model, PSR J0030+0451 is a MSP characterised by a mass of about ∼ 1.3 M ⊙ and a radius of ∼ 13 km.Both hot spots were located in the same hemisphere, opposite to the observer, and one of them had an elongated shape.ST+PDT and PDT-U are two additional models tested for the first time for PSR J0030+0451.They allow for an additional emitting component to describe one, for ST+PDT, or both of the hot spots, for PDT-U.
In this work we upgraded the data set release that was described in 2019 using an updated instrument response, and also coherently included XMM-Newton data in our analyses.Compared to R19, we here also adopted more extensive analysis settings in our main inferences.However, we were able to check the robustness of the obtained posterior distributions only for the simplest ST-U model.The computational resources necessary to test the adequacy of the analysis setting for inferences adopting more complex models were instead prohibitive.With this caveat in mind (see also Section 5.4), below we summarize the main takeaway points of our inferences.
• Analysing the updated NICER data set, with new settings and an upgraded pipeline, we produce results consistent with what was found and reported in R19.
• Based on ST-U tests, compared to the analyses of R19, the reprocessed data and new framework seem to require a larger number of MultiNest live points to ensure the robustness of the inferred posterior distributions.
• Joint analyses can require a considerable amount of computing resources while reducing Multi-Nest and X-PSI settings to make the analysis computationally tractable can have significant effects on the final findings.To produce robust results, analyses of future data sets will require the allocation of more computational resources and/or a considerable speed-up of the parameter estimation procedure.
• In joint NICER and XMM-Newton inferences, ST-U and ST+PST give rise to very similar solutions (but very different from the corresponding ones identified with NICER-only data), which lead to very large masses and very high radii (which would be in strong tension with the results from GW170817), the latter hitting the edge of our hard-cut prior at 16 km.
• Both ST+PDT and PDT-U models, with the latter being significantly preferred by the evidence, are able to reproduce both NICER and XMM-Newton data, without invoking elongated shapes for hot spots or requiring them both to be located in the same hemisphere.The hot spot configurations found with these two models are likely in better agreement with the PSR J0030+0451 gamma-ray emission and resemble the findings of PSR J0740+6620, especially for the solutions associated with the ST+PDT model.Radii and masses estimated during the preliminary joint NICER and XMM-Newton runs, presented in this work, for ST+PDT and PDT-U are respectively: R eq ∼ 11.5 km, M ∼ 1.4 M ⊙ and R eq ∼ 14.5 km, M ∼ 1.7 M ⊙ .However, more targeted and detailed studies are needed to assess the robustness of these findings.
• The configurations identified by the ST+PDT and PDT-U models, and how well they seem to describe the data, suggest the presence of gradients of temperature in the hot spot.It is therefore crucial to check what would be the impact of such gradients on our parameter recovery, through more realistic simulations.
• Our findings also suggest that our analysis is only weakly sensitive to small details describing the hot spots (e.g. for the secondary mode found for the ST+PST NICER-only analyses, the omitting component can assume different sizes and location, as long as the area of the effectively emitting regions maintains a similar size).
• Different models and parameter vectors are currently in good agreement, given the quality of the residuals, with the B19v006 data and constraints.If this situation does not change in the future, it could be useful to provide a more informative (constrained) prior for PSR J0030+0451's radius, based on other robust observations, such as estimates of the tidal deformability from gravitational waves and/or constraints from other NICER sources.One could also develop more adequate procedures to deal with multi-modal solutions for follow-on equation of state studies.
• It appears that more factors complicate the analysis of NICER PSR J0030+0451 data compared to PSR J0740+6620.One reason could be PSR J0740+6620's high inclination and general intrinsic hot spot configuration.It could also be due to the quality of the data, since for PSR J0740+6620 (which is faint in X-ray) we know that most NICER photons actually come from the background.The tight constraints on mass and inclination coming from radio observations, thanks to PSR J0740+6620's parameters and geometry, are also extremely helpful for the analysis of that source.
• Constraints from other studies and observables would be very useful, for example from PSR J0030+0451's gamma-ray pulse profile or NICER background estimates.

Summary
In summary, we analyzed the PSR J0030+0451 data set, described in Bogdanov et al. (2019a), with the most updated instrument response and found, with an updated analysis and modeling framework, results compatible with the findings reported in R19.Our in-depth analysis, however, reveals a complicated structure of the likelihood surface that is difficult to explore extensively, given our current computational resources.We introduce constraints posed by the XMM-Newton X-ray observations of PSR J0030+0451.This suggests that there may be no need for both hot spots to be located on the same hemisphere, nor for elongated shapes, as long as multiple temperatures can be assigned to a single hot spot.
Given the inference analyses conducted for this work (and noting the caveats outlined in Section 5.4), there is a significant preference, in terms of evidence values, for the most complex model tested (PDT-U), which can seemingly well represent both NICER-only and joint NICER and XMM-Newton data.The radius and mass associated with this solution are: 14.44 +0.88  −1.05 km and 1.70 +0.18  −0.19 M ⊙ .However, we find that ST+PDT also well reproduces both NICER-only and joint NICER and XMM-Newton data.It does so with a hot spot con-figuration that resembles PSR J0740+6620, and radius estimation is more in line with the value currently inferred for that source.The ST+PDT model yields a radius and mass of 11.71 +0.88  −0.83 km and 1.40 +0.13 −0.12 M ⊙ for PSR J0030+0451.
This in-depth re-analysis of the 2017-2018 NICER data set indicates that PSR J0030+0451 is a complex source to model.Due to the multi-modal nature of the solution space, it is a source for which additional constraints -on the background, or on the system geometry -will be important in determining a robust inference of mass and radius.This study provides a more comprehensive baseline to support in-progress analysis of new, larger NICER data sets for this source, including benchmarking of the computational resources that need to be allocated to ensure robust results.
between the spin axis and line of sight NH [cm −2 ] Hydrogen column density log 10 (Tp/K) log of the primary hot spot temperature log 10 (Ts/K) log of the secondary hot spot temperature ζp [rad] Angular radius of the primary hot spot ζs [rad] Angular radius of the secondary hot spot θp [rad] Colatitude of the primary hot spot θs [rad] Colatitude of the secondary hot spot ϕp [cycles] Phase shift of the center of the primary hot spot compared to the reference phase set by the data ϕs [cycles] Phase shift of the center of the secondary hot spot compared to the reference phase set by the data plus half a cycle α (NICER, αXTI, or XMM-Newton, αMOS) effective area energy-independent scaling factor β [kpc −2 ] distance scaled effective area parameter α/D 2 (used as an alternative to α and D) log 10 (T else /K) log of the temperature of the remaining surface of the MSP (only adopted in few analyses) Figure 1.ST-U posterior distributions (smoothed by GetDist KDEs) from 9 inference runs, for radius, compactness and mass.The corner plot on the left shows results for three inference runs with MultiNest settings identical to those adopted by R19, but obtained in the new X-PSI and data framework described in Sections 2, 3 (differences listed in Section 5.1).For all these runs we adopted the high-resolution X-PSI settings.For comparison, in solid yellow lines we also show the distributions found for run 1, ST-U model, in R19.In dashed black we present posterior distributions for the SE 0.3, ET 0.1, LP 10 4 , MM on run, which we consider to be the reference runs for ST-U model in this work.The corner plot on the right shows the results for three inference runs with our default MultiNest settings.We also report posterior distributions for the run allowing the whole surface of the MSP to emit (labeled as ST-U+T else ).Each of the two corner plots shows new posteriors from four inference runs which use different MultiNest settings, as reported in the legend (for definitions see Section 2.1.4).Curves in the representation of the 2-dimensional posteriors trace the 68% credible area.On top of the 1D posteriors, we report the 68% credible intervals (representing the area within the 16% and 84% quantiles in the 1D marginalized posterior) starting from the median of the distributions.These values, as well as the colored areas, refer to the two dashed black lines: the run enabling the mode-separation modality, in the left corner plot, and the run adopting the ST-U+T else model, in the right corner plot.

Figure 2 .
Figure2.ST-U posterior distributions (smoothed by GetDist KDEs) from 16 runs, for radius, compactness, and mass.This figure highlights the effects of adopting different MultiNest settings (for all these runs we adopted the high-resolution X-PSI settings).On the left plot we show results for 9 ST-U inference runs: they illustrate, for the specific problem at hand, what is the effect of changing evidence tolerance or sampling efficiency, compared to our default settings.The corner plot on the right instead shows the effect of increasing the number of live points.As a reference, in both corner plots, we also report with dashed black lines the posterior distributions obtained with the SE 0.3, ET 0.1, LP 10 4 , MM on run; numbers on top of the 1D posteriors, as well as shaded areas, refer to this inference run.See caption of Figure1for further details.
middle panel rightmost panel panel C, Fig. 3 Panel D, Fig. 3 of Fig. 7, V23a of Fig. 7, V23a Figure3.Representation of the main geometrical modes found with the analyses presented in this study.The surface patterns are shown from the observer point of view (i.e. according to the inferred inclination of the specific sample considered), at phase zero of the data.Hot spots on the same (opposite) hemisphere compared to the observer are plotted with full (dimmed) colors and and a cross (circular) marker at the centre.The color of the hot spot components changes from blue to red from the highest to the lowest temperature of the considered surface pattern (with higher precision compared to the temperature reported in the legend).Black is used for masking components.In all cases, we show, as a reference point, the hot spot geometry associated with the maximum likelihood sample of that specific mode within the considered inference run.Although this single sample cannot capture the whole possible variations present in a mode, we hope in this way to provide a simplified but more concrete indication of the configurations belonging to the mode.Panels A and B refer to the configurations associated with the main mode of ST-U (SE 0.3, ET 0.1, LP 10 4 , MM off, HR) and ST+PST (SE 0.3, ER 0.1, LP 10 4 , MM on, LR) models, NICER-only analyses.Panels C, D, E and F refer to the configurations found in joint NICER and XMM-Newton inferences.Panels C and D are representative of the main (C) and secondary (D) modes derived adopting the ST-U model (SE 0.3, ER 0.1, LP 10 4 , MM on), while panels E and F refers to the main mode found with ST+PDT and PDT-U models (SE 0.8, ER 0.1, LP 10 3 , MM off) (for these last two, low-resolution, LR, X-PSI settings have been adopted).

Figure 4 .
Figure 4. Posterior distributions (smoothed by GetDist KDEs) of radius, compactness, and mass from joint NICER and XMM-Newton inference runs: left and right corner plots respectively obtained adopting the ST-U and the ST+PST model.With dash lines here we show the prior distributions.In the 2D posteriors we plot 3 curves, defining the 68%, 95%, and 99% credible area.Analyses settings for the depicted runs are displayed in Table 4 (see ST-U and ST+PST entries with "yes" in the XMM-Newton column).See caption of Figure 1 for further details.
4.2.1.Reproducing R19 Results and Analysing the Impact of Settings Figure 5.Spectrum of NICER data and its inferred composition, according to the maximum likelihood sample of our X-PSI analysis.Data are plotted with a dashed magenta line; the total signal expected is shown with a solid pink line; the background (BKG) that maximises the likelihood for the (background-maginalised) maximum likelihood posterior sample (MaxL, in the legend) is plotted with a purple solid line.The contribution of the primary hot spot, the secondary hot spot, and the background, again associated with the maximum likelihood sample, are reported on top of each other with pink-shaded regions, increasingly more intensely colored.The purple shades show from darker to lighter color the average background ± 1, 2 or 3 standard deviations, calculated over 200 randomly selected posterior samples from the equal weight MultiNest output file.The inference runs for this figure concern the joint NICER and XMM-Newton analysis, performed with the X-PSI PDT-U model.The complete figure set (8 images) is available in the online journal, for the ST-U, ST+PST, ST+PDT, PDT-U models for both NICER-only analyses (reference runs highlighted in bold in Table4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table4, corresponding to "yes" entries in the XMM-Newton column).

Figure 6 .
Figure 6.Corner plot based on the joint analysis of NICER and XMM-Newton data sets, carried out with the ST-U model.This image includes the posterior distributions (smoothed by GetDist KDEs) for the parameters describing mass, radius, and the hot spot properties inferred for PSR J0030+0451's system.In the marginalized 1D posteriors we show with dash lines our corresponding priors.The complete figure set (8 images) is available in the online journal, for the ST-U, ST+PST, ST+PDT, PDT-U models for both NICER-only analyses (reference runs highlighted in bold in Table4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table4, corresponding to "yes" entries in the XMM-Newton column).Tables1 and 2show the meaning of the adopted parameter labels.The multi-modal structure of the posterior, described in Section 4.1.4,is here clearly visible.In particular the bi-modality in the 2D posteriors including the radius, shows the main mode (configuration corresponding to its maximum likelihood sample shown in Panel C of Figure3) and a secondary one (configuration corresponding to its maximum likelihood sample shown in Panel D of Figure3).For more general details about the plots in this figure set, see Figure1.

Figure 7 .
Figure7.ST+PST posterior distributions (smoothed by GetDist KDEs) from 14 runs, for radius, compactness, and mass.This figure highlights the effects of adopting different MultiNest settings.The corner plot on the left shows the effect of adopting different (low or high) X-PSI resolution settings.The green curves represent the posterior distributions inferred from the analyses that adopt high resolution (HR in the legend); the yellow ones are derived from low X-PSI resolution setting (our default for models more complex than ST-U).These latter have slightly different MultiNest settings, as marked with settings in parenthesis, but show very similar behaviours and are individually plotted on the right panel.The two outliers (one at HR, green, and one at LR, yellow) have been obtained fixing MultiNest and Python random seeds to the same values.With black dash lines, we also report the posterior distributions obtained with the SE 0.3, ET 0.1, LP 10 4 , MM on run; numbers on top of the 1D posteriors, as well as shaded areas refer to this inference.On the right panel, instead, we show new results for 5 ST+PST inference runs, with low X-PSI resolution, adopting our default MultiNest settings, the only exception being the second run from the top of the list, for which we enabled the mode-separation.Similar MultiNest settings have also been adopted in R19 (although higher X-PSI settings were in that case used).The ST+PST posteriors found in R19 are here shown in brown for comparison.This plot tests: the variability due to the randomness of the sampling process, given the specific analysis settings; the effect of updating the model prior, to allow the primary hot spot to overlap with the masked part of the secondary (CoH, runs marked in the legend with * ); and the effect of introducing emission from all the MSP surface (see run including T else in the legend).The black dashed lines represent the posterior distributions obtained when allowing the entire surface of the MSP to emit.Shaded areas and numbers on top of the 1D posterior distribution refer to the credible regions and intervals of this inference.See caption of Figure1for further details.* : runs adopting the updated CoH ST+PST prior, as explained in Section 2.3.4 ofVinciguerra et al. (2023).

Figure 8 .
Figure 8. ST+PDT posterior distributions (smoothed by GetDist KDEs) of radius, compactness and mass for two inference runs.We show results obtained with the analysis of NICER-only data (left), and derived by the joint analysis of NICER and XMM-Newton data (right).For more details about the plot see Figure 4.

Figure 9 .
Figure 9. PDT-U posterior distributions (smoothed by GetDist KDEs) of radius, compactness, and mass for two inference runs.We show results obtained with analysis of NICER-only data (left) and derived by the joint analysis of NICER and XMM-Newton data (right).For more details about the plot see Figure 4.

Figure 10 .
Figure 10.The three panels represent from top to bottom: data counts, posterior counts expected for the PDT-U model and residuals as a function of cycle phase (on the x-axis) and NICER instrument channel (on the y-axis).Residuals are defined as the difference in counts between data and expectations, weighted by the corresponding Poisson standard deviation (as expressed on the associated color axis): dij are count data for the i th phase bin and j th channel, cij the corresponding posterior-expected count numbers.Expected counts are evaluated considering 200 posterior samples, inferred jointly analysing NICER and XMM-Newton data.More details about this Figure can be found in the caption of Figure 13 and in Appendix A.2.1 of R19.The complete figure set (8 images) is available in the online journal, for the ST-U, ST+PST, ST+PDT, PDT-U models for both NICER-only analyses (reference runs highlighted in bold in Table4) and joint inferences for NICER and XMM-Newton data sets (analysis settings reported in Table4, corresponding to "yes" entries in the XMM-Newton column).

Figure 11 .
Figure 11.Example showing how our PDT-U maximum likelihood (MaxL, in the legend) solution, obtained in the joint NICER and XMM-Newton analysis, represents XMM-Newton PSR J0030+0451 and background (BKG) data, for MOS1 (left panel) and MOS2 (right panel).The plot shows similar quantities as Figure5, but for the XMM-Newton MOS1 and MOS2 total and background data sets.The dash-dotted line (magenta, for MOS1 and light blue, for MOS2) shows the background data used in our analysis; the inferred background is plotted with a solid line (in magenta for MOS1 and dark blue for MOS2).Given its low value, the uncertainties (shown with shaded areas, purple for MOS1 and dark blue for MOS2) are almost imperceptible.The dashed more scattered line shows the total on-source data, for MOS1 on the left and MOS2 on the right; the solid line (purple for MOS1 and blue for MOS2) is the inferred expected signal.We display the contribution of the primary, the secondary, and the background associated with the maximum likelihood sample, adding them (in this order) on top of each other with areas of different intensities of blue, for MOS1, green for MOS2.
The authors thank Evert Rol and Martin Heemskerk for technical assistance.This work was supported in part by NASA through the NICER mission and the Astrophysics Explorers Program.S.V., T.S., A.L.W., D.C., Y.K. and T.E.R. acknowledge support from ERC Consolidator Grant No. 865768 AEONS (PI: Watts).This work was sponsored by NWO Domain Science for the use of the national computer facilities.NICER work at NRL is supported by NASA.S.B. acknowledges support through NASA grant 80NSSC22K0728.SG acknowledges the support of the CNES.W.C.G.H. acknowledges support through grant 80NSSC23K0078 from NASA.S.M.M. acknowledges support from NSERC Discovery Grant RGPIN-2019-06077.A portion of the results presented was based on observations obtained with XMM-Newton, an ESA science mission with instruments and contributions directly funded by ESA Member States and NASA.This research has made use of data and software provided by the High Energy Astrophysics Science Archive Research Center (HEASARC), which is a service of the Astrophysics Science Division at NASA/GSFC and the High Energy Astrophysics Division of the Smithsonian Astrophysical Observatory.We acknowledge extensive use of NASA's Astrophysics Data System (ADS) Bibliographic Services and the ArXiv.

Table 1 .
Shared Model Parameters

Table 2 .
Additional Model Parameters for PST and PDT Hot Spots Note-Definition of the parameters necessary to describe PST and PDT hot spots.A PST hot spot is characterised by an omitting and an emitting spherical cap, respectively labeled above with the subscript o and e.A PDT hot spot is characterised by a ceding and a superseding spherical cap, respectively labeled with the subscript c and s.The parameters mentioned in the table with subscripts o/c hence refer to properties of the omitting component, for a PST hot spot, or to the ceding one, for a PDT hot spot.* χ/2π = ϕ o/c − ϕ e/s , where ϕ marks phases and the subscripts clarify the corresponding component.

Table 3 .
Most Relevant Parameters Describing MultiNest Settings

Table 5 .
Table describing the main properties of the three modes identified adopting the ST-U model.The first two rows show means and standard deviations of mass M and equatorial radius Req posterior distributions; the last two the corresponding maximum log-likelihood and local log-evidence values.

Table 7 .
Mass [M⊙]Radius [km] log (E) max (log (L)) max (log (LN)) max (log (LN&XMM)) Core hours Summary of the results for the reference NICER-only inferences and the joint NICER and XMM-Newton runs.Uncertainties represent the 68% credible intervals starting from the median of the marginalized 1D posterior distribution.Results above the double horizontal lines correspond to NICER-only analyses; the ones reported below to joint NICER and XMM-Newton inferences.For NICER-only analyses, we report medians, credible intervals, evidences E, maximum likelihood values log max LN (in a joint NICER and XMM-Newton framework log max L N&XMM ) and core hours associated with the run with MultiNest settings: SE 0.3, ET 0.1, LP 10 4 , MM on.The log max LN marks the maximum likelihood values of the run identified by the line.The joint NICER and XMM-Newton inference runs are performed with MultiNest settings: SE 0.8, ET 0.1, LP 10 3 , MM off, except for the ST-U case, the only model for which we could perform a SE 0.3, ET 0.1, LP 10 4 , MM on run.All the inference analyses, apart from those using the ST-U model, assume low resolution in terms of X-PSI settings (see Section 2.3.1 of V23a).For ST-U and ST+PST models, R19 report respectively a mass of 1.09+0.11−0.07M⊙ and 1.34 +0.15 −0.16 M⊙ and a radius of 10.44 +1.10 −0.86 km and radius of 12.71 +1.14 −1.19 km.Secondary mode of the NICER and XMM-Newton run with SE 0.3, ET 0.1, LP 10 4 , MM on (in this specific case, to describe mass and radius posterior, we use means and standard deviations and we report the value of the local evidence, as calculated by MultiNest).