Independent Evidence for earlier formation epochs of fossil groups of galaxies through the intracluster light: the case for RX J100742.53+380046.6

Fossil groups (FG) of galaxies still present a puzzle to theories of structure formation. Despite the low number of bright galaxies, they have relatively high velocity dispersions and ICM temperatures often corresponding to cluster-like potential wells. Their measured concentrations are typically high, indicating early formation epochs as expected from the originally proposed scenario for their origin as being older undisturbed systems. This is, however, in contradiction with the typical lack of expected well developed cool cores. Here, we apply a cluster dynamical indicator recently discovered in the intracluster light fraction (ICLf) to a classic FG, RX J1000742.53+380046.6, to assess its dynamical state. We also refine that indicator to use as an independent age estimator. We find negative radial temperature and metal abundance gradients, the abundance achieving supersolar values at the hot core. The X-ray flux concentration is consistent with that of cool core systems. The ICLf analysis provides an independent probe of the system's dynamical state and shows that the system is very relaxed, more than all clusters, where the same analysis has been performed. The specific ICLf is more $\sim$5 times higher than any of the clusters previously analyzed, which is consistent with an older non-interactive galaxy system that had its last merging event within the last $\sim$5Gyr. The specific ICLf is predicted to be an important new tool to identify fossil systems and to constrain the relative age of clusters.


INTRODUCTION
Fossil groups (FGs) are usually characterized as systems dominated by a single giant elliptical galaxy at least two magnitudes brighter than the second ranked galaxy (∆m 1,2 ≥ 2), within half the virial radius r 200 , (Jones et al. 2003) with extended X-ray emission corresponding to luminosities of more than 10 42 h −2 50 erg/s). Even though the first FG was discovered more than two decades ago (Ponman et al. 1994), their origin and evolution are still debated. FGs were originally thought to be the cannibalistic remains of galaxy groups that lost energy through dynamical friction (e.g., Mulchaey & Zabludoff 1999). Given the expected large times involved in dynamical friction and the lack of evidence of clear Xray substructures, the original explanation for their nature was that FGs formed early and have been undisturbed for a very long time (e.g., Ponman et al. 1994;Jones et al. 2003;Vikhlinin et al. 1999). Hereafter, we refer to this scenario as the "standard model".
Further X-ray and optical measurements have shown an increasing number of unusual characteristics in many FGs, which challenged the standard model. The intracluster gas temperatures (T X ) of FGs were often found to be similar to that of galaxy clusters, sometimes in excess of 4 keV (e.g., Khosroshahi et al. 2006aKhosroshahi et al. , 2007. Measurements of galaxy velocity dispersion (σ los ) in FGs (e.g., Cypriano et al. 2006;Proctor et al. 2011a) were found to be consistent with the measured T X (e.g. Khosroshahi et al. 2007;Miller et al. 2012), indicating a mass range that encompasses that of medium sized clusters. The lack of bright galaxies near the central bright cluster galaxy (BCG) in these cluster-sized systems makes them stand out.
The small number of combined detailed multi-wavelength X-ray and optical studies of FGs observed so far makes it difficult to provide an unambiguous answer about their formation epochs. On one hand, they seem to have high values of the concentration parameter, c 200 , defined as the ratio r 200 to r s , the NFW scaling radius (Khosroshahi et al. 2007). Given the correlation found between c 200 and formation epoch in Cold Dark Matter (ΛCDM) simulations, the typical FG c 200 would be associated to earlier formation epochs (e.g., Wechsler et al. 2002a;Deason et al. 2013) 1 . On the other hand, the cooling time of FGs is observed to be usually significantly less than the Hubble time (e.g., Sun et al. 2004;Khosroshahi et al. 2004Khosroshahi et al. , 2006a, but they often lack the large cool cores expected for a very old and undisturbed system and a paucity of central AGN activity (Miraghaei et al. 2014;Khosroshahi et al. 2017). Therefore, with currently available data, it is difficult to solve the above mentioned peculiarities of FGs and to determine whether they represent a "physical" class on their own. Not all systems classified as FGs show the above mentioned paradoxical characteristics (e.g., Sun et al. 2009;Bharadwaj et al. 2016;Voevodkin et al. 2010). This in part can be due to the "purity" of the samples selected for analysis. Even though there is theoretical support for a relation between high magnitude gap and early accretion of a significant fraction of the system's mass (e.g., D'Onghia et al. 2005), ∆m 1,2 alone is not a necessary condition for the system to be identified as an FG (e.g., Dariush et al. 2010), since the majority of them will lose the magnitude gap in a few Gyr, i.e., the magnitude gap of an individual system is highly transitory (Kundert et al. 2017;von Benda-Beckmann et al. 2008;Gozaliasl et al. 2014b). Therefore, this selection criterion is prone to be affected by many physical and observational systematics. The infall of new galaxies by the group may reduce the magnitude gap while the opposite may happen due to bright galaxies' mergers with the BCG. The simple orbital motion of member galaxies will produce a variance in the gap that can be as high as 2 mags (Kundert et al. 2017;Raouf et al. 2014). Furthermore, this type of selection is susceptible to galaxy membership mis-identification and/or poor completeness (Voevodkin et al. 2010). Li & Cen (2020) studied the evolution of the most stellar deficient groups in N-Body + semi-analytical cosmological simulations and found that low mass FGs (10 13.4 M < M < 10 13.6 M ) form earlier than normal groups and have lower stellar mass fraction. They also found that selections of FGs based on magnitude gaps show relatively low levels of purity. Nevertheless, magnitude gaps are the "cheapest" markers. Other, more physically supported, markers are expensive observation-wise. For example, as mentioned previously, concentration measurements such as c 200 are a good physical indicator of early formation 1 Note that the definition of formation time is often the epoch at which 50% of the system's final virial mass is assembled, while Wechsler et al. (2002b) has a more ample definition independent of the specific parametrization epoch, but to measure it, one needs good X-ray or lensing observations as well. Using a careful criteria to maximize purity we have selected a sample of FGs with good multi-wavelength data that can help to clarify the nature of FGs. The original selection started with the maxBCG cluster catalog (Koester et al. 2007), where on top of a magnitude gap criteria, low richness-high X-ray luminosity systems were included. FGs tend to have relatively low richness with respect to their mass, as pointed out by Proctor et al. (2011b). Follow-up with Chandra snapshots allowed to restrict the sample further based on the absence of cool cores and AGN activity, which were further confirmed by deeper XMM-Newton observations. We also obtained large-scale spectroscopy of the targets to verify membership. Additionally, we used multiwavelength HST observations, which, when coupled with the other high quality observations, can provide invaluable information about the intracluster light (ICL) distribution that can act as both a dynamical probe and age indicator, as explained below. Here we present the results for the first FG where this combined analysis has been applied.
Recent advances in measuring the ICL, i.e., stars that are not bound to individual galaxies in a cluster, provided more robust techniques that allowed us to place constraints to the nature of FGs. For the sake of argument we can broadly separate the ICL growth processes in two types: those that increase together with the system's mass (a), from those that do not (b). The first type includes cluster merging, where galaxy-galaxy interaction is significantly enhanced increasing disruption of dwarf galaxies and mergers with the BCGs. This increases the ICL production over the sum of the premerging individual sub-cluster's ICL (e.g., Krick & Bernstein 2007;Rudick et al. 2011;Jiménez-Teja et al. 2021a) 2 . Type (b) would include regular tidal stripping due to internal dynamical friction. Other minor processes such as star formation of stripped gas (e.g. Sun et al. 2010) can be included in (a) or (b).
Jiménez-Teja et al. (2018) applied a sophisticated ICL measuring technique to a sub-sample of 10 galaxy clusters from the Cluster Lensing And Supernova survey with Hubble (CLASH) survey and the Bullet cluster. Their results have been further expanded to include other HST Frontier Field clusters (de Oliveira et al. 2022), RELICS clusters (Jiménez-Teja et al. 2021b) and even ground data using the Javalambre Photometric Local Universe Survey (J-PLUS) data (Jiménez-Teja et al. 2019). We hereafter refer to the HST-observed sample mentioned above as ICLHST. Their results showed that merging clusters had higher intracluster light fraction (ICLf) fluxes than the relaxed ones and, more interestingly, that merging (dynamically active) clusters had an excess flux in the 4000-5000Å rest-frame wavelength range, which corresponds roughly with the peak emission of main sequence 2 Notice that CICLE is not subject to the systematics involved in measuring ICL thorough surface brightness cut-off methods. So in Figure 2 of Rudick et al. (2011) one would not expect to see the ICLf "dips" during mergers.
stars of late-A to early-F spectral types. The ICLf is defined as the ratio of ICL to total (i.e., ICL+galaxies) fluxes. They hypothesized that mergers violently amplify tidal stripping, rapidly removing stars from the outer part of galaxies. Furthermore, these merger-induced stripped and relatively shortlived bright stars would temporarily increase the 4000-5000 A ICL flux before evolving out of the main sequence, reducing their flux contribution and returning the ICLf wavelength distribution to that characteristic of relaxed cluster, i.e., flat. So, even though the full ICLf at any single time may not be a robust indicator of the cluster's dynamical state, as mentioned in Rudick et al. (2011), the ICLf in particular wavelength ranges can provide information about the cluster's dynamical state at a particular time.
It is expected that the ICL continuously increases with cluster mass (e.g., Lin & Mohr (2004); Rudick et al. (2011);Zhang et al. (2011)). The heuristic classification of ICL growth types described in the previous paragraphs suggests that the ICL production occurs in two different regimes: a steady regime, where the ICL is produced by tidal stripping enhanced by internal dynamical friction (in between merging events) and a violent regime during mergers with a fast injection of stars into the ICL. That would imply that the growth over time of the system's mass and that of ICL are different. In this very simplified view, when two clusters merge, the final mass is the sum of their pre-merging masses but the ICL of the merged system would be greater than the sum of the ICL of each pre-merging system, given that the merging process itself would produce new ICL through the violent regime. In between mergers, the ICL would still continue to grow even though the cluster's mass would not. So, if two clusters formed at the same time, following different merging trees and keeping similar average merging histories, i.e., similar overall mass distributions of constituent halos and subhalos, one would expect them to roughly have similar masses and ICL-to-mass ratios at any particular epoch. The latter would steadily grow even after all surrounding bound halos collapse into the final system. On the other hand, if one of these systems (S1) formed earlier than the other (S2), it would achieve its maximal final mass prior to S2 and start growing its ICL-to-mass ratio under the steady regime only. So, at the moment S2 had its last merger, S1 would have the same mass but a higher ICL-to-mass ratio than S2. FGs, within the standard model, would correspond to S1 and one would thus expect them to have an enhanced ICL-to-mass ratio. The same reasoning can be applied to the ICLf-to-Mass ratio.
2. DATA: OBSERVATIONS, CALIBRATION AND METHODOLOGY 2.1. HST Three orbits of the Hubble Space Telescope (HST) time were allocated to image the fossil group RX J1007+3800, in Cycle 25 (PI: Dupke). This proposal was part of a larger program, which also granted 53 ksec XMM-Newton (XMM) observation. HST images were taken with the Advanced Camera for Surveys (ACS) on May 11th 2018, using two different filters: F435W (one orbit) and F606W (two orbits). Each orbit was divided into four dithered exposures in order to cover the gap between the two ACS detectors, mitigate against bad detector pixels, to enable cosmic ray rejection, and to provide sub-pixel sampling in order to improve the final resolution of the stacked images. Individual raw exposures were first processed by the default HST pipeline in MAST with CALACS 4 , which applies several detectorlevel calibrations including bias subtraction, flat-fielding, and correction for charge transfer efficiency losses, as well as applying geometric distortion corrections using drizzlepac 5 . We then applied additional processing to these calibrated exposures, in particular improving the rejection of cosmic rays and bad pixels, and significant improvements to the astrometric alignment, following procedures first developed by Koekemoer et al. (2011), achieving better astrometric precision than provided by the default pipeline. We thereby produced combined mosaics with parameters optimized to our specific observational design, in order to improve the quality of the final mosaics, obtaining cleaner stacked images and a better sampling of the PSF, to take advantage of our sub-pixel dithering observing strategy. Our final mosaics are at a pixel scale of 0.03 per pixel for optimal PSF sampling, as well as 0.06 per pixel for computational-intensive analysis of the ICL. Final exposure depths were 2255 and 4493 seconds for the F435W and F606W filters respectively.
We measured the ICL of both images using CICLE (CHEFs ICL Estimator, Jiménez-Teja & Dupke 2016). CI-CLE is an algorithm specially designed to estimate the ICLf in galaxy clusters. In order to disentangle the galactic contribution from that of the ICL, foreground stars are usually masked out and galaxies are fitted using mathematical orthonormal bases called CHEFs (Chebyshev-Fourier functions, Jiménez-Teja & Benítez 2012). CHEFs efficiency and flexibility to model the surface distribution of galaxies are directly inherited from the mathematical properties of the Chebyshev rational functions and Fourier modes that compose them. Moreover, the CHEFs have proven to be able to fit very different galaxy morphologies (Jiménez-Teja & Benítez 2012), including BCGs (Zitrin et al. 2010). This flexibility allows the CHEFs to fit any regular galaxy of a cluster since they appear as reasonably well defined clumps of luminosity over the smooth, extended ICL background. However, the BCG extended halo can be easily misidentified with ICL since the transition from the BCG-dominated to the ICL-dominated area is gradual. For this reason CI-CLE outlines the limits of the BCG applying a more sophisticated approach before modeling it out with the CHEFs. Basically, CICLE calculates the curvature at each point of the BCG+ICL composite surface, intuitively understood as the intensity surface profile. The curvature is defined as the difference in the slope between a point and its surroundings. Once the region where the slope of the composite surfaces is maximized the transition from BCG to the ambient ICL is determined. Then, CICLE removes the BCG component and uses the surrounding region to interpolate the underlying ICL. To use the maximum slope change as the transition from BCG to the "true" ICL is physically justified because the stars from the BCG are supposed to have different kinematics as those from the ICL, since they have originated through different processes. CICLE is based on the assumption that the two profiles, the BCG and the ICL, have a different inclination, so the points where the curvature is maximized indicate the transition from the BCG halo to the ICL. This algorithm has been successfully tested with mock data (Jiménez-Teja & Dupke 2016), space and ground-based observations to both nearby (Jiménez-Teja et al. 2019) and intermediate-redshift clusters (Jiménez-Teja et al. 2018, 2021ade Oliveira et al. 2022). Once the ICL maps are computed, we calculate the ICLf.

Gemini
In previous works, we used to identify the member candidates using spectroscopic redshifts (Jiménez-Teja & Dupke 2016;Jiménez-Teja et al. 2018, 2019, which have the advantage of being much more precise than photometric redshifts. On the other hand, spectroscopic samples usually are not complete, and FGs have a paucity of bright galaxies in the central regions so that mis-identifications errors can bias the results significantly. So, we prioritized purity over completeness. In the case of RX J1007+3800, only 9 galaxies with spectroscopic redshift associated to the peak of the cluster at z ∼ 0.112 (within ±1500 km s −1 ) and inside an area of ∼ 0.61 × 0.61 Mpc 2 (∼ 6 × 6 arcmin 2 ) are publicly available (Aguado et al. 2019). The number of galaxies associated to the cluster increases to 16 when the search area is extended ∼ 1.4 × 1.4 Mpc 2 (∼ 12 × 12 arcmin 2 ). To overcome the lack of spectral information in the FG field, in particular in the central regions, where there are only a few galaxies with spectroscopic redshift information, new spectroscopic redshifts were obtained using the Gemini Multi-Object Spectrograph mounted at Gemini North telescope (GMOS, Hook et al. 2004). We used the SDSS DR15 to construct the colormagnitude diagram (CMD) and select the galaxies for spectroscopic follow-up inside the GMOS field of view (5.5 × 5.5 arcmin 2 ). Figure 1 shows the CMD of all galaxies with SDSS Figure 1. Color-magnitude diagram of all object classified as galaxies inside the GMOS field of view. Gray points represent all galaxies with SDSS DR15 magnitudes brighter than r = 23 mag and the green dots represent the selected FG galaxies candidates. About ∼ 61% of the selected sample (66 galaxies) were included in two GMOS masks (see text). photometry and r ≤ 23 mag (gray points). One-hundred and eight galaxies brighter than r = 21.5 were selected as member candidates for spectroscopy (green dots in Fig. 1). A total of 66 galaxies (∼ 61% of the selected sample) were included in two GMOS masks, prioritizing those galaxies located within ±1.5σ around the best-fit for the galaxies lying in the red sequence (red solid and dashed lines in Fig. 1). It is worth noting that the apparent magnitude cutoff (r = 21.5 mag) is one magnitude deeper than the second ranked galaxy (∼ M * r + 3.5 mag). The two GMOS masks were observed at different nights under the Program ID: GN-2019A-FT-206 (PI: Dupke). Mask 1 was observed on 2019 May 25 UT during dark time, under clear sky and poor seeing (∼ 1 ) conditions. The second mask was observed a month later, on 2019 June 20 UT, during dark time, with some cirrus (patchy cloudy) and under good seeing (∼ 0. 7) conditions. The spectra were acquired using the R400 grating centered at 6250Å, using 1 slitest and 2 × 2 binning. Wavelength offsets of 100Å toward the blue and the red were applied between exposures to cover the gaps between CCDs. At each wavelength setting, spectroscopic flats and CuAr comparison lamps spectra were taken before or after each science exposure. The science spectra were flux-calibrated using the spectrophotometric standard star Feige 34 observed with the same instrument setup than the science images, but on a different night (April 24, 2019 UT) and under different observing conditions. Therefore, only a relative flux calibration of the science spectra are provided.
The science images were reduced using the Gemini GMOS package version 1.14 6 following the standard procedures for multi-object spectroscopic (MOS) observations. All science and calibration exposures were over-scanned, bias-subtracted and trimmed. The two-dimensional science exposures were then flat-fielded, wavelength-calibrated, distortion-corrected, and extracted to one-dimensional format. The final wavelength solution has an average rms of ∼ 0.25Å. The resolution of the one-dimensional extracted spectra is ∼ 7.1Å (measured from the sky lines FWHM), with a dispersion of ∼ 1.5Å pixel −1 , and covering a wavelength interval be- The redshifts of the selected galaxies in the FG field were determined using the RV package inside IRAF. All spectra were cross-correlated with four high signal-to-noise (S/N) templates using the program FXCOR. The redshift errors were estimated based on the R statistic value of Tonry & Davis (1979). For galaxies with obvious emission lines, a line-by-line Gaussian fit was employed using the routine RVIDLINES. The errors of the measurements were estimated using the residual of the average redshift shifts of all measurements provided by the program. We were able to determine the redshifts for all 66 galaxies included in the two GMOS masks (100% success rate) plus one additional galaxy found by chance in one of the slits.
The GMOS field of view covers a physical area of 0.68 × 0.68 Mpc 2 , corresponding roughly to the core of RXJ1007. To obtain a robust estimation of the dynamical parameters, we have to increase the number of member galaxies beyond the GMOS field of view. We used the SDSSDR15 database to retrieve the spectroscopic redshift information of all galaxies within a field of view 24 ×24 (∼ 2.95 × 2.95 Mpc 2 ). The search was limited to galaxies with 0.0 < z < 0.3. Our GMOS sample is then supplemented with additional 53 galaxies inside the above area and redshift interval. Of the 53 galaxy redshifts retrieved from SDSSDR15, 10 galaxies have redshifts estimation in common with GMOS. The redshifts obtained with GMOS agree well with those redshifts in SDSSDR15 database, with a mean difference of 42 ± 20 km s −1 (rms of 63 km s −1 ) between both data sets.
The final galaxy catalog contains 98 galaxies with secure redshift determinations. The average redshift, the one dimensional line-of-sight velocity dispersion and the number of member galaxies of the cluster were estimated using the robust bi-weight estimators of central location (C BI ) and scale (S BI ) of Beers et al. (1990), using an iterative procedure and applying a 3-σ clipping algorithm to remove outliers. The best estimates of the location (Z) and scale (σ los ), as well as the number of member galaxies, the r 200 and the M 200 (see below) are shown in Table 1. The list of member galaxies of RXJ1007, magnitudes, colors and redshifts are shown in Table 6 in the Appendix B. For completeness, Table 7 in Appendix B shows the catalog of the foreground and background galaxies observed with GMOS.
The upper panel of Figure 3 shows the redshift distribution of all galaxies with z ≤ 0.3 within the 24 ×24 field of view. The red histogram shows the distribution of the 46 spectroscopic confirmed member galaxies of RXJ1007. The inset in the upper panel shows projected phase-space diagram for all spectroscopic confirmed member galaxies, given by the peculiar line-of-sight velocity of each member galaxy with respect to the mean cluster velocity, normalized by the velocity dispersion of the cluster (equation 1) as a function of the projected distance from the center of the cluster, normalized by the r 200 .
The dashed lines in the inset represent the escape velocity for an NFW halo with the corresponding M 200 projected in the line of sight Jaffé et al. 2015). The galaxies located inside the dashed lines are expected to be gravitationally bound to the cluster, i.e., located in the virialized region of RXJ1007. The spatial distribution of the spectroscopic confirmed members is shown in Figure 2. M1 an M2 are the first and second rank members with a ∆m 1,2 = 2.45 and a projected separation of 0.66 Mpc.
The M 200 in Table 1 was computed using the σ-M 200 scaling relation of Munari et al. (2013) obtained from zoomed-in hydro-dynamical simulations of Dark Matter (DM) halos calibrated using Dark Matter particles and taking into account prescriptions for cooling, star formation, and Active Galactic Nuclei (AGN) feedback: . Bottom Panel: Colourmagnitude diagram of all galaxies brighter than r =23 mag inside 24 ×24 in the RXJ1007 field (gray points). The squares and triangles represent the members and foreground/background galaxies, respectively. The galaxies observed with GMOS are represented by red squares (member galaxies) and blue triangles (foreground/background galaxies). The solid red line fits the red sequence of the FG, while the dashed lines represent the ±1.5σ from the best-fit of the red sequence.
where σ 1D is the one-dimensional (1D) velocity dispersion,    Table  2. Only data from the European Photon Imaging Camera (EPIC) (MOS and PN detectors) are processed and reported in this paper. The standard Science Analysis System (SAS 18.0.0) pipeline tools were used throughout this analysis. SAS tools emchain, epchain for regular and also Out Of Time (OOT) events were used to generate calibrated event files from the raw data. mos-filter and pn-filter were subsequently used to remove the soft proton flares. Point sources resolved with SAS tool cheese were removed and additional sources were subsequently removed manually using the xmmselect GUI after verification by eye using regions large enough to encompass at least one XMM PSF (20 ) .
The background is composed of several components, including soft proton flares (SP), cosmic ray induced instrumental fluorescent lines, solar wind charge exchange (SWCX) and several astrophysical components, including the Local Bubble, the Galaxy Halo in the soft-bands and a power-law component for the cosmic X-ray background (CXB) (for details see www.cosmos.esa.int/web/xmmnewton/epic-background-components). We determined the main parameters of these components directly through spectral fittings following the recommendations of Snowden & Kuntz (2014) 8 .
MOS and PN instrumental backgrounds were calibrated with Filter Wheel Closed (FWC) Data (as described below). MOS 1 CCD#6 in Observation 0824910201, CCDs #3, #4& #6 were not considered in Observations 0824910201 and 0824910101, due to known primary or secondary damage from micrometeorite hits. The effective exposure times for the source data are shown in Table 2.
Production of intermediate files necessary to create model background spectra for the specific regions analyzed here, including spectra and responses, was made with the tools mosspectra and pn-spectra, using the most recent Filter Wheel Closed (FWC) calibration data. Quiescent particle back- ground (QPB) spectral models were generated with the tool mos back and pn back with OOT events subtracted and the solid angle of the individual regions was derived from the task proton scale.

Background Treatment
Given the complexity of XMM detector's background, we determined the remaining non-quiescent and cosmic background through spectral fitting modeling all components, together with an absorbed CIE model for the intracluster gas residual emission for a large external annular region from 3 -12 (∼360-1,450) kpc, which starts at ≥ 0.5 r 500 ) centered at RXJ1007, called hereafter the "outer region". We chose the outer region to be thick enough to have better statistics to constraint the other modelled background components, even at the cost of having to insert a source component (the FG emission), without overwhelming its contribution. Given that the system is at an intermediate redshift, we believe this is a good option and allows us to have slightly better statistics overall, given that we do not refit the slopes of the soft proton background for the each annulus. Spectra from all detectors were individually grouped to have a minimum of 20 counts per channel with the ftool grppha.
The FG contamination in this external region was modeled with an absorbed apec model. The nH column (1.35×10 20 cm −2 ) was chosen from the HI4PI Map (HI4PI Collaboration et al. 2016) through the HEASARC nH tool 9 . Abundances listed here are with respect to the photospheric value (Anders & Grevesse 1989). Following the prescription laid out by Snowden et al. (2004), we introduced Gaussian components in the spectral model that were not included in the QBP spectra with energies of 1.486 keV (Al Kα) for MOS & PN, 1.74 keV (Si Kα) for MOS to account for instrumental fluorescence lines and also five more Gaussians for the PN for the main contribution from Cu, Ni & Zn at 7.49 keV, 7.11 keV, 8.05 keV, 8.62 keV, and 8.90 keV. Two extra Gaussians are included to account for potential SWCX strongest contribution to the O VII & O VIII lines (e.g., Kuntz 2019) at 0.56 and 0.65 keV. Two more apec models were used, one for the Local Bubble emission with a fixed temperature of 0.1 keV and an absorbed one to account for 9 heasarc.gsfc.nasa.gov/cgi-bin/Tools/w3nh/w3nh.pl the hot Galactic Halo with a fixed temperature of 0.26 keV. Their metal abundance and redshift are fixed at 1 and 0 respectively and their model normalizations were allowed to vary independently. The redshift of the FG emission was fixed at the nominal value. An absorbed power law with a slope of 1.46 was also incorporated to account for the CXB.
To constrain the cosmic background component we added a fourth data group set with ROSAT All-Sky Survey (RASS) spectrum of the off-source (1 • -2 • ) background as a good approximation for the cosmic background in the direction of the target, obtained from the HEASARC X-ray Background Tool 10 with the respective responses.
We then fit the background spectra with XSPEC 12.11.0 using the following model: gauss + gauss + gauss + gauss + gauss + gauss + cons*cons *(gauss + gauss + apec + (apec + apec + pow)*wabs + apec*wabs). The constant components are for the relative normalization between the detectors and the solid angle scaling.
To determine the contamination level by soft protons (SP) a broken power law with a break at 3.0 keV was added as a separate model using the diagonal responses included in the updated Current Calibration Files (CCF). The best fit values for the broken powerlaw slopes and the FG residual contamination in the outer region are shown in Table 3. These slopes are fixed in the spectral fits for all internal regions.
For the sake of comparison, we also carried out an alternative background strategy, in which we reduced the size of the outer region to a very external annulus, where no source contribution was expected, and added a carefully chosen external pointing to fix some of the CXB parameters. In this strategy we re-modelled the full background (including the QPB) in every cluster region. The results of this alternative method are mostly consistent with the the previous and are shown in Appendix A.

RESULTS
3.1. ICL CICLE was applied to the fully-calibrated images of RXJ1007, both in the F435W and the F606W bands. Results show a relatively compact ICL (especially, in the F606W filter), highly concentrated around the BCG (Figure 4). Its distribution is smooth without signs of substructure, bumps or irregularities of significance. The obtained ICLfs are: 7.24 ± 3.48% and 12.39 ± 0.5%, for the F435W and F606W bands and detection limits of 137.7 kpc and 178.0 kpc, respectively. The error bars do include both the photometric error associated to the measurement of the flux of the ICL and the cluster galaxies, and the theoretical error associated to the CICLE algorithm. This latter error is calculated simulating images with the same observational and geometrical characteristics as the original images, (i.e., simulating a composite BCG+ICL surface using exponential profiles with the same effective radii and magnitudes as the original objects, and adding noise with the same signal-to-noise as our HST data). CICLE is then applied to these simulations to calculate its intrinsic error, basically associated with the accuracy in the calculation of the points where the BCG-to-ICL transition occurs. The ICL maps in both filters are displayed in Figure 4 along with the original RXJ1007images.
As mentioned previously, the ICLf can be used as an indicator of the dynamical stage of a cluster of galaxies in the redshift range 0.18≤z≤0.55 (Jiménez-Teja et al. 2018, 2019. Merging (or dynamically active) clusters show a clear signature in the ICLf measured at different wavelengths: an excess in ICLf measured in the filters that correspond to the peak emission of late-A/early-F type stars, hereafter called blue ICL excess (BIE). The relaxed clusters displayed a constant ICLf (within the error bars) independently of the optical band used to measure it. The BIE is consistent with being produced by the stripping of relatively younger and shorter lived stellar populations in the outskirts of galaxies into the ICL during merger. In this scenario, it would be expected that the A-F brighter population responsible for the BIE would leave the main sequence towards the giant locus effectively vanishing from the ICLf in the frequencies corresponding to the BIE, within a relatively short timescale after being stripped. Roughly, given a half-life for this stellar population of about ∼ 2 Gyr (for F2 star) after being stripped it would be reasonable to assume that that would be the duration of the BIE, after which, we would observe the overall ICLf wavelength distribution similar to that of relaxed clusters, i.e., a flat distribution profile ( Figure 5).
In Figure 5, in addition to all ICLHST clusters, we also included a recently measured merging system, WHL  Figure 5. ICLf rest-frame color distribution for merging (red) and relaxed (blue) clusters studied in Jiménez-Teja et al. (2018). Black lines indicate the error weighted mean of each main sequence spectral-type subsample and colored shadowed areas, the mean of the errors. We also plot the values for a candidate fossil cluster RX JJ105452.03+552112.5 in green Yoo et al. (2021). Gray vertical lines at the top of the figure split the wavelength range expected for the peak emission for an average main sequence star, labeled with gray letters. Although the distribution for relaxed clusters is mostly flat, that of merging clusters shows an excess in the region corresponding to the emission peaks of late A-and early F-type stars. RXJ1007 distribution is shown by the black squares and does not follow the typical behavior of a merging nor of a normal relaxed cluster. For the sake of visibility, we used a symbol size larger than the small errorbars for the F606W filter J013719.8-08284 at z=0.566 (Jiménez-Teja et al. 2021a), two Frontier Fields clusters with ICL analyzed very recently, A370 (z=0.375) and S1063 (z=0.348) (de Oliveira et al. 2022) and a potential "fossil system", where a somewhat similar ICL analysis has been performed (Yoo et al. 2021), RX J105453.3+552102 at z∼0.47. Compared to the systems that we analyzed in the above mentioned previous works, in Figure 5 it can be seen that RXJ1007 behaves in a different way from all clusters: it does not show either a constant ICLf or the previously described BIE. Even though the value obtained for the F435W band is consistent with the relaxed clusters in ICLHST, the ICLf computed for the F606W band is higher than the typical values found for the passive/relaxed systems. In addition, the measured ICLf for the F606W band is significantly higher when compared to that of F435W. That particularity could, in principle, be expected if the system has been without mergers for a very long time. If the system continued to be undisturbed for a very long time one would expect that tidally stripped stars will be of earlier types, and since all the new star formation would happen inside galaxies, the ICLf in the bluer band would be reduced and in the redder band enhanced, as we observe for this system.

Spectral analysis
We originally extracted spectra from five annular regions within the central 600 kpc: 0-50 kpc, 50-100 kpc, 50-150 kpc, 150-300 kpc and 300-600 kpc. At the redshift of RXJ1007, 1 ∼ 2.05 kpc, so that the central bin was chosen to cover a significant region of XMM's PSF with 80% of the encircled energy fraction (see XMM-Newton Users Handbook 11 ). Miller et al. (2012) also analyzed a RXJ1007 Chandra snapshot observation that we use for comparison. They found a gas temperature of T X = 2.60 +0.63 −0.53 keV including all emission within 250 kpc of the center. This previous Chandra observation showed that the AGN contamination is small, with ≈ 4% of the central bin (0-50 kpc) and is limited to r ≤ 2 . Spectral fittings of the central region with and without a central 8 did not show significant differences. The second and third bins cover the predicted cooling radius originally estimated using a T X ∼ 2.6 keV. A similar modelling as that described in section 2.3.1 for the outer region was used for each of the regions, but with the best fit slopes of the SP frozen at the best fit values found in the outer region. The normalizations were free to vary. Energy bands were restricted to 0.3-8.0 keV. Overall, 24 parameters were free to vary, as opposed to the fits for the outer region designed to measure the background, where 28 parameters were allowed to vary.
We fit MOS 1,2 and PN spectra simultaneously for each observation since the results were consistent within all 3 observations individually. The best fit values of temperatures and abundances are shown in Table 3 and plotted in Fig.  6. Both temperature and abundance profiles show negative radial gradients. The intracluster gas temperature reaches 3.0±0.1 keV in the central 50 kpc, dropping to 1.8±0.1 keV at ≥ 150 kpc. The metal abundance is very high in the most central bin within 30 kpc reaching 1.3±0.2 solar dropping steeply by a factor of three at 75 kpc still within the hot core. The abundance then flattens outwards at ∼ 0.25 solar. Central abundance gradient are fairly typical of cool core clusters, (e.g., De Grandi et al. 2004) but very rare in non-cool core clusters, which are typically dynamically active. We are unaware of any non-cool-core cluster having such a steep abundance gradient as that observed in RXJ1007.

Image analysis and surface brightness
Having followed the standard procedure for background determination and determined the main characteristics (normalizations and slopes) of the SWCX and SP contamination together with the 0.4-1.25 keV QPB image we can produce a "clean" image for analysis. We used the tool proton with the parameters for the broken power law and corresponding normalizations to obtain the SP images. We used the tool swcx with the previously determined normalizations for the lines at 0.56 keV and 0.65 keV to obtain SWCX images. After aligning them to the sky coordinates using the tool rot-im-det-sky we created a full-background-subtracted  and exposure-corrected image combining all instruments using the tool comb (with thresholdmasking parameters set to 0.02 and binning to 2). We show its smoothed version in Fig.  7. We extracted the surface brightness from 100 equallyspaced annuli centered at the X-ray center. We fit these surface brightness profiles to a single beta model which takes the form of S = S0 [1 + ( r rc ) 2 ] −3β+0.5 , where S0 is the nor-malization, r is the projected radius, r c is the core radius, β is the slope. The best fitting values are r c = 39.7 ± 1.5 kpc and β = 0.42 ± 0.003 for a χ 2 of 180 and a correlation coefficient of -0.69. The plot is shown in Fig. 8. It can be seen that the central bins present a significant excess, which is characteristic of cool core clusters. If we limit the inner fitting region such as to encompass to > 50% of the encircled energy fraction of the XMM PSF 12 or 15 we obtain r c = 53.3 ± 1.5 kpc and β = 0.44 ± 0.003 for a χ 2 of 175 and a correlation coefficient of -0.77. This is fully consistent with the values obtained by the Chandra snapshot by (Miller et al. 2012), show found r c = 50 +19 −15 kpc 13 , β = 0.5 +0.09 −0.07 in a single β-model surface brightness fitting.

Mass estimates
The masses can be derived using Euler's momentum equation which, in the absence of bulk fluid velocities (v) allows the gravitational field (g) to be fully characterized by the gas density (ρ) and pressure (P) profiles. As implied from the single beta models used to fit the surface brightness, the particle density profile as a function of the core radius, central density (n 0 , r c ), is expressed as n = n 0 (1 + ( r rc ) 2 ) −3β/2 . Following Arnaud & Evrard (1999) we call this the β model (BM) approach. For an isothermal case, the mass enclosed within a particular radius can be expressed as M (≤ r) = 1.11×10 14 β ( µ 0.6 ) −1 kT keV r M pc ( ( r rc ) 2 1 + ( r rc ) 2 )M (4) where µ is the mean molecular weight 14 . and where δ c is the density contrast with respect to the critical density. Assuming that the hot core-excised temperature is the value at and over 300 kpc, i.e. 1.78 keV, and the best fit values for r c and β from surface brightness fitting (previous section) we obtain M 500 BM = (0.39 ± 0.05) × 10 14 M , where r 500 BM = 0.53 ± 0.02 Mpc. To correct for the central negative temperature gradient, we can assume that the temperature distribution can be approximated by two isothermals, the first one with the internal temperature set at the value correspondent to the error weighted average of the regions at 200 kpc, i.e., the hot core, (2.78±0.05) keV, and Right -X-ray contours overlaid on the HST image (F606W). The 90% encircled total energy scale based on XMM PSF is ∼45 , 47 and 49 at 0 , 1.5 and 3 from the center (xmmtools.cosmos.esa.int/external/xmm user support/documentation/uhb/offaxisxraypsf.html). N is up, E is left Figure 8. Surface brightness profile for the full-backgroundsubtracted and exposure-corrected X-ray image combining all detectors for RXJ1007 observation 0653450201. We show here a single beta-model profile fit a second isothermal with our most external region value of (1.78±0.1) keV. We then add the mass difference from the two isothermals measured in the inner hot core to the total mass derived with the second isothermal and find M 500 = (0.47 ± 0.06) × 10 14 M 15 A more straightforward way to estimate the mass throughout the non-isothermal region would be to directly fit P(r) 15 A single isothermal with T X =2.78±0.05 would give M 500 = (0.61 ± 0.08) × 10 14 M within the "hot" region (r ≤ r hot ), so that where n hot is the number density at r hot , G and m H are the gravitational constant and the H mass. Using a linear fit for P(r), we obtain M 500 = (0.52 ± 0.06) × 10 14 M On the other hand, one can also estimate the system mass using scaling relations based on virialization (VT) either through X-ray temperature (Evrard et al. 1996) or galaxy velocity dispersion measurements (Carlberg et al. 1997) 16 . We chose to use this value for the mass henceforth throughout the paper because (1) it is the most consistent with that derived from velocity dispersion (section 2.2), (2) it is less prone to systematics involved in the estimation of the surface brightness parameters with XMM and (3) it provides the most conservative value in the interpretation of the specific ICLf peak at 5400Å, which will become clear in the next section.
Integrating the density up to r 500 we obtain a gas mass of M gas500 ∼ (7.59 ± 0.24) × 10 12 M for µ = 0.6. This would imply a gas fraction f gas500 = 0.119 ± 0.041 (f gas200 = 0.169 ± 0.06) 16 . Assuming a galaxy (M/L r ) ratio of 6 in the r-band, obtained converting from the i-band in Cappellari et al. (2006) for a (r − i) = 0.5, we obtain the mass from galaxies M gal500 = (4.36 ± 0.02) × 10 12 M and from the ICL M icl ∼ (0.47 ± 0.003) × 10 12 M , where the latter is a lower limit, since we conservatively estimated only up to the radius that the ICL could be measured, that is, for r≤ 178 kpc and did not extrapolate the ICL beyond that. This would bring the total baryon fraction 17 f b500 to 0.19±0.03. This is higher than all groups of similar masses analyzed in Laganá et al. (2013) and more compatible with higher mass clusters (∼ 1.5−4)×10 14 M . The ratio of stellar mass to total mass within r 500 , f * 500 = M * M500 , when the ICL is included is found to be ∼ 0.075±0.008, again very high and compatible with much more massive clusters (Laganá et al. 2013). The mass to light ratio within r 500 is consequently low and is found to be ( M500 Lr ) = 82 ± 10( M Lr ) , lower by a factor of ∼2 than the more massive REXCESS clusters, but more consistent their "REGULAR" systems (Holland et al. 2015). If the mass-tolight ratio does not vary significantly from r 500 to r 200 the system would not stand out as a "dark" system, as suggested by the FG sample studied by Proctor et al. (2011b) The entropy (S X ) calculated as S(r) = kT(r) n e (r) −2/3 at 0.1 r 200 is found to be 204±71 keV cm 2 . Assuming the error-weighted average of all independent regions (2.59±0.04 keV) 18 as representative for this system, the value is consistent with that found for other FGs , lower than that for non-FG groups, and it is closer to the extrapolation of self-similarity defined by massive clusters (Khosroshahi et al. 2007). In the most central region the entropy is S(∼ 0.02r 200 ) = 96 ± 34 keV cm 2 , lower and with a shallower profile than that of the hot FG RX J1416.4+2315 Khosroshahi et al. (2006a), but consistent with the expected entropy floor for its overall temperature (Lloyd-Davies et al. 2000) and marginally more consistent with the class of systems with entropy floor ≤ 50 keV cm 2 as defined by Cavagnolo et al. (2009). 4. DISCUSSION The standard model of FGs, as being relaxed and undisturbed systems, necessarily leads to the production of well developed cool cores. In this work we confirmed that RXJ1007 does not have a cool core. The radial intracluster gas temperature gradient is actually negative, the opposite of what one would expect in a cool core system. On the other hand, the ICM shows a steep negative radial abundance gradient typical (perhaps exclusive) of cool core clusters and groups, where the abundance may reach super solar values in the center (Ettori et al. 2015). Furthermore, its surface brightness also hints a departure from a single β model fitting in the central 15 , again, typical of cool core clusters.
Another cool core characteristic of RXJ1007 can be inferred by checking concentration parameters, in particular the so-called "surface brightness concentration". We denote it here as c ISB , since it is in reality an Integrated Surface Brightness (or flux) over some "core" region divided by that over some larger region. Santos et al. (2008) proposed this parameter when analyzing 26 clusters at low and high redshifts. They found that cool-core clusters had c ISB ≥ 0.075 using (inner/outer) radii of (40 kpc/400 kpc), respectively. Using the same radii for RXJ1007 by direct background subtracted counts in these regions we obtain c ISB = 0.1 ± 0.03. Even if we used the best-fit β model fit, which tends to underestimate the central flux, we obtain c ISB ≥ 0.078 ± 0.013, still consistent with the limit for cool core systems. In any case, this suggests that the core gas concentration is not as disrupted by the last merging event as the core temperatures.
The strongest indication of long age for RXJ1007 comes from the ICLf analysis ( Figure 5), which shows a color distribution that is not consistent with merging (dynamically active) clusters but compatible with that observed in "relaxed clusters". It shows indications of being even older than the relaxed clusters analyzed so far, based on the high ICLf found in the F606W filter. As mentioned in section 1, the specific ICLf, or ICLf M ass ratio, hereafter denoted by ICLf M , may be an indicator of relative age of the system. With a mass of ∼ 0.64 × 10 14 M within r 500 we measure significantly higher ICLf M500 ratios in comparison to all clusters, both relaxed and merging, especially in the redder filters. We found for RXJ1007 ICLf F 435W M500 = (114±56) (10 15 M ) −1 and ICLf F 606W M500 = (193±24) (10 15 M ) −1 for the F435W and F606W filters, respectively, as can be seen in Figure 9, where we plot the rest-frame ICLf M500 for all systems showed in Figure 9. It can be seen in that figure that RXJ1007 ICLf M500 stands out from all clusters, especially at its reddest measured band at λ ∼5400Å, where it is more than five times higher than perhaps the most relaxed cluster in the ICLHST sample and closest in redshift and mass, A383 (z=0.187), which in turn has an ICLf M500 more than twice as high as those measured for the intermediate-z merging clusters MACS J0717.5+3745 (z=0.548), MACS J1149.5+2223 (z=0.544) and WHL J013719.8-08284 (z=0.566) at that band.
The unusually enhanced ICLf M in the longer wavelengths suggests that the FG has been injecting ICL through the steady regime without increasing its mass for a long period of time. This is what would be expected if the system reached the end of its merging tree and has been dynamically undisturbed since then. We can estimate the maximum time of the last merger from cooling time constraints. The central cooling time can be estimated from Equations 7 (Voigt & Fabian 2004). For that estimation, we used a smaller (r=15 ) central region, chosen as the minimum region where we could still have enough counts to determine T X with reasonable precision, given the relatively large XMM PSF. The results are also shown in Table 3 and plotted in Figure 6. The temperature there is found to be T X ∼ 3 keV. In that region, we derive a central density of (6.7 ± 0.9) × 10 −3 cm −3 , corresponding to t cool = (4.8 ± 0.66) Gyr, if we use the average value of the equations 7 for temperatures over and under 3 keV. This value also satisfies one of the criteria for a cool core system (e.g. Hudson et al. 2010;Bîrzan et al. 2004 Gyr f or T X ≤ 3keV (7) The lack of a cool core in that very central region also sets the cooling time as the upper limit for the last merger event in this system. It should be noted that this value is more than twice the time needed to the late-A early-F stars injected in the ICL during a merger to evolve away and erase the BIE feature, given the average half-life of ∼ 2 Gyr for a main-sequence F2 star. In principle, the BIE lifespan could gives a rough time scale corresponding to the lower limit for the last merging event. One caveat with this lower limit is that we are assuming that there would be a BIE resulting from fast stripping of the "younger" stars from the outskirts of galaxies, particularly late-types in any merger. Certainly the situation is likely to be much more complex, perhaps involving full tidal disruption of galaxies. We did include one "fossil cluster", the only one where a similar ICL analysis has been performed, using Gemini data in i and r bands, RX J105453.3+552102 at z∼ 0.47, in Figures 5 and 9 (Yoo et al. 2021). It has a dynamically estimated mass of M 200 ∼ 8 × 10 14 M and it presents a somewhat high ICLf at the BIE peak, but has a flat profile within the (large) errors. This system is, however, unusual, showing significant departure from Gaussianity of group member velocities (Aguerri et al. 2011) and a large mismatch of the X-ray centroid and the BCG, which would be consistent with a recent merger.
The ICLf M of RXJ1007 peaks roughly (given the F606W width) at ∼ 5400Å (4300Å-6500Å) encompassing the peak emission of G-main sequence stars. Once the late type stars in the member galaxies outskirts are removed, it is plausible that tidal stripping would start to removing earlier type populations closer to the central regions of the galaxies, which differently from those stars responsible for the BIE, would be redder and have significantly longer lifespans, assuming that processes of galaxy rejuvenation (e.g., Fang et al. 2013;Mancini et al. 2019;Dimauro 2022) are not significant.
This ICLf M enhancement allows us to have a rough idea of the lower limit for an ICL injection rate during the steady regime for ICL generation. To be conservative, we compare here the ICLf M500 of RXJ1007 to that of A383 at the ICLf M peak wavelength, since it has the closest ICLf M value to that of RXJ1007 at that wavelength. If we denote and assume that η ∼ 1 at the time of the last merger event in RXJ1007 (t merge ) and that the ICL continued to be injected in a passive steady regime up to its current redshift, without increasing the system's mass, its injected ICL rate is where the subindex obs indicates the observation epoch, t merge + ∆t yr, and L stands for luminosity. We also assume that the ICL injection and the galaxy luminosity loss happen in the same proportion, i.e., ∆L ICL = −∆L gal , which ignores significant star formation during that time. However, the discrepancy with this assumption can be alleviated keeping in mind the outside-in quenching in cluster galaxies, backed up by observations (e.g., Johnston et al. 2014) and numerical simulations (Pfeffer et al. 2022). Using the observed (Fig 9) η 0 of 5.8±2.4 and the same (M/L) ratio as in the previous section we obtain a ( dM ICL dt ) r≤178kpc of (98±43) M yr −1 , for t merge = t cool . Again, conservatively, we carry out this estimation only up to ≤ 178 kpc, the radius at which the ICL could be measured confidently. Tollet et al. (2017) has predicted loss of stellar mass of satellite galaxies entering groups and clusters due to tidal stripping using cosmological N-body simulation and abundance matching technique (Conroy et al. 2007) for two different modes: shutdown and starvation. In the former, the gas available to form stars is fully removed upon entry into the host halo. In the latter, star formation continues after entry. In the case of massive systems (M 500 ≥ 10 14 M ) from Gonzalez et al. (2013), their work favored the starvation plus tidal stripping mode. In RXJ1007 the ratio of the BCG+ICL masses to the total stellar mass (M BCG +M ICL )/M * within r 500 is found to be 0.53±0.003, which significantly disagrees with the prediction for the shutdown model and, instead, is in very good agreement with the starvation + tidal stripping model found for the more massive systems (Figure 14 of Tollet et al. (2017)). Specific simulations of these combined processes, i.e., changes in radial age gradients in cluster galaxies and ICL injection will be very helpful to assess the differences in the ICL flux distribution in FGs.
The very high intracluster gas central abundance enhancement in RXJ1007 is also puzzling. It would likely require a pre-existent high abundance gradient in at least one of the pre-merging systems. A situation similar to the case of Abell 1142 (Su et al. 2016), where the merging BCGs would subsequently settle in the high metal core. Even though metal transport through BCG sloshing have been observed (e.g., Dupke et al. 2007;Simionescu et al. 2010), it is not physically implausible that central abundance gradients could survive mergers, even if the cool cores were destroyed, given the different nature of the mechanisms of thermalization and chemical mixing and the inefficiency with which post merging sloshing mixes the gas Ghizzardi et al. (2014).
Numerical simulations of cluster mergers can provide a few helpful hints with respect to the nature of the merger. To keep the central temperature high for such a long time and avoid a metal abundance mixing, small impact parameter mergers seem more likely. For example, Ricker & Sarazin (2001) work shows a core reheating prior to the core merging, about 4 Gyr after the first core-crossing, for an impact parameter b 2 r s , where r s ∼ 140 kpc 19 is the Dark Matter scaling radius.
This time scale is also consistent with the merging of the BCGs. Kitzbichler & White (2008) use a semi-analytic model to estimate the merging rate and timescale of galaxy pairs, based on the Millennium N-body simulation. They find that the merging timescale is T ∼ 3.2 r p M −0.3 ies, where the line of sight velocity difference between the two galaxies is less than 3000 km s −1 , where r p is the maximum projected separation of the two galaxies in units of 50 h −1 70 kpc and M * is the stellar mass of the galaxies in units of 4×10 10 h −1 70 M . Using the luminosity of RXJ1007 BCG and a M Li ratio in the i-band of 3.8 (Cappellari et al. 2006) we get a BCG merging time of ∼ 3.4 Gyr for an r p ≤ 2r s , assuming that each pre-merging BCG would share the mass of the post-merging BCG equally.
Independent of the selection uncertainties using the magnitude gap criterion, there is some consensus that systems that assembled a significant fraction of their final mass earlier ("old") will tend to develop large magnitude gaps. Its been argued frequently that the gap is unlikely (10%) to survive for longer periods (> 4Gyr) of time (e.g., von Benda-Beckmann et al. 2008;Dariush et al. 2010), even though this fraction can be substantially higher for systems over 10 14 M (Gozaliasl et al. 2014a). This work shows that RXJ1007 has survive as an FG for ∼ 5Gyr, and it will survive significantly longer given its wide ∆m 1,2 ∼ 2.5 ( Figure 2). So, the question of how a FG would be able to maintain itself as an FG (under the magnitude gap ∆ m1,2 criterion) over its evolutionary history is pertinent. Given the typical presence of several bright (∆ m1,2 ≤ 2) galaxies within 0.5 r 200 in galaxy clusters, it is easy to imagine that the merger of an FG with a galaxy cluster would result in a cluster and not in a FG. One could in general expect that FG+Cluster=Cluster and Clus-ter+Cluster=Cluster.
One interesting possibility for long term survival of FGs is through FG-FG merger. Something shown in detail only once so far for the Cheshire Cat (Irwin et al. 2015). The redshift difference between the two BCGs, which represent the "eyes" of the Cat, corresponds to 1350 km s −1 in the group's rest-frame. In a combined Chandra-HST-Gemini analysis of the system, Irwin et al. (2015) showed compelling evidence that that system is actually the merging of two separate galaxy groups, both qualifying (by the ∆ m1,2 criterion) as FGs 20 , which are in collision course and will merge in ∼0.9 Gyr, becoming again a larger FG. Analogous to RXJ1007, that system also has a negative temperature gradient going from ∼ 5.4 keV in the central 115 kpc to ∼3 keV in the outer parts. The high X-ray temperatures measured near the center of the Cheshire Cat are the result of shock heating from the merger of two FGs.
The existence of the Cheshire Cat highlights the potential importance of fossil group progenitors to explain the permanence of FGs over time even in the presence of merging. A search for these so called fossil progenitors in the CAS-SOWARY strong gravitational arcs catalog found ∼13% of the lensing groups were identified as FGs and that ∼23% of lensing systems were fossil progenitors, 6% higher than in the control sample. The CASSOWARY systems are a good place to look for them because strong gravitational lensing preferentially selects systems with a high mass concentration such as fossil systems (Johnson et al. 2018a).
In fact, many CASSOWARY fossil progenitors show highly asymmetric BCGs along with higher X-ray luminosities and ICM temperatures than group scaling relations predict (Johnson et al. 2018b) implying a higher than expected gravitational potential based on group richness. Fossil progenitors appear to be in mid-assembly of their dominant BCG (akin to the Cheshire Cat) making these excellent candidates to follow up observations of their ICLf to check for the presence of BIE and ICLf M . This would allow us to see the evolution of these properties prior to a "final" merger and to compare them to that of FGs.
Near future spectrophotometric surveys with high precision (0.3%) photo-z such as J-PAS, which has 54 narrowsband contiguous filters (Benitez et al. 2014;Bonoli et al. 2021), will allow us to measure the ICL SED for a very large number of systems up to relatively high redshifts z 0.6-0.8, which will provide invaluable information to the age and dynamical state of FGs.

SUMMARY
We have analyzed X-ray, optical (HST) and spectroscopic (Gemini) data of a classic FG RX J100742.53+380046.6, which presents many of the contradictory physical properties often found in these systems. We performed a combined multi-filter ICLf analysis in a bona fide FG to test its dynamical state and age and compared to other systems merging and relaxed, where similar analyses have been carried out. We found that: • the absence of a cool core in RXJ1007 was corroborated. We instead found a hot core, where the intragroup gas temperature rises to ∼ 3 keV in the central 30 kpc, and drops to ∼ 1.8 keV for r≥300 kpc. The central metal abundance is very high, reaching supersolar values (1.3 solar) in the central 30 kpc and dropping very steeply at 75 kpc still within the hot core.
• the gas fraction and baryon fraction of RXJ1007 are found to be f gas500 = 0.12±0.04 and f b500 = 0.19±0.03. This value is quite high and, in general, more consistent with higher mass systems, with M 500 ∼ (1.5 − 4) × 10 14 M .
• The stellar mass fraction,including the ICL is found to be ∼ 0.075±0.008. It is compatible with much more massive clusters. The mass to light ratio M500 Lr ) = 82 ± 10( M Lr ) , lower by a factor of ∼2 than the more massive REXCESS clusters.
• the gas entropy at 0.1r 200 is found to be 204±71 keV cm 2 , consistent with that found for other FGs , lower than that of non-FG groups and closer to the extrapolation of self-similarity defined by massive clusters. In the most central region the entropy is ∼ 96±34 keV cm 2 , consistent with the expected entropy floor for its overall temperature.
• the ICL fraction wavelength distribution analysis of the FG shows the absence of the blue ICL excess (BIE) in disagreement with what was found for merging (dynamically active) clusters and closer to that of relaxed clusters. In fact, when comparing the ICLf distribution of RXJ1007 to other relaxed clusters analyzed previously, one can see that the unique combination of a reduction in "blue" ICLf with the enhancement in "green/red" ICLf suggests that the system has been relaxed for a very long time.
• RXJ1007 very old age is particularly visible when looking at the distribution of specific ICLf, or ICLf M ass ratio.
We find for RXJ1007 ICLf F 435W M500 = (114±56) (10 15 M ) −1 and ICLf F 606W M500 = (193±24) (10 15 M ) −1 for the F435W and F606W filters. This is significantly higher in comparison to all clusters, both relaxed and merging measured so far, especially in the redder filters. This is what would be expected from a system that has not had any merger for a very long time and is producing ICL only in the steady regime.
• if we assume that the specific ICLf of RXJ1007 was the same as that of the other clusters, prior to the last merger, an ICL injection equivalent to 98±43 M yr −1 within the central 178 kpc, would be needed to raise the specific ICLf to its current values.
• the cooling time of the system estimated from the central (r≤ 30 kpc) electron density is 4.8±0.66 Gyr. This value can be considered as an upper limit for the last merging event at z∼ 0.45 the lower limit of which can be roughly estimated by the complete absence of a BIE in the ICLf as ∼ 2 Gyr, or about a half-life of a F0-F4 star.
• despite the absence of a cool core, RXJ1007 has characteristics of cool core systems including a high Xray flux concentration and, in particular, a steep radial abundance gradient achieving supersolar values in the central 30 kpc and dropping outwards to 0.2 solar at r≥150 kpc. This indicates that at least one of the premerging systems that formed RXJ1007 had a central abundance enhancement, possibly both.
• the ratio of the BCG+ICL masses to the total stellar mass within r 500 is found to be 0.53±0.003, which significantly disagrees with the prediction for the shutdown model for galaxy evolution in clusters. Instead, the results strongly favor the starvation + tidal stripping model found for the more massive systems.
Overall, this work puts forward a potential age indicator for galaxy systems using the ICL fraction, that, in this case, provided further evidence towards a longer age for FGs, even for those without cool cores. Assuming that the FG achieved the "end" of its merging tree and injected ICL only through internal dynamical friction and tidal stripping, without increasing its mass through mergers, one would expect the specific ICLf to be higher than that clusters of similar mass, but still undergoing a history of mergers and this is consistent with what we observed. Furthermore, the fact that the magnitude gap of this system has been lasting (and will last) for significantly longer times than what simulations predict justifies the search for alternative mechanisms for magnitude gap longevity. Considering that mergers in between FGs seem more likely to form an FG than cluster-cluster or FG-cluster mergers, it is possible that regions with an enhanced preference for FG formation early on would increase the chances for FG survival to z∼0. Further multi-wavelength analysis of bona fide FGs and FG progenitors will shed light on the evolution of this systems. ACKNOWLEDGEMENTS R.A.D. acknowledges partial support from NASA grants 80NSSC20P0540 and 80NSSC20P0597 and the CNPq grant 308105/2018-4. R.A.D. also thanks Dr. MARc Kessler for very insightful discussions, Drs. Francois Mernier and Zack Li for helpful suggestions. Y. J-T has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 898633. Y. J-T. also acknowledges financial support from the State Agency for Research of the Spanish MCIU through the "Center of Excellence Severo Ochoa" award to the Instituto de Astrofísica de Andalucía (SEV-2017-0709). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior -Brasil (CAPES) -Finance Code 001. Based on observations obtained at the international Gemini Observatory, a program of NSF's NOIRLab, which is managed by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the National Science Foundation. on behalf of the Gemini Observatory partnership: the National Science Foundation (United States), National Research Council (Canada), Agencia Nacional de Investigación y Desarrollo (Chile), Ministerio de Ciencia, Tecnología e Innovación (Argentina), Ministério da Ciência, Tecnologia, Inovações e Comunicações (Brazil), and Korea Astronomy and Space Science Institute (Republic of Korea). This work was enabled by observations made from the Gemini North telescope, located within the Maunakea Science Reserve and adjacent to the summit of Maunakea. We are grateful for the privilege of observing the Universe from a place that is unique in both its astronomical quality and its cultural significance. APPENDIX A. EPIC BACKGROUND MODELING WITH EXTERNAL POINTING CONSTRAINTS    Figure 10. Temperature (in keV) and metal abundance (in solar photospheric) radial profiles for RXJ1007 using a differential background modelling and an offset pointing. We also show the best-fit values from Figure 5 for comparison. In green we show the values derived from a previous short Chandra observation (Miller et al. 2012) for comparison.
To compare to the background treatment detailed in Section 2.3.1, we performed an alternative protocol. Instead of subtracting the FWC events from each observed spectrum, we have modeled it simultaneously with the science observations (Mernier et al. 2015;Su et al. 2017Su et al. , 2019. We include an additional offset pointing (PI:Koss, obsID 0821730401) found within 60 from the RXJ1007 center. The EPIC observations were processed and filtered with the same tools described in Section 2.3. Two types of background components are considered: the Astrophysical X-ray background (AXB) and non-X-ray background (NXB). The AXB model describes the emission of astrophysical sources, including the Local Bubble (apec LB ), hot Galactic Halo (apec MW ), and Cosmic X-ray background (powerlaw CXB ), as detailed in Section 2.3.1. Abundance and redshift were fixed at 1 and 0, respectively, for both apec LB and apec MW models. We determine the AXB parameters by simultaneously fitting the offset pointing spectra from a circle radius of 12 , its corresponding FWC spectra, and a RASS spectrum from a region of 0.3 • -0.9 • (≈ 2R 200 − 4R 200 ) annulus, using the X-Ray Background Tool 21 . The best-fit parameters for these components are presented in Table 4.
The non-X-ray background (NXB) component describes the high energy particles that hit the CCD detectors during the observation. In the NXB model, each MOS and PN detector is represented by a set of fluorescent instrumental lines and a continuum spectrum. Each instrumental line is modeled with a Gaussian model, where the line width is limited by ≤ 0.3 keV, and the set of fluorescent lines considered are listed in Table 2 from Su et al. (2017). We also include the SWCX lines mentioned in Section 2.3.1. Besides these lines, we model the continuum particle background with a broken power law, where the energy break is fixed at 3 keV. For each detector and observation, quiescent particle background data were generated using the evqpb task. The event lists were filtered and cleaned with the same good time intervals, PATTERN, and FLAG as the observed data. To determine the NXB parameters, we first simultaneously fit the spectra of the unexposed corner data of both RXJ1007 and its FWC (from the evqpb task) for each EPIC observation to determine the ratios of their broken power law, which is fixed for each region of interest. A quiescent continuum of soft proton may persist even after filtering solar flare events. For each detector and observation, we compare the ratio of the count rates in the 6-12 keV energy range from an inner region (removing 10 ) to those from the unexposed corner. To characterize the residual soft proton contamination, we add a power law if that ratio is above 1.15 (De Luca & Molendi 2004). The photon index of power law can vary between 0.1 and 1.4 (Snowden & Kuntz 2014).
The spectral fitting was performed in the 0.5-10.0 keV and 0.7-10.0 keV for MOS and PN detectors, respectively. For each region of interest, we jointly fit the RXJ1007 and its FWC spectra for each observation. We consider the following set of models: wabs × (apec RXJ1007 + powerlaw CXB + apec MW ) + apec LB . All FWC data were set to zero for this model. The AXB parameters were fixed at the best-fit values listed in Table 4. The Galactic extinction is modeled through the wabs model, which component was fixed to the reported average nH value of 1.36×10 20 cm −2 (HI4PI Collaboration et al. 2016). The emission of the RXJ1007 hot gas, apec RXJ1007 , is modeled for each region of interest with redshift fixed at 0.112; the plasma temperature, abundance, and emission intensity (normalization) were free to vary. The NXB model contains the NXB components, including the broken power law, set of fluorescent instrumental lines, and possibly the power law to characterize the quiescent continuum of the soft proton. We link the NXB parameters from the observed spectra and its FWC data for each region of interest. We performed all spectral fittings using XSPEC v12.12 and χ 2 statistics. We report the error-weighted average abundance and temperature with a 1σ confidence level by implementing joint fit of the MOS1, MOS2, and PN detectors fr each observation in Figure 10. B. CATALOG OF GALAXIES WITH CONFIRMED SPECTROSCOPIC REDSHIFTS Table 6 lists the member galaxies of RX J100742.53+380046.6 inside a radius of ∼ 12 (∼ 1.4 Mpc) obtained with GMOS and with SDSS DR15. Table 7 shows the list of galaxies observed with GMOS and located in the foreground and background of the RX J100742.53+380046.6. Table 6. Catalog of member galaxies with confirmed spectroscopic redshifts