Stellar Chromospheric Activity Database of Solar-like Stars Based on the LAMOST Low-Resolution Spectroscopic Survey

$\require{mediawiki-texvc}$A stellar chromospheric activity database of solar-like stars is constructed based on the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) Low-Resolution Spectroscopic Survey (LRS). The database contains spectral bandpass fluxes and indexes of Ca II H&K lines derived from 1,330,654 high-quality LRS spectra of solar-like stars. We measure the mean fluxes at line cores of the Ca II H&K lines using a 1 ${\AA}$ rectangular bandpass as well as a 1.09 ${\AA}$ full width at half maximum (FWHM) triangular bandpass, and the mean fluxes of two 20 ${\AA}$ pseudo-continuum bands on the two sides of the lines. Three activity indexes, $S_{\rm rec}$ based on the 1 ${\AA}$ rectangular bandpass, and $S_{\rm tri}$ and $S_L$ based on the 1.09 ${\AA}$ FWHM triangular bandpass, are evaluated from the measured fluxes to quantitatively indicate the chromospheric activity level. The uncertainties of all the obtained parameters are estimated. We also produce spectrum diagrams of Ca II H&K lines for all the spectra in the database. The entity of the database is composed of a catalog of spectral sample and activity parameters, and a library of spectrum diagrams. Statistics reveal that the solar-like stars with high level of chromospheric activity ($S_{\rm rec}>0.6$) tend to appear in the parameter range of $T_{\rm eff}\text{ (effective temperature)}<5500\,{\rm K}$, $4.3<\log\,g\text{ (surface gravity)}<4.6$, and $-0.2<[{\rm Fe/H}]\text{ (metallicity)}<0.3$. This database with more than one million high-quality LAMOST LRS spectra of Ca II H&K lines and basal chromospheric activity parameters can be further used for investigating activity characteristics of solar-like stars and solar-stellar connection.


INTRODUCTION
With detailed observations of solar activity for several centuries, many features and phenomena of the Sun, such as sunspots, plages, flares, etc., have been discovered and thoroughly studied.These features and phenomena are the manifestations of magnetic field activity on the Sun (Hale 1908).Observations for solar-like stars (Cayrel de Strobel 1996) revealed that magnetic activity is also common on other stars, and the connection between stellar activity and solar activity (i.e., solar-stellar connection) has become a topic of wide interest (Noyes 1996).Choudhuri (2017) collected various stellar activity data and explored the extrapolation of solar dynamo models for explaining magnetic activity of solar-like stars.The knowledge of activity of solar-like stars in turn is very helpful for understanding the activity status of the Sun (Güdel 2007).According to the classification by Gomes da Silva et al. (2021), the Sun is located in the high-variability region of the inactive main sequence star zone.Reinhold et al. (2020) illustrated that the Sun is less active compared with other solar-like stars.
Stellar activity is closely related with the rotation period (e.g., Noyes et al. 1984b;Wright & Drake 2016;Zhang et al. 2020a) and age (e.g., Mamajek & Hillenbrand 2008;Lorenzo-Oliveira et al. 2018;Zhang et al. 2019) of stars.In general, stellar activity level will decrease with increase in stellar rotation period or age.On the other hand, Maehara et al. (2012) found that the maximum energy of stellar flares is not correlated with stellar rotation period.Investigation on the relation between stellar activity cycle and rotation by Reinhold et al. (2017) reveals that the activity cycle period slightly increases for longer rotation period.
As an important aspect of stellar activity, the chromospheric activity of solar-like stars has always been a popular research subject (Hall 2008).A detailed review of stellar chromosphere modelling and spectroscopic diagnostics has been given by Linsky (2017).Stellar chromospheric activity of solar-like stars can be indicated by line core emissions of the Ca II H&K lines in violet band of the visible spectrum (e.g., Baliunas et al. 1995;Hall et al. 2007), the Hα line in red band (e.g., Delfosse et al. 1998;Newton et al. 2017), and the Ca II infrared triplet (Ca II IRT) lines (e.g., Soderblom et al. 1993;Notsu et al. 2015), etc.The emission of Ca II H&K lines of the Sun has long been known to have a strong correlation with the solar chromospheric activity (see a comprehensive review by Linsky & Avrett 1970).With the discovery of the emissions of Ca II H&K lines from other stars (e.g., Eberhard & Schwarzschild 1913), people began to explore whether it comes from the same mechanism as the solar activity and whether it has a long-term cyclic variation as the solar cycle.Wilson (1963) at the Mount Wilson Observatory (MWO) investigated the relationship between intensity of stellar Ca II H&K emission and stellar physical nature, and concluded that the chromospheric activity of main sequence stars decreases with age.Wilson (1978) found the long-term cyclic variations of stellar Ca II H&K fluxes similar to the solar cycle.Baliunas et al. (1995) presented continuous Ca II H&K emission records of stellar chromospheric activity for 111 stars, which came from a long-term observing program at MWO.At that time, the MWO S index was first introduced as a quantitative chromospheric activity indicator based on the Ca II H&K lines (Wilson 1968;Vaughan et al. 1978;Duncan et al. 1991;Baliunas et al. 1995), defined as the ratio between the flux of Ca II H&K line cores and the flux of the reference bands on the violet and red sides of the lines multiplied by a scaling factor for calibration between different instruments.Because of the high correlation between Ca II H&K emission and stellar magnetic activity (e.g., Saar & Schrijver 1987), researchers prefer to characterize stellar magnetic activity through the indicators derived from Ca II H&K lines which often involve S index.
The S index has been established as a fundamental parameter of stellar chromosphere activity.Based on S index, by subtracting the photospheric contribution to Ca II H&K flux, the true chromospheric emission of Ca II H&K lines can be extracted as an index R HK (Linsky et al. 1979;Noyes et al. 1984a).Considering the existence of the basal (lower-limit) flux of chromosphere (Schrijver 1987) which is thought to be unrelated with magnetic activity, Mittag et al. (2013) proposed a new index R + HK to reflect pure stellar chromospheric activity.The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST, also named Guoshoujing Telescope) is a telescope dedicated for spectroscopic sky survey.There are 4000 fibers within a diameter of 1.75 meters (corresponding to 5 • in the sky) at the focal surface (Cui et al. 2012).The Low-Resolution Spectroscopic Survey (LRS) of LAMOST began in October 2011, with a spectral resolving power (R = λ/∆λ) of about 1800 and a wavelength coverage of 3700-9100 Å (Zhao et al. 2012).The first year observation was for the pilot survey (Luo et al. 2012;Zhao et al. 2012), and the regular survey began in September 2012.LAMOST also conduct regular Medium-Resolution Spectroscopic Survey (MRS; R ∼ 7500) since 2018 for wavelength bands of 4950-5350 Å and 6300-6800 Å (Liu et al. 2020).When completing seven years of sky survey (one year pilot survey and six years regular survey) in June 2018, LAMOST became the first spectral sky survey project in the world to accumulate more than 10 million spectra.
The majority of the released data by LAMOST is LRS spectra.The huge data set is very beneficial for big-data analyses of stellar properties.There are millions of LRS spectra of solar-like stars that can be used for studying stellar chromospheric activity and solar-stellar connection.Figure 1 gives an example of LRS spectra of solar-like stars observed by LAMOST.As illustrated in Figure 1, the LRS spectrum contains several optical spectroscopic features that can indicate stellar chromospheric activity, in which the Ca II H&K lines (highlighted in red) are the most commonly employed spectral lines.The other lines (Hα and Ca II IRT) can also be used (e.g., Frasca et al. 2016), but those lines are relatively narrow and hence are not well-resolved in LRS spectra compared with Ca II H&K lines.
The massive amount of LAMOST LRS spectral data provides a great opportunity for investigating overall chromospheric activity properties of solar-like stars based on the Ca II H&K lines.Karoff et al. (2016) selected 5,648 solar-like stars (including 48 superflare stars) from LAMOST LRS catalog and studied the relation between stellar chromospheric activity level (indicated by the Ca II H&K S index) and occurrence of superflares.Zhang et al. (2020a) calculated chromospheric S index and R + HK index of 59,816 stars from the LRS spectra of the LAMOST-Kepler observation project (De Cat et al. 2015;Zong et al. 2018;Fu et al. 2020)  scope) of F-, G-, and K-type stars and the dependence of the activities on stellar rotation.Zhao et al. (2015) presented measurements of chromospheric S index for 119,995 F, G and K stars by using the LRS spectra from the first data release of LAMOST.Tu et al. (2021) found 7,454 solar-like stars with both light-curve observation by the Transiting Exoplanet Survey Satellite (TESS; Ricker et al. 2015) and LRS spectral observation by LAMOST and investigated the relations between the stellar chromospheric activity (measured by S index), photometric variability, and flare activity of the stellar objects.
The previous works discussed above used a subset of the released LAMOST spectra.We believe that exploiting the full data set of LAMOST will further facilitate the relevant research (He et al. 2021).By utilizing the large volume of LRS spectra in LAMOST Data Release 7, we constructed a stellar chromospheric activity database of solar-like stars based on the Ca II H&K lines, which is elaborated in this paper.We introduce the LAMOST Data Release 7 in Section 2 and explain the criteria for selecting high-quality LRS spectra of solar-like stars in Section 3. The data processing workflow for the selected LRS spectra and the derivation of stellar chromospheric activity measures and indexes are described in detail in Section 4. The components of the stellar chromospheric activity database of solar-like stars are elucidated in Section 5. We discuss the results obtained from the database and perform a statistical analysis of the stellar chromospheric activity indexes in Section 6.In Section 7, we summarize this work and prospect the further researches based on the database.

DATA RELEASE OF LAMOST
The annual observation of the LAMOST sky survey begins in September of each year and ends in June of the next year.The summer season (from July to August) is for instrument maintenance.The acquired data by LAMOST are also released in yearly increments.Each data release (DR) consists of the data files of one-dimensional spectra (in FITS format) and the catalog files (in both FITS and CSV formats) of spectroscopic parameters (Luo et al. 2015).A new DR contains all the data collected in the corresponding observing year as well as in the previous years.
The LAMOST Data Release 7 (DR7) is opened to the public in September 2021, which contains the spectral data observed from October 2011 to June 2019.In this work, we utilize the LRS spectra in LAMOST DR7 v2.01 to construct the stellar chromospheric activity database of solar-like stars.As demonstrated in Figure 1, the flux of LRS spectra is calibrated, and the vacuum wavelength is adopted in LAMOST data.Most telluric lines in red band of LRS spectra have been removed (Luo et al. 2015).
Basic information of the LRS spectra in LAMOST DR7 v2.0 is shown in Table 1.The spectra of solar-like stars investigated in this work are taken from the LAMOST LRS Stellar Parameter Catalog of A, F, G and K Stars (hereafter referred to as LAMOST LRS AFGK Catalog, for short), which consists of 48 spectroscopic parameters, such as observation identifier (obsid), sky coordinates, signal-to-noise ratio (SNR), magnitude, and so on.Several important stellar parameters, including effective temperature (T eff ), surface gravity (log g), metallicity ([Fe/H]), and radial velocity (V r ), are also provided by the catalog.The four stellar parameters are obtained by the LAMOST Stellar Parameter Pipeline (LASP), in which all the parameters are determined simultaneously by minimizing the squared difference between the observed spectra and the model spectra (Wu et al. 2011;Luo et al. 2015).The targets in LAMOST DR7 have been cross-matched with the Gaia DR2 catalog (Gaia Collaboration et al. 2018), and their Gaia source identifiers and G magnitudes are included in the LAMOST DR7 catalogs.

SELECTION OF HIGH-QUALITY LRS SPECTRA OF SOLAR-LIKE STARS
High-quality LAMOST LRS spectra of solar-like stars (Cayrel de Strobel 1996) are expected in this work.The Sun is a main-sequence star with spectral class of G2, T eff of about 5800 K, and log g of about 4.4 (g in unit of cm • s −2 ).There are various definitions for solar-like stars in the literature with different ranges of stellar parameters around the values of the Sun (e.g., Schaefer et al. 2000;Maehara et al. 2012;Shibayama et al. 2013;Zhao et al. 2015;Reinhold et al. 2020;Zhang et al. 2020a,b).In this work, we consider the concept of solar-like stars from the viewpoint of stellar chromospheric activity and adopt a broader range of spectral class than G-type, since Ca II H&K lines are also prominent in spectra of late F-and early K-type stars (e.g., Kesseli et al. 2017).
We select high-quality LRS spectra of solar-like stars from the LAMOST LRS AFGK Catalog of LAMOST DR7 v2.0, which contains 6,179,327 LRS spectra (see Table 1) with determined stellar parameters (T eff , log g, [Fe/H], and V r ) by the LASP.The criteria for the spectral data selection involve five aspects: SNR condition, T eff range, [Fe/H] range, main-sequence star condition, and data completeness in Ca II H&K band.The SNR and data completeness criteria are for the high-quality data, and the T eff , [Fe/H], and main-sequence star criteria are for the sample of solar-like stars.These criteria are described as follows.
1.One major indicator of the quality of a spectrum is the SNR.LAMOST catalogs provide SNR parameters for LRS spectra in five color bands, that is, ultraviolet, green, red, near infrared, and infrared bands, which are abbreviated as u, g, r, i, and z, respectively. 2In LAMOST catalogs, the value of SNR is in the range from 0 to 1000.Higher SNR generally means higher quality of the spectral data, and hence smaller uncertainties of the spectral fluxes, determined stellar parameters, and derived stellar chromospheric activity parameters.In this work, we utilize the SNR parameters of LRS spectra in the g band and r band3 (denoted by SNR g and SNR r , respectively), and adopt the SNR condition for high-quality LRS spectra as SNR g ≥ 50.00 and SNR r ≥ 71.43.This criterion is a compromise between a smaller uncertainty of spectral fluxes/stellar parameters/activity parameters and a larger volume of spectral sample.The g-band threshold (SNR g ≥ 50.00) is the primary condition; the r-band threshold (SNR r ≥ 71.43) is determined from the g-band condition in consideration that SNR r /SNR g ∼ 10/7 for the spectra with T eff in the range of solar-like stars (see criterion 2). Figure 2 shows the scatter plot of SNR g vs. SNR r , in which the ratio of SNR r /SNR g = 10/7 and the SNR thresholds for high-quality spectra are illustrated.It should be noted that although the Ca II H&K lines employed in this work are in the g band, the whole LRS spectrum is used by the LASP to determine the stellar parameters (Luo et al. 2015).In addition to the g-band SNR condition, the r-band SNR condition is included to ensure a smaller uncertainty of stellar parameters.
The separation line between the giant and main-sequence samples defined by Equation ( 1) is shown as a black solid line (log g = 5.98 − 0.00035 × T eff ) in Figure 3.The two endpoints of the line, (T eff = 6800 K, log g = 3.6) and (T eff = 4800 K, log g = 4.3), are determined empirically by visual inspecting the distribution of the LRS samples in the T eff -log g diagram.In Figure 3, the sample of main-sequence stars (beneath the black solid line, as according to Equation ( 1)) with T eff in the range of 4800-6800 K (criterion 2) and SNR above the thresholds (criterion 1) is shown in green; the sample of giant stars (above the black solid line) with SNR above the thresholds is shown in orange; the sample of other main-sequence stars (T eff < 4800 K or T eff > 6800 K) with SNR above the thresholds is shown in blue; and the sample below the SNR thresholds is shown in gray.
5. In this work, we utilize the Ca II H&K band to analyze stellar chromospheric activity.Some LRS spectral data in this band contain data points with zero or negative fluxes, which are not reliable according to the caveat of LAMOST and should be removed from the spectral data.Those LRS spectra with incomplete data points in Ca II H&K band are not used in the analysis.
By applying the first four criteria (SNR condition, T eff range, [Fe/H] range, and main-sequence star condition) for the spectra in the LAMOST LRS AFGK Catalog, we get a sample of 1,352,910 LRS spectra.By applying the fifth criterion (data completeness in Ca II H&K band), 22,256 spectra in the sample are discarded.We ultimately obtain 1,330,654 high-quality LAMOST LRS spectra of solar-like stars that are suitable for studying stellar chromospheric activity through the Ca II H&K lines.The number density distribution of these selected LRS spectra in the T efflog g parameter space is shown in Figure 4.The sky coordinates of these selected spectra are illustrated in Figure 5.
In Figure 6, we show the distributions of the uncertainty values of the four stellar parameters determined by the LASP (denoted by δT eff , δ log g, δ[Fe/H], and δV r , respectively) for all the selected high-quality LRS spectra of solarlike stars.It can be seen from Figure 6 that the peak positions of the uncertainty distributions of the four stellar parameters are δT eff ∼ 25 K, δ log g ∼ 0.035 dex, δ[Fe/H] ∼ 0.025 dex, and δV r ∼ 3.5 km/s, respectively.The highaccurate stellar parameters of the selected LRS spectra of solar-like stars lay a good foundation for the subsequent data processing and analysis.

DATA PROCESSING WORKFLOW FOR THE SELECTED LRS SPECTRA OF SOLAR-LIKE STARS
The purpose of data processing is to obtain emission fluxes of Ca II H&K lines and chromospheric activity indexes for each of the selected LRS spectra of solar-like stars.The workflow for data processing has four steps (He et al. 2021): (1) correction for wavelength shift introduced by radial velocity; (2) measurement of emission fluxes of Ca II H&K lines; (3) evaluation of chromospheric activity indexes; and (4) estimation of uncertainties of emission flux measures and activity indexes.These steps are described in detail in the following subsections.The radial velocity of a stellar object causes wavelength shift in the observed spectrum.This wavelength shift should be corrected to obtain the spectrum in rest frame before measuring the emission fluxes of Ca II H&K lines.The radial velocity values of the selected LRS spectra of solar-like stars have been determined by the LASP and are provided by the LAMOST LRS AFGK Catalog.
The relation between the radial velocity V r and the wavelength shift of a spectrum can be expressed as where c is the speed of light, λ is the wavelength value of the observed spectrum, λ 0 is the wavelength value in rest frame, and λ − λ 0 is the wavelength shift introduced by radial velocity.Then, the desired wavelength value in rest frame, λ 0 , can be calculated from the observed wavelength value λ and radial velocity V r by the following equation: An example of wavelength shift correction for Ca II H&K lines in LAMOST LRS spectra can be seen in Figure 9.

Measurement of Emission Fluxes of Ca II H&K lines
The first long-term program of measuring emission fluxes of stellar Ca II H&K lines for a bulk of stars started in 1966 at MWO (Wilson 1968) using a photoelectric scanner called HKP-1.When a new photoelectric spectrometer referred to as HKP-2 was constructed at MWO, Vaughan et al. (1978) formally defined the H, K, R and V channels of Ca II H&K lines and began to routinely measure stellar emissioin fluxes in the four bands.The H and K bands are two 1.09 Å full width at half maximum (FWHM) triangular bandpasses at line cores of Ca II H and K lines (center wavelengths in air being 3968.47Å and 3933.66Å, respectively), and the R and V bands are two 20 Å rectangular bandpasses on the red and violet sides of the Ca II H&K lines (wavelength ranges in air being 3991.07-4011.07Å and 3891.07-3911.07Å, respectively).The R and V bands provide reference fluxes of pseudo-continuum for evaluation of S index (Vaughan et al. 1978;Duncan et al. 1991).
The aforementioned H, K, R, and V bands have become the standard for characterizing emissions of Ca II H&K lines and been widely used for assessing stellar chromospheric activity in the literature (e.g., Hall et al. 2009;Isaacson & Fischer 2010;Mittag et al. 2013;Beck et al. 2016;Salabert et al. 2016;Boro Saikia et al. 2018;Karoff et al. 2019;Melbourne et al. 2020;Zhang et al. 2020a;Gomes da Silva et al. 2021;Sowmya et al. 2021).Meanwhile, researchers also look for alternative definitions of the bands to quantify the emissions of Ca II H&K lines for specific facilities and data sets.For example, Zhao et al. (2013) defined their H and K bands as 2 Å rectangular bandpasses for the spectral data of SDSS; Mittag et al. (2016) adopted 1 Å rectangular H and K bands for the spectral data of the TIGRE telescope.Both Zhao et al. (2013) and Mittag et al. (2016) kept the definitions of R and V bands by MWO.Vaughan et al. (1978).
The wavelengths in vacuum are calculated from air wavelengths using the formula given by Ciddor (1996).
In this work, we adopt the classical definitions of R and V bands with 20 Å rectangular bandpasses, the classical definitions of H and K bands with 1.09 Å FWHM triangular bandpasses, and the alternative definitions of H and K bands with 1 Å rectangular bandpasses for the LAMOST LRS spectra.Six emission flux measures are introduced to evaluate the mean fluxes in the six bandpasses, which are tabulated in Table 2.These measures are R and V for the mean fluxes in the 20 Å wide R and V reference bands, H tri and K tri for the mean fluxes in the H and K bands with 1.09 Å FWHM triangular bandpasses, and H rec and K rec for the mean fluxes in the H and K bands with 1 Å rectangular bandpasses.H rec and K rec are measured for Ca II H&K lines in addition to H tri and K tri because the physical meaning of the fluxes in rectangular bandpass are more straightforward than the fluxes in triangular bandpass (see further discussion on the two types of bandpasses in Section 6.1).
The center wavelength values of the R, V , H, and K bands are given in the rightmost two columns of Table 2.The wavelength values in air are taken from Vaughan et al. (1978), and the wavelength values in vacuum are calculated from the air wavelengths by using the conversion formula given by Ciddor (1996).In practical computation, the vacuum wavelengths are employed to derive the six emission flux measures from the LAMOST LRS spectra.A diagram illustration of the vacuum wavelength ranges of the 20 Å wide R and V bands, the 1.09 Å FWHM triangular H and K bands, and the 1 Å rectangular H and K bands can be found in Figure 9.
We derive the values of the six emission flux measures ( R, V , H tri , K tri , H rec , and K rec ) of Ca II H&K lines for all the selected LRS spectra of solar-like stars.To measure the mean flux in a bandpass for a LAMOST LRS spectrum, we first integrate the spectral fluxes in the bandpass, and then divide the integrated flux value by the wavelength width of the bandpass.Since the LRS spectrum consists of discrete data points which are a bit sparse for the bandpass integration, we obtain a denser distribution of data points in the bandpass via linear interpolation.The wavelength steps after interpolation are 0.01 Å for R and V bands and 0.001 Å for H and K bands.Then, the mean flux value in a bandpass is calculated based on the interpolated spectral data.

Evaluation of Chromospheric Activity Indexes
From the emission flux measures of Ca II H&K lines obtained in Section 4.2, stellar chromospheric activity indexes can be evaluated.The widely used chromospheric activity indicator, the classical MWO S index (denoted by S MWO ), was originally defined at MWO as (Wilson 1968;Vaughan et al. 1978;Duncan et al. 1991;Baliunas et al. 1995) where N H and N K are the number of counts in the 1.09 Å FWHM triangular bandpasses of Ca II H and K lines, N R and N V are the number of counts in the 20 Å R and V reference bands (Vaughan et al. 1978), and α is a scaling factor for adjusting the HKP-2 measurements to be in the similar scale as HKP-1 results (Duncan et al. 1991).
For the LAMOST LRS spectra, by referring to the definition of S MWO , we can introduce the LAMOST S index (denoted by S L ) which is expressed as (Lovis et al. 2011;Karoff et al. 2016) where α L is the scaling factor for LAMOST, and H tri , K tri , R, and V are the emission flux measures of Ca II H&K lines defined in Section 4.2 (see Table 2).The reason for multiplying 8 to the 1.09 Å FWHM is that the integration time spent on the H and K bands by the spectrometer used in MWO is eight times longer than that on the 20 Å wide R and V bands (Duncan et al. 1991;Lovis et al. 2011).The value of α L is adopted as 1.8 for LAMOST LRS data as suggested by Karoff et al. (2016), which is determined based on the S-index distributions of solar-like stars.
The relationship between the values of S L derived from LAMOST LRS spectra and the values of S MWO measured by MWO can be calibrated based on the common stars of the two data sets.In Figure 7, we show the scatter plot of S L vs. S MWO for 65 common stars between the selected LRS spectra of solar-like stars in this work and the S MWO catalog given by Duncan et al. (1991).As exhibited in Figure 7, the relationship between S L and S MWO can be fitted with an exponential function which is displayed as a black line in Figure 7 (see Appendix A for details of the common stars and fitting procedure).The nonlinear relation is expected for larger S-index values since they are obtained by instruments with distinct spectral resolutions (see, e.g., Vaughan et al. 1978;Henry et al. 1996, for more examples).For smaller S MWO values (less than 0.3), the fitted line approaches S MWO /S L = 1 as exhibited in Figure 7, illustrating the suitability of the scaling factor value of α L = 1.8 adopted for LAMOST LRS data.
The median of the relative deviations between the S L values of the common stars and the fitted line in Figure 7 is about 8.8%.The large residuals could be due to the long-term activity variation in these stars (e.g., Wilson 1978;Baliunas & Jastrow 1990;Baliunas et al. 1995;Radick et al. 2018), and LAMOST only captures a snapshot observation of these stars' activity at all S-index values.
If the activity index values are not needed to be in the similar scale as the measurements at MWO, the factors in Equation ( 5) are unnecessary.Therefore, we can define the S tri index based on the 1.09 Å FWHM triangular bandpass for Ca II H&K lines.The S L index is connected with the S tri index by equation If it is not needed to keep the bandpass shape adopted by MWO, we can also use the 1 Å rectangular bandpasses to measure the line core emissions of Ca II H&K lines (see Section 4.2) and define the S rec index as Jenkins et al. ( 2011) analyzed the influences of change in width of bandpasses at line cores of Ca II H&K lines on the result of S index.Their result showed that the correlation between the derived S-index values and the original values of MWO decreases with increasing bandpass width (such as 2 Å, 3 Å, 4 Å, etc.) even for low-resolution spectral data.Therefore, we adopt the current bandpass widths for defining the activity indexes S tri , S L , and S rec to keep best compatibility with the MWO measurements.
We calculated the values of S tri , S L , and S rec indexes by using Equations ( 7), (8), and ( 9) for all the selected LRS spectra of solar-like stars.The correlations between the values of the three indexes are illustrated and discussed in Section 6.1.

Estimation of Uncertainties of Emission Flux Measures and Activity Indexes
In Sections 4.2 and 4.3, six emission flux measures of Ca II H&K lines and three stellar chromospheric activity indexes are derived from the LRS spectra.In this subsection, we estimate uncertainties of these activity parameters.Three sources of uncertainty are taken into account: the uncertainty of spectral flux, the discretization in spectral data, and the uncertainty of radial velocity, which can finally propagate into the composite uncertainty values of the derived activity parameters.
The FITS file of a LRS spectrum provides the value of inverse variance (1/δ 2 0 , where δ 0 denotes the uncertainty of flux) to indicate photon noise for each data point in the spectrum.As described in Sections 4.2 and 4.3, the emission flux measures as well as the activity indexes are calculated based on the interpolated spectral data.The flux uncertainty value of an interpolated data point (denoted by δ i ) can be derived from δ 0 by equation where n i is number density of the spectral data after interpolation and n 0 is number density in the original LRS spectrum.Then, for a given activity parameter P (one of the six emission flux measures and three activity indexes), the uncertainty of P caused by the uncertainty of spectral flux (denoted by δP flux ) can be estimated from δ i based on the definitions or formulas in Section 4.2 and 4.3 using the error propagation rules.
A spectrum is stored in discrete data points, and the flux values between data points have to be obtained via an interpolation algorithm as described in Section 4.2, which leads to uncertainties of the derived emission flux measures and activity indexes.To estimate the uncertainties of the activity parameters caused by the discretization in spectral data, we utilize two interpolation algorithm to obtain the spectral flux values between data points.One is the linear interpolation algorithm as used in Section 4.2, and another is the cubic interpolation algorithm.For a given activity parameter P , we can get two parameter values, P linear and P cubic , corresponding to the two interpolation algorithms, respectively.Then, the uncertainty of P caused by the discretization in spectral data (denoted by δP discrete ) can be estimated as the difference between P linear and P cubic , i.e., To estimate the uncertainties of the emission flux measures and activity indexes caused by the uncertainty of radial velocity, we perform wavelength correction for a LRS spectrum (as described in Section 4.1) using three deliberately set radial velocity values, V r − δV r , V r , and V r + δV r , where V r is the formal radial velocity of the spectrum determined by LASP and δV r is the uncertainty of the radial velocity.For a given activity parameter P , we can get three parameter values, P − , P , and P + , corresponding to the three deliberately set radial velocity values, respectively.Then, the uncertainty of P caused by the uncertainty of radial velocity (denoted by δP Vr ) can be estimated by the following formula: The composite uncertainty of P (denoted by δP ) caused by all the three uncertainty sources can be calculated by We calculate the uncertainty values (δP flux , δP Vr , and δP ) of the six emission flux measures and three activity indexes for all the selected LRS spectra of solar-like stars.In Figure 8, we show the relative magnitudes among the uncertainties originating from different sources, by using the uncertainty of the activity index S rec as an example.As illustrated in Figure 8, the uncertainty of S rec originating from the uncertainty of spectral flux (about 10 −2 ) is roughly an order of magnitude higher than that from the discretization in spectral data (about 10 −3 ), which in turn is roughly an order of magnitude higher than that from the uncertainty of radial velocity (about 10 −4 ).
All results of the obtained emission flux measures and activity indexes, and the estimated composite uncertainties of the activity parameters are integrated into the catalog of the stellar chromospheric activity database of solar-like stars (see Section 5).In a few of LRS data, the inverse variance values are not available at some data points in the H, K, R, and V bands of Ca II H&K lines.For those spectra, the uncertainty values of the emission flux measures and activity indexes are filled with '-9999' in the catalog of the database.

STELLAR CHROMOSPHERIC ACTIVITY DATABASE OF SOLAR-LIKE STARS
In Section 3, we select 1,330,654 high-quality LRS spectra of solar-like stars from LAMOST DR7 v2.0.In Section 4, we derive six emission flux measures ( R, V , H tri , K tri , H rec , and K rec ) and three stellar chromospheric activity indexes (S tri , S L , and S rec ) of Ca II H&K lines as well as their uncertainties for the selected LAMOST LRS spectra.These emission flux measures and activity indexes can be used to investigate the overall distribution of chromospheric activity of solar-like stars as well as the activity characteristics of individual spectra.We also produce spectrum diagrams of Ca II H&K lines for all the selected LRS spectra.A stellar chromospheric activity database of solar-like stars are constructed based on the selected LAMOST LRS spectra, the derived emission flux measures and activity indexes, and the produced spectrum diagrams of Ca II H&K lines.The entity of the database is composed of a catalog of the spectral sample and activity parameters, and a library of the spectrum diagrams, which are described in detail in the following subsections.An online version of the database is available. 4The original FITS data files of the associated LRS spectra can be queried and downloaded from the LAMOST website (see Section 2) through the obsid or fitsname information (see Table 3) included in the catalog the database.

Catalog of Spectral Sample and Activity Parameters
The catalog of the 1,330,654 high-quality LAMOST LRS spectra of solar-like stars as well as the derived emission flux measures and activity indexes is stored in a CSV format file (filename: CaIIHK Sindex LAMOST DR7 LRS.csv).Each row of the catalog corresponds to a LRS spectrum.All the columns in the catalog are tabulated in Table 3 with brief descriptions.As shown in Table 3, the six emission flux measures (columns: R mean, V mean, H mean tri, K mean tri, H mean rec, and K mean rec), the three activity indexes (columns: S tri, S L, and S rec), and their uncertainties derived in Section 4 are included in the catalog (18 columns in total).The figure file name of Ca II H&K spectrum diagram (see Section 5.2) is also included in the catalog (column: figname).The aforementioned 19 columns provided by this work are labeled with a ' * ' symbol in Table 3.Other columns in Table 3 are taken from the data release of LAMOST; those columns are used in this work and hence are kept in the catalog for reference.In Section 4.3, the S tri and S rec indexes are introduced based on the triangular bandpass and the rectangular bandpass at Ca II H&K line cores, respectively.The S L index is further introduced by multiplying S tri by a scaling factor.Figure 10 depicts the correlations of S rec vs. S tri , S tri vs. S L , and S rec vs. S L based on the derived values of the activity indexes in the database.
As shown in Figure 10, there is a good consistency between the values of S rec , S tri , and S L for the LRS spectra.A linear fitting for S rec vs. S tri gives S tri = 0.983 S rec + 0.0075 (14) (see Figure 10a), which means that for larger values (> 0.456) of activity indexes, S rec is generally slightly greater than S tri .The relation between S tri and S L has been given by Equation (8) (also see Figure 10b).A linear fitting for (see Figure 10c).Note that Equation ( 15) can be deduced from Equations ( 8) and ( 14).
In comparison to the S tri and S L indexes defined based on the triangular bandpass at line cores of Ca II H&K lines, the S rec index defined based on the rectangular bandpass has a more straightforward physical meaning and is also a reliable and suitable choice for stellar activity studies with the LRS spectra.The mean S MWO index value of the Sun is about 0.169 as determined by Egeland et al. (2017) based on the MWO HKP-2 measurements.By substituting this value to Equation (6) and then using Equation (15), we can get the mean S rec value of the Sun (denoted by S rec, ) is about 0.223.

Uncertainty of Activity Index versus SNR of Spectra
The uncertainty of the activity indexes is related to the SNR of spectra.A larger SNR usually corresponds to a smaller uncertainty of activity index.We utilize all the derived uncertainty values of S rec in Section 4.4 and the SNR g values of the LAMOST LRS spectra to quantitatively analyze this relation.(The analysis can also be performed for S tri and S L , and the results are similar.) In Section 4.4, we have obtained the uncertainties of S rec originating from the uncertainty of spectral flux, the discretization in spectral data, and the uncertainty of radial velocity (denoted by δS rec, flux , δS rec, discrete , and δS rec, Vr , respectively), as well as the composite uncertainty of S rec caused by all the three uncertainty sources (denoted by δS rec ).The scatter plots illustrating the distributions of log SNR g vs. log δS rec, flux , log SNR g vs. log δS rec, discrete , log SNR g vs. log δS rec, Vr , and log SNR g vs. log δS rec are displayed in the left panels of Figure 11.We also calculate the corresponding relative uncertainties of S rec (i.e., δS rec, flux /S rec , δS rec, discrete /S rec , δS rec, Vr /S rec , and δS rec /S rec ), and the distributions of log SNR g vs. log δS rec, flux /S rec , log SNR g vs. log δS rec, discrete /S rec , log SNR g vs. log δS rec, Vr /S rec , and log SNR g vs. log δS rec /S rec are displayed in the right panels of Figure 11.
It can be seen from Figure 11 that a power law relation is roughly satisfied between SNR g and δS rec, flux as well as between SNR g and δS rec, flux /S rec (Figures 11a and b), however, the values of δS rec and δS rec /S rec are distributed over a relatively wide strip area.The order of magnitude of δS rec, discrete (Figures 11c and d) is generally smaller than δS rec, flux , and the order of magnitude of δS rec, Vr (Figures 11e and f) in turn is generally smaller than δS rec, discrete , which has been demonstrated in Figure 8.
Figures 11g and h show that the composite uncertainty δS rec is mainly affected by δS rec, flux for smaller SNR g values and by δS rec, discrete for larger SNR g values.We performed cubic polynomial fitting for the upper envelope, mean value, and lower envelope of the distributions of log δS rec and log δS rec /S rec in Figures 11g and h.We divide the range of log SNR g (from log 50=1.7 to log 1000=3.0)into 300 equal-width bins and use the upper envelope, mean, and lower envelope values of log δS rec and log δS rec /S rec in each bin for fitting.The upper envelope and lower envelope threshold is defined as the number density value in Figures 11g and h no less than 5 in each bin.If all the number density values in a bin are less than 5, the bin does not participate in the fitting.The fitting results are illustrated in Figures11g and h.
The formulas for the fitted upper envelope, mean value, and lower envelope of the log δS rec distribution are given in Equations ( 16), (17), and (18), respectively, in which x represents log SNR g and y represents log δS rec : y = −0.041x 3 + 1.304x 2 − 6.389x + 6.288, (16) y = −0.157x 3 + 1.880x 2 − 6.806x + 5.415, ( 17) The formulas for the fitted upper envelope, mean value, and lower envelope of the log δS rec /S rec distribution are given in Equations ( 19), (20), and (21), respectively, in which x represents log SNR g and y represents log δS rec /S rec : y = −0.152x 3 + 1.984x 2 − 7.063x + 5.738.( 21) Equations ( 16)-( 21) can be used to make a preliminary estimation of the values of δS rec and δS rec /S rec from the value of SNR g .For example, by using Equation (20) (illustrated by the black dashed line in Figure 11h), it can be deduced that the mean value of log δS rec /S rec for SNR g = 50 is about −0.87 and the corresponding δS rec /S rec value is about 10 −0.87 ≈ 0.13.By using Equation ( 16) (illustrated by the red line in Figure 11g), it can be deduced that the upper envelope value of log δS rec for SNR g = 50 is about −1.0, which means the whole upper envelope line of log δS rec is well below −1.0 and the corresponding δS rec value is below 10 −1.0 = 0.1.Considering the distribution range of the S rec values (about 10 0 = 1.0; see Figures 12), it is appropriate to set the upper limit of the uncertainty values of S rec to about 10 −1 = 0.1.This requirement leads to the SNR g condition (SNR g ≥ 50.00) for selecting LRS spectra in Section 3.

Overall Distribution of Chromospheric Activity of Solar-like Stars
We use the full data set of the derived S rec values in the database to show the overall distribution of chromospheric activity of solar-like stars.The distributions of S tri and S L are similar since they have approximate linear relations with S rec (see Figures 10a and c).
The histogram of all the S rec values in the database is displayed in Figure 12. Figure 12a uses a linear scale for vertical axis and Figure 12b a logarithmic scale.As shown in Figure 12a, most of the spectra in the database are associated with stars that are not very active with the values of S rec distributed around the mean S rec value of the Sun ( S rec, = 0.223; see Section 6.1). Figure 12b demonstrates that there are a certain number of spectra having a higher value of S rec which might be associated with active stars, and a dozen of spectra have isolated S rec values greater than 1.1.The features of the spectra with higher values of S rec will be examined in detail in the further work.
The relationship between the stellar activity and the stellar parameters (T eff , log g, and [Fe/H]) has attracted wide interest in the literature (e.g., Wilson 1968;Gray et al. 2006;Mittag et al. 2013;Zhao et al. 2013Zhao et al. , 2015;;Lorenzo-Oliveira et al. 2016;Boro Saikia et al. 2018;Fang et al. 2018;Karoff et al. 2018;Zhang et al. 2019;Gomes da Silva et al. 2021).In Figure 13, we display the scatter diagrams of T eff vs. S rec , log g vs. S rec , and [Fe/H] vs. S rec using the values of the stellar parameters and activity indexes in the database, with color scale indicating number density.
In Figure 14, we display the distribution of S rec values in scatter diagrams of T eff vs. log g, T eff vs. [Fe/H], and [Fe/H] vs. log g, with color scale indicating the value of S rec (see the color bar in Figure 14).The smaller S rec values (< 0.4) are displayed in blue, the medium S rec values (0.4-0.6) are green, and the larger S rec values (> 0.6) are red.The data points in Figure 14 are drawn in order from smallest S rec at the bottom to largest at the top, and hence the data points with larger S rec values are overlaid on top of the data points with smaller S rec values.As shown in Figure 14, most of the spectra have lower S rec values (blue color).The higher the S rec value, the smaller the distribution range is.
From the distribution diagrams of S rec with respect to log g and [Fe/H] (Figures 13b and c), it can be seen that the distribution morphology of the sample with S rec > 6.0 is different from the sample with S rec < 6.0.That is, the

Figure 1 .
Figure 1.An example of LAMOST LRS spectrum of solar-like stars.The spectral lines commonly used for analyzing stellar chromospheric activity (Ca II H&K, Hα, and Ca II IRT) are labeled.The Ca II H&K lines employed in this work are highlighted in red.The obsid (observation identifier) and FITS file name of the spectrum are shown in the plot for reference.

Figure 2 .
Figure 2. Scatter plot of SNRg vs. SNRr for the spectra in the LAMOST LRS AFGK Catalog with T eff in the range of 4800-6800 K.The black dots represent the high-quality spectra with SNRg ≥ 50.00 and SNRr ≥ 71.43.The gray dots represent the spectra with SNRg or SNRr below the thresholds.The dashed line indicates the ratio of SNRr/SNRg = 10/7.
Figure 3. T eff -log g diagram of the spectra contained in the LAMOST LRS AFGK Catalog.The black solid line is defined by Equation (1) and divides the samples of main-sequence stars (beneath the line) and giant stars (above the line).The sample with SNR below the thresholds (see criterion 1 in Section 3) is shown in gray.The sample with SNR above the thresholds is shown in color, in which the sample of main-sequence stars with 4800 K ≤ T eff ≤ 6800 K (solar-like candidates) is shown in green, the sample of other main-sequence stars (T eff < 4800 K or T eff > 6800 K) is in blue, and the sample of giant stars is in orange.The position of solar T eff and log g is indicated with a ' ' symbol.

4. 1 .
Correction for Wavelength Shift Introduced by Radial Velocity

Figure 4 .Figure 5 .
Figure 4. Number density distribution of all the selected high-quality LAMOST LRS spectra of solar-like stars in the T efflog g parameter space.

Figure 6 .
Figure 6.Histograms of the uncertainty values of (a) T eff , (b) log g, (c) [Fe/H], and (d) Vr for all the selected high-quality LAMOST LRS spectra of solar-like stars.The vertical dashed line in each plot indicates the peak position of the uncertainty distribution, and the value of the peak position is labeled.

Figure 7 .
Figure7.Scatter plot of SL vs. SMWO for 65 common stars between the selected LRS spectra of solar-like stars in this work and the SMWO catalog given byDuncan et al. (1991).Error bars are displayed for the data points with known uncertainty values.The relationship between SL and SMWO is fitted with an exponential function (black line; see Appendix A for details of the common stars and fitting procedure).The gray line indicates the ratio of SMWO/SL = 1.Note that the fitted line approaches SMWO/SL = 1 for smaller S-index values.

Figure 8 .
Figure 8. Distribution histograms of the uncertainties of the Srec index originating from Vr uncertainty (green line), discretization in spectral data (blue line), spectral flux uncertainty (red line), and all uncertainty sources (composite uncertainty; black line).

Figure 9 .
Figure 9.An example of the spectrum diagrams of Ca II H&K lines in the database using a LAMOST LRS spectrum with obsid = 54904030.The wavelength range of the diagram is 3892.17-4012.20Å.The original flux of the spectrum released by LAMOST is shown through the right-hand vertical axis.The left-hand vertical axis is the relative flux normalized by the maximum value of the original flux in the plot.The blue dash-dot line is the original LAMOST spectrum, and the red solid line is the wavelength-shifted spectrum after radial velocity correction.The radial velocity value of the spectrum is given in the upper left area of the diagram.The two yellow rectangular regions on the two sides of the diagram indicate the 20 Å wide R and V pseudo-continuum bands.The two green triangular regions centered at 3969.59 Å and 3934.78Å indicate the 1.09 Å FWHM triangular bandpasses for Ca II H line and K line, respectively.Within the triangular regions are the 1 Å rectangular bandpasses which are shown in yellow.The stellar chromospheric activity parameters derived in this work are displayed in the area just above the spectrum plot.The obsid, FITS file name, data release number, and several stellar and spectroscopic parameters of the LAMOST spectrum are given in the title area of the diagram.

Figure 10 .
Figure 10.Scatter plots of (a) Srec vs. Stri, (b) Stri vs. SL, and (c) Srec vs. SL based on the values of the activity indexes in the database.The red lines are linear fittings to the correlations between the activity indexes.The formulas for the fitting results are given in the plots.The residuals of the fittings are also given for panels (a) and (c).There are no residuals in panel (b) since the relation between Stri and SL is directly defined by Equation (8).

S
rec vs. S L gives S L = 0.771 S rec + 0.0059

Figure 15 .
Figure 15.Linear fitting for the values of SL and ln SMWO of the common stars.Black dots are the original data as listed in Table4.The values of ln SMWO of the common stars are distributed in the range from −2 to 0, which is divided into 20 equal bins (bin step = 0.1) distinguished by horizontal lines.The red dots indicate the median of the SL values in each bin.The linear fitting is performed for the red dots in the plot, and the fitting result is displayed as a black line.The formula for the fitted line is displayed, with x representing SL and y representing ln SMWO.

Table 2 .
Emission flux measures of Ca II H&K lines.Note-'Column in Database' is used in Section 5.1.The wavelengths in air are taken from

Table 3 .
Columns in the catalog of the database.

Table 3
continued on next page

Table 3
(continued)Columns labeled with a ' * ' symbol are provided by this work.Other columns are used in this work and are taken from the data release of LAMOST.

Table 4
Duncan et al. (1991)SMWO values of MWO are taken fromDuncan et al. (1991).The star names are taken from the online catalog of the MWO data (https://cdsarc.cds.unistra.fr/viz-bin/cat/III/159A). The Gaia DR2 Source Identifiers are taken from the LAMOST LRS AFGK Catalog.The SL values are taken from the catalog obtained by this work.If a stellar object has more than one S-index records in a catalog, the median of the S-index records is adopted.Some of the uncertainty values are blank because they are not available in the source catalog.