FIRST, SECOND, AND THIRD MASSIVE STARS IN OPEN CLUSTERS

Alexey Mints

doi:10.1088/0004-637X/729/1/24

1. INTRODUCTION

Unfortunately, in most cases there does not exist a large body of statistics covering star cluster membership, so estimation of the cluster mass and the full number of members is not an easy task. In some cases, dynamical mass estimates were applied when spectroscopic data are available (making use of the virial theorem; see Kouwenhoven & de Grijs 2008), although this method can be imprecise (see Fleck et al. 2006). Another method measures the total brightness of identified members and extrapolates it using some luminosity function (see Bonatto & Bica 2005). However, it is often the case that only a few of the brightest stars are reliably identified as members (Kharchenko et al. 2003), which provides only a tiny amount of information about the cluster.

It is natural to expect to find more massive stars in massive clusters, although assuming a Salpeter (1955) initial mass function (hereafter IMF), one would expect several stars with masses M_* ∼ 300 M_☉ in our Galaxy, which is not the case. This controversy was discussed by Elmegreen (2000), who presented a relation between the cluster mass and the mass of its most massive star (assuming random sampling from the Salpeter IMF):

$\begin{equation} M_{\mathrm{cl}}\sim 3 \times 10^3 \left(\frac{m_{\mathrm{max}}}{100 \,M_\odot } \right) ^{1.35} \,M_\odot. \end{equation} \tag{ 1 }$

Elmegreen tried to introduce an exponential cut-off function to explain the absence of heavy stars, while maintaining randomness of sampling from the IMF. However, this led to a contradiction with the observed mass functions of massive clusters, as a Salpeter-like power law can be traced up to at least 100 M_☉. He proposed several explanations for this, including dependence of the IMF on the initial cluster mass.

Kroupa's IMF (Kroupa 2001) has become popular over the past decade as a standard cluster IMF. It is built from several power-law parts:

$\begin{eqnarray} f(m)dm &= C m^{-\alpha } dm \nonumber \\ F(m) &= \int _0^{m} f(m^{\prime })dm^{\prime }, \end{eqnarray} \tag{ 2 }$

with parameters (Kroupa 2001)

$\begin{eqnarray} \alpha _0 = +0.30 &\hspace{14.22636pt} 0.01 \le m/\,M_\odot < 0.08, \nonumber \\ \alpha _1 = +1.30 &\hspace{14.22636pt} 0.08 \le m/\,M_\odot < 0.50, \\ \alpha _2 = +2.35 &\hspace{14.22636pt} 0.50 \le m/\,M_\odot < m_{\mathrm{max}}. \nonumber \end{eqnarray} \tag{ 3 }$

From Equation (3) we see that a sharp cut-off at a high-mass end was introduced, although the exact value of m_max is left as a free parameter. Using this version of IMF, Weidner & Kroupa (2006) reviewed the correlation between the mass of the most massive star in the cluster and the total cluster mass. They arrived at the following conclusion: random sampling contradicts observations (for a description of various samplings, see Section 2). Maschberger & Clarke (2008) discussed the correlation between the number of stars in the cluster and its most massive member's mass, and found it compatible with random sampling. However, they only looked at a small range in cluster masses. These (and many more) papers were recently reviewed in Weidner et al. (2010).

Faustini et al. (2009) recently studied stellar clusters around young high-mass stars. They used Monte Carlo simulations to try to reproduce the properties of observed clusters. Among other results, they found that the distributions for the total cluster mass and for the mass of the most massive star were skewed and suggested using the mode of the distribution instead of its mean or median values.

In this paper, we try to produce cluster mass (M_cl) and membership (N) estimators using the masses of three most massive members, and analyze the precision of these estimators. We concentrate on two questions.

1.
What data provide the most reliable information on the cluster properties?
2.
What is the best method to extract cluster properties from that data?

Having these two goals in mind means that we will neglect at least three very important factors: stellar binarity, stellar evolution, and cluster dynamics. Stellar binarity can show itself directly by altering the IMF (if the process of binary star formation differs from that of a single star) or indirectly by unresolved binaries that alter the luminosity function and our assumptions about the IMF. This problem was studied by Weidner et al. (2009). Stellar evolution is important for old clusters, as their heavy stars can evolve and turn to stellar remnants or lose a fraction of their initial mass, which is caused by stellar wind. Modeling this process requires a set of assumptions and is beyond the scope of this work. Note that Weidner et al. (2010) studied young clusters, which allowed them to neglect stellar evolution. Cluster dynamics leads to mass segregation and evaporation of the cluster with time. Although it is believed that the lighter stars are the most probable candidates to be ejected from the cluster, this can also happen to the heavier stars. For more details on this problem, see Pflamm-Altenburg & Kroupa (2006) and Fleck et al. (2006). However, this effect is less important for young clusters.

Of course, it is not possible to weigh stars directly, but we can estimate their mass, mostly done by measuring their brightness. However, this process is a very uncertain one. The other way to estimate stellar masses is to weigh binary members, if the orbit is known, but this raises the problems of influence of the binarity on the IMF mentioned above. But, again, as we are interested in the statistical side of the problem, we will postpone astrophysical difficulties for later research.

If we assume all stars in the cluster to be single, then the (initial) mass of each star depends only on the IMF. Here, we will consider only one IMF—from Kroupa (2001; see Equation (3)). The value of m_max is still a matter of debate; therefore, several values will be considered in this paper (50, 150, and 300 M_☉), with the main focus being put on m_max = 150 M_☉. Although m_max = 50 M_☉ is not a realistic value, it is used here to study the dependence of the results on m_max. We will try to see how m_max influences the mass estimator precision.

The following notation will be used throughout the paper: $\bar{m}_1$ , $\tilde{m}_1$ for the average and the median value of m₁, respectively (similar representations for m₂ and m₃). Another useful value is the position of the peak of the m_1,2,3 distribution for a given M_cl or N (mode of the distribution), which we designate as $\hat{m}_{1,2,3}$ . Kroupa IMF (see Equation (3)) has an average stellar mass $\bar{m} = 0.36 \,M_\odot$ for m_max = 150 M_☉.

2. MODEL

Following Weidner & Kroupa (2006), we used three different methods for generating cluster members.

1.
Random sampling. N stars are taken randomly from the IMF, with N ranging from 300 to 10,000.
2.
Constrained sampling. M_cl is fixed; then stars are taken from the IMF until their total mass surpass M_cl. Thus, some spread in N is expected in this sample.
3.
Sorted sampling. M_cl is also fixed; then $N^{\prime } = M_{\mathrm{cl}}/\bar{m}$ stars are taken from the IMF. If $M^{\prime } = \sum _{N^{\prime }} m_i$ is smaller than M_cl, then $\Delta N = (M_{\mathrm{cl}}- M^{\prime })/\bar{m}$ stars are added to the cluster, giving a new N' and M', and the procedure is repeated until the cluster mass M' surpasses M_cl. After this, the stellar masses are sorted. If |M' − M_cl| is larger than |M_cl − (M' − m₁)|, then the heaviest star is removed from the set.

According to Weidner & Kroupa (2006), random sampling is the least realistic model, but the easiest to be modeled and described analytically.

For each set of parameters (M_cl or N, sampling, m_max), 30,000 clusters were simulated, and for each one five values were saved: cluster mass M_cl, number of stars in the cluster N, and masses of the three most massive stars of the cluster—m₁, m₂, m₃.

The goal is to build a method to find M_cl and/or N, when m₁, m₂, and m₃ are known. It seems natural to find the functions $M_{\mathrm{cl}}(\bar{m}_{1,2,3})$ , $M_{\mathrm{cl}}(\tilde{m}_{1,2,3}),$ and $M_{\mathrm{cl}}(\hat{m}_{1,2,3})$ (as well as $N(\bar{m}_{1,2,3})$ , $N(\tilde{m}_{1,2,3}),$ and $N(\hat{m}_{1,2,3})$ ). From here on, they will be called mass estimators (ME): average ME, median ME, and mode ME.

3. ANALYTICS

The probability for the most massive star to have mass m₁ ∈ (m, m + dm) can be written (Arnold et al. 1992) as the probability for a given star to have mass in (m, m + dm) multiplied by the probability that all other stars have masses below m and by the number of stars N (because any star can be the most massive one):

$\begin{eqnarray} &&P(m_1 \in (m, m+dm)) = N f(m) [ F(m) ]^{N-1} \nonumber \\ &&\qquad\quad= N f(m) \left[ 1 - \int _m^{m_{\mathrm{max}}} f(m^{\prime })dm^{\prime } \right]^{N-1}{.} \end{eqnarray} \tag{ 4 }$

Of course, m₁ should be smaller than m_max; otherwise P ≡ 0.

We can confidently use part of the Kroupa IMF (see Equation (3)) for m>0.5 M_☉, as the most massive stars are usually much heavier than 0.5 M_☉. Substituting Equation (2) into Equation (4) and integrating we obtain (for a ≠ 1)

$\begin{eqnarray} &&P(m_1 \in (m, m+dm)) = N C m^{-\alpha }\nonumber\\ &&\qquad\qquad\times\left[ 1 - \frac{C}{1-\alpha } \left(m_{\mathrm{max}}^{1-\alpha } - m^{1-\alpha } \right) \right]^{N-1}, \end{eqnarray} \tag{ 5 }$

where C is a normalization constant.

If N is large, then we can use exponent instead of square brackets (and replace N − 1 by N for the sake of simplicity):

$\begin{eqnarray} &&P(m_1 \in (m, m+dm)) \simeq N C m^{-\alpha }\nonumber\\ &&\qquad\qquad\times\exp \left(-\frac{N C}{1-\alpha } \left(m_{\mathrm{max}}^{1-\alpha } - m^{1-\alpha } \right)\right). \end{eqnarray} \tag{ 6 }$

The maximum of this distribution (or the mode of distribution) is located at the point

$\begin{equation} \hat{m}_1 = \left(\frac{NC}{\alpha }\right)^{1/(\alpha -1)}. \end{equation} \tag{ 7 }$

For $\hat{m}_1 \ge m_{\mathrm{max}}$ , the maximum is obviously at the point $\hat{m}_1 = m_{\mathrm{max}}$ . This puts an upper limit on the cluster mass that can be estimated with this formula. This is N ≈ 26, 000, and thus $M_{\mathrm{cl}}= \bar{m} N = 9500 \,M_\odot$ . By inverting this equation, we can obtain an estimate for N and M_cl from $\hat{m}_1$ :

$\begin{eqnarray} N &=& \frac{\hat{m}_1^{\alpha -1} \alpha }{C}, \nonumber \\ M_{\mathrm{cl}}&=& \frac{\bar{m} \hat{m}_1^{\alpha -1} \alpha }{C}. \end{eqnarray} \tag{ 8 }$

Note that m_max is hidden within the constant C in these equations, although the dependence is weak.

For the nth massive star, if n ≪ N, then we can use the expression

$\begin{equation} P(m_n \in (m, m+dm)) \simeq (1 - F(m))^{n-1} P(m_1 \in (m, m+dm)). \end{equation} \tag{ 9 }$

Finding average and median values for Equation (6) is not that easy.

Building analytical expressions for the other sampling methods is a much more complicated task and is not discussed here.

4. RESULTS

4.1. Random Sampling

The random sampling model has, as a natural parameter, the number of stars in the cluster N. Here, N ranges from 300 to 10,000, with 30,000 clusters being simulated for each value of N.

For each value of N, the distributions of m₁, m₂, and m₃ were calculated. An example of these distributions is shown in Figure 1. From the figure, it can be seen that the theoretical estimates given by Equations (6) and (9) match the data well. Note the long power-law tails of the distributions, especially for m₁. This tail leads to significant differences between the average and the median values, making the average much higher. Thus, averages are not so well suited to making cluster mass estimators.

Now the task is to build a method to find M_cl and/or N, knowing m₁, m₂, and m₃. We will try to find the functions $M_{\mathrm{cl}}(\bar{m}_{1,2,3}), M_{\mathrm{cl}}(\tilde{m}_{1,2,3}),$ and $M_{\mathrm{cl}}(\hat{m}_{1,2,3})$ . These functions for M_cl are shown in Figure 2. They can be approximated with the functions of the shape:

$\begin{eqnarray} M_{\mathrm{cl}}(m_{1,2,3}) &=& a m_{1,2,3}^b (m_{\mathrm{max}}-m_{1,2,3})^c \\ \nonumber N(m_{1,2,3}) &=& a m_{1,2,3}^b (m_{\mathrm{max}}-m_{1,2,3})^c, \end{eqnarray} \tag{ 10 }$

so those functions rise as power laws for small m and then saturate as m goes to 150 M_☉ (m_max).

**Figure 2.** Estimators data: dependences of M_cl on m₁ (left), m₂ (middle), and m₃ (right) for random (solid lines), constrained (long-dashed), and sorted (short-dashed) samplings. Top row is for average values, middle is for medians and bottom for the mode.
Download figure:
Standard image High-resolution image

The parameters of the fits are shown in Table 1. The first column refers to one of the functions from Equation (10), and the second, third, and fourth columns are for different parameters applied to this function—a, b, and c, respectively. For $N(\hat{m}_{1,2,3})$ Equations (10) can be used (by setting c = 0 and b = 1.35). For other estimators, b is always close to α₂ − 1 = 1.35 and c is close to −1, although f(m) = am^1.35(150 − m)⁻¹ is a bad fit. The value of c decreases from f(m₁) to f(m₃)—this is caused by the fact that the values of m₃ are much smaller than those of m₁ and are well separated from m_max; therefore, they are less affected by saturation. Thus, f(m₃) is less sensitive to the value of m_max. It should also be mentioned that a, b, and c are highly correlated, so the values given in Table 1 might not be the only ones giving good fits.

Table 1. Parameters of Fits Based on Average, Median, and Mode Values (see Equation (10))

Function	Fits Average Values			Median Values			Mode Values
	a	b	c	a	b	c	a	b	c
M_cl(m₁)	504.90	1.55	−1.23	512.24	1.30	−0.94	30.09	1.35	0.00
M_cl(m₂)	2492.00	1.35	−1.17	623.04	1.34	−0.82	42.32	1.40	0.00
M_cl(m₃)	4106.74	1.31	−1.13	865.90	1.34	−0.79	60.48	1.38	0.00
N(m₁)	1431.05	1.55	−1.23	1894.15	1.26	−0.97	83.58	1.35	0.00
N(m₂)	7091.91	1.35	−1.17	2659.81	1.30	−0.88	117.55	1.40	0.00
N(m₃)	11770.94	1.31	−1.14	4241.68	1.31	−0.89	168.00	1.38	0.00

Download table as: ASCII Typeset image

Given these approximations, we return to the initially simulated data to test how good they are. Namely, we will substitute m_1,2,3 for each cluster into mass estimators (see Equation (10)) to obtain M_cl(m_i) and N(m_i), which can then be compared to the real values. This produces some distributions of the estimated M_cl(m_i) and N(m_i). Errors of the estimation can be calculated as |M_cl(m_i) − M_cl| and |N(m_i) − N|. A sample of the result for N(m_i) is shown in Figure 3 for a cluster with a pre-defined number of stars (N = 1000). Note that there is a large power-law tail at the high-mass side where N goes to 10⁶, which is highest for N(m₁) and smallest for N(m₃) in all cases. Generally, N(m₃) shows a smaller spread than other estimators. The distribution of N(m₃) also peaks closer to the real value N = 1000.

**Figure 3.** Distribution of estimates for a number of stars in the cluster N(m_1,2,3) (solid, long-dashed and short-dashed lines, respectively) for a cluster with pre-defined 1000 stars. Estimators are based on average (upper left), median (upper right), and mode (bottom) values.
Download figure:
Standard image High-resolution image

Tables 2 and 3 summarize the relative errors of mean and relative dispersions for various estimators and samplings. The average estimator is the worst one, giving the highest error of the average value in almost all cases—sometimes up to 23%, with high dispersion. The best one seems to be the median estimator, with errors of less than 2%. The mode estimator is even worse than the average estimator for random sampling (75% error), but it is better for the sorted one—which is a more realistic sampling. As expected, the result is due to the power-law tail of the distributions, to which the median (and mode) values are less sensitive. There is a high probability for m₁ to be close to m_max, where estimator functions (see Equation (10)) are very sensitive to m_i, thus producing a higher error and extremely large dispersions. The mode estimator is free from this effect by definition, as there is no (m_max − m)^c factor.

Table 2. Relative Error (in Percents) of the Mean Value of the Estimated Masses

Value	Random Sampling			Constrained Sampling			Sorted Sampling
	f(m₁)	f(m₂)	f(m₃)	f(m₁)	f(m₂)	f(m₃)	f(m₁)	f(m₂)	f(m₃)
Median Estimator
N	0.88	0.28	0.41	0.48	0.32	0.27	1.77	1.89	1.90

M	1.37	0.99	1.05	0.36	0.20	0.12	0.36	0.23	0.13

Average Estimator

N	23.23	20.13	15.29	12.44	13.66	11.43	18.09	10.41	6.43

M	23.24	20.14	15.29	12.48	13.69	11.45	19.08	11.99	8.44

Mode Estimator

N	74.97	43.85	25.87	19.44	15.66	12.99	7.41	6.35	4.17

M	74.82	43.73	25.76	21.27	9.29	7.79	7.92	7.63	4.67

Download table as: ASCII Typeset image

Table 3. Relative Dispersion (in Percents) for Mass Estimations

Value	Random Sampling			Constrained Sampling			Sorted Sampling
	f(m₁)	f(m₂)	f(m₃)	f(m₁)	f(m₂)	f(m₃)	f(m₁)	f(m₂)	f(m₃)
Median Estimator

N	46.02	1.80	0.74	13.56	1.84	0.66	122.50	1.17	0.43

M	39.00	1.62	0.70	12.97	1.77	0.65	111.41	1.13	0.43

Average Estimator

N	264.68	3.38	0.82	35.47	3.11	0.73	1296.57	1.99	0.46

M	259.82	3.34	0.81	34.34	3.04	0.73	547.78	1.00	0.36

Mode Estimator

N	1.08	0.84	0.58	0.41	0.60	0.42	0.77	0.44	0.32

M	1.08	0.84	0.58	0.38	0.56	0.39	0.79	0.46	0.33

Download table as: ASCII Typeset image

The power-law tails of the distributions also cause extremely high dispersions for the estimates based on m₁. Dispersions (and errors of the mean) are much smaller for estimators based on m₂ and m₃, as the slope of the power-law tail is significantly higher. Relative dispersions are smallest for the mode estimator, but it has a higher relative error for the mean when compared with the median estimator. Note that in most cases estimators based on m₃ show the best results both in terms of the error of the mean and the dispersions.

Here we emphasize once again that errors are distributed in a significantly non-Gaussian way in this problem. Using median values minimizes the error for a high proportion of the data, while for a smaller proportion the errors remain large.

4.2. Constrained Sampling

We applied almost the same algorithm, as in the random sampling case, for the constrained sampling case. The only change was that we did not have an analytical formula for $\hat{m}_{1,2,3}$ , and therefore had to use fits to the simulated data of the shape $f(\hat{m}) = a \hat{m}^b$ .

In Figure 4, one can see that the difference between random and constrained samplings is not very large in most cases. The distribution for constrained sampling rises and falls faster than the one for random sampling. The faster decrease at the distribution high end for the small cluster (Figure 4, top panel) is due to the fact that during the simulation the total mass comes close to the desired M_cl, and massive stars are preferentially rejected from the sample, when adding them will make the cluster too massive. Obviously, this effect vanishes for higher M_cl, as one can see from the bottom panel in Figure 4.

4.3. Sorted Sampling

Sorted sampling should suppress the probability of high-mass star formation even more than constrained sampling. This can be seen in Figure 5: distribution of m₁ for sorted sampling is almost like that for m₂ for random sampling at the high end.

4.4. Estimator Reliability

We first attempt to compare the various estimators by comparing the data on which they are constructed. Let us return to Figure 2. It is obvious that the curves are very close to each other. This can also be seen from the similarities of fits parameters a, b, and c (see Equation (10) and Table 1). So one might expect that predictions made with different estimators for the same value of m will not differ from each other significantly. This difference can be even smaller than the difference between the estimated and real values, as predictions can deviate from the real value in the same direction. We calculate the average difference as

$\begin{equation} \Delta f(m_i) = 100\%*\left< \left| \frac{f(m_i) - f_\mathrm{random}(m_i)}{f_\mathrm{random}(m_i)}\right| \right>, \end{equation} \tag{ 11 }$

where i = 1, 2, 3 and m_i goes from 3 to 140 M_☉. This is a measure of how far away the estimators are from each other on average. The result is shown in Table 4. Note that in most cases Δf(m₃) < Δf(m₁). Constrained sampling is much closer to random sampling than to sorted sampling (from 14% to 71%, comparing to 21%–92%). The reason for this is, of course, that different samplings give different m_i distributions that are used to produce estimators. This can be seen in Figure 2 by the distance between the lines. Differences remain large, on the order of 20%, which is much larger than the relative errors of the mean value (see Table 2) and relative dispersions (see Table 3). This difference is less important for estimators based on m₁, as the relative errors of the mean value and relative dispersions are comparable to the differences between samplings. Due to this fact it is not efficient to use statistics on the most massive stars for distinguishing between various samplings.

Table 4. Relative Difference (in Percents, see Equation (11)) between Estimators for Random and other Sampling

Sampling	Δf(m₁)	Δf(m₂)	Δf(m₃)
Average Estimator

Constrained	24	17	18

Ordered	92	38	21

Median Estimator

Constrained	20	14	14

Ordered	89	55	45

Mode Estimator

Constrained	71	66	67

Ordered	34	45	47

Download table as: ASCII Typeset image

Thus, it is crucial to know which sampling method is more realistic, although there is still some discussion about it (see Section 1). It is also important to notice that current mass estimates for both m_max and M_cl can have errors as high as 50%.

Another check for the reliability of the obtained estimators is to try to apply them to the "wrong" data set, for example, using the median estimator from sorted sampling (see Section 4.3) to estimate the masses for the random one (see Section 4.1) or to use an estimator from a data set with m_max = 150 M_☉ to estimate the masses for the m_max = 300 M_☉ data set, etc. An example of this is shown in Table 5. Here, m_max was varied: a data set with one m_max was used to build an estimator function (source data set) that was then applied to the data set with another m_max (target data set). It should be noted that m_max in the target data set cannot exceed that of the source data set, as the function from Equation (10) will be undefined. Diagonals in this table (i.e., values with equal source and target data sets) are the same as those in Columns 8 and 10 in Table 2. As expected, errors increase with increasing difference between the source and target data set's m_max. However, m₃ is better in almost all cases, but the median estimator is as sensitive to m_max variations as the average one, showing errors of up to 75%. On the other hand, relative estimate dispersions decrease rapidly with m_max difference, as the probability of having m₁ close to m_max, where one can get large errors, becomes smaller. It is also important that values of m_max that are smaller than 150 M_☉ are not realistic, and m_max = 50 M_☉ was introduced just to study the effect of a large range of parameter values.

Table 5. Relative Error for Estimators Applied to the Data Set Different from the one the Functions were Built on (Sorted Sampling Only)

m_max (Source Data Set)	m_max (Target Data Set)
	Average ME			Median ME

	300	150	50	300	150	50

	Estimators Based on m₁

300	26.37	51.03	81.12	2.10	34.47	74.66

150	...	18.09	77.91	...	1.77	70.67

50	...	...	16.53	...	...	2.36

	Estimators Based on m₃

300	7.28	21.97	56.21	2.12	17.44	53.82

150	...	6.43	51.76	...	1.90	49.36

50	...	...	5.43	...	...	2.30

Download table as: ASCII Typeset image

5. CONCLUSIONS

Several mass estimators for the cluster mass from the first, second, and third most massive stars were defined in this paper. Their precision was estimated. Estimators based on the mass of the third massive member m₃ gave the best results (approximately three to five times better than those based on m₁), and are less dependent on the maximum allowed stellar mass m_max and the assumed way of star formation (algorithm for picking masses from the IMF). We found that it is also better to build estimators on the median or mode values of m_i instead of the average values. The reason is that the strong power-law tails in the m_i distributions make the average value a less representative parameter.

The most important parameter is the assumed algorithm describing how the cluster mass is distributed among stars.

However, as several astrophysical effects were not taken into account, these results cannot yet be applied to most of the real clusters. Inclusion of evolution into this model is a subject for further work. Here, it was shown that m₃ is a good candidate for building mass estimators. Error analysis was also carried out and revealed a power-law tail in the error distribution. We showed that the median (or mode) values are much better sources for mass estimators than the average values.

FIRST, SECOND, AND THIRD MASSIVE STARS IN OPEN CLUSTERS

Article metrics

Permissions

Author affiliations

Dates

ABSTRACT

1. INTRODUCTION

2. MODEL

3. ANALYTICS