Paper The following article is Open access

OCT4 expression in human embryonic stem cells: spatio-temporal dynamics and fate transitions

, , , , , , and

Published 16 February 2021 © 2021 The Author(s). Published by IOP Publishing Ltd
, , Citation L E Wadkin et al 2021 Phys. Biol. 18 026003 DOI 10.1088/1478-3975/abd22b

1478-3975/18/2/026003

Abstract

The improved in vitro regulation of human embryonic stem cell (hESC) pluripotency and differentiation trajectories is required for their promising clinical applications. The temporal and spatial quantification of the molecular interactions controlling pluripotency is also necessary for the development of successful mathematical and computational models. Here we use time-lapse experimental data of OCT4-mCherry fluorescence intensity to quantify the temporal and spatial dynamics of the pluripotency transcription factor OCT4 in a growing hESC colony in the presence and absence of BMP4. We characterise the internal self-regulation of OCT4 using the Hurst exponent and autocorrelation analysis, quantify the intra-cellular fluctuations and consider the diffusive nature of OCT4 evolution for individual cells and pairs of their descendants. We find that OCT4 abundance in the daughter cells fluctuates sub-diffusively, showing anti-persistent self-regulation. We obtain the stationary probability distributions governing hESC transitions amongst the different cell states and establish the times at which pro-fate cells (which later give rise to pluripotent or differentiated cells) cluster in the colony. By quantifying the similarities between the OCT4 expression amongst neighbouring cells, we show that hESCs express similar OCT4 to cells within their local neighbourhood within the first two days of the experiment and before BMP4 treatment. Our framework allows us to quantify the relevant properties of proliferating hESC colonies and the procedure is widely applicable to other transcription factors and cell populations.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Human pluripotent stem cells (hPSCs), encompassing human embryonic stem cells (hESCs) and human-induced pluripotent stem cells (hiPSCs) self-renew indefinitely while maintaining the property to give rise, under differentiation conditions, to almost any cell type in the human body [13]. The maintenance and control of the pluripotency and differentiation trajectories of hPSCs is central to their touted applications in drug discovery, and regenerative and personalised medicine [410].

The control and optimisation of pluripotency across colonies is difficult due to its complex inter-regulatory dynamics. This regulatory network consists of a core set of pluripotency transciption factors (TFs) expressed to maintain self-renewal and suppress differentiation [11, 12]. Amongst the most important TFs that preserve the undifferentiated state in hESCs are NANOG, OCT4 and SOX2 [2, 11, 12]. During development, these TFs become expressed at different levels, initiating differentiation towards specific cell lineages following signalling cues [13]. Pluripotency is also affected by external factors: the local environment [14, 15], interactions with neighbours [16, 17], the cell cycle [13] and the substrate [18]. On the colony scale, complex collective effects of pluripotency emerge. In the presence of restrictive geometries, differentiated cells form bands occurring around colony edges [17, 19].

Several experiments have been performed to quantify the behaviour and joint influence of each TF in the pluripotent cell [2023]. Their results indicate that the expression of the TF proteins are highly variable both at the single-cell (time) and colony-level (space) due to intrinsic noise in gene expression, interactions at the molecular level and the randomness present in the extracellular environment [2426]. Thus, heterogeneity and stochasticity are inherent properties of pluripotent stem cell populations [25, 27, 28] that hinder their clonal expansion in culture [2932].

OCT4, in conjunction with other core members of the pluripotent regulatory network, activates both protein-coding genes and non-coding RNAs necessary to maintain pluripotency [22]. In mouse embryonic stem cells (mESCs), OCT4 expression is relatively uniform with a high correlation between its levels and pluripotency [33, 34]. In hESCs OCT4 also interacts with the BMP4 (bone-morphogenetic protein) pathway. Under standard culture conditions BMP4 acts as a morphogen [16] and defines several cell fates: in the presence of BMP4, high levels of OCT4 promote mesendoderm differentiation, while low levels result in extra-embryonic ectoderm and primitive endoderm differentiation [22, 35, 36].

Fluorescently-labelled hESCs are useful tools for the in vitro tracking, visualisation and real-time monitoring of the hESCs without the need for cellular fixation. These single-cell measurements can be used for accurate quantification of the protein changes in time and space. Recent studies of the expression of OCT4 in hESCs bearing the OCT4-mCherry reporter [37] indicate inheritance of similar protein levels from mother to daughter cells, with the OCT4 levels established in new daughter cells being predictive of long-term cellular states [38]. Although the daughter cells continue to be very similar to their predecessors, in the long term, further variations get amplified with consecutive cell divisions and thus the heterogeneous hESC population is established by incremental divergences [39]. These divergences, caused by regulatory mechanisms, noise in the protein expression, etc, create paths through all possible cell states (fates) which result in distinctive patterns in hESC colonies under differentiating conditions [17, 40].

These observations raise questions about the temporal behaviour of the OCT4 signal and how and when the cell fates get established within a hESC colony. In this paper we build upon the previously published work of reference [38] which considers time-lapse fluorescent measurements of the OCT4-mCherry reporter levels in cells in a growing hESC colony. Although the dynamics of OCT4 are complex, affected by many genetic factors and closely regulated by the other TFs [2, 35, 41], here we isolate autonomous properties of OCT4 to facilitate the development of descriptive mathematical models.

We describe quantitatively the fluctuations in OCT4 in relation to cell fate and the addition of the differentiation agent BMP4. We quantify the self-regulation of OCT4 through anti-persistence and characterise it within the diffusion framework. Using custom-designed software, we reconstruct the hESC colony spatio-temporally, examine the establishment of the hESCs pro-fates (pluripotent, differentiated and unknown) and report the transition probability matrices of the hESCs between the different pro-fates at mitosis. These matrices result in the stationary distributions of cell fates that get established in the hESC colony in the presence and absence of BMP4.

From our spatial analyses of the hESCs positions within the colony, we calculate the time at which the cells segregate in terms of their pro-fates. This gives a time-frame for the emergence of pre-patterning in a hESC colony. Finally, we quantify the 'cooperation' between nearest hESCs, defined in terms of a dissimilarity metric between their OCT4 values. Our quantitative analyses, along with reference [38] provides a transferable basis for the comparison to other TFs and for developments in mathematical and statistical models of pluripotency.

2. Materials and methods

2.1. Experiment

The experiment was carried out by Purvis Lab (University of North Carolina, School of Medicine), and published in reference [38]. The experiment details are described thoroughly in reference [38]. In the following, we give a brief description to facilitate the reading of our paper. We show a workflow diagram to illustrate our steps in analysing the experimental data set in figure 1.

Figure 1.

Figure 1. Workflow diagram showing the steps taken to analyse the experimental dataset from reference [38].

Standard image High-resolution image

The OCT4 levels (time-lapse mean OCT4-mCherry fluorescence intensity) in a hESC (H9) colony were determined over multiple generations until their differentiation to extra-embryonic mesoderm. Cells were live-imaged for 68 h. The experiment begins at texp = 0 h and ends at texp = 68 h. At texp = 43 h, the hESCs are treated with (100 ng ml−1) bone-morphogenetic protein 4 (BMP4) to induce their differentiation towards distinct cell fates. Therefore, time texp < 43 h indicates the absence, and texp > 43 h the presence of BMP4.

The data is provided as time series and includes the position (x(t), y(t)), radial position within the colony (r(t)) and OCT4 immunofluorescence intensity values (Ω(t)) of each cell at intervals of Δt = 5 min. The initial and final times in the time series denote the cell birth and division, respectively. The OCT4 values are reported in arbitrary fluorescence units (a.f.u.), the intensity values in terms of the number of photons detected by the microscope from the specimen.

To classify the cells as either self-renewing (pluripotent) or differentiated, the expression levels of CDX2 were quantified at texp = 68 h. The CDX2 levels, along with the final OCT4 expressions were used to classify the cells according to their pro-fates using a two-component mixed Gaussian distribution, that is, those cells belonging exclusively to a pluripotent (self-renewing) or differentiated state. A remaining group of uncatalogued cells were classified in an unknown category. Using these pro-fates, the cell population was traced back in time, spanning multiple cell divisions, with each earlier cell labelled according to this pro-fate. We use the letters P, D and U to denote the pluripotent, differentiated and unknown pro-fates, respectively.

2.2. Quantitative analysis

The quantitative analysis in both reference [38] and this manuscript were performed using MATLAB®. We give full details of the quantitative methods in the supplementary information (https://stacks.iop.org/PB/18/026003/mmedia).

3. Results

3.1. Colony summary

The colony begins from 30 cells and grows over 68 h (817 time frames) to 463 cells, with 1274 cell cycles elapsing within this time. The number of cells, N, considered in each cell pro-fate category, pre- (texp < 43 h) and post-BMP4 (texp > 43 h) addition is given in table 1. An analysis of the number of cells in the colony over time, N(texp), is given in the supplementary information (figure S1). The whole colony follows exponential growth, with a doubling time of 16 ± 0.01 h, as noted in reference [38] and consistent with other reports [42, 43]. When split by fate, the pluripotent cells proliferate significantly faster than the differentiated cells.

Table 1. The number of cells, N, in each of the cell fate (pluripotent P, differentiated D and unknown U) and pre- and post-BMP4 categories. A post-BMP4 is any cell present at texp = 43 h or later. There are 1274 cell cycles in total.

N Pre-BMP4Post-BMP4All times
P96422518
D22111133
U112511623
All fates23010441274

Spatial reconstruction allows us to visualise the evolution of the colony as in video S1 (cells labelled by pro-fate) and video S2 (cells labelled by OCT4). Corresponding snapshot images at the beginning of treatment with BMP4 and at the end of the experiment are shown in the supplementary information, figure S2. Snapshots of the colony colour-coded by OCT4 intensity are shown in figure 2(a). Spatial clustering of pro-fate and OCT4 can be seen.

Figure 2.

Figure 2. (a) Snapshots of the colony at texp = 0 h, 20 h, 43 h (addition of BMP4) and 68 h (final time). The cells are coloured according to their OCT4 intensity levels. Note that the circles are not indicative of cell or nucleus size. (b) The distributions of OCT4 for pluripotent (filled blue circles), differentiated (open red squares) and unknown cells (open turquoise diamonds), pre- and post-BMP4. (c) OCT4 values for all sister pairs (564 pairs) at the start (Ωi (t0)) and end (Ωi (tn )) of their cell cycles. The lines of best fit (orange solid lines) with standard errors in predicting a future observation (dashed lines) are Ω1(t0) = (1 ± 0.003)Ω2(t0) with R2 = 0.98 and Ω1(tn ) = (0.97 ± 0.2)Ω2(tn ) with R2 = 0.78 for initial and final OCT4, respectively.

Standard image High-resolution image

The measurement of the OCT4 signal at 5 min intervals, results in a set of evenly sampled discrete observations for each cell, Ω(t0), Ω(t1), ..., Ω(tn ), where t0 and tn denote the times of cell birth and division, respectively. The values of tn range from 15 min to 30 h across the population. The OCT4 time series for a cell at the beginning of the experiment and its descendants is shown in figure 1.

The distributions of all measured OCT4 values are shown in figure 2(b). The mean, median and kurtosis of each distribution are given in table 2. Pro-differentiated cells pre-BMP4 show a visibly skewed distribution, with a higher preference of lower OCT4 expressions than the pluripotent cells, fitting with the pre-determined fate choice identified in reference [38]. Post-BMP4, all fates also show a reduction in their OCT4 expression, with the effect seen most strongly in the differentiated cells. This reduction in OCT4 in the differentiated cells is expected, but it is interesting that the same effect (to a lesser extent) is also present in the other fate groups.

Table 2. The mean ± standard deviation, median (lower quartile upper quartile) and kurtosis of the OCT4 distributions for each pro-fate (pluripotent P, differentiated D and unknown U) shown in figure 2(b).

  Pre-BMP4Post-BMP4
PMean1510 ± 2301120 ± 270
Median1500 [1280 1730]1090 [930 1260]
Kurtosis3.35.2
DMean1130 ± 240730 ± 320
Median1100 [960 1290]720 [450 990]
Kurtosis3.32.1
UMean1280 ± 300910 ± 350
Median1280 [1090 1490]940 [670 1150]
Kurtosis3.03.1

The corresponding temporal probability density functions for OCT4 over time are given in the supplementary information, figure S3. We also present the average OCT4 expression with time at the colony level in figure S4. Both show the reduction of OCT4 expression in all pro-fates with time.

The analysis in reference [38] shows that upon cell division the OCT4 ratio between sister cells is centred around 1:1, meaning that although asymmetric inheritance does occur, on average sister cells start with similar levels of OCT4. We consider the correlation in the OCT4 time series for sister cells over their lifetimes by calculating the correlation coefficient, ρ. Note that for sister pairs with unequal cell-cycle times, the correlation is calculated for the time series of the length up to the minimum cell cycle time. Each OCT4 time series was de-trended to account for any confounding similarities in sister cells that may be present due to their shared environment. The distributions of ρ are shown in the supplementary information, figure S5. The mean correlations, $\bar{\rho }$, are given in table 3 and show moderate positive correlations across all categories.

Table 3. The mean correlation, $\bar{\rho }\enspace {\pm}$ the standard deviation (standard error) between pairs of sister cells.

$\bar{\rho }$ Pre-BMP4Post-BMP4All times
P0.5 ± 0.3(0.02)
D0.5 ± 0.2(0.04)
U0.5 ± 0.3(0.02)
All0.3 ± 0.2(0.03)0.5 ± 0.3(0.01)0.5 ± 0.3(0.01)

All pro-fates show $\bar{\rho }=0.5$ (±0.2, ±0.3 and ±0.3 for pluripotent, differentiated and unknown cells, respectively). Sister cells pre-BMP4 show a weaker correlation than those post-BMP4, with $\bar{\rho }=0.3{\pm}0.2$ and $\bar{\rho }=0.5{\pm}0.3$, respectively. This suggests that BMP4 treatment exacerbates the similarities in sister cell OCT4. These results quantify the regulation between sister cells and further illustrate that this regulation is systematic and importantly, still present when confounding external trends are removed.

We can also quantify how this correlation between sister cells varies through their lifetimes. The initial (Ωi (t0)) and final (Ωi (tn )) OCT4 values for all sister cells are shown in figure 2(c). The initial values follow a close relationship (consistent with the OCT4 ratio splitting distribution in reference [38]), with correlation ρ = 0.99 and the trend line Ω1(t0) = (1 ± 0.003)Ω2(t0). Note that the labelling of the sister cells as cell 1 and cell 2 is arbitrary. By the end of their respective lifetimes, the distribution spreads, with ρ = 0.78 and a line of best fit Ω1(tn ) = (0.97 ± 0.2)Ω2(tn ).

Next, we consider the behaviour of OCT4 from the initial point of possible asymmetric inheritance to the final time before mitosis and characterise how this drift of similarity shown in figure 2(c) occurs.

3.2. Temporal OCT4 dynamics

In this section we quantify the temporal behaviour of OCT4 dynamics on the cellular level over the course of a cell lifetime. We consider the variability between discrete time-steps and quantify the self-regulatory behaviour of OCT4 using several methods.

3.2.1. Variability at short time scales

Even small fluctuations in TF abundance impact cell fate [44], with both high and low TF values resulting in differentiation [45, 46]. The mathematical quantification of TF fluctuation will facilitate the description of pluripotency over discrete time-steps, fitting for time-lapse experiments such as the one considered here [38]. We denote the change in the intra-cellular OCT4 abundance between the 5 min intervals as ΔΩ = Ω(ti+1) − Ω(ti ). We consider the scaled distributions, ΔΩ/Ω(ti ), shown in figure 3(a), to account for the reduction in expression post-BMP4 addition. The scaled fluctuations are centred around zero, however, the distributions are not normal (confirmed by the Kolmogorov–Smirnov and Shapiro–Wilk tests at the 95% confidence level) due to a narrower and steeper peak. A Laplace distribution, Laplace (μ, b), better fits the experimental data in all cases, with the parameters given in table 4. The distributions of ΔΩ are shown in the supplementary information, figure S6 with fittings in table S1.

Figure 3.

Figure 3. (a) Distributions of the change in OCT4 between time frames (ΔΩ/Ω(ti )). Solid lines show the Laplace distribution fittings, given in table 4 and dashed lines show the Normal distribution fittings. (b) The Poincaré maps for the OCT4 signal. The colour bar shows the normalised relative frequency of the points.

Standard image High-resolution image

Table 4. The parameters from the Laplace (μ, b) fittings to the ΔΩ/Ω(ti ) distributions shown in figure 3(a).

Laplace (μ, b)Pre-BMP4Post-BMP4
P(−4.6 × 10−4, 0.035)(−2.4 × 10−4, 0.030)
D(−7.1 × 10−4, 0.041)(−4.6 × 10−3, 0.032)
U(−7.8 × 10−4, 0.035)(−2.5 × 10−3, 0.030)

Post-BMP4 addition, the distributions for all pro-fates become narrower, with the parameter b showing reductions of 14%, 22% and 15% for pluripotent, differentiated and unknown cells, respectively. This narrowing is due to a preference of smaller changes in OCT4 provoked by the differentiation agent. This could be driven by induced selectivity caused by the BMP4 addition (i.e. the BMP4 causes a systematic change, producing a preference for smaller ΔΩ/Ω(ti ) values), or it could suggest some collective self-regulation. It is expected, since the differentiated cells are most affected by the BMP4, that this group would show the biggest reduction in variation and therefore the strongest regulation in their OCT4 values. Averaged over all cells, $\bar{{\Delta}{\Omega}/{\Omega}\left({t}_{i}\right)}$, (supplementary information, figure S6) shows that although the trend is strongest post-BMP4, there are periods of sustained negative fluctuations from as early as 5 h.

We can also consider the self-similarity of the OCT4 series using the Poincaré map [47, 48]. For each cell, its OCT4 time series can be plotted against itself with one time-step delay, i.e. Ω(ti ) against Ω(ti+1), shown in figure 3(b). By assessing qualitatively the shape formed by the return map, we observe changes in the distribution of points pre- and post-BMP4. Even pre-BMP4 addition, the differentiated cells show less variation compared to the pluripotent cells, with the addition of BMP4 exacerbating this effect. Quantitatively this can be described by fitting an ellipse to the shape formed by the data plots, given in the supplementary information table S2.

This information quantifies step changes in OCT4, suggesting Laplace distributions to simulate variation and showing that the addition of BMP4 provokes tighter self-regulation across all cell fates. It also highlights that even between small time increments such as these, the fluctuations post BMP4 should be considered separately for cells of different fates, not only in terms of their average, as expected, but also their variability.

3.2.2. OCT4 self-regulation

To investigate the self-regulation and internal memory of OCT4 during a cell cycle, we consider three related approaches, the Hurst exponent, the autocorrelation function and diffusion analysis.

The Hurst exponent. The Hurst exponent, 0 < H < 1 is a measure of the long term memory of a time series [49, 50]. If a series is Brownian, H = 0.5, with mutually statistically independent fluctuations. If the series is persistent, H > 0.5, and at each time-step the series is more likely to fluctuate in the same direction as in the previous step, i.e. if in the last time-step there was an increase, it is more likely there will be another increase during the next time-step. For anti-persistence, H < 0.5, the series is less likely to fluctuate in the same direction as the previous step.

We calculate the Hurst exponent for all cells which live longer than 50 time frames (4.16 h). The distributions of all H values are shown in figure 4(a) with the average Hurst exponents, $\bar{H}$, given in table 5. In all cases, the Hurst exponents are significantly less than 0.5, showing moderate anti-persistence indicative of intra-cellular OCT4 self-regulation. Note that this describes the behaviour only on the time scale of a single cell lifetime and that there are other longer-term behaviours also influencing expression [38]. There is no significant difference in H before and after the BMP4 addition in all cell fates (confirmed by the Kolmogorov–Smirnov test at the 95% level) suggesting this aspect of the self-regulatory behaviour is inherent to the cells and unchanged by the differentiation stimulus.

Figure 4.

Figure 4. (a) The Hurst exponent, H, for pluripotent cells (filled blue circles), differentiated cells (open red squares) and unknown cells (open turquoise diamonds). The black lines show H = 0.5 for Brownian fluctuations. (b) Typical autocorrelations showing (i) a period of anti-correlation before settling at zero (seen in 51% of cells, blue solid), (ii) two periods of anti-correlation followed by correlation (seen in 28% of cells) and (iii) a period of anti-correlation followed by a period of correlation (seen in 14% of cells).

Standard image High-resolution image

Table 5. The mean Hurst exponent $\bar{H}$ with (standard deviation, standard error) for all cell categories.

$\bar{H}$ Pre-BMP4Post-BMP4
P0.37 (0.08 0.008)0.37 (0.09 0.004)
D0.42 (0.08 0.02)0.39 (0.09 0.009)
U0.38 (0.08 0.007)0.39 (0.09 0.004)

Autocorrelation. The anti-persistence can be further explored by considering the autocorrelation of the OCT4 time series. The autocorrelation is the correlation of a time series with itself at increasing time lags, hence −1 ⩽ C ⩽ 1 where C = 0 signifies no correlation, C < 0 a negative correlation and C > 0 a positive correlation. The decay of the autocorrelation to zero (scaled to cell lifetimes) is presented in the appendix in reference [38] and here we extend this to quantify the periods of anti-correlation and consider the periodic nature of the autocorrelation.

Typical autocorrelation functions are shown in figure 4(b). The majority of the cells follow an autocorrelation similar to the one shown in figure 4(b)(i) (Cell ID 46), with initial correlation declining to zero, followed by a period of anti-correlation before the autocorrelation settles at zero. There are, however, other behaviours evident. Some cells show several lag intervals of anti-correlation, as in figure 4(b)(ii) (Cell ID 14), with others showing a positive correlation at a longer time lag before settling at zero, as in figure 4(b)(iii) (Cell ID 43).

Anti-correlation for a time lag of at least 1 h duration is seen in 99% (1255/1274) of cells, and for at least 5 h in 86% (1090/1274) of cells. Of the cells with at least 1 h anti-correlation visible, 44% show a second period of correlation near the end of their lifetimes (as in figure 4(b)(ii) and (iii)). Out of these, 65% show one period of anti-correlation (as in figure 4(b)(iii), 31% two periods (as in figure 4(b)(ii), and the remaining 4% three or more. For the 57% of cells with no second period of correlation, 90% of cells show one period, 8% two periods and 2% three or more periods of anti-correlation. There is no correlation between the number of periods of anti-correlation and cell fate or the cell's average position in the colony.

The earliest time anti-correlation occurs, tAC, can be calculated for each individual cell. The distribution of tAC for cells with at least 1 h anti-correlation is shown in the supplementary information, figure S7, showing that in all cells anti-correlation has begun by 8 h into the cell cycle. This could be due to the memory effects or the down-regulation of the PTF which occurs prior to mitosis [51, 52]. The distributions of the percentage lifetime a cell spends in an anti-correlated state are given in figure S7. Across all groups cells spend 40%–80% (with a mean of 60%) of their cell cycle expressing anti-correlated OCT4.

The oscillatory nature and decay of the autocorrelation can be captured by the function C = cos(2πt/a)et/b [53], where a and b are constants and t is time (note that this periodicity in the autocorrelation does not necessarily imply periodicity in the time series). These fittings are shown in figure S8 for 25 randomly selected cells. This quantifies the temporal, periodic decay in the autocorrelation, with the parameter a representing the time-scale of the periodicity, and b the time-scale of the decay (the correlation decay time). Histograms of a and b for all 1274 cells are shown in figures S9 and S10. For all fates, the medians are 11.7 h and 3.0 h, with 90th percentiles of 30 h and 7 h for a and b respectively. This quantifies the characteristic time-scale of the periodicity and the correlation decay time as less than 7 h in 90% of cases.

The correlation time is defined as $\tau ={\int }_{-\infty }^{\infty }C\left(t\right)\mathrm{d}t$, with a mean across all cells of $\bar{\tau }\approx 0{\pm}0.002\enspace $ h. The distribution of all τ is shown in figure S7. The mean autocorrelation (figure S11) decreases to zero at around 3 h, followed by a period of negative autocorrelations between approximately three and 12 h. By 13 h, the mean autocorrelation settles at zero, showing no internal memory past this time. These observations are robust to cell fate and the equivalent autocorrelations for pluripotent and differentiated cells are shown in figure S12. This shows that during a cell cycle, there is long-term memory in the OCT4 expression up to around 12 h, but the nature of the effect differs over this time with initial correlation being replaced by anti-correlation. Notably, the mean autocorrelation is not fully described by cos(2πt/a)et/b , as the full scale of the anti-correlation is not captured.

Diffusion analysis. The theory of diffusion and random walks is widely used across many biological applications, including stem cells and so it is useful to quantify the OCT4 behaviour within this framework [5459]. After the cell division with asymmetric inheritance of OCT4 there is a short period of increased fluctuations [38]. Therefore here we consider each OCT4 time series from half an hour after cell division. The mean square difference of OCT4 over time, MSD(t), can be calculated as ⟨|Ω(ti ) − Ω(t0)|2⟩, where the angular brackets denote the average across all cells in the group considered.

The MSD for each pro-fate, pre- and post-BMP4 between 0 and 12 h is shown in figure 5. In all cases, there is visible sub-diffusive behaviour. The power law fits MSD = βtα are shown in figure 5 and the parameters α and β are given in tables 6 and 7. This sub-diffusivity is consistent with the anti-persistence. The effect of this on sister cell's OCT4 is presented in the supplementary information, figures S13 and S14. The differentiated and unknown (pre-BMP4) cells have α ≈ 1 showing diffusion at early times.

Figure 5.

Figure 5. The mean-square difference up to 12 h for each pro-fate, pre- and post-BMP4. The black lines show the power law fits, MSD = βtα , with the parameters α and β specified in tables 6 and 7.

Standard image High-resolution image

Table 6. The parameter α in the fitting to the mean square displacement with time, MSD = βtα for all cell pro-fates.

α Pre-BMP4Post-BMP4
P0.59 ± 0.030.58 ± 0.03
D0.88 ± 0.131.00 ± 0.004
U1.01 ± 0.050.80 ± 0.01

Table 7. The parameter β in the fitting to the mean square displacement with time, MSD = βtα for all cell pro-fates.

β Pre-BMP4Post-BMP4
P42000 ± 270033000 ± 2000
D54000 ± 300013000 ± 900
U32000 ± 100023000 ± 400

This further quantifies the self-regulatory behaviour of OCT4 within the diffusion framework, a fundamental starting point for many mathematical models. In the next sections we analyse the fate transitions and spatial patterning within the colony.

3.3. Cell fates and spatial properties

3.3.1. Fate transitions

How a stem cell divides to give rise to two daughters is critical for the maintenance and expansion of the culture. Stem cells may undergo both symmetric and asymmetric cell divisions, guided by several molecular, cellular, and environmental cues [60, 61]. During symmetric cell division, a stem cell generates two stem cells or two differentiated cells. The former is highly desired as it leads to the maintenance of the pluripotent state in a hESCs colony. Asymmetric divisions result in only one daughter inheriting the fate of the mother cell [62]. This is the main process driving the homeostatic growth of tissues in an organism [63]. The quantification of the transition dynamics between the cell fates is of utmost importance to emulate this behaviour with experiments in-silico.

The classification of the cells in terms of their pro-fates at the end of the experiment allows us to study the transitions between the different pro-fates at mitosis [38]. We define our notation as follows: let m be the fate of the mother cell and d1, d2 the fates of its two daughters. We then indicate with $\left[m,{d}_{1},{d}_{2}\right]$ the six possible outcomes for the two daughter cells, with m, d1 and d2 taking any of the three pro-fates, that is, P (pluripotent), U (unknown) and D (differentiated). Using the family trees, we calculate the transition probabilities between these pro-fates, shown in the supplementary information, figure S15, tables S3 and S4. In the absence of BMP4, the most important event driving the fate dynamics in the colony are symmetric cell divisions with both daughters having the same state as their mother cell (denoted as [P, P, P], [D, D, D] and [U, U, U]).

A quantitative way of visualising these results is using one-step transition probability matrices that govern the changes between the pro-fates. We use a state-transition Markov model as a simple tool to simulate the pro-fates (cohorts) long-term states in the colony. Note that we are not simulating the behaviour of single-cells with well-defined (correlated) transitions over their life-times. But instead focusing on a group of cells belonging to a cohort (that is P, U, D) for which the transitions between each other follow a Markov process [64]. Thus, the following calculations are not in contradiction the anti-persistent behaviour or the results presented by [38] with the cell history being predictive of the cell fate.

Since the presence of BMP4 significantly alters the underlying dynamics of the OCT4 signal, we hypothesise that a similar effect might influence the transition between the hESCs pro-fates. Thus we obtain two right stochastic matrices, the first showing the transitions for mother cells born before (absence) BMP4 treatment (303 events),

Equation (1)

and the second showing the transitions for mother cells born after (presence) of BMP4 treatment (194 events),

Equation (2)

In equations (1) and (2), the rows and columns correspond to transitions from the starting (mother cell) to ending (daughter cell) states, respectively. The events with the maximum probabilities are highlighted in bold. For W, equation (1), we get higher probabilities for WPP and WUU. That is, if the mother is pluripotent, it has 60% of probability of dividing into a pluripotent daughter, with a remaining 35% and 5% of giving rise to an unknown and differentiated daughter, respectively. This last event, although small, is detrimental to the maintenance of a highly pluripotent colony.

The one-step transition probabilities under BMP4 treatment show a strong tendency for all cell pro-fates (${W}_{\text{PP}}^{{\ast}}$, ${W}_{\text{UU}}^{{\ast}}$ and ${W}_{\text{DD}}^{{\ast}}$) to give rise to a daughter of the same pro-fate. The differentiation conditions changed from WDD ∼ 34% to ${W}_{\text{DD}}^{{\ast}}\sim 70\%$. That is, the pro-differentiated post-BMP4 mothers have twice the chance of producing a differentiated daughter cell than those pre-BMP4. Most importantly, our results reflect the fact that BMP4 treatment inhibits the return of a differentiated cell towards the pluripotent pro-fate, with ${W}_{\text{DP}}^{{\ast}}\sim 0\%$. A scheme of these transition probabilities is shown in figure 6.

Figure 6.

Figure 6. Transition probabilities between the pro-fates in hESCs in the absence/presence of BMP4. The pluripotent, differentiated and unknown pro-fates are represented with a blue, red, and turquoise spheres.

Standard image High-resolution image

Next we consider the steady-state probability distributions, which quantify the averaged behaviour of the system towards a stationary state and how this state changes under a perturbation, such as the treatment of BMP4. Similar calculations have been applied to gene regulatory networks [32]. We obtain the convergence of equations (1) and (2) to a steady state using eigendecomposition. The following right-hand eigenvectors correspond to these stationary distributions,

Equation (3)

for transitions in the absence of BMP4 and

Equation (4)

in the presence of BMP4. These eigenvectors are also known as the stationary probability vectors of W and W*, respectively. After a sufficiently long time, the states dictated by equations (1) and (2) will evolve towards a stationary probability distribution given by equations (3) and (4). That is, given an initial distribution of cells across the pro-fate states, equations (3) and (4) give the equilibrium distribution generated by the life trajectory of a cohort of identical cells by repeated multiplications of the vector of population counts by the transition probability matrix. It is important to note that these equations give the approximate behaviour of a strongly idealised system, thus we are not considering all the biological events affecting each cell in the colony. These type of models have proven successful in clinical decision making as a prognostic tool to guide decision making [64].

Equation (3) indicates that in the stationary state, a hESC colony maintained under self-renewal conditions (absence of BMP4), 46% of the cell state transitions result into highly pluripotent hESCs with the remaining 42% and 12% giving hESCs in the unknown and differentiated pro-fate. This last quantity indicates that in the long-term, over one-tenth of the cells will be in a differentiated state.

The treatment with BMP4 changes the equilibrium state. The population fractions ending in the pluripotent and differentiated pro-fates increase to 52% and 18% respectively. This at the expense of a decrease in the fraction of cells in the unknown pro-fate. Results presented in reference [17] show that hESC colonies under BMP4 treatment show bands of differentiation with constant width independent of the colonies' radii. A straightforward calculation shows that the number of cells in the band surrounding a colony increases linearly with its radius, while the number of cells in the bulk increases following a power law. Thus, if a similar process is affecting the hESCs differentiation, this also leads only to a slight increase in transitions towards the differentiated pro-fate.

A potential factor affecting these fate transitions is the interaction between neighbouring cells. This phenomenon is achieved through a variety of signalling pathways and can impact cell state changes [6567]. Next, we compute the spatio-temporal fate segregation in the colony, which serves to explain the high likelihood of certain transitions in equations (1) and (2).

3.3.2. Fate segregation

The segregation of cells in the early mammalian embryo occurs during the early phases of embryonic development and ends with the formation of the three germ layers [68]. The continuous rearrangement of cells occurs due to changes in the environment (surface cues) that induce differences in adhesion properties and changes in the cytoskeleton [69]. These differences in adhesion properties between neighbouring cells maintain a physical separation between different cell types and are one of the basic mechanisms for the pattern formation during development [70].

We use computational tools previously introduced in reference [71] to quantify the segregation of the hESCs in terms of their pro-fates. We identify the set of nearest neighbours of each cell within the colony by applying the Voronoi tessellation diagram (VD) of the space to each snapshot of the colony. The state of the hESC colony at texp = 18 h is shown in the supplementary information, figure S16. Using the VD and its dual, the Delaunay triangulation, we can obtain a cell's set of nearest neighbours. The segregation of two types of cells, A and B, can be measured using the segregation order parameter, δ, that depends explicitly on the number of nearest neighbours [71]. For a perfectly mixed system, with six Delaunay neighbours, δ ∼ 0. If the system is completely segregated (e.g. one cluster of A particles surrounded by other of B particles) δ ∼ 1.

For this calculation, the segregation of the pluripotent (or differentiated) cells is obtained by merging the unknown cells with the differentiated (or pluripotent) cells to generate the type B cells. The results are shown in figure 7. We discard the data at the initial stages of the experiment to avoid spurious results due to the low cell numbers and poor statistics. The hESCs with pluripotent pro-fate are effectively segregated (δP > 0.65) from the other pro-fates after texp ≈ 48 h (two days after the beginning of the experiment), as seen in figure 7(a) and video S1. The differentiated pro-fate, figure 7(b), segregates (δD > 0.7) at earlier stages, texp ≈ 20 h (one day after the beginning of the experiment), and remain in that state. The result for the unknown pro-fate (with type B cells defined as both the differentiated and pluripotent hESCs) is inconclusive. This is expected since these cells are located between the pluripotent and differentiated pro-fates and thus, are 'mixed' with their type B cells.

Figure 7.

Figure 7. Segregation of hESCs according to their pro-fates: (a) pluripotent (δP) (b) differentiated (δD) and (c) unknown (δU) cells as a function of time. The results of the bootstrap method for each sample are shown in grey which correspond to the calculations performed by re-sampling the datasets with replacement. Each data point shows the average value with standard deviation error bars obtained by averaging over all cells in 12 snapshots (1 h of experiment).

Standard image High-resolution image

3.3.3. OCT4 dissimilarity

The measurement of the clustering according to the hESCs pro-fates using the segregation order parameter allows us to calculate the OCT4 variability between a specific hESC and its closest neighbours. We use the set of neighbours obtained for each hESCs in the previous section and characterise the OCT4 levels between the cells located in a local neighbourhood. Since the OCT4 levels take any real positive value, we define the 'cooperation' between the cells as the tendency of a specific cell to express a similar OCT4 value to that of its nearest neighbours.

We use the dissimilarity metric, Si , to quantify the similarity in OCT4 in neighbouring cells [72]. If the OCT4 values in the colony are quantitatively more similar amongst adjacent cells, Si ≈ 0. If the opposite occurs, that is, the OCT4 values are different between the cells in a local neighbourhood, Si ≠ 0.

To avoid inaccurate results due to poor statistics, we obtain the probability density function (PDF) of Si over specific time intervals, figure 8. We also perform a qualitative comparison with simulated datasets, by randomising the positions of the cells and drawing their OCT4 levels from a uniform distribution over the same range as the experimental distribution, assuming that no cell-to-cell interactions occur. To assist in the visualisation of these plots, we use a non-linear binning scheme and plot the x-axis on a logarithmic scale.

These results indicate a similar behaviour in the PDFs of Si in the absence of BMP4, that is, for ${t}_{\mathrm{exp}}=\left[5,35\right]\enspace $ h, figures 8(a)–(c), with an average mean of 0.081 ± 0.009 and skewness of 3.07 ± 0.56. For ${t}_{\mathrm{exp}}=\left[35,45\right]\enspace $ h and $\left[45,55\right]\enspace $ h, panels (d) and (e), the mean of these two distributions is 0.078 ± 0.005, similar to those values observed for (a)–(c). However, their tails become larger, with a skewness of 6.90 and 9.84, respectively. These latter results indicate that some (few) cells in the colony are expressing highly dissimilar OCT4 values compared with those of their nearest neighbours. These differences become larger for ${t}_{\mathrm{exp}}=\left[45,55\right]\enspace $ h, after the treatment with BMP4. Finally, for ${t}_{\mathrm{exp}}=\left[56,65\right]\enspace $ h, figure 8(f), the distribution displaces towards the right (with a mean of 0.2268 and skewness of 6.74). This corresponds to an overall increment in the dissimilarity for a large proportion of the cells in the colony. In all cases, the PDFs obtained with the simulated data have a larger mean and smaller skewness than the experimental datasets.

Figure 8.

Figure 8. Probability density functions for the averaged dissimilarity metric ⟨Si ⟩ of the OCT4 expression of hESCs and their nearest neighbours (filled blue circles) at (a) 5–15, (b) 15–25, (c) 25–35, (d) 35–45, (e) 45–55 and (f) 55–65 hours. The distribution displaces towards the right with time which indicates that the dissimilarity in OCT4 values between nearest-neighbours increases with time and after BMP4 treatment. Simulated datasets with the OCT4 values following a uniform distribution are shown with open orange diamonds.

Standard image High-resolution image

4. Discussion

We set out to demonstrate transferable approaches for quantifying the dynamics of pluripotency transcription factors in hESC colonies, using the published OCT4 data set from reference [38].

4.1. Temporal regulation

Sister cells show more closely related OCT4 values than pairs of random cells [38]. Here we have further quantified the temporal dynamics of sister cells in relation to one another. Taking into account any common trends affecting both cells due to their shared environment, the sister cells pre-BMP4 show a moderate correlation with each other (0.5). This is reduced to a slight correlation for pairs that both exist post-BMP4 (0.3). The fact that these correlations still occur after de-trending further highlights the inherent similarities between sister cells.

We found that pre-BMP4, within the first 24 h of the colony growth, all hESCs with pluripotent pro-fate express high levels of OCT4. The averaged expression of these signals evolves in time towards a minimum coinciding with the treatment with BMP4, and with strong correlations between the pluripotent and unknown pro-fates (0.84). This correlation may indicate the presence of an underlying process affecting the behaviour of the cells at the colony level, that is, changes in the environmental conditions, cell-to-cell signalling, paracrine signals, etc, that may have influenced the colony at the early stages. The OCT4 distribution at the colony level is not uniform, in contrast with the results observed for other ESCs types [33, 34], showing a highly dynamical behaviour and evolving from higher to lower values of OCT4 expression as the colony grows.

The time-step changes in OCT4 are symmetric, with an average of zero, best described by a Laplace distribution. Laplace distributions have previously been applied to gene expression data, with the suggestion that the distribution can represent mixtures of other distributions (e.g. Normals, Pareto) also related to gene expression [73]. The parameters can be used for experimental comparisons, as direct inputs into computational models and for model verification. Further experimental data (e.g. different cell lines, culture conditions, restricted geometries, cell densities) are needed confirm the robustness, estimate the parameters for other experimental conditions and investigate how this is affected by cell–cell interactions. Such stochastic fluctuations in OCT4 have been shown to bias cell fate [44] with evidence of asymmetric noise leading to noise-mediated cell plasticity [74].

Although this shows that overall, positive changes in OCT4 are just as likely to occur as negative ones, it does not reveal anything about the temporal nature of these fluctuations and hence any temporal correlation properties (for example, all the positive changes in OCT4 could come one after the other, followed by all the negative changes, it does not mean that a positive change is necessarily followed by a negative change). There is also a difference in these fluctuations after the differentiation agent, with the addition of BMP4 provoking tighter self-regulation across all cell fates. Further experiments with increasing colony size are needed to investigate whether this self-regulation is a collective behaviour effect.

A significant finding is the quantification of the self-regulatory properties of OCT4. We calculate the average Hurst exponent (0.38, indicative of anti-persistence) which has been previously used to characterise gene expression [75], DNA sequencing [76], stem cell division times [77] and self-renewal capacities [78]. Broadly, the identification of a Hurst exponent which is not Brownian suggests the use of specific equations for describing temporal OCT4. Furthermore, it can be a direct input into some stochastic modelling equations in which the parameter, H, is required, e.g. fractional Brownian motion [49, 50, 79, 80]. To visualise the OCT4 values in relation to the previous time frame we use Poincaré plots, but note that other methods are also applicable here, such as diffusion maps [81].

An autocorrelation analysis shows periods of anti-correlation, in keeping with the regulation of pluripotency TFs [2, 35, 82]. Throughout the colony growth, anti-correlation of at least 5 h is seen in 86% of cells (with no significant difference between the cell fates) and on average occurs between 3 and 12 h into a cell's lifetime. This suggests that the anti-correlation is an inherent property of the cells, across all cell fates. Further experiments are needed to clarify that this is the case under different experimental conditions (i.e. different sized colonies, other cell lines, in different geometries) but this provides a further quantitative statistic for comparisons. The identification of this systematic property has implications for the underlying stochastic chemistry of the OCT4 regulation and can be used to inform chemical models of TF regulation, often based on the Hill equations [83, 84].

The sub-diffusive nature of the time series allows for another characterisation of the OCT4 self-regulation using only two summary parameters, α and β. Note that this is not contradictory to the conclusion of pre-determined cell fate in reference [38], as here we consider the behaviour over individual cell lifetimes, (i.e. shorter time scales) and do not take into account other behaviours in the colony (i.e. over multiple cell divisions, longer time scales). The sub-diffusivity highlights the presence of a universal behaviour in the cells, and can be used as a direct parameter input into Brownian (and the fractional and geometric extensions) computational models allowing the future behaviour to be predicted statistically.

4.2. Fate transitions

The one-step transition probability matrices depict the possible paths for cellular pro-fate transitions and give insight into the (a) symmetric nature of the mitotic events. Consistent with the results reported in reference [38], the pro-pluripotent hESCs have a higher probability (∼60%) of giving rise to a daughter cell of the same fate. The remaining probability is associated with a pro-pluripotent cell giving rise to an unknown or differentiated daughter cell at mitosis. Interestingly, the pro-differentiated cells had the highest chance of giving rise to an unknown pro-fate cell. The transitions away from the pluripotent pro-fate are detrimental for the maintenance of a pluripotent colony if we assume that both pro-fates (unknown and differentiated) are undesirable for highly pluripotent colonies.

Under the framework of Markov chains, we obtained the stationary probability functions of hESCs fate transitions both in the absence and presence of BMP4. The stationary probability distributions predict 46% of the cell state transitions towards a cell in the pluripotent pro-fate for a hESC colony under self-renewal conditions and in the absence of BMP4. The remaining 42% and 12% result in hESCs in the unknown and differentiated pro-fates, respectively. Thus, after a sufficiently long time, the proportion of hESCs moving to the pluripotent and unknown pro-fates becomes similar. In the presence of BMP4, we obtain a stationary probability distribution with 52%, 30% and 18% of cell state transitions towards the pluripotent, unknown and differentiated pro-fates. As expected, this last probability indicates that BMP4 induces an increase in the number of hESCs transitioning to a differentiated state.

4.3. Spatial segregation

The segregation parameter indicates that the ancestors of the hESCs with a differentiated pro-fate position themselves at the outer (top) regions of the colony as early as one day after the beginning of the experiment and they remain clustered (segregated) throughout the experiment. The segregation process culminates with the separation of the pluripotent and unknown pro-fates one day later. Coupling these results with the higher probabilities of division towards a daughter with the same pro-fate, a hESCs within a differentiated pro-fate gives rise to a differentiated daughter in its local neighbourhood. These transitions are consistent with the patterning observed in hESCs under confinement inside microfluidic devices reported elsewhere [19, 40].

The dissimilarity metric indicates that the OCT4 values amongst the nearest cells remain comparable in the absence of BMP4. However, we observe changes in the tail of these distributions 10 h after treatment with BMP4. This indicates the presence of cells expressing highly dissimilar OCT4 values with those of their nearest neighbours. Since by this time the pluripotent and differentiated pro-fates are segregated, we hypothesise that these large differences in OCT4 levels may happen at the interface between the differentiated and unknown pro-fates since their OCT4 distributions are highly dissimilar by the end of the experiment.

4.4. The unknown fate

The behaviour of the unknown cells (unable to be classified as either pluripotent or differentiated based on their OCT4 and CDX2 levels) lies between that of the pluripotent and differentiated cell fates. Spatially, they are not clearly segregated from the pluripotent and differentiated cells, a possible indication that the fate decision has not yet occurred. The unknown pro-fate cells could be the result of a mixture of both populations with the ability to express high and low OCT4 expressions, or cells undergoing a transition phase between pluripotent and differentiated. The distinct differences between cell fates could provide non-invasive diagnostic tools to identify cell fates.

4.5. Future

The experiment in reference [38] has led to a rich analysis, allowing us to establish the language through which to quantitatively compare this experiment to others and to guide mathematical modelling choices. These tools can be easily modified and adapted to study the dynamics of other relevant transcription factors, such as NANOG and SOX2 in H9 hESCs. Furthermore, for experiments in which several transcription factors are measured simultaneously in real-time, our custom-developed algorithms can be easily modified to quantify the in vivo segregation in tissues.

The limitation of this research is the single hESC colony analysed. Although several mitotic events were recorded during the experiment, the results need to be generalised to other hESC colonies of different cell lines and under different experimental conditions. Currently, methods such as single-cell RNA sequencing and real-time reverse transcription PCR allow more accurate quantification of the cellular states at the transcriptional level and gene expression, respectively. These datasets can potentially drive the development of mathematical models at the single-cell level using stochastic processes (e.g. the non-Markovian Langevin or non-Markovian Fokker–Planck equations). These models can account for transitions between cell states (genotype) and/or fates (phenotype) at two different time scales: shorter and longer than the cell cycle. A combined measurement of TF expression in hESCs with single-cell RNA sequencing may give a deeper understanding of the spatio-temporal changes of TF expression in hESC colonies.

In general, our results highlight the need for further temporal experimental data on OCT4 and other transcription factors. We expect that studies with computational tools complementing experiments will become more commonplace, furthering our knowledge in stem cell biology and accelerating the development of stem cell-based technologies.

Author contributions

Methodology: LEW, SO-F, RAB, AS, NGP; software: SO-F; data analyses: LEW, SO-F; writing: LEW, SO-F; editing: LEW, SO-F, IN, ML, RAB, AWB, NGP; supervision: RAB, AS, NGP.

Acknowledgments

LEW acknowledges support from the London Mathematical Society Early Career Fellowship. SOF acknowledges financial support from the Consejo Nacional de Ciencia y Tecnología (CONACyT, Mexico) for the Grant CVU-174695. IN acknowledges the grant from the Russian Government 641 Program for the recruitment of the leading scientists into 641 Russian Institution of Higher Education 14.w03.31.0029 and RFFI project Grant Number 20-015-00060. ML acknowledges BBSRC UK (BB/I020209/1) for providing financial support for this work. RAB wants to acknowledge financial support from CONACyT (Mexico) through the project 283279.

Please wait… references are loading.