Novel techniques for characterising graphene nanoplatelets using Raman spectroscopy and machine learning

A significant challenge for graphene nanoplatelet (GNP) suppliers is the characterisation of platelet morphology in industrial environments. This challenge is further exacerbated to platelet surface chemistry when scalable functionalisation processes, such as plasma treatment, are used to modify the GNPs to improve the filler-matrix interphase in nanocomposites. The costly and complex suite of analytical equipment necessary for a complete material description makes quality control and process optimisation difficult. Raman spectroscopy is a facile and accessible characterisation technique, with recent advancements unlocking fast mapping for rapid data collection. In this study, we develop novel techniques to better characterise GNP morphology and changes in surface chemistry using Raman maps of bulk powders. Providing a bespoke algorithmic framework for the analysis of these advanced materials. An unsupervised peak fitting and processing algorithm was used to extract crystallinity data and correlate it with laser-diffraction-derived lateral size values for a commercial set of GNPs rapidly and accurately. Classical machine learning was used to identify the most informative Raman features for classifying the plasma-functionalised GNPs. The initial material properties were found to affect the peak features that were the most useful for classification. In low defect density and low specific surface area GNPs, the D peak full width at half maximum is found to be the most useful, whereas the I2D/IG ratio is the most useful in the opposite case. Finally, a convolutional neural network was trained to discern between different GNP grades with 86% accuracy. This work demonstrates how computer vision could be deployed for rapid and accurate quality control on the factory floor.


Introduction
Graphene is a two-dimensional allotrope of carbon consisting of a single atomic layer of graphite. It was among the first to be isolated in what has become a large and growing family of 2D crystals [1]. Graphene boasts a thin, paper-like platelet shape made from a plane of sp 2 bonded carbon atoms, which gives rise to an array of interesting and superlative nanoscale properties. However, the production and isolation of graphene monolayers remains a challenge in significant quantities. Few-layer (3)(4) layers) and multilayer (10+ layers) flake materials are being produced at an industrial (tonnage) scale using top-down techniques such as electrochemical and shear exfoliation [2]. This 'bulk' nanomaterial inevitably falls short of some of the superlative properties of its single-layer counterpart; although, it can retain the high aspect ratio and graphitic nature that underlie many of graphene's properties. It is generally sold under the name 'graphene nanoplatelets' (GNPs), with manufacturers offering product lines with different surface area and lateral size distributions [3].
Characterising GNPs is challenging because of the agglomeration of nanoscale flakes within the bulk powder. All of which are subject to a degree of heterogeneity in terms of their lateral size, thickness, and surface chemistry. A recent study systematically characterised commercially available GNPs using a variety of techniques [4]. However, the range, cost, and complexity of the suite of analytical equipment used to provide such a holistic description of GNPs inhibits the ability of industrial manufacturers to optimize their processes and carry out quality control [5]. Thus, a single technique capable of extracting multiple key metrics and being deployed on the factory floor is required.
Raman spectroscopy is a fast, facile, and accessible characterisation technique that can provide deep insight into the electrical, chemical, and physical properties of materials. Because graphene is a zerobandgap semiconductor, photons of any energy excite resonant transitions between bands, meaning that all phonon modes are resonant, making it very sensitive to Raman analysis [6]. Therefore, it has become an indispensable and widely used technique for characterising carbon nanomaterials. Common applications include discerning graphene from other nanocarbons using the G and 2D peaks [7], quantifying defects and crystallinity using the D and G peaks [8,9], approximating the layer number using the 2D/G peak ratios [10], and evaluating changes in flake strain using the 2D peak shift [11]. However, these applications have been developed on few-layer graphene, typically single flakes isolated using micromechanical cleavage or grown using chemical vapour deposition. Although extending their use to GNPs is a natural step, there are three key issues to overcome: (1) The large number and heterogeneity of flakes present leads to more spectral variance. (2) The term GNP encompasses a range of materials, making it challenging to develop generic approaches. (3) An increased number of graphene layers (i.e. graphitic volume relative to surface area) decreases the sensitivity to surface changes, making Raman spectroscopy less useful for identifying surface functionality.
To compensate for the (1) variance inherent in GNPs, recent studies have relied on collecting large datasets consisting of hundreds of spectra, facilitated by advancements in fast surface mapping spectrometers, and then carried out statistical analysis to extract trends that are correlated with the behaviour seen in individual few-layer graphene flakes [12,13]. This 'big data' approach can enable Raman spectrometers to become a more complete quality control tool for GNPs. Recently principles and techniques from the field of machine learning (ML), have been incorporated to extract further value from these datasets: principal component analysis has been used to decompose the 2D peak shape onto a phase space consisting of two components, which when calibrated with scans from a flake with known layer number provided a means of rapidly sizing flake thickness [14]. Subsequently deep neural networks were trained to denoise Raman spectra, and increase the quality of this analysis, without compromising the Raman mapping efficiency [15]. Challenges with collecting and analysing these datasets remain. In particular the accurate extraction of the peak fit parameters, which becomes more difficult as the number of spectra increases, and automated processes are required. Furthermore, the issues of (2) the disparate nature of commercial GNPs and (3) dulled sensitivity to surface modifications remain and are only becoming more significant as the number of GNPs on the market increases and manufacturers become functionalised for improved compatibility in composite matrices [2,4]. There is opportunity to draw further from advancements in ML to tackle these problems.
In this work, the big data approach is extended in three ways. First, we detail a novel peak fitting algorithm capable of characterising a wide range of GNPs and extracting crystallinity and defect density values from the peak fit parameters. We use this technique to demonstrate how crystallinity can be closely correlated with laser-diffraction-derived lateral size values, enabling Raman spectroscopy to be used for rapid and accurate measurements of average flake size. Second, we propose the use of classical ML classifiers trained on structured peak fit parameter datasets to identify which spectral features are best placed to classify plasma-functionalised GNPs. This revealed how functionalisation induced modifications to the flakes are observed in the bulk material. Third, we showcase a computer vision model for batch-to-batch GNP quality control, using deep neural networks trained on raw spectral data. This was designed as a proof-of-concept to propose how Raman could work as an in-line characterisation technique in industry. The Python code and Raman spectra used to carry out this work are available for use and modification of the Materials Cloud [16].

Results and discussion
2.1. Characterising GNPs using a big data approach with Raman spectroscopy 2.1.1. Peak fitting algorithm A large spectral dataset requires automated methodologies for extracting quantitative data from peak features, for example, intensity (I), centre (ω), and amplitude (A). There are many software packages capable of performing batch peak fittings; however, the number of component curves is usually fixed at the start, which can result in overfitting if too many are defined or underfitting vice-versa. Goldie et al developed an automated algorithm for peak fitting of GNP spectra with D, G, 2D, D' , and G + D peaks [13]. It tackled underfitting by iteratively fitting an increasing number of peaks and overfitting by requiring a 2.5% improvement in fitting quality (measured using chi-squared) to retain a higher number of peaks. Here, we build on this approach by rewriting the algorithm to allow any user-defined peaks to be considered. Extending the program's capabilities to not only fit more complex GNP spectra more accurately but also any spectra where possible peaks are present are known.
The program is outlined in schematic 1 A configuration file contains the constraints for all the peaks that may be present in the spectra, one of which must always be present in the desired spectra, without which the scan is considered noise. In the case of the GNPs, this was defined as the G peak. Once the dataset is loaded, the spectra are normalised (using standard normal variate normalisation), any cosmic rays are removed, and spectral noise is quantified. The background (by default, a fourth-order polynomial) is fitted to compare against the subsequent background + G-peak fit. Any scans in which the addition of the G peak does not result in a significant improvement are considered noise. All other peaks defined in the configuration file are fitted alongside the background + G peak. Subsequently, peaks are only included in more complex permutations (three peaks, four peaks, etc) when they previously showed improvement. These iterations are repeated until all defined peaks are included in the fit or when adding peaks no longer improved the fit quality sufficiently. If there are multiple acceptable permutations, the best-quality fit is maintained. Typically, peaks are defined using a Lorentzian shape with three fitting variables, ω, A, and sigma (σ) [17]: Once a peak is fitted, two additional variables can be calculated: the full width at half maximum (Γ), where Γ = 2σ, and I = A/ (σω). Figure 1 shows the automated spectral fitting for three carbon nanomaterials with different spectral shapes: GNP (C750), graphene oxide (GO), and multi-walled carbon nanotubes (MWCNTs). The fitting closely follows the raw data for all three cases. The peaks used for all fittings were the same, demonstrating the capability of the program to find appropriate deconvolutions for a broad range of graphitic materials. The fitting parameters are listed in table S1.

Automated extraction of crystallinity and defect density values
The peak-fitting process extracts peak features (e.g. ω, A) from which additional features can be calculated (e.g. Γ, I, and peak ratios). These features can be used in combination with numerical simulations to quantify the density of 0D (point defects, vacancies, functional groups) and 1D (grain boundaries, dislocations, edges) defects [18]. These were disentangled using two peak features, A D /A G , corrected for different excitation laser energies (E L ) and Γ G . This results in the planes shown in figure 2, where the spectra that fall within the envelope can have a mix of 0D and 1D defects quantified using linearly interpolated results  L as a function of ΓG [18]. The dashed line delimits spectra which consist of only 0D defects, and the solid line delimits spectra which consist of only 1D defects. from numerical simulations, which provide values for the crystallite size (L a ) and length between defects (L d ). Although simulation work has been available since 2017, cross-referencing results against the simulation grids is too time consuming to carry out hundreds of spectra, as is necessary for GNPs with high spectral variance. As such, this process was automated to enable the facile translation of Raman peak features to crystallite size and defect length.

Characterising GNPs of increasing lateral size
The above methodology was applied to three commercially available GNPs from First Graphene (UK) Ltd, with lateral sizes averaging 5, 10, and 20 µm (PG5, PG10, and PG20, respectively). Each GNP was dispersed in 1 mg ml −1 1:1 IPA, DI solution, and 6 ml vacuum filtered onto a hydrophobic polyvinylidene difluoride (PVDF) membrane support, resulting in a film of agglomerated flakes for Raman analysis. This approach to sample preparation ensures the spectra are representative of the bulk powder, maximises the number of graphene scans collected over the mapped area, and is easily reproduced. Raman spectroscopy was carried out using a Renishaw InVia with a 532 nm laser and a 50× lens. The StreamHR mapping function was used to collect 1500 spectra for each material [19].
The average Raman spectra shown in figure 3(a) exhibit three characteristic peaks: a first-order G mode (∼1560 cm −1 ), a second-order 2D mode (∼2700 cm −1 ), and a defect-induced D mode (∼1350 cm −1 ). The G peak shows a shoulder pertaining to the defect-induced D' (∼1600 cm −1 ) mode, the overtone of which is 2D' (∼3200 cm −1 ). Two weak peaks were observed on either side of the 2D peak, pertaining to the D + D' (∼2450 cm −1 ) and D + D' (∼2950 cm −1 ) combination modes, respectively [20]. The D * (∼1175 cm −1 ) and D * * (∼1475 cm −1 ) peaks seen in figure 1 are induced by areas of high disorder and were not observed for this set of GNPs [21]. The high-intensity characteristic peaks (D, G, 2D) were prioritised for analysis, as they are less susceptible to noise. A typical peak fitting is shown in figure 3(b). The collected spectra were passed through the peak fitting algorithm, after which a data cleaning process was applied where points 1.5× outside the inter-quartile-range extents for key mode (D, G, 2D) parameters were considered anomalous and dropped. Finally, 1000 spectra for each material were randomly sampled from the remaining dataset for analysis.
A Malvern Mastersizer with a HydroEV module was used to determine the particle size distribution via laser diffraction. Samples were dispersed in 1:1 mixture of water and IPA until appropriate (5%-10%) obscuration levels were reached, then sonicated for two cycles of two minutes. The results are shown in figure 3(c), where the D 50 values (and span, calculated using D90−D10 D50 [22]) are 6.22 (1.617), 10.20 (1.854) and 18.30 (2.079) µm for PG5, PG10 and PG20 respectively. The distribution broadened as the particle size increased, resulting in an increased span. X-ray photoelectron spectroscopy (XPS) revealed that the atomic O/C ratios (see figure  S1 and table S2) were 0.055, 0.051, and 0.040 for PG5, PG10, and PG20, respectively. This consistency in O/C indicates a similar manufacturing process for all GNPs.
The distribution of the key mode features for 1000 spectra per material is shown in the grid of box plots in figure 4. As expected, there are clear positive or negative trends in the mean average of various parameters, for example, the I D /I G ratio decreases with increasing lateral size. However, there is a significant overlap in the data distribution across most parameters, demonstrating the importance of a big data approach to Raman spectroscopy for GNPs. This variance makes it difficult to assess any differences in the overall homogeneity of the materials, whereas the laser diffraction data show a broader spread of lateral sizes with increasing GNP grade average size.
In figure 5(a), the measured spectra are located on the A D /A G as a function of Γ G plane, and the values from within the envelope are translated into the defect density (σ = 1/L 2 D ) and crystallinity (L 2 a ) plane in figure 5(b). The data in figure 5(a) show decreasing A D /A G and Γ G values with increasing average lateral sizes. While there is some overlap between the datasets, the materials appear to be clearly defined from one another. Translating this data into the crystallite area vs. defect density plot in figure 5(b) shows that the average crystallite area and standard deviation increase with the particle size. These two trends correlate with the increasing D 50 and larger lateralsize spans observed in the particle-sizing data in figure 3(c). The defect density decreased with increasing lateral size, as the I D /I G ratio does in figure 4. The consistent standard deviation in defect density implies that all the materials are manufactured in a similar manner, a claim supported by the O/C analysis from the XPS. A linear regression was carried out on the data in figure 5(c), showing a strong positive relationship between the Raman spectroscopyderived crystallite area and laser diffraction-derived D 50 . Enabling Raman spectroscopy to be used for particle-sizing GNPs that sit within the envelope in figure 5(a).   of the graphitic basal plane leads to weak interactions between graphene and its surrounding matrix or dispersant. In nanocomposites, this can result in a weaker filler-matrix interphase, inhomogeneous distribution, and agglomeration [23]. In inks it can lead to unstable aqueous dispersions that are prone to crash and require the use of surfactant additives [24]. The heavily oxidised graphene analogue, GO, is a common alternative to GNPs because the abundance of oxygen-containing groups on its surface and edges enable improved dispersibility in water [25] and covalent interactions with polymer matrices and binders [26]. However, GO is more costly and chemically intensive to synthesise than most GNPs [27], and the extent of functionalisation severely reduces its thermal and electrical conductivities, limiting its application. Thus, the focus turns to tuning the chemistry of GNPs for improved dispersibility and better matrix compatibility through surface treatment techniques that can operate at scale, for example, plasma functionalisation.

Characterising GNPs using
The addition of functional groups complicates analysis, as functionalisation can have multiple simultaneous effects on the Raman response; for example, adherence of functional groups to the basal plane can both charge dope the material and strain the lattice, shifting ω G and ω 2D [28]. The simultaneous covalent adherence of plasma-induced functional groups and amorphisation of the sp 2 lattice activates the D peak alongside existing physical defects in the carbon [29], as well as broadening the 2D peak and reducing the I 2D /I G ratio [30]. The nature of the defect, whether vacancy, sp 3 carbon, or grain boundary, can affect the intensity ratio of the D and D' peaks [31]. Disentangling these, at times competing, effects from one another becomes even more difficult when considering the heterogeneity in flakes that make up a GNP. Isolating the peak features that are most heavily influenced by the functionalised groups would help with discerning between the baseline and functionalised material, while advancing understanding into how the GNP morphology and chemistry are modified by surface treatment.
ML models are algorithms that can be tuned to classify and cluster data using discrete or continuous labels and a variety of mathematical approaches, such as K-nearest neighbours, random forests, and deep neural networks [32]. The decision-making logic behind each model is subject to different interpretation levels. Given sufficient labelled data, supervised ML can be used to train a model to accurately discern the labels of new cases. Training is carried out by splitting a dataset into two sets: the first is used to train the model, and the second to test its accuracy. A random forest consists of many decision trees, each trained on a random subset of features. Each decision tree identifies the features from its subset that best distinguishes between different labels, thus maximizing information gain. The decision of the random forest is the average of all trees, predicting the label for a novel example. Random feature selection and averaging over many trees discourage overreliance on a few features, thereby improving predictions. The most informative features used by the decision trees in a random forest, with high accuracy on the test set, can elucidate which spectral characteristics have the most potential for further examination [33]. This approach has been used previously to establish that the G-mode intensity is the most informative feature for predicting the twist angle of bilayer graphene using simulated data [34].
This process was applied to plasma-fluorinated GNPs from multiple manufacturers, PG10 from First Graphene (a different production batch than that used in 2.1.3), Elicarb GNP F7P1518 (EC-GNP) from Thomas Swan, HP-GNP from Versarien, and xGNP C750 from XG-Sciences. Plasma treatment was performed using a Haydale HT60 Plasma Reactor. This enables the functionalisation of lab-scale quantities of nanomaterial powder, with larger reactors capable of treating commercial quantities. Plasma treatment was carried out on 50 g batches of GNP at low pressures (<10 mbar) and temperatures (<100 • C) using CF 4 gas. Raman spectroscopy and peak fitting were carried out as described in section 2.1, and XPS analysis was carried out as detailed in the SI.
The average Raman spectra are shown in figure 6(a). There is a clear difference between the spectra of the C750 and other GNPs, the former has much broader D (Γ D = 112.9 ± 9.7 cm −1 ) and G (Γ G = 60.7 ± 5.7 cm −1 ) peaks, and a larger I D /I G ratio (0.85 ± 0.09). The EC-GNP showed a very weak D peak in the average scan; however, this was too weak to pick up in individual spectra, resulting in an I D /I G ratio of 0. The HP-GNP and PG10 each have an I D /I G ratio of 0.15 ± 0.03 and 0.27 ± 0.10. In figure 6(b), these three GNPs cluster towards the left of the plot (lower A D /A G ratios and sharper G mode), whereas C750 sits towards the right (higher A D /A G ratio and broader G mode). The EC-GNP is outside the outline, where the line and point defect contributions can be disentangled. This is a known limitation of the defect disentangling technique, as GNPs with a low defect density require the parameterisation of multiple other variables beyond the scope of the original simulation work [18]. For EC-GNP, HP-GNP, and PG10, fluorination resulted in larger defect densities and broader G peaks, whereas only the latter was true for C750. XPS analysis (see figure S2 and table S3) showed that the F/C ratio increased to 0.059 for PG10, 0.190 for C750, 0.220 for EC, and 0.044 for HP-GNP. The extent of functionalisation can be correlated to the bulk density of the material (see figure  S3); the same mass of lower-density material fluorinates more as there is less exposed surface area for functionalisation.
For each GNP, data from the baseline and fluorinated materials were passed through a random forest model for binary classification. The features available for use include I, Γ, A and ω from the three main GNP modes (G, D, and 2D) and the corresponding A and I ratios for D/G and 2D/G (e.g.  as well as the crystallinity and defect density, for a total of 18 features. Permutation feature importance is the decrease in model accuracy when the target feature value is randomly shuffled. As such, a high value is indicative of informative features, a value near zero indicates uninformative features, a negative value of a model that has been overfitted, or generalised poorly [33]. This is complicated by the fact that Raman data can be significantly correlated, that is, information from one feature (e.g. the A D /A G ratio) can be deduced from other features (e.g. A D and A G ). This can impact the feature importance, as correlated features have lower overall importance. To overcome this, Spearman's rank-order coefficient (r s ) can be calculated for every feature pair (n pairs, from the first, u, and second, v, feature), which is a measure of the strength and association of two variables [35]: Hierarchical cluster analysis can subsequently be applied to the r s coefficients to visualize the correlated features in the dendrogram diagrams shown in figure 7 [32]. The sooner that lines link the more similar features are. A cluster cut-off value can be chosen (algorithmically or through visual inspection) to determine the number of groups and a representative feature used from each in the random forest model. The four different GNPs displayed similar dendrograms; however, closely correlated features for one system (e.g. A D and A G in C750) were not necessarily the same in another system (A D and A D /A G in HP-GNP), resulting in different cluster compositions. A cut-off of 0.5, was chosen as this roughly halved the features kept without impacting accuracy.
The results of the binary classification using random forests for each GNP are presented in table 1. Each random forest classifier was fed 1000 spectra per label (e.g. baseline, fluorinated), there was 70:30 training to test split, and the test set consisted of scans from a Raman map taken at a different spot to the training set. Two performance metrics, accuracy and the F1 score, were evaluated. The former measures the proportion of correct predictions, whereas the latter is the harmonic mean of the precision (share of true positives to all predicted positives) and recall (share of true positives to all actual positives). A 5-fold cross-validation (CV) was carried out, whereby the training set was split into five bins and a model was trained on all but one, then tested on this hold-out. This is repeated five times, and the average accuracy should give a value indicative of the test set performance, as well as the model variance (using the standard deviation). The 5-Fold CV scores closely matched the test-set accuracy in all the cases, indicating that the training data were representative of the test data. The performance metrics using the entire feature set (18 features) can be compared to those of the filtered features extracted from hierarchical clustering. These  were closely matched in all cases, suggesting that hierarchical grouping was effective in filtering out closely correlated features. The accuracy and F1 scores for all GNPs were >76%, suggesting the models were effective in discerning between baseline and fluorinated spectra. Further improvements in accuracy could be achieved by using larger datasets and hyperparameter tuning.
The top permutation feature importance is listed for each GNP, and these are indicative of which Raman features are most useful to the model for distinguishing between baseline and fluorinated materials. The EC-GNP and HP-GNP materials both lie at the bottom left of the outline in figure 6(b), as they have a low A D /A G ratio. For these, any increase in defect density due to plasma fluorination is best observed through a broadening of the weak D peak, meaning Γ D is the most important feature for classification in both cases. At the other end of the outline is C750, with a high A D /A G ratio and a broad G peak. Notably, it has a much higher specific surface area of 685 m 2 g −1 , compared to 5.21 and 26.4 m 2 g −1 for EC-GNP and HP-GNP respectively (BET methodology detailed in the SI). As such, the I 2D /I G ratio (which typically correlates with flake thickness [10]) becomes the dominant feature, picking up changes in the aggregation of flake stacks due to the fluorinated surfaces. The PG10 sits between these GNPs in the outline of figure 6(b) and has a specific surface area of 10.7 m 2 g −1 . These properties result in the lowest classification accuracy, as the larger initial D peak and lower specific surface area make Raman less sensitive to changes in the defect density and aggregate morphology, respectively. Thus, the initial nature of the GNP affects how functionalisation is observed in its Raman response, and Cançado's 0D, 1D disentangling outline is a useful means of predicting which peak features are most affected.

Accurate spectra classification using computer vision
Computer vision is a branch of ML that performs object detection and classification on unstructured data, typically images. Convolutional Neural Networks (CNNs) are deep-learning models that power a range of computer vision applications [36]. CNNs consist of a series of convolutional layers that extract both localised and high-level features from input data. Features were filtered and combined in a multitude of layers and used to classify the data. The training process encourages the network to identify informative features and learn how these features should be compared and combined [37]. This is a powerful tool for spectral classification, as it automates the process of feature extraction, a computergenerated alternative to the peak-fitting algorithm. Using spectral data directly, rather than a database of peak fits, is also advantageous, as producing the latter can be resource-intensive, inflexible, and may lead to the loss of potentially valuable information, which reduces the accuracy of classification [38]. However, unlike the classical ML approach used in section 2.2.1, the use of a deep neural network makes it infeasible to reverse engineer the abstract feature combinations that the model uses to classify the data. As such, this technique is well suited for rapid factory floor quality control, quickly validating whether a new material matches a previous baseline (on which the model has been trained).
As a proof-of-concept for this technology, a CNN model was trained on the Raman spectra collected for the PG5, PG10, and PG20 GNPs analysed in section 2.1.3. It was then tasked with classifying the different production batches for each material. The CNN used, summarised in schematic S1, consists of two convolution layers, in which there are multiple concurrent n × 1 filters that move along the 1D spectrum. The dot product of each spectral segment and filter represents a 'convolution' . This convolved (2·n × 1) input is activated by a rectified linear activation function, which in turn is down sampled to half the size to reduce its spatial dimension. During training, the outputs of the convolutions pass through a dropout layer, whereby a fraction of the values is randomly ignored to minimize overfitting. Finally, they are fed into densely connected layers using a softmax activation function to convert the vector of values from the network into a probability score for each classification label.
The model was able to classify between the three GNPs with a training set accuracy of 81%, test set accuracy of 86%, and F1 score of 86%. The confusion matrix in figure 8(a) shows the accuracy of the classification of each material in the test set. Each GNP was classified with an accuracy of at least 81%, with most of the error occurring from incorrect prediction of PG10, which is expected as it sits between the two other GNPs in terms of lateral size. This performance is indicative of intra-batch variation. However, in practice, this approach is likely to be used for inter-batch quality control. The performance of the model was tested against GNPs from a different production batch, which was not used in model training. Inference was performed on 250 new Raman spectra from each GNP, and the resultant confusion matrix is shown in figure 8(b). All three GNPs were correctly classified, albeit with a lower overall accuracy of 69%; classifying PG5 resulted in the highest accuracy at 78%; Both PG5 and PG20 were most often confused for PG10, but never each other. These results are promising considering that the model was trained on a single batch of each material. Better performance could be achieved by increasing the size and scope of the training set to include multiple production runs to account for normal batch-to-batch variance.

Conclusion
The rapid growth in the industrial production and application of GNPs has led to new possibilities and challenges. The shared high particulate aspect ratio and graphitic nature of GNPs with few-layer graphene enables much learning from the latter to be carried over. However, the heterogeneity of flakes in GNPs, their larger thicknesses, and the broad range of carbon nanomaterials labelled 'GNP' , has thus far limited the application of Raman spectroscopy for characterising these 'bulk' nanomaterials. In this study, we build on previous work to extend the capabilities of Raman spectroscopy using a big data approach.
We present a peak-fitting algorithm capable of accurately and automatically deconvoluting GNP Raman spectra using user-defined constraints. This was further extended by automating the work of Cançado et al [18] to extract the crystallinity and defect density values from peak fittings. We demonstrate that this combination of algorithms enables the use of Raman spectroscopy for particle sizing in compatible GNP systems. Next, we outline a technique for using random forests to isolate the most informative Raman features for discerning between the baseline and functionalised flakes. We find that the initial material properties affect which Raman features change the most post-functionalisation: in low defect density and low specific surface area, GNPs Γ D is the most important, whereas in GNPs with a high defect density and high specific surface area, it is the I 2D /I G ratio. Finally, we develop a methodology for factory-floor GNP quality control powered by CNNs. This proof-of-concept technique can discern between three GNP grades with 86% accuracy and correctly classify new material batches with up to 78% accuracy, demonstrating how computer vision and Raman spectroscopy, with its sensitivity to physical, electrical, and chemical changes, could be used by industrial GNP manufacturers to validate batchto-batch consistency.

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: https:// archive.materialscloud.org/record/2022.139.