Analysis of electrode arrangements for brain stroke diagnosis via electrical impedance tomography through numerical computational models

Objective. Rapid stroke-type classification is crucial for improved prognosis. However, current methods for classification are time-consuming, require expensive equipment, and can only be used in the hospital. One method that has demonstrated promise in a rapid, low-cost, non-invasive approach to stroke diagnosis is electrical impedance tomography (EIT). While EIT for stroke diagnosis has been the topic of several studies in recent years, to date, the impact of electrode placements and arrangements has rarely been analyzed or tested and only in limited scenarios. Optimizing the location and choice of electrodes can have the potential to improve performance and reduce hardware cost and complexity and, most importantly, diagnosis time. Approach. In this study, we analyzed the impact of electrodes in realistic numerical models by (1) investigating the effect of individual electrodes on the resulting simulated EIT boundary measurements and (2) testing the performance of different electrode arrangements using a machine learning classification model. Main results. We found that, as expected, the electrodes deemed most significant in detecting stroke depend on the location of the electrode relative to the stroke lesion, as well as the role of the electrode. Despite this dependence, there are notable electrodes used in the models that are consistently considered to be the most significant across the various stroke lesion locations and various head models. Moreover, we demonstrate that a reduction in the number of electrodes used for the EIT measurements is possible, given that the electrodes are approximately evenly distributed. Significance. In this way, electrode arrangement and location are important variables to consider when improving stroke diagnosis methods using EIT.


Introduction
Stroke is one of the most significant causes of disability and death worldwide (Johnson et al 2016).In the United States in 2019 alone, stroke was the fifth most common cause of death resulting in >150 000 deaths (Centers for Disease Control and Prevention (CDC) 2023), and each year, >795 000 people suffer from a stroke.Essential to an improved chance of survival and prognosis is early detection and appropriate treatment (Centers for Disease Control and Prevention (CDC) 2023).However, treatment is highly dependent on the type of stroke.The two primary types of stroke are ischemic stroke, which is caused by an obstruction of a blood vessel due to a thrombosis or embolism (Murphy and Werring 2020), and hemorrhagic stroke, which is caused by a rupturing of a blood vessel, resulting in blood pooling in the brain (Unnithan et al 2020).Standard treatment protocol for ischemic stroke is the use of thrombolytics, specifically recombinant tissue plasminogen activator (tPA) (Hacke et al 2008), (Pan and Shi 2021).However, treatment must be done within a 4.5 h window from symptom onset (Messé et al 2016).Due to this short time span, up to 69% of ischemic stroke patients miss this time window (Eissa et al 2013).Further complicating this is that tPA can drastically worsen outcomes and potentially be deadly for hemorrhagic stroke patients (Demaerschalk et al 2016).As such, timeliness of treatment is only possible with rapid stroke-type classification.
Current standard practice for detecting and differentiating between stroke types is through imaging (Wardlaw et al 2004, Murphy andWerring 2020).However, existing imaging methods, such as magnetic resonance imaging (MRI) and computed tomography (CT), are both slow and expensive (Fred 2004).Therefore, there is a significant clinical need for a rapid stroke diagnostic system.
Electrical impedance tomography (EIT) has emerged as a promising imaging modality capable of potentially detecting and differentiating stroke type rapidly and at a low cost, particularly when coupled with machine learning (ML) (Malone et al 2015, Samorè et al 2017, McDermott et al 2018, McDermott et al 2019a, Agnelli et al 2020, McDermott et al 2020, Culpepper et al 2021, Candiani and Santacesaria 2022, and Culpepper et al 2023).These studies have investigated the feasibility of EIT for this application, including exploring approaches for improved modeling and advanced ML classification models.However, there has been limited investigation on the actual electrode placements and arrangements, with most studies opting to use a ring of electrodes equidistantly placed in a single plane or an array based on electroencephalography (EEG) caps.Additionally, the impact of these electrode positions and arrangements on stroke diagnostic performance is rarely scrutinized or tested.
Research on electrode arrangement and placement with EIT algorithms in general demonstrates that the location of the electrode can impact the performance and quality of EIT measurements (Demidenko et al 2005, Adler et al 2011, Hyvonen et al 2014, Silva et al 2017, and Smyl and Liu 2020).Evidence of this is found in other medical applications, such as in cardiac imaging (Noordegraaf et al 1996), lung ventilation (Karsten et al 2015), peripheral nerve modulation (Hope et al 2018, Ravagli et al 2019), and bladder monitoring (Schlebusch and Leonhardt 2013).Therefore, the lack of research into optimal electrode arrangements for stroke application is notable.This may be of more particular importance for stroke due to the need for rapid selection of the correct treatment approach, and thus the need for very high accuracy in classification.An informed electrode measurement protocol with fewer electrodes not only reduces the hardware system cost and complexity but also reduces the diagnosis time.However, to date, only two studies have been identified as exploring the effect of differing electrode placements with stroke detection.Specifically, in Tian et al (2023), the authors investigate EIT for post-operative stroke monitoring, a scenario in which it may not be possible to place electrodes evenly around the entirety of the head.As a result, the authors focus on partial coverage of only the front, back, or left/ right side, each using the same number of electrodes (16).They, however, do not compare the performance of each individual electrode arrangement.In another study, Lee et al (2021) analyzes the impact of individual electrodes with respect to specific lesion locations, but the use of only a sixteen-electrode ring limits the information that can be gathered from this study.Furthermore, this study does not consider ML classification.
Therefore, in this study, we seek to investigate the effect of electrode arrangement and placement using an array of 32 electrodes that encompass the full head, as was used in the most advanced clinical study on stroke and EIT to date (Goren et al 2018), and then seek to test these results using machine learning classification.We specifically examine the clinically relevant performance of classifying bleed versus non-bleed cases, studying uniquely the impact of: (i) specific electrodes, current-injection pairs, and voltage-measurement pairs; and (ii) different electrode arrangements that are 16-element subsets of the full 32 electrodes in an EEG configuration.Together, these studies provide valuable insight on the impact and importance of the choice of electrode positions and overall electrode arrangements on bleed classification accuracy for stroke disambiguation.
This paper is organized as follows.Section 2 reviews the overall methodology, including the models utilized in this study and the classification architecture.Section 3 analyzes the importance of individual electrodes on the EIT measurements, while section 4 investigates the importance of electrode arrangements by taking subsets of the full electrode array and using an ML classification model.In section 5, we discuss the results of the paper, noting its significance and limitations.Lastly, we conclude the study in section 6.

Methods
This section gives an overview of the general methods used in this paper.We first review the numerical models used and the methods to simulate the EIT data.We then review the overall classification model architecture.We build and expand on the work of Culpepper et al (2023), which provides a detailed justification of the construction of the numerical models and classification architecture.However, we note any significant deviations.

Review of terminology
Electrical impedance tomography (EIT) is an imaging modality capable of reconstructing images indicative of the internal electrical properties of a medium.This is of particular interest in applications in which a deviation from a prior state causes a change in the internal electrical properties, such as impedance and conductivity.With the differences in conductivity between healthy brain tissue, ischemic tissue, and blood (i.e. a hemorrhage), as shown in table 1, EIT has the potential to exploit these impedance variations to successfully detect and diagnose a stroke.
In a typical application of EIT, an array of electrodes is placed at the boundary of the medium, and a lowfrequency sinusoidal current is injected between a selected pair of electrodes.Utilizing the remaining electrode pairs, voltage measurements are made, each of which is indicative of the impedance between those electrode pairs.By utilizing various combinations of current injection and voltage measurement pairs, it becomes possible to gain information on the internal impedance within the medium of interest.Measurements obtained from these electrode pairs are known as channels.All the selected combinations of current injection and voltage measurement electrode pairs make up a protocol, and together, all the measurements obtained utilizing a specified protocol constitute a frame of data.While it is possible to reconstruct an image using this frame of data, recent literature involving EIT for stroke diagnosis is increasingly moving towards machine-learning-driven methods, in which voltage measurements are directly fed as input features into a classification machine-learning (ML) model, as proposed in McDermott et al (2018).This strategy avoids the low-resolution images typical of EIT, while still taking advantage of the large number of collected measurements.
Much of the current EIT literature focuses on demonstrating the feasibility of the technology through computational simulations.This allows for the creation of complex models that achieve a higher degree of realism compared to physical phantom models.Packages such as EIT and Diffuse Optical Tomography Reconstruction Software (EIDORS) (Polydorides andLionheart 2002, Adler andLionheart 2006) take these models and other user-defined parameters to first solve the forward problem, predicting the likely voltage measurements that will be obtained from the specific model, and then solving its corresponding inverse problem, estimating the underlying impedance based on the measurements.Details about the mathematical derivation and implementation of EIT in EIDORS can be found in Polydorides and Lionheart (2002).

Numerical models and data production
Anatomically and electrically realistic yet simplified numerical models were utilized in this study.A detailed and comprehensive description and justification for each of these layers can be found in Culpepper et al (2023).We briefly review these numerical models.
The baseline head model was extracted from a stereolithography (STL) file, which was previously adapted from a polygon mesh (Grozney 2013).Finite element models (FEMs) composed of ∼270 000-298 000 voxels were generated from these STL files, using EIT and Diffuse Optical Tomography Reconstruction Software (EIDORS) (Polydorides andLionheart 2002, Adler andLionheart 2006) aided by Gmsh (Geuzaine and Remacle 2009) and Netgen (Schöberl 1997).A total of four layers were inserted into this baseline head model: an outer aggregate layer of scalp and skull, cerebrospinal fluid (CSF), the ventricles, and the brain.As done in McDermott et al (2018) and Culpepper et al (2023), a 3D mesh based on a structural MRI from Dilmen (2016) served as the basis for the baseline STL file of the brain, in addition to the baseline ventricle models.The CSF layer was created by scaling up the baseline brain model.In addition to the brain layer, these specific head layers were selected for inclusion due to their high resistivity or high conductivity, both of which are expected to significantly impact the resulting measurements.Moreover, the position of the ventricles directly within the brain layer may result in larger impacts than the layers external to the brain, particularly for deep lesions, making its inclusion important.
Variants of each model layer were considered in order to account for the vast amount of anatomical variation within the human population.Nine variations of the head model were constructed by scaling the baseline model up and down in each coordinate direction (resulting in six variants) and in all directions (resulting in two additional variants beyond the baseline).The baseline brain model was scaled ±5% in all dimensions, producing three total brain variants and thus three total CSF variants.Lastly, five variants of the ventricles were created by (1) scaling the baseline ±20% in all dimensions and (2) rotating the baseline in the clockwise and counterclockwise directions with respect to the coronal plane.The approximate sizes are included in table 1.In total, 135 sets of healthy head models (i.e.permutations) were generated by pairing a unique head model (nine in total) with a unique brain-CSF pair (three in total) and ventricle variant (five in total).Each model was scaled down by 40.40% in volume in order to reduce the number of voxels and thus reduce the computational complexity and simulation time, as was previously done in Lee et al (2021) and Culpepper et al (2023), among others.An example of one healthy head permutation is illustrated in figure 1(a).These scaling and rotation variations were derived from an intensive literature review, further described in Culpepper et al (2023).
Head models representing both an ischemic and a hemorrhagic stroke were then generated.To produce these stroke head model permutations, one of sixteen discrete lesions was inserted into each of the 135 healthy permutations, thus resulting in a total of 2160 stroke lesion permutations for each stroke lesion type (i.e.ischemic, hemorrhagic).These sixteen discrete lesions, as shown in figure 1(b), vary in size and location, thereby representing a diverse arrangement of potentially realistic stroke lesions.The lesions are approximately evenly distributed throughout the brain and are thus intended to be representative of the variation in lesion location within the brain.Further rationale for these specific lesions can be found in Culpepper et al (2023).The various different sizes are included in table 1.
Conductivities of the various head layers and lesions vary across frequency.Table 1 shows the conductivity assigned to each head layer at 5 Hz, which was the frequency of the injected current applied to this study.A frequency of 5 Hz was selected, as the conductivity difference between blood (i.e. the result of a hemorrhagic stroke) and brain tissue, as well as blood and an ischemia, is the largest.Additionally, 5 Hz measurements were also obtained clinically in Goren et al (2018), enabling potential explorations of this framework and comparisons with the data collected in that study.The conductivity of CSF and the ventricles are derived from Latikka and Eskola (2019) and Baumann et al (1997), while the conductivity of blood is from Hasgall et al (2018).Using values from Hasgall et al (2018), the conductivity of brain is estimated as an aggregation of gray and white matter, and the conductivity of the scalp/skull aggregate is estimated from a combination of cancellous bone, cortical bone, and scalp conductivities.Lastly, since ischemia is the result of restricted blood supply to the brain, we estimated its conductivity as a ∼5% reduction of the brain conductivity.Past studies in rats (Shu et al 2022) and canines (Kim et al 2008) found a 30%-40% and 10%-14% reduction in ischemic tissue conductivity compared to that of the brain respectively.However, using similar reasoning from Culpepper et al (2023), a worst-case scenario of only a 5% reduction is assumed, particularly due to the time it takes for an ischemia to develop.Though not entirely realistic, constant isotropic conductivities for the model layers have been typically assumed in previous literature.While studies such as Samorè et al (2017), Agnelli et al (2020), and Candiani and Santacesaria (2022) attempt to account for this model error by adding random variation to the assigned conductivities, McDermott et al (2019b) found that assumed conductivity errors have little impact on the resulting machine learning performance.As such, as done in McDermott et al (2020), and Culpepper et al (2023), we do not specifically add conductivity variation to account for this error.
A total of 32 circular electrodes with 6 mm radii were meshed onto each head permutation with additional local refinement.The electrode arrangement was modeled after that used in Goren et al (2018), one of the most comprehensive experimental works in a clinical setting to date.To further enhance the realism of the models from Culpepper et al (2023), a 1 kΩ electrode-skin contact impedance was incorporated into the models, a contact impedance used in the simulations from Malone et al (2014) and shown to be realistically achievable by Goren et al (2018).
Boundary voltage measurements were simulated for all of the various head permutations using EIDORS (Adler and Lionheart 2006).In order to model the effect of noise, random Gaussian white noise was added to each data frame independently to account for measurement noise, potential noise or variation in the individual layer conductivities, and other potential sources of noise and error.Zero-mean additive white Gaussian noise has been commonly used in past EIT stroke simulation studies, such as McDermott et al (2018), Candiani andSantacesaria (2022), andCulpepper et al (2023).Additionally, past analyzes of the signal-to-noise ratio in EIT hardware systems have found that the electronic noise of the system can be modeled as zero-mean additive white Gaussian noise (Murphy et al 2016), and probabilistic interpretations for studies investigating and optimizing EIT algorithms, such as Adler and Lionheart (2006) and Dardé et al (2013), have found this noise type to be suitable for their analyzes.For this study, two signal-to-noise ratios (SNR) were considered: (1) 55 dB to simulate a realistic scenario, as shown in Goren et al (2018), and (2) 120 dB to simulate an idealistic scenario.120 dB was used for the analysis of the individual electrodes (see section 3), while 55 dB was used for classification purposes (see section 4).
A 5 Hz sinusoidal current with an amplitude of 45 μA was utilized for this study, adhering to the current parameter for a 5 Hz frequency utilized in Goren et al (2018) and considered to be within the safety limits specified in IEC 60 601-1 (Commission et al 2002).Voltage measurements were obtained with respect to a common electrode located just above the nasion fiducial point, as done in Goren et al (2018).Therefore, current injection channels were composed of two of the 32 electrodes, while voltage measurement channels were comprised of one of the 32 electrodes and this common electrode.In later sections, only the unique voltage measurement electrode will be mentioned to describe the measurement channels.In a departure from Goren et al (2018) and Culpepper et al (2023), both of which used a reduced number of current injection and voltage measurement channel combinations for the protocols, we uniquely simulate measurements at all possible channel combinations, resulting in a total of 29 760 measurements in a single data frame (i.e. a full data frame).Note that as typically done in EIT studies, voltage measurements were not obtained from the current injection electrodes.Thus, for each of the (32 × 31 = 992) current injection pairs, 30 voltage measurements were collected.A total of 100 data frames, differentiated by the added noise vector, were generated using this protocol for each healthy and stroke lesion head permutation.Figure 2 illustrates an average of these 100 data frames and demonstrates the structure of the data.
In order to create the electrode subset data frames used in section 4, we extracted only the measurements from the full data frames that were possible using only the electrodes in that subset (i.e. the current injection electrodes and voltage measurement electrode were contained within the electrode subset).In doing so, a reduced data frame equivalent to taking measurements using the electrodes in the subset directly was created.This process was repeated for each set of measurements and electrode subset analyzed.

Classification model architecture
The classification architecture, which is based on that in Culpepper et al (2023), relies on a nested crossvalidation (CV) structure, as shown in the flow chart in figure 3.In a typical application of cross-validation, the ML model goes through multiple training and testing runs, with a new subset of the input observations withheld for testing on each iteration.In doing so, it reduces the risk of overfitting and better generalizes the model.Therefore, the model that achieves the best average performance is selected as the final model.A single iteration of the flow chart in figure 3 constitutes a single run.By testing the ML classifier on combinations of previously unseen head models and lesions while still training the classifier on a diverse array of head models and lesions, this architecture closely simulates a realistic scenario.
In this specific study, the ML classifier utilized is a support vector machine (SVM) with a linear kernel.A linear SVM generates an optimal hyperplane in the same space as the input features.In doing so, the squares of the feature weights are presumed to correspond to the importance of that feature in classification, as done in Mladenić et al (2004).This enables us to evaluate the importance of different electrode combinations on classification.
Here, the two classes that the classifier is tasked to separate are the hemorrhagic lesion head models (i.e.bleed) and the other head models, including the ischemic lesion head models (i.e.clot) and the healthy head models.This binary classification problem is considered to be the most clinically relevant, as ruling out a hemorrhagic stroke is crucial before the administration of tPA.As detailed in section 1, the administration of tPA to a hemorrhagic stroke patient can be potentially deadly.Therefore, we choose to focus on this specific classification problem.
For a given run, a random subset of the 100 noisy EIT data frames for each head model permutation and type (i.e.normal/healthy, bleed/hemorrhagic, clot/ischemic) is selected to serve as input into the classification architecture.The number of measurements (i.e.features) in each data frame depends on the number of electrodes used in the protocol.The number of observations totals 8640 with 4320 for the hemorrhagic class and 4320 for the other class, which is divided evenly between ischemic and healthy data frames.To achieve this, two noisy frames out of 100 for all the 2160 hemorrhagic stroke head permutations, one frame out of 100 for all the 2160 ischemic stroke head permutations, and sixteen frames out of 100 for all the 135 healthy head permutations were randomly selected.As such, care was taken to ensure a balanced dataset, which is important for reducing any bias.While the distribution of the different classes is not entirely realistic, particularly in emergency room settings, it is important to train the classification model using a balanced dataset to avoid the model biasing towards the most prevalent class.If an imbalanced but clinically accurate dataset was used, techniques, such as resampling, would be utilized to combat the effects of an imbalanced dataset and would thus achieve similar results.
The nested CV structure is composed of an outer and an inner CV loop.The outer CV loop utilizes a leave-pout CV (LpOCV), with eight total folds.This loop simulates a realistic test scenario, as each isolated test dataset contains all the frames from eight of the 135 head model permutations (including both healthy and lesion variants), all the frames from two of the sixteen lesions, and an equal number of healthy frames from the same model permutation.In doing so, the final classifier is tested using previously unseen head models and lesions.The selected head model permutations and lesions for the test dataset are randomly selected without repeats between the folds for each run.To select the optimal hyperparameters for the final classifier, k-fold CV with nonstratified folds is implemented in the inner CV loop, with k = 10.
In order to fairly compare the performance of the full dataset with datasets extracted from electrode subsets, which contain fewer features, two methods are employed.The first is by randomly sampling the features such that the number of features sampled from the full electrode set is equal to the total number of features possible with the electrode subset.The second is by randomly sampling the electrodes such that the number of electrodes sampled from the full electrode set is equal to the total number of electrodes in the subset.This ensures that the number of features as input into the ML classifier is consistent and thus does not influence the classification results.Note that this random sampling is unique to each outer fold but is consistent within the same inner fold, thus ensuring that a given classifier is trained and tested with the same chosen features/electrodes.To account for potential skewing of the data due to the specific head model permutations and lesions selected for the test data set, multiple runs are completed.In total, three total runs are completed, thus resulting in 24 isolated test datasets.The overall results are then extracted from statistics calculated across each of these 24 isolated test datasets.

Classification performance metrics
Three different metrics are used to evaluate the classification performance: the accuracy, the positive predictive value (PPV), and the negative predictive value (NPV).These metrics were calculated from the number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN).The equations for each metric are included in equations (1)-(3) In applications such as stroke diagnosis, it is crucial to evaluate metrics other than the accuracy, as a misdiagnosis in one class may not be as clinically damaging as a misdiagnosis in the other class.In this application, a TP indicates that a hemorrhagic stroke was correctly identified as a hemorrhagic stroke, a FP indicates that either an ischemic stroke or a healthy model was misclassified as a hemorrhagic stroke, a TN indicates that an ischemic stroke or a healthy model was correctly classified as not a hemorrhagic stroke, and a FN indicates that a hemorrhagic stroke was misclassified as not one.While both misclassifications are undesirable, a FN is considered to be worse.In practice, a FN would indicate to clinicians that it is acceptable to administer tPA to the patient when in reality, delivering tPA to a hemorrhagic stroke patient is potentially deadly.While a FP may result in an ischemic patient not receiving beneficial medication, there are other treatment alternatives.As such, even though a high accuracy, PPV, and NPV are desirable, a high NPV is clinically most important, as it indicates that there is a small number of FN.

Analysis of individual electrodes and results
In this section, we analyze the impact of individual electrodes by evaluating the electrodes that are most sensitive to the changes in impedance caused by a stroke lesion.Since we desire to determine the electrodes that are truly the most sensitive, as opposed to those affected by noise, the idealistic SNR of 120 dB is applied to the data frames used in this section.We first describe the methods by which we determine these most sensitive electrodes, which are termed 'dominant contributors'.Following this, we present and discuss these dominant contributors.

Determination of dominant contributors
The concept of dominant contributing electrodes has been proposed previously in Lee et al (2021).Depending on the location of the lesion, different current injection and voltage measurement electrode combinations in the protocol resulted in a larger absolute voltage measurement change between the lesion and the healthy head model, given the same baseline model.We suspected that electrode combinations that detect larger differences may be more influential in correctly diagnosing and classifying stroke and thus be more important to include in the protocol, as these electrode combinations may correspond to higher weighted and thus more important features in the ML model.Therefore, we sought to investigate the location of these dominant contributors for each lesion location and if there were dominant contributors that were common across all the lesion locations.
The method for determining the dominant contributors is shown in figure 4 and closely matches that in Lee et al (2021).The boundary voltage measurements were first obtained for each healthy and lesion head model through the use of EIDORS (Polydorides and Lionheart 2002) and (Adler and Lionheart 2006) to calculate the forward solution.For each of the 135 healthy/baseline head models (described in section 2.2), the absolute difference between its data frame and that of each of its sixteen associated lesion models was calculated, thus the insertion of the lesion would be the only the cause for any voltage measurement change.The top five injection and measurement channels that detected the largest absolute difference were then obtained.This was completed for all 100 data frames generated for each model.Then, across all the 100 frames, the number of appearances of each electrode among these top five channels was summed.These electrodes were the overall dominant contributors.We restricted the count to only appearances as a current injection or voltage measurement electrode to obtain the current injection dominant contributors and the voltage measurement dominant contributors, respectively.In doing so, we were able to analyze the specific role that some of these dominant contributors assume.
In order to better extract patterns across all the head variations, we summed the total number of appearances of each electrode as an overall, current injection, and voltage measurement dominant contributor for each of the sixteen lesion locations across all 135 baseline head models.Understanding the location of dominant contributors with respect to the lesion location may be informative in the potential of this technology in localizing the lesion.One method by which we visualized this data is through the generation of a 3D head plot, which was first employed in Lee et al (2021).On the 3D head plot, all 32 electrodes (barring the ground electrode) are plotted.For each dominant contributing electrode combination, a solid line was drawn between the current injection channels, with a dashed line drawn from the midpoint of the solid line to the voltage measurement channel.This was done for each unique dominant contributing electrode combination for a given baseline head model and lesion location.The width of the lines was directly related to the number of appearances of that electrode combination in the top five channels across the 100 data frames.A simplified example is shown at the bottom of figure 4, and an example using the data is included in figure 5.
To obtain patterns across the head variations and the lesion locations, we summed the electrode appearances across all the lesion locations and baseline head models.Statistical significance tests were performed to determine the overall, current injection, and voltage measurement dominant contributors across all the head and lesion models.In order to accomplish this, one-tailed paired t-tests were completed between the electrode appearance counts.Electrodes with a statistically significant number of appearances greater than 24 other electrodes constituted the general dominant contributors.If there are electrodes that are consistently dominant contributing electrodes, these may indicate that these electrodes are more influential in detecting the lesion than the others.Therefore, this may indicate that it is more important to include these electrodes in the full protocol, thus informing the importance of electrode placement for this application.Electrodes with a statistically significant number of appearances less than 24 other electrodes were also noted, as these electrodes are likely the least informative and thus may not be necessary in the full protocol.

Results
In this section, we will first describe the results of the dominant contributors with respect to each lesion before detailing the dominant contributors across all the head models and lesion locations.We will distinguish between the overall, current injection, and voltage measurement dominant contributors.
Across the sixteen lesion locations and sizes, clear patterns began to emerge.One example for a single lesion that is consistent with the other lesions is depicted in figure 5.For a given lesion location, as shown in figures 5(a) and (b), only a few electrodes clearly emerge as the dominant contributors.These electrodes are highly dependent on the location of the lesion.Generally, the current injection dominant contributors are located on opposite sides of the head such that the solid line in the 3D head plot (i.e. the straight line drawn between the two electrodes) intersects the lesion.Voltage measurement dominant contributors tend to vary from the current injection ones, as they tend to align closely with the electrodes that are located closest to the lesion.These are demonstrated by both figures 5(c) and (d).These patterns are consistent with both the ischemic and hemorrhagic lesion models.The observation with the current injection dominant contributors is consistent with prior research that shows that diametrically opposed electrode stimulation patterns can achieve improved performance over adjacent ones (Adler et al 2011 andCulpepper et al 2021).
The depth of the lesion also impacts the consistency of the dominant contributors across the head variations.For shallow locations, the dominant contributors across the 135 baseline models are consistent.However, for deeper locations, the variance is higher, even at an SNR of 120 dB.This likely indicates that deeper locations may be harder to correctly diagnose, which is expected.
When analyzing the patterns across all the baseline head model permutations and lesions, distinct patterns are also visible in the electrodes that have the most and least number of appearances as dominant contributors, as shown in figure 6 for a hemorrhagic stroke model.In terms of the overall and current injection dominant contributors, the outer electrodes tended to appear most as dominant contributors, while the inner, more central electrodes tended to appear the least.This aligns with our expectations based on the lesion-specific results, as we would expect the outer electrodes to more likely intersect lesions of all depths.The voltage measurement electrodes with the most appearances as dominant contributors tended to be towards the back, while those with the least tended to be towards the front of the head.This, along with the slight skewing of the overall and current injection electrodes with the least appearances towards the left part of the head, may be a result of the choice of lesion locations.Since there are only sixteen lesions, though they are fairly comprehensive in location and size, any slight biases may be reflected in the results.We suspect that any bias may come from the distribution of lesions relative to the actual electrode protocol.However, the fact that there is a clear pattern in all the dominant contributors is promising.Similar results are obtained for the ischemic stroke models.By relying on a ranking system instead of the extent of the difference to obtain the dominant contributors, we expect the results to be more robust to the specific distribution of this lesion set.The value of the difference will likely be more influenced by both the specific location and the size of the lesion, as well as the specific locations of the electrodes.
These results demonstrate that not all electrodes are equally impacted by the insertion of a given stroke lesion.As expected, the magnitude of the effect is dependent on both the location of the electrode, as well as that of the stroke lesion.Despite this dependence, there are electrodes in the protocol that appear more consistently as dominant contributors than others, illustrating the importance of considering the electrode positions.This could help inform the electrode layout designs in future arrays, including for specific applications like the postsurgical stroke monitoring case described in Tian et al (2023).

Analysis of electrode arrangements and results
In this section, we use the knowledge of the general dominant contributors across all the head models and lesions to inform our selection of electrode subsets.We describe each of our chosen subsets before describing the results of each when used as input into an ML classifier described in section 2.3.We then expand on our analysis of electrode importance by analyzing the weights assigned to each electrode combination.Note that since we desire the classification results to accurately reflect realistic situations, the data frames with a realistic SNR of 55 dB are used in this section.

Selection of electrode subsets
Three unique electrode subsets were created for comparison.To ensure fairness when used as input into the ML classifier, each subset was comprised of sixteen electrodes.The choice of sixteen electrodes was selected as many EIT commercial systems are available with either 16 or 32 electrodes.Since 32 electrodes are already used in the full electrode protocol, reducing the number of electrodes to any number greater than sixteen would still require a purchase of the 32-electrode system, which is more costly.Further, a 16-electrode system would enable faster data collection and analysis.
The electrode subsets we selected are termed the ring subset, central subset, and adapted 10-20 subset, with each illustrated in figure 7. The ring subset is intended to correspond to the electrodes with the most number of appearances as dominant contributors, while the central subset is intended to reflect the electrodes with the least number of appearances.We decided to include both, as it serves as a point of comparison since we expect the ring subset to result in a better performance than the central subset.The adapted 10-20 subset is a relatively evenly distributed subsampled version of the full 32-electrode set.Moreover, these electrodes were selected, as each electrode is included in the International 10-20 System, which is commonly used for electroencephalograms (EEG).Hardware systems with these electrode placements are readily available due to the prevalence of EEG.Note that electrode 14 was not included in either the central or the adapted 10-20 subset, even though it fits the definition for both.This specific electrode was excluded to restrict the number of electrodes to sixteen while still maintaining symmetry.
As described in section 2.3, to fairly compare the performance of these electrode subsets to the full electrode set, we randomly sampled the features and the electrodes from the full electrode set to maintain the same number of features used as input into the ML classification model.Sixteen electrodes correspond to 3360 unique electrode combinations and thus measurements.Therefore, 3360 measurements or 16 electrodes were randomly sampled.We term the prior as the random feature set and the latter as the random electrode set.

Classification results
The data frames from each of the various electrode subsets served as input into the ML classification model described in section 2.3.The results of this across 24 runs are included in table 2 and figure 8. Paired t-tests were also conducted to test for statistical significance.
As expected, the full random feature subset performed the best in accuracy, NPV, and PPV.Since each random feature sample is likely a reflection of the entire 32-electrode set, the input data would likely contain more information than any 16-electrode set.More information should result in better performance.Its large standard deviation and range are indicative that the specific selection of features can heavily influence the performance.
Notably, the adapted 10-20 subset performs the best out of all the subsets and achieves a performance very similar to the full random electrode subset across all metrics.Additionally, there is no statistically significant difference between the adapted 10-20 subset and the full random feature subset with respect to the accuracy and the NPV, the metric that is arguably the most important.This illustrates that a reduction from 32 to 16 electrodes does not significantly impact the classification performance, particularly if the sixteen electrodes are evenly distributed.
Surprisingly, the difference in performance between the ring and central subsets is not statistically significant across all metrics.Despite the lack of significance, across the 24 runs, the ring subset achieves a better accuracy and NPV but a poorer PPV.Note that there is no statistically significant difference between the adapted 10-20 and ring subsets with respect to the accuracy and the NPV and between the adapted 10-20 and central subsets with respect to the PPV.This appears to indicate that the ring subset may result in fewer misclassifications of hemorrhagic stroke as ischemic/healthy, while the central subset may result in fewer misclassifications of ischemic/healthy as hemorrhagic.In this way, the ring subset is better at detecting hemorrhagic stroke, though it may result in more false positives.This would align with our expectation, as the ring subset aligns with the electrodes that have the most number of appearances as dominant contributors.In future work, it may be interesting to explore taking measurements from the full 32-electrode set and making a classification decision based on running the classifier separately on multiple subsets, as what is done with ensemble ML methods.In this way, it may be possible to find the optimal combination of subsets (e.g. the ring and central subsets).We note that we attempted to study arrangements of 32 electrodes in this work, but our computational capabilities (some  of the most advanced in the world thanks to our advanced computing center) were exceeded trying to perform machine learning with such a large number of features.Across all the subsets, the PPV is consistently higher than the NPV, thus indicating that correctly diagnosing hemorrhagic stroke is a more difficult problem than correctly diagnosing the other class.This poses an issue, as it is clinically better to achieve a higher NPV.Additional testing and optimizing of hyperparameters to prioritize this metric may be necessary to achieve this.Overall, given that the accuracies are within 70%-80%, the problem of stroke diagnosis and classification is a challenging one, and optimizing electrode arrangements alone may not be sufficient to make it a clinically optimal system prototype.

Analysis of significant electrodes
We then chose to analyze the significant electrodes specifically for the adapted 10-20 subset, as this subset included an even distribution of electrode locations and achieved the best performance among the subsets (excluding those created from the full electrode set).This was achieved by extracting the weights assigned to each feature (i.e.voltage measurement by a specific electrode combination) by each of the final classifiers in the 24 runs.We squared the weights, thus removing the influence of the sign, and extracted the top 10 electrode combinations with the largest squared weights.We summed up the number of appearances of each electrode across all 24 runs.We also counted the number of appearances as the current injection channels, as well as the voltage measurement channel.
The results of this are included in figure 9.Only the electrodes that accounted for at least 15% of the total possible number of appearances across the 24 runs are noted.From this, it is clear that only a few electrodes are highly influential in classification.In general, it appears that electrodes toward the back are more important.Excluding a few exceptions, these electrodes also align with the electrodes that would be considered the outer electrodes of the 10-20 subset.This may indicate that these results match that of the dominant contributors, suggesting that outer electrodes are more influential.However, additional studies would be required to fully understand these effects.
While the electrodes in figures 9(b)-(d) account for >60% or >75% of the total number of appearances as a highly weighted electrode depending on if four or five electrodes are marked respectively, it would likely not be sufficient to rely on only these electrodes for classification.The electrodes plotted may be used with other current injection electrodes and/or voltage measurement electrodes not included as one of these influential electrodes.However, the plots demonstrate which electrodes are consistently within the current injection-voltage measurement electrode combinations that are assigned the highest squared weights by the classifier.As such, these are expected to be more influential in classification and thus more important to include in the protocol.

Discussion
Based on the results, it is evident that the selection of electrode placement can influence the voltage measurements and classification performance.As such, it is important for careful consideration to be made when selecting the actual electrode placement.While the location of the electrodes most sensitive to changes in voltage due to the lesion (i.e.dominant contributors) is highly dependent on the location of the lesion, the outer electrodes consistently emerged as the dominant contributors when analyzing across all the lesions.However, only retaining these electrodes for classification does not achieve the best performance.In fact, an even distribution of electrodes is preferred, though the electrodes considered to be most significant for classification aligned with the outer electrodes as well.We suspect that an even distribution performs better despite this because it is able to best account and generalize for all lesion locations.Outer electrodes may have difficulty with deeper and more central lesions, particularly if they are small.
The important electrodes for each role (i.e.current injection, voltage measurement) also vary in terms of location.Generally, for each specific lesion, dominant contributing current injection electrodes tended to be ones that are diametrically opposed from the other such that the lesion is between the two electrode pairs.Dominant contributing voltage measurement electrodes tended to be those closest to the lesion.This indicates that the role of the electrode is important to consider as well.
The results also demonstrate promise that a reduction in the number of electrodes while still maintaining a similar performance is achievable, particularly if the electrodes are evenly distributed across the head (e.g. the adapted 10-20 subset).This illustrates that a full 32-electrode set may bear a decent amount of redundancy.This is significant, as a reduction in the number of electrodes decreases the cost of the system and, more importantly, reduces the time of setup and measurement, thus delivering more rapid results and making faster treatment possible.

Limitations
While the results are promising, there are several limitations to this study.The first is the limited number of lesion locations and sizes.Since there are only sixteen lesions, the results of this study may be biased toward these specific locations, particularly in the location of the lesions.However, the lesions decently span a variety of sizes and locations and include more than the majority of studies done in this field.The lesion locations were deliberately placed such that they were approximately evenly distributed throughout the brain, and a posthoc analysis of the lesion coordinates found that the specific lesion coordinates are not statistically significant from a uniform distribution.Since the location of the lesion is likely the most determining factor for a specific lesion, the results obtained from a set of lesions that are approximately uniformly distributed throughout the brain would likely be able to generalize.Additionally, in the results, we sought to extract general patterns instead of fixating on specific electrodes in order to account for any bias presented by the use of this lesion set.Lastly, we used a ranking system when determining the dominant contributors rather than a system dependent on the extent of the measurement difference.In doing so, the extracted general patterns are likely more robust to any slight variations in the lesion size and location distribution.In the future, accounting for the actual statistical distribution of strokes clinically in the selection of lesion locations may further improve these results.The second is that we choose to focus on the electrode placement and number.While the number of electrodes is likely to most impact the setup time and system cost, another variable that can also be analyzed is the specific selection of electrode combinations used for current injection and measurement.Reducing the number of measurements can also reduce measurement time, thus this is another area that should be further analyzed.
The third is the use of the linear SVM.A linear kernel may perform worse than other kernel types that we have not considered here.However, this study does not seek to achieve the best possible performance compared to other studies; instead, the focus is a comparison in performance between the different electrode subsets selected in this study.Additional studies should be completed using these subsets and other SVM kernels to investigate if the results from this study hold for other SVM kernels and ML models.

Conclusion
The overall electrode placement and arrangement significantly impact both the EIT boundary voltage measurements and the performance of ML-based classification.In terms of voltage measurements, outer electrodes tend to be most sensitive to changes in impedance caused by a stroke lesion for both hemorrhagic and ischemic cases.Despite this, relying on only outer electrodes for classification does not achieve the best performance.Instead, an even distribution of electrodes not only outperforms the other electrode subsets but is capable of achieving a similar performance as using an electrode set double its size.Based on this study, we conclude that optimizing the electrode arrangement can improve the state-of-the-art EIT stroke classification algorithms and is an area worth further investigation.

Figure 1 .
Figure 1.Numerical models used for this study.(a) An example of a healthy head model permutation with four layers: the scalp/skull aggregate (transparent), CSF (gray), the brain (pink), and the ventricles (purple).(b) An illustration of the size and location of all sixteen lesions, with the 5 ml lesions in yellow, the 10 ml lesions in green, and the 30 ml lesions in red.The baseline image for both the head model and the lesion locations is adapted from Culpepper et al (2023).

Figure 2 .
Figure 2. Example average data frame from one of the 135 baseline models with one of the 16 lesions inserted at a simulated SNR of 120 dB.(a) The difference between the healthy and hemorrhagic and healthy and ischemic head models across the 29 760 measurements obtained from all possible electrode combinations averaged across the 100 trials in red and blue respectively.(b) A partial frame of the first 120 measurements.(c) The four current-injection pairs that correspond to the first 120 measurements.Using each current-injection pair, voltages are measured using the remaining 30 electrodes.The baseline image of the electrode positions is drawn from figure 2(a) in Goren et al (2018).

Figure 3 .
Figure 3. Machine learning pipeline used in this study.( * ) Note that for the full dataset, the same electrodes/features are sampled from the training and test data within each outer fold (m).The selection of these electrodes/features is random and differs per outer fold.The pipeline is adapted from Culpepper et al (2023).

Figure 4 .
Figure 4. Pipeline to determine the dominant contributors.Note that the 3D head plot schematized in this figure is a simplified version with only nine electrodes, for ease in visualization.The baseline image for the head models is adapted from Culpepper et al (2023).

Figure 5 .
Figure 5. Example of the dominant contributors for a single lesion location and size.The lesion is a 30 ml hemorrhagic lesion located towards the left hemisphere.(a) 3D head plot for the specific lesion (red sphere) and baseline head model with an overlay of a 3D head model.The endpoints of the solid line indicate the current injection dominant contributors, while the endpoint of the dashed line indicates the voltage measurement dominant contributor.The width of the line directly correlates to the number of appearances of that electrode as a dominant contributor across all the data frames.(b) Percentage of each electrode as an overall dominant contributor for this specific lesion across all baseline head models.(c) and (d) Top five closest electrodes and electrodes with the most appearances as a current injection and voltage measurement electrode respectively for this specific lesion across all the baseline head models.The approximate size and location of the lesion are overlaid on these plots as a red circle.The baseline image of the electrode positions is drawn from figure 2(a) in Goren et al (2018).

Figure 6 .
Figure 6.Electrodes with the most and least appearances as an (a) overall, (b) current injection, and (c) voltage measurement dominant contributor across all hemorrhagic lesions and head models.Similar patterns are observed across all ischemic lesion models.Onetailed paired t-tests were completed, and electrodes with appearances that were statistically significant with more than that of 24 other electrodes were plotted.The baseline image of the electrode positions is drawn from figure 2(a) in Goren et al (2018).

Figure 7 .
Figure 7. Electrode subsets selected for this study: (a) ring subset, (b) central subset, and (c) adapted 10-20 subset.Each electrode subset contains sixteen electrodes.The baseline image of the electrode positions is drawn from figure 2(a) in Goren et al (2018).

Figure 9 .
Figure 9. Electrodes from the adapted 10-20 subset with the highest squared weights across the 24 runs.These should correlate with the electrodes that are most influential for classification.(a) Adapted 10-20 subset for comparison.(b)-(d) Influential electrodes in terms of overall, current injection, and voltage measurement respectively.The baseline image of the electrode positions is drawn from figure 2(a) in Goren et al (2018).

Table 1 .
Conductivity (at 5 Hz) and the size of head layers.