Algorithmic detection of sleep-disordered breathing using respiratory signals: a systematic review

Background and Objective. Sleep-disordered breathing (SDB) poses health risks linked to hypertension, cardiovascular disease, and diabetes. However, the time-consuming and costly standard diagnostic method, polysomnography (PSG), limits its wide adoption and leads to underdiagnosis. To tackle this, cost-effective algorithms using single-lead signals (like respiratory, blood oxygen, and electrocardiogram) have emerged. Despite respiratory signals being preferred for SDB assessment, a lack of comprehensive reviews addressing their algorithmic scope and performance persists. This paper systematically reviews 2012–2022 literature, covering signal sources, processing, feature extraction, classification, and application, aiming to bridge this gap and provide future research references. Methods. This systematic review followed the registered PROSPERO protocol (CRD42022385130), initially screening 342 papers, with 32 studies meeting data extraction criteria. Results. Respiratory signal sources include nasal airflow (NAF), oronasal airflow (OAF), and respiratory movement-related signals such as thoracic respiratory effort (TRE) and abdominal respiratory effort (ARE). Classification techniques include threshold rule-based methods (8), machine learning models (13), and deep learning models (11). The NAF-based algorithm achieved the highest average accuracy at 94.11%, surpassing 78.19% for other signals. Hypopnea detection sensitivity with single-source respiratory signals remained modest, peaking at 73.34%. The TRE and ARE signals proved to be reliable in identifying different types of SDB because distinct respiratory disorders exhibited different patterns of chest and abdominal motion. Conclusions. Multiple detection algorithms have been widely applied for SDB detection, and their accuracy is closely related to factors such as signal source, signal processing, feature selection, and model selection.


Introduction
Sleep-disordered breathing (SDB) is a group of disorders that disrupt normal breathing patterns and quality during sleep, manifesting as intermittent occurrences of respiratory apnea or hypopnea.According to the definition provided by the American academy of sleep medicine (AASM) (Iber C et al 2017), apnea is defined as a drop of more than 90% from the baseline airflow lasting at least 10 s.Hypopnea is defined as a drop of more than 30% from the baseline airflow lasting for at least 10 s, accompanied by either a desaturation of more than 3% from the pre-event baseline or an arousal from sleep.SDB can be classified into three types based on the underlying causes: obstructive sleep apnea (OSA), central sleep apnea (CSA), and complex sleep apnea (MIX), which is a combination of both OSA and CSA, each with different respiratory symptoms.Overall, SDB can have severe impacts on physical and mental function, resulting endothelial dysfunction, oxidative stress, inflammation, glucose dysregulation, and brain and white matter pathological changes (Daulatzai 2015, Polsek et al 2018, Liguori et al 2021).Furthermore, it has been associated with various disorders, including hypertension, diabetes, metabolic syndrome, osteoporosis, and cardiovascular diseases (Floras 2015, Liguori C et al 2016, Sharma and Culebras 2016, Reutrakul and Mokhlesi 2017, Ryan 2017).Additionally, cognitive impairment and Alzheimer's disease (AD) have also been linked to SDB (Ju et al 2013, Polsek et al 2018, Shi et al 2018).Moreover, SDB can increase mortality in patients with heart failure, stroke, or coronary artery disease.
The prevalence of SDB overall remains uncertain.A recent review of 17 studies from 16 countries focusing on OSA, estimated that approximately 936 million adults aged 30-69 years worldwide have mild to severe OSA, with around 425 million adults identified as having moderate to severe OSA (Benjafield et al 2019).Despite the high prevalence, patients and their partners often overlook the symptoms of SDB, such as snoring, gasping, or choking, without recognizing their association with the condition.Consequently, patients typically do not seek medical attention for these symptoms.Moreover, the diagnosis of SDB presents challenges as it primarily occurs during sleep, making traditional clinical evaluation methods less effective.The gold standard method for SDB diagnosis is polysomnography (PSG) (Graco et al 2018), which involves utilizing specialized equipment in a dedicated laboratory with a minimum of 22 electrodes and 11 channels to continuously collect, measure, and analyze signals such as electroencephalography (EEG), electrocardiography (ECG), respiration, blood oxygen levels, and other relevant physiological parameters.Therefore, PSG is time-consuming, labor-intensive, and expensive with limited patient adherence.Overall, the underrecognized symptoms, limited diagnostic methods, and the challenges of PSG collectively contribute to underdiagnosis of SDB, with an estimated over 80% of SDB cases lacking accurate diagnosis and timely treatment (Jin and Sanchez-Sinencio 2015, Jaiswal et al 2017).
To overcome these challenges, researchers have proposed alternative methods for detecting SDB using various single-lead signals, such as respiratory signals (Van Steenkiste et al 2019), blood oxygen (Deviaene et al 2020), snoring (Hu et al 2022), and ECG (Shen et al 2021), etc.Among these, respiratory signals, including nasal airflow (NAF), oronasal airflow (OAF), and respiratory movement-related signals such as thoracic respiratory effort (TRE) and abdominal respiratory effort (ARE), are the preferred signal source for SDB detection according to the recommendations of the AASM.Over the past ten years, multiple studies (Nakano et al 2007, Makarie Rofail L et al 2010, Masa et al 2011, Crowley et al 2013, Morgenstern et al 2013, Masa et al 2014) have consistently showed the effectiveness and accuracy of single-channel respiratory signal detection in identifying and accurately diagnosing SDB, of which algorithms based on different methods, such as threshold rule, machine learning (ML) and deep learning (DL), have been developed.While several reviews have been conducted on SDB algorithms, including ECG-based algorithms (Faust et al 2016), respiratory and blood oxygen fusion-based algorithms (Alvarez-Estevez andMoret-Bonillo 2015, Uddin et al 2018), and multiple signals-based algorithms (Mendonca et al 2019, Serrano Alarcon et al 2021), there is currently a gap in the literature regarding a systematic review specifically focusing on algorithms for SDB identification using respiratory signals and/or respiratory movement-related signals.Thus, our review aims to fill this gap by conducting a systematic literature review of respiratory signal-based algorithms for SDB detection during the recent decade (2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019)(2020)(2021)(2022).The objectives are to compare the advantages and disadvantages of various algorithms based on respiratory signals and methods, and summarize and analyze current research trends in algorithm development, thus providing a comprehensive reference guide for new researchers developing SDB recognition algorithms.

Methods
This systematic review was conducted according to the protocol registered with PROSPERO (International Prospective Register of Systematic Reviews, CRD42022385130, https://www.crd.york.ac.uk/prospero/ display_record.php?ID=CRD42022385130).The search strategy protocol followed guidelines from PRISMA statement (Liberati et al 2009) and its extension PRISMA-S (Rethlefsen et al 2021), and the following eligibility criteria were applied to define the structure of the current systematic review.

Search strategy
We conducted a comprehensive literature review of papers published between 2012 and 2022 via searching various databases, including Web of Science, IEEE explorer, and PubMed, as well as through the cited literature in the included articles and related journals.The keywords employed in the search were 'SDB detection algorithm', 'algorithm AND sleep Apnea', 'Respiration analysis AND apnea', and 'Apnea AND deep'.Articles were selected by 2 reviewers independently (blinded to each other's assessment) by applying the criteria to each title and abstract and then assessed fully.Divergent opinions were resolved through group discussion until reaching consensus.All processes were performed in Endnote, a bibliographic software used for managing references.

Selection criteria
The following criteria were used to select eligible studies in this review: (1) the signal must be a single-channel respiratory signal or respiratory movement-related signal, (2) a complete computer-based detection and/or prediction system for SDB must have been proposed, (3) detailed system evaluation data with comparisons to the gold standard (PSD) must have been included, which can verify system validity, and (4) preliminary or definitive results must have been formed.Additionally, these criteria also applied to studies obtained from reference tracking.Algorithms based on multi-source combined signals (e.g.respiration, combined blood oxygen) and indirect signals (e.g.respiratory variability extracted from ECG signals) were excluded.In addition, adults and children are two distinct entities in SDB detection.Sleep architecture, respiratory physiology, apnea definition, and apnea severity in adults differ from children/pediatric subjects (Alsubie andBaHammam 2017, Kljajić et al 2017).Therefore, the algorithms for SDB detection in pediatric subjects are quite different and usually need special consideration or criteria to obtain better detection results.Accordingly, this review focuses only on the adult population.Figure 1 presents flowchart of the systematic review design and study selection.Overall, a total of 342 studies were screened, of which 43 studies met the inclusion criteria and were eligible for data extraction.

Data extraction
We conducted detailed data abstraction from a total of 43 studies.The article information (title, author, publication date, journal), method (database, main decision, classification method) and evaluation metrics (accuracy, specificity, etc) were abstracted and recorded in the Excel table.Python packages (Pandas, NumPy, and Matplotlib) was used for data visualization.We classified and sorted the results in ascending order according to different algorithms (rule threshold-based, machine learning, deep learning), and further eliminated 11 papers with duplicated content or same approaches from the same author, including preprints or early-stage conference outcomes.In the final analysis, a total of 32 papers were included for final discussion.

Results and discussion
Based on the findings of the literature, the algorithms for SDB detection are varied by signal source, signal processing technique, feature extraction method, and the choice of model.Therefore, we present and discuss the results from these four aspects.

Respiratory signal acquisition sensors
According to the AASM guidelines (Iber C et al 2017), the primary sensors for measuring respiratory signals include oronasal thermal airflow sensors (mainly for detecting apnea) and nasal pressure sensors (mainly for detecting hypopnea).Another alternative sensor option is the thermal sensor.Thermal airflow sensors measure the temperature of the air flowing through the nasal and/or oral passages, while thermal sensors measure the temperature of the skin around the nose and mouth.These sensors are utilized to acquire respiratory signals by detecting changes in airflow and temperature associated with episodes of apnea and hypopnea.For measuring respiratory effort signals, commonly employed sensors include respiratory inductive plethysmography belts (RIP belts) and polyvinylidene fluoride belts (PVDF belts).RIP belts are devices worn around the chest and abdomen to measure respiratory movement-related signals, producing RIPsum (the sum of the chest and abdomen RIP belt signals) and RIPflow (the rate of airflow into and out of the lungs).Additionally, RIPsum can replace primary sensors (e.g.oronasal thermal airflow or nasal pressure sensors) when airflow signals are not available or unreliable (Farre et al 2004, Kaniusas andKaniusas 2012).In addition, chest and abdominal signals related to respiration can also be acquired by PVDF belts, called PVDFsum.
Recent developments in sensor technology have introduced new methods for measuring respiratory signal during sleep, such as radar and imaging techniques.The radar method uses either Doppler radar or impulse radio ultra-wideband (IR-UWB) radar.Doppler radar employs the Doppler effect, which is the frequency shift of a signal due to the motion of the transmitter and receiver, to detect changes in the chest and abdomen (Lee et al 2014).On the other hand, IR-UWB radar employs wide bandwidth and high-frequency carrier waves to detect minute movements in the chest and abdomen (Kang et al 2020).Both methods have demonstrated promising results in the studies.However, Doppler radar is susceptible to interference from random body movements and encounters challenges with null-point detection (Sun andMatsui 2015, Park et al 2006).Similarly, IR-UWB radar indirectly determines the breathing state from chest movements, which may lead to discrepancies between the recorded data and the actual breathing state.Moreover, even small patient movements can significantly disrupt the measured signals, such as respiration rate and heart rate (Park et al 2019).To overcome these limitations, researchers have proposed the use of imaging techniques, such as Infrared Optical Gas Imaging (IR-OGI) (An et al 2022), which has been demonstrated to have a sensitivity of 96.0%.Additionally, the three-dimensional (3D) time-of-flight (TOF) Camera (Coronel et al 2019, Coronel et al 2020) has exhibited similar functionality to RIPsum.
Based on the signals obtained from these sensors, they can be roughly classified into NAF, OAF, and respiratory movement-related signals such as TRE and ARE.Among these signals, the NAF signal has been extensively studied in the literature, with a total of 18 studies (Selvaraj and Narasimhan 2013, Guijarro-Berdiñas et al 2012, Avci and Akbas 2015, Ciolek et al 2015, Gutierrez-Tobal et al 2012a, 2012b, 2013, 2016, Lee et (Koley and Dey 2013a, 2013b, Kim et al 2019, ElMoaqet et al 2020b).The detailed list of signals and their corresponding literature sources can be found in table 1.The NAF signal has received widespread attention in research, potentially due to two primary reasons.Firstly, compared to the OAF signal, the NAF signal can be obtained using common and easy-to-use sensors such as oronasal thermal airflow sensors or nasal pressure sensors.In contrast, acquiring the OAF signal requires simultaneous measurement of airflow from both the nasal and oral cavities, making sensor selection and placement more complex.Additionally, most PSG devices routinely capture the NAF signal.Furthermore, compared to measuring TRE and ARE, these signals are susceptible to interference from body movements and other artifacts.This is likely another reason why the NAF signal has received relatively more attention in research.

Signal preprocessing
Respiratory signals are typically measured by air flow or monitoring chest and abdomen movement.However, they can be affected by various types of noise, such as baseline wander, power line interference, muscle artifacts, and electrode motion artifacts (Sankar et al 2010).This can mask the tiny features of the signal and lead to false diagnosis.To reduce power line interference, a notch filter with a bandwidth of 50-60 Hz was used in several studies (Ziarani and Konrad 2002, Łęski and Henzel 2005, Nassi et al 2022).For other types of noise, various filters such as Butterworth, finite impulse response (FIR) and infinite impulse response (IIR) have been applied to different frequency ranges of 0.01-30 Hz, as summarized in table 2. In general, there is no standardized frequency range for these signals in the studies.The potential reason, as stated by Várady et al (2002) could be that airflow signals are specific to the applied sensor and they can change during measurements because of sensor or patient movements.
In addition to these common noises, motion artifacts are the most challenging to deal with in respiratory signals.These noises are both inevitable and unpredictable, arising from the subject's movements during sleep, such as head movement and body shifting (Liu et al 2013).Therefore, relying solely on simple filtering techniques with a fixed cutoff frequency proves to be inadequate in effectively addressing this problem.With the reviewed literature, four articles provided comprehensive explanations of motion artifacts processing.Among them, two articles employed adaptive filtering methods that dynamically adjust the cutoff frequency based on specific movement conditions.Keenan and Wilhelm utilized a least mean square (LMS) adaptive filter with a triaxial accelerometer (ACC) signal as reference signals (Keenan and Wilhelm 2005).Similarly, Fedotov et al introduced ACC as a reference signal and implemented an adaptive filtering based on the Winer-Hopf RLS algorithm (Fedotov et al 2018).However, these methods introduce new reference signals that interfere with respiratory signals.Subsequently, Liu et al proposed an algorithm based on mutual information and power criteria for automatically selecting appropriate intrinsic mode functions (IMFs) to remove tissue artifacts and reconstruct respiratory signal reconstruction (Liu et al 2013).However, complex algorithms increase computing power costs and hardware consumption.To address this challenge, Rosa et al proposed an energy-efficient Haar level 5 (Haar-5) wavelet transform architecture, which can save 38.19% of circuit area and reduce power dissipation and energy by 38.26% compared to other architectures (da Rosa et al 2021).
In conclusion, the different noise sources originating from various sensors and experimental environments make it difficult to scientifically evaluate the effectiveness of specific preprocessing methods.Furthermore, limited literature exists in this domain.In practical applications, researchers should analyze and select appropriate signal preprocessing methods based on the measurement technique, noise sources, and energy considerations to achieve optimal results.

Feature extraction
Through the literature review, respiratory signal features can be broadly classified into three categories: time domain features, frequency domain features, and nonlinear features.Commonly used methods for feature  Gutierrez-Tobal et al (2012a, 2012b, 2013, 2016), Avci and Akbas (2015), Gogus and Tezel (2019)  extraction include statistical analysis, Fourier transform, and wavelet transform.Among the reviewed literature, 12 articles presented comprehensive explanations of features and extraction methods, which are outlined in table 3.In the subsequent paragraphs, we will provide a detailed exploration of each category, discussing their respective advantages and disadvantages.
(1) Time-domain feature Time-domain analysis typically involves analyzing statistical properties and morphological features from a set of N discrete-time samples within a specified time window (Lutus 2008, Prahallad 2011).Time-domain analysis of respiratory signals and respirator effort signals can yield significant statistical and morphological features, which can be obtained by observing the waveform and performing simple calculations, such as the mean, median, standard deviation, skewness, and kurtosis of the signal.Additionally, morphological features, such as zero-crossings, peaks, valleys, signal slope, and signal amplitude, can also be derived from the waveform.However, due to the nonlinear and non-stationary nature of the respiratory signal and respiratory effort signal, their values vary with time, resulting in differences in their statistical properties at different time points (Fong et al 2013).Consequently, it becomes challenging for time-domain features to fully capture the characteristics of the signal, even when utilizing a random time window.To overcome this limitation, other signal analysis techniques like frequency domain analysis can be employed to acquire more detailed and accurate features.
(2) Frequency-domain feature Frequency-domain analysis offers a comprehensive view into the respiratory system.By decomposing the signal into its frequency components, various features such as the frequency spectrum, spectral density, power spectral density (PSD), and spectral peak can be analyzed to identify the presence and characteristics of respiratory abnormalities, including Apnea and Hypopnea.The primary methods for frequency transformation include Fourier Transform and Wavelet Transform.and FFT, respectively, along with the nonparametric estimation Welch's method, which is very suitable for nonstationary signals, to obtain the PSD.Wavelet transformation is commonly performed using Haar, Symmlet, and Daubechies wavelets to obtain coefficient features for classification purposes.For instance, Romero et al used the Symmlet wavelet family (Symmlet of order: O = 7) to obtained coefficients feature and achieved favorable Apnea classification results (Fontenla-Romero et al 2005).Maali et al decomposed the data using three levels of Haar wavelet transformation to extract wavelet coefficients, which were then used as inputs for an SVM classifier (Maali andAl-Jumaily 2012, Maali andAl-Jumaily 2011).Berdiñas et al applied discrete wavelet transformation and computed the average value of 16 wavelet coefficients for SDB detection (Guijarro-Berdiñas et al 2012).Avci et al employed the Daubechies wavelet to decompose the airflow signal and extract statistical features for classification purposes (Avci and Akbas 2015).
Although these feature extraction methods effectively reveal the frequency domain features of respiratory signals or respiratory movements, SDB is caused by a combination of anatomical upper airway predisposition and changes in neural activation mechanisms (Salisbury and Sun 2007), whose nighttime airflow signals contain rich nonlinear dynamical features which can provide important information for diagnosis and treatment.Therefore, nonlinear features are also commonly considered in feature extraction.
(3) Nonlinear feature In the realm of nonlinear features, several measures are commonly employed to extract the nonlinear characteristics of respiratory signals, including Central Tendency Measure (CTM), Lempel-Ziv complexity, and Approximate Entropy (ApEn) (Alvarez et al 2010, Marcos et al 2012, Gutierrez-Tobal et al 2012a, 2013, 2016).CTM quantifies the variability degree within a time series (Cohen et al 1996), Lempel-Ziv complexity measures complexity in finite sequences (Lempel and Ziv 1976), and ApEn assesses the irregularity of a time series by assigning higher values to higher irregularity (Pincus 1991).The three nonlinear features can essentially serve as representatives of the nonlinear characteristics of respiratory signals.
Furthermore, researchers have also explored diverse combinations of methods to extract complex nonlinear features from respiration signals and respiration movement signals.For example, Kaimakamis et al used the monofractal scaling features obtained from detrended fluctuation analysis (DFA), combined with the maximum Lyapunov exponent (LLE), and incorporated them into ApEn to construct a comprehensive complexity index (Kaimakamis et al 2016).However, as Vaquerizo-Villar et al point out (Vaquerizo-Villar et al 2018), apnea and hypopnea events in patients with SDB can generate random spikes and irregular fluctuations in physiological signals.These variations and fluctuations align with a multifractal structure and cannot be entirely characterized by a single fractal provided by conventional DFA.Consequently, (Gogus et al 2020).employed multifractal detrended fluctuation analysis (MDFA) as a feature extraction technique on single-channel nasal cannula airflow signals and evaluated its performance using a random forest (RF) classifier.
In summary, feature extraction methods for respiratory signal can be categorized into three main types: time-domain, frequency-domain, and nonlinear features.Time-domain features involve analyzing statistical properties and morphological characteristics within a specific time window using discrete-time samples.Frequency-domain features are obtained by decomposing the signal into its frequency components.Nonlinear features are derived through various nonlinear analysis methods, such as CTM, approximate entropy, LLE, and Lempel-Ziv.It is crucial to emphasize that feature extraction plays a vital role in SDB detection.The extracted features should accurately capture the signal's characteristics to enhance performance with the classification algorithm.Researchers need to determine which feature is more effective by matching different threshold rules or classifiers.

Classification algorithm
In the systematic review, we found that SDB classification algorithms were primarily based on threshold rules (8), machine learning (13), and deep learning (11).As shown in figure 2(a), early algorithms were predominantly based on threshold rules and machine learning, while after 2017, deep learning became the predominant approach.However, there is no significant difference in the number of points of interest for the recognition targets among the different algorithms, as shown in figure 2(b).In this section, we will provide a detailed overview of the literature survey results focusing on the three algorithms.
(1) Threshold rule-based algorithms The commonly employed methods for detecting sleep apnea include peak detection and envelope detection.The determination of judgment threshold is based on the regulations provided by the AASM (Iber C et al 2017).For instance, Selvaraj utilized piecewise cubic Hermite interpolation to interpolate local maxima and local minima points, allowing for the acquisition of upper and lower envelopes (Selvaraj and Narasimhan 2013).Then, set the threshold by obtaining the envelope width (E) and instantaneous amplitude base frequency variability feature value.MT Bianchi developed an adaptive envelope-tracking function that tracks the amplitude excursions of thoracic signals (Bianchi et al 2014), which adapts to changes in breath size or belt amplitude by defining a new threshold for each breath based on the height of the previous peak.Azimi proposed an algorithm based on an adaptive threshold power transformation of the RIPsum (Azimi et al 2018).This algorithm dynamically calculates the threshold as 20% of the highest power within each power segment, considering the 120 s preceding the current power segment.ElMoaqet et al proposed a method for continuous monitoring of air flow respiration, which consists of a 600 s baseline window (Wb) and 100 s detection window (Wm) (Kim et al 2019).In Wb, the detected amplitudes and time intervals are sorted in descending order to calculate the average value, establishing the baseline.In Wm, the detected amplitudes and time intervals are sorted in ascending order, and the average value is compared with the baseline to detect the occurrence of sleep apnea events.However, the presence of artifacts can lead to sudden changes in the respiratory rhythm, making it challenging to accurately characterize the baseline morphologically (Redline et al 2007, Otero et al 2011).Thus, Ciołek et al proposed an improved approach by replacing linear low-pass FIR filters L(ω)r used in Hilbert transform-based and square-law envelope-based detectors with a cascade of standard median (SM) and recursive median (RM) filters (Ciolek et al 2015).This improved approach addresses the issue of envelope distortion and phase shift caused by artifacts, resulting in a more robust envelope detection method.
Previous threshold rule-based literature mainly focuses on the identification of Apnea, even though the study of Hypopnea is mentioned, the Apnea and Hypopnea were still combined in the final identification process.Moreover, there has been limited research on specifically classifying the types of SDB.To address this issue, Lee et al Threshold rule-based recognition algorithms provide clear explanations for labeling specific signal periods as containing sleep apnea events or not (Sannino et al 2014).These algorithms have demonstrated good performance, as shown in table 4.This white-box approach holds great value in the medical field.However, threshold rule-based algorithms, which rely on comparing simple features and experiment-derived thresholds, overlook the statistical distribution of input features and output categories (ElMoaqet et al 2020b).It becomes challenging to establish a universal threshold that can be applied to diverse individuals.To address this limitation, there is a need for more complex algorithms capable of learning from the data and adapting to individual differences, such as machine learning and deep learning.(2) Machine learning algorithms Machine learning algorithms employed in SDB detection discern patterns and relationships within the data for precise classifications.Initiating this process requires the expertise of medical professionals to meticulously annotate respiratory signals, including categories like apnea, hypopnea, normal, OSA, CSA, MIX, etc. Subsequently, algorithmic researchers, specializing in the field, undertake a crucial step known as feature engineering.This meticulous process involves manually extracting pertinent features from labeled data, detailed in section 3.3.In the final stage, algorithmic researchers apply feature selection algorithms meticulously.These selected features are then strategically integrated with machine learning algorithms to optimize results.Diverse machine learning algorithms play a pivotal role in recognizing SDB, including Support Vector Machines (SVM) ( Additionally, between 2012 and 2015, Tobal et al conducted extensive research on machine learning algorithms for sleep apnea-hypopnea syndrome (SAHS) recognition and apnea-hypopnea index (AHI) estimation based on AF and RRV (respiratory rate variability derived from AF) signal.To achieve optimal results, the researchers selected various feature combinations, including statistical features, spectral features, and nonlinear features, with particularly emphasis on the 'spectral bands of interest' feature.These feature combinations were chosen using techniques such as These studies' results mentioned above have demonstrated promising results in the field of SDB algorithms based on machine learning recognition, as presented in table 5.However, machine learning techniques require manual engineering of features and can potentially miss interesting sleep apnea markers in the biometric signals due to human misinterpretations.To address this issue, researchers have increasingly turned to deep learning techniques, which are capable of relying on patterns and inferences to automatically learn the complex relationships between features and labels from data (LeCun et al 2015).Such techniques have been applied widely in the analysis of complex medical data (Ravi et al 2017).
(  Feature extraction, involving the automatic extraction of pertinent features in the convolutional and maxpooling layers of the network.(3) Event classification, wherein SDB events are classified by the fully connected layers.Researchers consistently achieve optimal results by refining the signal preprocessing method and adjusting critical parameters, including the number of convolutional layers, kernel size, number of filters, stride, activation function, and optimizer.For instance, Haidar et al utilized 30 filters with a [5 × 1] kernel size, 5 strides, and a Rectified Linear Unit (ReLU) activation function, concurrently employing the Adam optimizer.This configuration resulted in an accuracy of 74.7% with a 1D CNN on a 30 s non-overlapping NAF signal, surpassing the SVM model's accuracy of 72.0% (Haidar et al 2017).In contrast, McCloskey et al transformed original respiratory waveform images into wavelet spectrograms, implementing a 2D CNN architecture.This approach yielded an average precision of 79.8% (McCloskey et al 2018), marking a 5.3% improvement over the 1D CNN's accuracy of 74.5% (Haidar et al 2017).Choi et al utilized three convolutional layers and introduced overlapping sliding windows, enhancing the accuracy of SHAS recognition to 94.9% and enabling the estimation of AHI (Choi et al 2018).
Differing from conventional CNNs, WaveNet is a fully convolutional neural network that lacks fully connected layers (Oord et al 2016).Originally designed for generating audio waveforms, WaveNet utilizes a deep convolutional structure to capture long-term dependencies within audio signals.Given the resemblance between respiratory signal waveforms and audio waveforms, some researchers have leveraged WaveNet for SDB recognition.For example, Nassi enhanced WaveNe's feature extraction capabilities by utilizing its residual blocks to enlarge the receptive field.The architecture of WaveNet was further modified by incorporating noncausal convolutions, allowing the output nodes to depend on both past and future time steps, thereby improving recognition accuracy.Nassi's work demonstrated the effectiveness of WaveNet in recognizing respiratory Apnea-Hypopnea events using a single-effort belt and in multi-classification tasks, including OSA, CSA, and arousal-hypopnea (Nassi et al 2022).
There is another type based on the deep convolutional structure, ResNeXt, which adopts the concept of residual learning by introducing residual blocks to tackle issues such as gradient vanishing and exploding during the training of deep neural networks.Researchers have also improved the ResNeXt structure to enhance image feature extraction capabilities for improving SDB recognition rates.For example, Chen et al proposed an improved ResNeXt network called Multi-resolution ResNeXt (Mr-ResNeXt), which decomposes images into low-frequency and high-frequency features using octave convolution.Additionally, they upgraded the 3 × 3 filter in ResNeXt with a new block of multi-level group convolution.This enhancement led to an increase in the recognition accuracy of OSA by nearly 3% (from 91.02% to 94.23%) (Chen et al 2020).Wu et al introduced octave convolution and attention mechanisms based on the residual network to detect Apnea-Hypopnea, achieving an accuracy of 91.23% (Wu et al 2021).Subsequently, Yue et al verified the effectiveness of the multiresolution residual network (Mr-ResNet) on two databases, obtaining satisfactory results (Yue et al 2021).
However, most of the neural network architectures based on convolutional structures mentioned above are commonly employed for image recognition, requiring substantial computational power (Urtnasan et al 2018).Recent developments in deep learning have introduced an alternative model: long short-term memory (LSTM) neural networks, known for their proficiency in capturing both long-term and short-term dependencies in temporal data.This capability makes LSTM effective in detecting patterns without relying on handcrafted features.In contrast to image-centric architectures, LSTM can significantly reduce the volume of data processing.The typical structure of an LSTM includes a cell (cell state), forget gate, input gate, and an output layer (commonly employing the Softmax or sigmoid function, depending on the functional requirement) (LeCun et al 2015).In the SDB recognition process, respiratory signals and movement-related signals undergo normalization and segmentation to obtain the input sequence initially.Subsequently, the cell, input gate, and forget gate play crucial roles in capturing and storing relevant features from the input sequence.The final step involves the output layer, which computes the probability distribution for each SDB event.To enhance recognition rates, researchers often explore adjustments such as varying the number of LSTM layers, activation functions, loss functions, or employing different combinations of LSTMs.For example, Steenkiste et al pioneered the application of LSTM to validate the ability to identify apnea based on single-channel respiratory signals-an important step towards a fully automated sleep apnea detection method (Van Steenkiste et al 2019), employing the adadelta optimizer.Drzazga et al proposed a structure consisting of two LSTM networks neural connected in series, combined with signal preprocessing and filtering, to demonstrate the ability of LSTM to It is worth noting that utilizing DL techniques to deal with imbalanced data classification pose significant challenges (Buda et al 2018, Johnson andKhoshgoftaar 2019).Imbalanced datasets may bias the model towards the majority class, resulting in skewed outcomes.To address this issue, researchers have employed various algorithms to handle imbalanced SDB data.The most straightforward solutions include downsampling (Vluymans 2019, Singh andMishra 2022), oversampling (ElMoaqet et al 2020a), and subsampling (Choi et al 2018, Lakhan et al 2018).However, these methods may result in the loss of valuable information and the risk of overfitting, thereby impacting the generalization ability of the new data.To overcome these disadvantages, an innovative procedure called balanced bootstrapping has been proposed (Wallace et al 2011).This method has been adopted in subsequent recognition algorithms to address the imbalance problem in SDB data (Van Steenkiste et al 2019, Nassi et al 2022).

Perspective
Due to variations in data collection methods, data volume, database sources, and evaluation criteria across the 32 papers identified in our systematic review, accurately determining the superiority of one algorithm over another is not feasible.However, it is possible to discuss the overall trends of various algorithms and their respective areas of expertise, analyze the common issues present in current algorithms, and explore potential research directions that may emerge in the future.

Trend of algorithm development
Early research predominantly relied on threshold rule-based algorithms, which provided a user-friendly and accepted approach for medical personnel.These algorithms can clearly explain why signals at certain moments are marked as containing sleep apnea events or not (Sannino et al 2014), which is more user-friendly and accepted for medical personnel.However, classical threshold rule-based detectors relied on experimentally derived thresholds and features, which posed challenges in establishing a generalized standard due to individual variations in physiological signals.Furthermore, these algorithms had limited applicability to small datasets and struggled to extract input features and statistical distributions effectively (ElMoaqet et al 2020b).The reviewed literature primarily focused on datasets of fewer than 120 people (Bianchi et al 2014).
In contrast, machine learning has emerged as a solution to address challenges mentioned above.Its strong learning ability, capacity to automatically detect complex patterns, higher model accuracy, and greater flexibility have made it indispensable in the recognition of SDB algorithms at all stages.Machine learning algorithms constitute a substantial proportion of the reviewed literature, accounting for 40.6%.However, the core of machine learning lies in feature selection and model selection to achieve optimal classification performance (Jordan and Mitchell 2015, Mahesh 2020), which also leads to some limitations.Firstly, researchers need to possess strong prior knowledge and a comprehensive understanding of the signals to extract effective feature sets, which can be challenging for non-medical personnel without expertise in SDB pathology.Secondly, the manually extracting of features for data acquisition is costly and subject to subjective influences.Additionally, machine learning algorithms have limited ability to capture temporal information, posing challenges when analyzing complex long-term vital sign signals like respiratory signals.Thus, there are inherent limitations to using machine learning for extracting hidden pathologic information from respiratory signals.
Since 2017, deep learning-based SDB recognition algorithms have gradually emerged as the dominant trend, primarily due to their ability to automatically extract features and recognize hidden information from time series data.In practical applications, researchers often employ CNN models, renowned for their prowess in image processing, and LSTM models, which excel at processing time series data.CNN technology primarily involves transforming respiratory signals into images through techniques such as Wavelet Transform and Fast Fourier transform, followed by enhancing features using various image processing techniques to improve SDB recognition accuracy.On the other hand, LSTM models focus on improving the model's ability to learn temporal features, thereby enhancing the accuracy of SDB recognition.To further improve recognition accuracy, researchers have also adopted a combined approach utilizing both CNN and LSTM models, leveraging the advantages offered by each (Hafezi et al 2020).
In summary, although our analysis did not reveal clear differences or trends among the various methods of signal acquisition, signal processing, and feature extraction, a significant trend emerged when examining classification models.This trend underscores the growing importance of deep learning in the field of SDB recognition.Its impact on SDB recognition is expected to persist until more advanced models emerge.

Techniques to enhance model performance
Although conclusive statistical evidence regarding the superiority of specific algorithms is lacking, there are effective techniques available to enhance the performance of SDB detection models.In terms of signal source, algorithms can be categorized into three categories: NAF, OAF, and respiratory movement-related signals (ARE, TRE).A comparison of algorithms based on different signal sources reveals that the overall recognition algorithm utilizing NAF signals achieves the highest accuracy at 94.11%, whereas the average accuracy for the other two signal types is only 78.19% (it is important to note that this comparison considers only the recognition results of the apnea algorithm, as different papers employ distinct classification targets and evaluation indicators).This statistical finding is consistent with the research result in Avci and Akbas (2015), which further emphasizes the significant impact of the signal source selection on model accuracy.
From a functional perspective, a simple respiratory signal alone can only provide recognition of apnea, hypopnea, or AHI, which possesses limited diagnostic capabilities.However, for a comprehensive SDB diagnosis, it is necessary not only to recognize apnea or hypopnea, but also to diagnose the type of SDB, such as OSA, MSA and MIX, to formulate effective treatment plans.In cases of OSA obstruction events, the direction of thoracic motion is opposite to the direction of abdominal motion when the subject tries to breathe, presenting a unique contradictory motion pattern (Tobin et al 1983, Staats et al 1984, Farre et al 2004).In CSA, when an obstruction occurs, the brain cannot produce or transmit signals to control respiratory muscles, leading to a complete cessation of breathing for a brief period (Ramachandran and Karuppiah 2021).Both situations will occur in the MIX pattern.Therefore, TRE and ARE can be added as two additional signals to increase the ability to identify SDB types (Lin et al 2016), thus expanding the model's function.
When dealing with SDB data, it is crucial to address the issue of data imbalance.This is primarily due to the limited positive instances of respiratory arrest for each patient, leading to an imbalanced dataset.When machine learning or deep learning models are trained using such datasets, they tend to exhibit a strong bias towards the majority class, resulting in skewed outcomes.Common methods for dealing with imbalanced data include Subsample, Bootstrapping, and Oversampling.In practical applications, researchers can use multiple techniques and evaluation metrics such as accuracy and specificity to compare the performance of models in multiple dimensions to identify the best-performing model.It is important to note that when dealing with imbalanced data, precision and recall (Hafezi et al 2020) should be added to evaluate the performance of the classifier.

Problems to be Improved and Solved
In general, algorithms relying on a single respiratory signal source are preferred due to their simpler hardware implementation and lower costs.However, the standard definition of a hypopnea event by the AASM requires a decrease in respiratory airflow of over 30% for at least 10 s, accompanied by a decrease in oxygen saturation of over 3% or an arousal (Iber C et al 2017).This definition poses limitations on detecting hypopnea using a single respiratory signal source.Most algorithms reviewed in the literature combined apnea and hypopnea into one event, and even those that recognized hypopnea have shown unsatisfactory overall performance.For instance, the highest recognition rate reported involved only 25 subjects at 73.34% (Adha and Igasaki 2021), while the lowest recognition rate was a mere 49.3% (Drzazga and Cyganek 2021).Currently, a common practice to improve hypopnea recognition rates is to incorporate blood oxygen signal sources, but this necessitates additional hardware resources.Hence, there is a need to develop algorithms solely based on a single respiratory signal source to enhance hypopnea recognition accuracy, which should be a new research direction.
Nowadays, deep learning algorithms are widely utilized in the medical and healthcare fields due to their ability to automatically extract features, capture long-term temporal dependencies, and handle large and unstructured datasets (Ravi et al 2017).However, these algorithms are often treated as a highly complex 'black box' system, making it challenging to explain their results, access the underlying mechanics, or make modifications in case of misclassifications.This lack of interpretability is not user-friendly for medical personnel who are not well-versed in algorithmic fields.Consequently, developing a new SDB classification model that is interpretable and human-machine coexistent will be a new research focus.
Over the years, SDB recognition technology has made significant advancements, transitioning from the initial gold standard of multi-lead PSG to various single-lead detection technologies, accompanied by cost reductions.However, precise measurement of respiratory signals still necessitates the installation of sensors at the mouth and nose, which must be measured continuously throughout the night.This approach is inconvenient for patients and affects their compliance.With the emergence of new materials and the development of the Internet of Things, several contactless respiratory signal acquisition methods have been explored, such as infrared optical gas imaging (An et al 2022), radar (Javaid et al 2015, Kagawa et al 2016), and mattress of pressure sensor (Yizraeli Davidovich et al 2016).However, they are still in the early stages of research and are not yet applied clinically.Thus, developing a reliable contactless respiratory signal recognition algorithm for clinical applications is a promising research direction for the future.

Conclusion
This article presents a comprehensive investigation into the literature on the SDB detection algorithms based on respiratory and respiratory movement-related signals from 2012 to 2022.From the selected 32 articles, a detailed analysis and discussion were conducted, focusing on four main aspects: respiratory signal acquisition, signal processing, feature extraction, and classification algorithm.
The research findings indicate that NAF obtained from airflow thermal or thermal sensors demonstrates superior accuracy, making it the preferred signal for studying respiratory signal algorithms for majority of researchers.Additionally, due to the presence of respiratory paradoxical movement, ARE and TRE obtained through RIP belts or PVDF belts can serve as signals for recognizing different types of SDB.Regarding signal processing, the literature commonly employs techniques such as band-pass, median, notching, IIR, and FIR filters.However, given the substantial inter-individual variations in respiratory signals and the presence of artifacts, researchers should consider exploring adaptive filtering or more sophisticated algorithms to improve signal quality.Feature extraction is of paramount importance in rule-based and machine learning algorithms as it enables enhanced accuracy through use of suitable feature selection methods and algorithmic matching.Researchers are advised to conduct comparative experiments and carefully select suitable features and matching algorithms.Within the domain of classification models, deep learning has emerged as the predominant approach owing to its automatic feature extraction capabilities and superior learning performance.However, researchers must be aware of the 'black box' effect inherent in deep learning and strive to mitigate it through algorithmic enhancements or novel models.
In summary, this article provides a comprehensive overview of the SDB algorithm based on respiratory and respiratory-related movement signals.It aspires to serve as a quick reference guide for novice researchers and provide valuable insights for experienced researchers to help them adjust algorithms.Moreover, the article identifies and analyzes the current unsolved problems of the SDB algorithm, with the hope of helping researchers in formulating future research topics and developing new algorithms.

Figure 1 .
Figure 1.Flowchart of the systematic review design and study selection.N denotes the number of articles.

,
Gogus et al (2020),Haidar et al (2017), McCloskey et al (2018), Choi et al (2018), ElMoaqet et al (2020a), Chen et al (2020), Wu et al (2021), Yue et al (2021) 2 OAF 4 ElMoaqet (Kim et al 2019, ElMoaqet et al 2020b), Koley and Dey (2013a, 2013b) 3 TRE, ARE 10 Bianchi et al (2014), Kagawa et al (2016), Azimi et al (2018), Adha and Igasaki (2021), Thommandram et al (2013), Lin et al (2016), Van Steenkiste et al (2019), Hafezi et al (2020), Drzazga and Cyganek (2021), Nassi et al (2022) NAF: nasal airflow, OAF: oronasal airflow; TRE: thoracic respiratory effort, ARE: abdominal respiratory effort Fourier transforms commonly used in extracting frequency domain feature from respiratory signals encompass discrete fourier transform (DFT) (Gutierrez-Tobal et al 2012a, 2012b, 2013, Javaid et al 2015) and fast Fourier transform algorithm (FFT) (Koley and Dey 2013a, Diaz et al 2014, Ciolek et al 2015), an enhanced version of DFT.However, DFT assumes that the analyzed signal is stationary and lacks temporal or frequency variations (Sundararajan 2001).To capture the changing frequency characteristics of respiratory signals over time, researchers have started utilizing the short-time Fourier transform (STFT) for time-frequency conversion of respiratory signals (Wu et al 2021, Yue et al 2021).Additionally, to obtain the PSD feature of the frequency domain, which reflects the recurrent changes in air flow during night respiration (Krishnan and Athavale 2018), researchers such as Tobal et al (Gutierrez-Tobal et al 2016) and Koley et al (Koley and Dey 2013a) utilized DFT introduced a median filter to obtain the AF amplitude and then distinguished A and H based on six rules, and the PPV of Hypopnea reached 65.7 (Lee et al 2016).Adha et al utilized the obstructive reciprocal divergence (ORD) algorithm, based on RIPsum signal to identify Hypopnea and the accuracy reached 73.34 (Adha and Igasaki 2021).To classify SDB types, Kagawa et al proposed a method for classifying SDB types based on the theory of paradoxical breathing motion (Lin et al 2016).This method involved using the Pearson correlation method to estimate the phase difference between chest and abdominal movement signals.Subsequently, amplitude drop and phase difference thresholds were utilized to classify OSA, CSA, and MIX (Kagawa et al 2016).

Figure 2 .
Figure 2. Statistical visualization of the results after data analysis.(a) Comparison of the number of papers published utilizing three different algorithms over the years.(b) Number statistics of identified SDB types by three different algorithms.A: Apnea, H: Hypopnea.AH: the combination of apnea and hypopnea.AHI: Apnea hypopnea index.
Koley and Dey 2013a, 2013b, Lin et al 2016), artificial neural networks (ANN) (Guijarro-Berdiñas et al 2012), Random Forest (RF)(Avci and Akbas 2015, Gogus and Tezel 2019, Gogus et al 2020), Gaussian processes (GP) (ElMoaqet et al 2020b), K-Nearest-Neighbors (KNN)(Thommandram et al 2013), and Logistic Regression (LR) (Gutierrez-Tobal et al 2012a).Researchers customize feature combinations for different classifiers to attain optimal classification results.For instance, Berdiñas et al (Guijarro-Berdiñas et al 2012) utilized the SVM recursive feature elimination (SVM-REF) method to select the wavelet coefficient features of the DWT-transformed TRE signal and input them into the error-correction output code (ECOC)(Dietterich and Bakiri 1994) model of the ANN expert to detect SDB types.Koley et al employed SVM-RFE to select the optimal feature subset from the 36 power features extracted from OAF signal and applied it to the three binary SVM classifier to achieve the offline recognition of Apnea, Hypopnea, and Normal (Koley and Dey 2013a).The results of the three classifiers were evaluated using the 'oneagainst-all strategy'(Kim et al 2003) and the 'winner-takes-all rule'(Kim et al 2003) was used to determine the final result.Subsequently, online verification was carried out on 8 subjects(Koley and Dey 2013b).In another study, Lin utilized an SVM classifier based on three features of joint signals TRE and ARE, namely amplitude ratio (AR), frequency ratio (FR), and covariance between thoracic and abdominal movements, to achieve a classification of OSA and CSA(Lin et al 2016).Other researchers have also employed different feature selection techniques and classifiers to improve the performance of machine learning algorithms in SDB detection.Avci et al and Gogus et al utilized feature subset selection methods such as Correlation-based Feature Subset Selection (CfsSubsetEval)(Avci and Akbas 2015), OneR Attribute Eval Feature Selection (OneRAttributeEval)(Gogus and Tezel 2019), and wrapper subset evaluation (WSE)(Gogus et al 2020) to select optimal feature sets.These selected features were then utilized in a RF classification model for the detection of Apnea.Thommandram et al developed a KNN model for Apnea identification(Thommandram et al 2013), based on four clinically observable features: peak-to-peak time stability, peak heights stability, presence of long pauses, and flat lining indication.The best performance was achieved with a k value of 44.Furthermore, ElMoaqet et al proposed a novel feature, which is the relative changes between the baseline window and the detection window used as feature input, and a Gaussian model was used to recognize the apnea (ElMoaqet et al 2020b).This model outperformed the threshold breath pause recognition algorithm proposed by the author in 2018(Kim et al 2019).

)
Deep learning algorithms The commonly used deep learning technologies in SDB recognition algorithms include convolutional neural networks (CNN) (Haidar et al 2017, Choi et al 2018, McCloskey et al 2018, Wu et al 2021, Yue et al 2021),

Table 2 .
List of cutoff frequencies.

Table 3 .
List of extraction methods and features.
M : mean IRA; LTP: Logarithm of total power; LEN: Length; P2P stability: Stability of the peak-topeak time; Peak stability: Stability of the heights of the peaks; LP presence: Presence of long pauses; FL: Flat-lining; BW M : spectral band of interest mean; BW MA : spectral band of interest maximum amplitude; BW mA : spectral band of interest minimum amplitude; BW SD : spectral band of interest standard deviation; BW MF : spectral band of interest median frequency; BW k : spectral band of interest kurtosis; SampEn: Sample entropy; Mini: minimum; Max: maximum; Var: variance; Ave: average; AR: Amplitude ratio; FR: Frequency ratio; Cov: covariance; LLE: Largest Lyapunov Exponent; DFA: Detrended Fluctuation Analysis; SF set: Statistics feature set; AF set: Amplitude feature set; DMF set: Descriptive model feature set; ATM: Amount of tracheal movement; BDC: Breathing duty cycle; IDC: Inspiratory duty cycle; AUIC: Area under inspiration curve; SS: Signal slope; AUB: Area under breath; GHE: Generalized Hurst exponent; min hq: Minimum singularity exponent value; VDH: Vertical distance between hqmin and hqmax; KMS: Kurtosis of multifractal spectrum; AIdx: Asymmetric index; MSmin: Multifractal spectrum corresponding to minimum;

Table 4 .
List of threshold rule-based algorithms and results.

Table 6 .
List of deep learning algorithms and results.Apnea and Hypopnea based on the joint signal of OAF, TRE, and ARE (Drzazga and Cyganek 2021), utilizing the Adam optimizer.ElMoaqet et al utilized variants of LSTM (Bi-LSTM) to detect Apnea based on NAF, OAF and ARE signals respectively, and the results showed that the detection accuracy based on NAF was higher (ElMoaqet et al 2020a), employing the Adam optimizer.Hafezi et al combined the capability of CNN in extracting robust features (Krizhevsky et al 2017) with the proficiency of LSTM in encoding relative information across temporal measurements (Sepp Hochreiter 1997) to form a CNN+LSTM model.The RMSProp optimizer was employed for the recognition of Apnea (Hafezi et al 2020), yielding satisfactory results.Table 6 provides an overview of algorithms and results based on deep learning for recognizing SDB. recognize