Radar antenna scanning type identification based on the fusion of multiple temporal features

Identifying the antenna scanning type (AST) of radar signal aims to deeply analyze the parameters of radar waveform and accurately judge the AST of the radar signal. Traditional methods rely heavily on domain priors and expert experience to set classification thresholds, which introduces human errors. Moreover, the generalization and anti-noise interference abilities are relatively weak, and it is difficult to deal with complex battlefield environments. The model based on the convolutional neural network (CNN) is limited by the receptive field of the convolution kernel, which makes the model unable to make full use of the global information of the signal. To fully exploit the global timing characteristics of radar signals, we propose a radar AST recognition method based on the fusion of multiple temporal features. Specifically, for the time series characteristics of the radar antenna scanning signal, a 1D CNN branch is first built to extract its short time series features. Given the limited receptive field of the convolutional network and the inability to fully consider the overall characteristics of the radar signal, we propose a transformer network branch to extract the global timing features fully. The recognition of radar AST based on multi-time series feature fusion can fully model the local and global sequence attributes, thereby improving the cognitive recognition ability of radar signals. Our method attains state-of-the-art results based on experimental findings.


Introduction
The attributes of the antenna have a direct impact on radar performance since the antenna serves as the apparatus responsible for emitting radar's electromagnetic energy [1].For target detection, the radar must explore the designated airspace in a specific manner, necessitating the antenna to perform beam scanning [2].Using the antenna's scanning characteristics, the fixed pulse radar can be positioned [3] [4].More importantly, accurately identifying the AST of the enemy radar is crucial to judging the threat level we are facing, and it is also the key to identifying the radar key means of type and working status.
Traditional radar signal processing faces challenges in addressing the requirements of diverse, automated, refined, and intelligent radio signal analysis and identification in the modern era.The advent of artificial intelligence (AI) technology has been proven transformative.Machine learning models, with their outstanding fitting and representation capabilities, have delivered remarkable outcomes across various applications [5].
Deep learning methods have been gradually integrated into radar AST recognition in recent years.The convolutional neural network (CNN) approach has been proven effective in extracting radar model characteristics and enabling end-to-end data-driven AST recognition.However, the CNN convolution kernel's restricted receptive field hinders its capacity to comprehensively leverage global information and extract the global timing characteristics of radar signals.This limitation leads to diminished recognition accuracy, particularly for low-reconnaissance intercepted radar signals in complex environments.
This paper proposes a novel radar AST recognition model based on the fusion of multiple temporal features to address these challenges.Specifically, we establish a 1D CNN branch to extract short time series features from the radar antenna scanning signal.Recognizing the limitations of the CNN's receptive field in fully modelling the radar signal's global sequence properties, we introduce the transformer network branch.The transformer network effectively captures longer-range timing features and comprehensively extracts and utilizes both long-term and short-term characteristics of radar signals.This model significantly enhances radar signals' cognitive recognition ability and accurately recognizes radar ASTs.By integrating the strengths of the CNN and transformer, our proposed approach offers a more robust and efficient solution for radar AST recognition in complex environments.

Related work
Traditional AST identification has historically relied on manual judgment by operators using a headset and stopwatch.As technology continues to advance, new automatic identification methods have been developed.These methods focus on classifying and identifying signal characteristic parameters that are captured by reconnaissance receivers.Gong et al. [6] introduced a dual-stage learning approach in their radar AST recognition system.This technique integrates the immune algorithm and the least square method for formulating a radial basis function (RBF) network.The principal goal of employing this strategy is to improve the optimization efficiency of the network training algorithm and, to some extent, mitigate the concern of premature convergence.However, the generalization ability and anti-noise interference ability of the automatic identification method are relatively weak, and it is difficult to deal with the complex battlefield environment.
Deep learning methodologies exhibit robust feature extraction and learning capabilities, surpassing traditional methods and showcasing outstanding performance across a wide spectrum of tasks.Numerous models based on CNNs have been introduced in radar signal processing.However, the modern battlefield presents novel challenges, particularly with radar systems extensively employing agile waveforms, which pose significant obstacles to signal identification in electronic warfare systems.Conventional recognition models often struggle to cope with these dynamic scenarios.
Matuszewski et al. [7] proposed an innovative CNN-based method for agile waveform radar emission signal recognition to address this critical issue.Their approach involves measuring and processing these signals by using an electronic identification receiver, converting them into digital data, and performing identification analysis.Furthermore, in [8], a method based on CNNs was introduced to identify the Radar Antenna Scan Period (ASP) in cases where radar and electronic warfare systems utilize cyclic scanning of antennas.
While CNN has shown success in diverse radar signal processing tasks, its effectiveness is constrained by the limited receptive field of the convolution kernel, preventing comprehensive modelling of the global timing properties of the signal.In contrast, transformer-based deep models have emerged as formidable performers in various domains.Initially introduced in [9] for sequence-tosequence machine translation, transformers have gained prominence and demonstrated strength in diverse fields, including natural language processing (NLP), computer vision, and audio processing.
Within computer vision tasks, transformers have showcased formidable representation capabilities, spanning a wide array of applications.These include but are not limited to image classification, video classification, video recognition, image segmentation, image captioning, image object detection, and various other domains.These results demonstrate the superiority of transformers in modelling and processing complex data and provide new possibilities for their application in radar signal processing.Therefore, with the powerful feature extraction and global sequence modelling capabilities of transformers, we are expected to more effectively address the challenge of radar signal recognition by using agile waveforms in electronic warfare systems.

Transformer branch
Inspired by the transformer's powerful medium and long-term modelling capabilities, the transformer branch is designed to capture the global sequence properties of radar signals fully.This branch consists of three transformer encoders.Each transformer encoder comprises a single-layer multi-head selfattention (MSA) and a multi-layer perceptron (MLP) module.Each MSA mechanism consists of multiple "heads".Each head can learn different attention weights so that the model can focus on different locations or features simultaneously.However, as the MSA lacks sensitivity to position information, the challenge lies in associating the self-attention output with specific positions.We introduce position encoding information into the embedding of the input token mC SR ≥ ⊆ to ensure that the model can effectively understand the temporal position relationship of the signal.Such a position encoding strategy enables the transformer branch to better capture the signal sequence's global dependencies and essential features.The formula for position encoding is as follows: where mC pos TR ≥ ⊆ indicates positional embedding.
In the transformer encoder, the MSA takes a triplet as input, comprising query Q , key K , and value V .The computation process unfolds as follows: ,, The transformer encoder uses MSA, which helps the network capture richer features or information.The formula for MSA can be defined as:

CNN branch
CNN is famous for its excellent local context modelling ability, and we specially designed the CNN branch to fully mine and utilize the local sequence properties of radar signals.This CNN branch consists of 1D convolutional layers, batch normalization (BN) layers, and rectified linear unit (ReLU) layers.In our design, we incorporate residual connections to address issues related to network degradation and gradient disappearance.These connections enable the addition of the output of a convolutional layer to its input, preserving the original feature information and facilitating learning adjustments through the residual.This helps to avoid the vanishing gradient problem while increasing the depth and learning ability of the network.

Squeeze-and-excitation module
The squeeze-and-excitation module first compresses the length and width of the feature layer through adaptive global average pooling and only retains the information of the channel dimension.Doing so helps model each channel's international importance in the feature map.The compressed channel information is then self-attention and weighted by two fully connected layers.The significance between channels can be learned automatically through the squeeze-and-excitation module, and the feature maps are weighted according to their importance.This enhances the network's focus on pivotal features, leading to improved overall performance and a more optimized feature representation.

Antenna scan type dataset
This paper explores two simulated radar ASTs: mechanical scan type (MST) and electronic scan type (EST).The dataset consists of eight attributes, namely pulse repetition interval (PRI), time of arrival (TOA), radio frequency (RF), high-gain electronically scanned multi-beam (HGESM) beam elevation, pulse amplitude (PA), angle of arrival (AOA), pulse width (PW), and signal-to-noise ratio (SNR).For AST identification, this study focuses on data from four dimensions: RF, PRI, PW, and PA.The training dataset has a total of 883 samples, including 423 mechanical scanning and 460 electronic scanning samples.Meanwhile, the test dataset includes 54 mechanical and 57 electronic scanning samples.

Experimental setting
The model we proposed undergoes training on a CPU using the PyTorch framework.For optimization, we employ the Adam optimizer over 50 training epochs.The initial learning rate is set at 2e-5, with a weight decay of 5e-4.A batch size of 128 is utilized, and the model's quantitative performance is evaluated by using the overall accuracy (OA).OA, which gauges the proportion of correctly classified samples out of the total samples, serves as a robust metric.A higher OA value indicates superior model performance.In order to verify that our model has excellent performance, we chose the traditional method SVM and the model of pure CNN structure (CNN-T, CNN-B) as the comparison algorithm.In Table 1, we can see that the best results are in bold.Among them, CNN-T represents a 1D small convolution kernel model, while CNN-B is a 1D large convolution kernel model.In contrast to conventional machine learning approaches, the CNN model adeptly captures features directly from radar signals, effectively modelling the nonlinear relationships between these features.The CNN-B variant extends the model's receptive field through enlarged convolution kernel sizes, resulting in a notable 1.8% improvement in performance compared to the CNN-T model.Table 1 highlights that our proposed CFormer model, with 6.5 million parameters, achieves an optimal overall accuracy (OA) value of 97.29%.

Figure 1 .
Figure 1.The network architecture of the proposed CFormer.The CFormer model consists of three parts: transformer branch, CNN branch and squeeze-and-excitation module.
S is the output of MSA.To overcome network degradation and gradient disappearance, we mitigate these issues by incorporating residual connections into the MSA output, thereby enhancing the model's complexity.The formulas for residual connections and layer normalization are as follows:

Table 1 .
Quantitative Evaluation of Different Models in Simulation Datasets.