Table of contents

Volume 1518

2020

Previous issue Next issue

2020 4th International Conference on Machine Vision and Information Technology (CMVIT 2020) 20-22 February 2020, Sanya, China

Accepted papers received: 23 March 2020
Published online: 20 May 2020

Preface

011001
The following article is Open access

CMVIT 2020 was held in Sanya during Feb. 20th-22nd, 2020. 2020 4th International Conference on Machine Vision and Information Technology (CMVIT 2020) is organized by Asia Pacific Institute of Science and Engineering. The conference provides a useful and wide platform both for display the latest research and for exchange of research results and thoughts in Machine Vision and Information Technology and other topics. The participants of the conference were from almost every part of the world, with background of either academia or industry, even well-known enterprise. The success and prosperity of the conference is reflected high level of the papers received.

The proceedings are a compilation of the accepted papers and represent an interesting outcome of the conference. This book totally includes 83 papers and covers 5 chapters: 1. Systems, Models, and Algorithms; 2. Multimedia Information Systems with Human Centered Computing; 3. Computer Vision: Theory and Applications; 4. Signal, Image, and Video Process; 5. Ubiquitous Communications, 5G Technologies and Internet of Things.

We would like to acknowledge all of those who supported CMVIT 2020. Each individual and institutional help were very important for the success of this conference. Especially we would like to thank the organizing committee for their valuable advices in the organization and helpful peer review of the papers.

We sincerely hope that CMVIT 2020 will be a forum for excellent discussions that will put forward new ideas and promote collaborative researches. We are sure that the proceedings will serve as an important research source of references and the knowledge, which will lead to not only scientific and engineering progress but also other new products and processes.

Prof. Irwin King

The Chinese University of Hong Kong, Hong Kong S. A. R., China

Conference Committee Chair

011002
The following article is Open access

Conference Committee Co-Chair, Program Committee, Organizing Committee, International Technical Committee are available in the pdf

011003
The following article is Open access

All papers published in this volume of Journal of Physics: Conference Series have been peer reviewed through processes administered by the proceedings Editors. Reviews were conducted by expert referees to the professional and scientific standards expected of a proceedings journal published by IOP Publishing.

Systems, Models, and Algorithms

012001
The following article is Open access

In this paper, the concept of C4ISR systems interoperability is analyzed and the difference compared with traditional system interoperability is presented. The influencing factors of C4ISR system interoperability is analyzed from the overall framework of the US military global information grid. Six attributes including structure, application, facility, security, operation and maintenance, and data are selected. The selection process of evaluation attributes is described. Based on the enhanced interoperability maturity mode, the C4ISR system level attribute model is given by combing with the development trend of C4ISR system technology system. The grade of maturity and an evaluation index system for C4ISR systems interoperability are built and the index level reference model is designed. Combining qualitative assessment with quantitative assessment, an index synthesis criterion based on the mapping model and the corresponding interoperability level evaluation method for C4ISR systems are proposed to provide a specific method model for measuring the interoperability level of the C4ISR system.

012002
The following article is Open access

, , and

One derivative type of the constant modulus algorithm (CMA) termed hierarchical CMA (HCMA) is investigated for its parallel implementation with 8 lanes. Taking the essence of CMA, HCMA is also the steepest descent under the least mean-square (LMS) criterion theoretically. To update the tap coefficients of the equalizer, HCMA utilizes adjustable modulus in accordance with the dynamical amplitude of the baseband signals. The above mechanism of HCMA can make the expectation of the error estimation converge to 0. The 8-parallel HCMA adopts the framework of the pipelined LMS adaptive filter. On the basis of the iterated short convolution and the relaxed look-ahead approximation, the implementation of 8-parallel HCMA can reduce much multiplication. The numerical simulation indicates that HCMA performs better than CMA with evident gain for the equalization of 64 quadrature amplitude modulation (QAM).

012003
The following article is Open access

, and

This paper proposes a network selection algorithm based on Vehicle Trajectory Prediction and AHP (VMAP). The Markov chain is used to predict the running trajectory of the vehicle. The predicted trajectory is used to calculate the dwell time of the vehicle in the network. A network in which the dwell time is greater than a preset threshold is selected as a candidate network. Then, using the Analytic Hierarchy Process (AHP) to determine the weights of each attribute of the candidate network and obtain the comprehensive evaluation quality index of each candidate network to select the optimal network. This algorithm can not only minimize the ping-pong effect generated, but also enable the vehicle to access the optimal network stably.

012004
The following article is Open access

, , , and

Localization is the basic problem of mobile robot. A new approach for information coding and decoding for robot localization is proposed in this paper. information codes are used as landmarks to provide a global attitude reference. The label and location information is stored in the information code and strategically placed in the operating environment. The mobile robot is equipped with a camera that can read the information code for localization purposes Firstly, the current situation of the coding methods used for robot localization is analyzed, and the exposed problems and defects in application are summarized. In view of these problems, a new code method which is more suitable for robot location is designed. Then, the coding and decoding methods of the design are elaborated in detail. Finally, relevant experiments are designed in combination with practical application scenarios. Compared with the traditional QR code decoding method, the effect is better than the traditional QR code.

012005
The following article is Open access

, and

Sequential recommendation is based on the sequence of items that users interact with to predict the items that users will interact with next time. The existing methods usually consider the hidden sequential patterns in the user's action sequence, but modeling candidate item features is ignored. To solve this problem, in this paper, we use the two-tower network structure to model the user's action sequence and candidate item features respectively, then recommend items to users according to the matching score of user interest and item attribute. Specifically, we use self-attention mechanism to mine the information in the user's action sequence, and use Factorization machines to automatically construct the cross features of item. We named this method MatchRec. The experiments on the public datasets demonstrated that MatchRec has a significant improvement in recommendation accuracy compared with that of various baseline methods.

012006
The following article is Open access

, , and

With the rapid development of Mobile Internet and Big Data in recent years, the demand for data centers is increasing. The number of super large data centers (more than 1000 counters) has increased. This paper focus on the distribution system, who is responsible for providing a continuous, stable and uninterrupted power supply for the data center. In the event of a large-scale power outage, all business in the data center will be interrupted. Therefore, identifying which electrical equipment in the data center paly the key role in a large-scale power outage is extremely important. Considering the computational complexity of the algorithm, the validity of the vulnerable nodes, the intelligence of the algorithm, this paper proposes a two-layer fragile node identification method based on attention mechanism. The algorithm selects the topology structure and electrical parameters of the data center power distribution system, and considers the flow load of the nodes around each electrical device to form an expression vector. After the vulnerability ranking of the candidate node expression vectors, the k most vulnerable nodes are selected. The outline of the two-layer node identification method is as follows, the first layer calculates an alternative node sequence through a variety of graph-based fragile node identification methods, and the second layer re-calibrates the expression vector of each candidate node based on the attention mechanism and then sorts all candidate nodes. Through the simulation test of a real data center power distribution system, we can see that the proposed algorithm in this paper identifies the effectiveness of the vulnerable nodes in the data center power distribution system.

012007
The following article is Open access

, and

Time-triggered Ethernet adds a time-triggered traffic type based on traditional event-triggered best-effort traffic, which extends data traffic into three communication formats, namely time-triggered (TT) messages and rate-constrained (RC) messages and the best-effort (BE) messages. Aiming at the time-triggered Ethernet message scheduling planning problem, this paper first introduces the principle of the SMT solver and the idea of the solver algorithm, and designs constraints for the message scheduling problem. Based on this, the paper introduces a load balancing strategy and a step-by-step iterative conflict backtracking method, designs and completes a time-triggered Ethernet message scheduling algorithm, and then completes scheduling planning for time-triggered traffic and optimization of scheduling results. For the planning results, the paper proposes verification indicators for network load balancing and algorithm solution efficiency, designs a test set for algorithm verification, completes algorithm analysis and verification based on the test set solution results, and determines the algorithm optimization effect.

012008
The following article is Open access

, , , , and

Crowdsourcing testing technology has developed in recent years with the development of software testing, which can speed up releasing cycle and improve the quality of testing. It is of great practical value to study the priority classification and cause analysis of defect reports by using the potential information of crowdsourcing test defect reports. This paper combines the research of mobile application crowdsourcing test defect report with machine learning data analysis technology, studies the priority classification of mobile application crowdsourcing test defect report, and then carries out defect cause analysis on the basis of defect priority classification. Defect classification is an intuitive reflection of defect research. This paper takes defect priority classification as the breakthrough point of defect report research, uses σ – AdaBoostSVM classification algorithm to classify defect reports, and then carries out cause analysis after defect report classification, which is conducive to the faster location and repair of defects. The experimental verification results demonstrate the effectiveness of the proposed method.

012009
The following article is Open access

, , , and

This paper analyses the distributed generation's influence on failure severity index, introduces the index which can represent the probability of islanding and derive the distribution networks' failure severity index with the connection of distributed generation. This paper analyses how the distributed generation influence distribution networks when the wide-area distributed generation connect with networks, then deriving the distribution networks' failure severity index with the connection of wide-area distributed generation. Through analysing the same IEEE 33 node network and calculating traditional distribution networks' reliability index and failure severity index, we can obtain the system's risk index. By comparing with every index in 3 different conditions, we can find that when the distributed generation connect with distribution networks, the networks' risk tolerance has been improved, we can also find the significant improvement of distribution networks' risk index when the wide-area distributed generation connect with distribution networks.

012010
The following article is Open access

, , , and

In this paper, the reliability index calculation method of power equipment, load and system dispatching is studied. According to the location of the circuit breaker and disconnecting switch, the fault is partitioned, and the load reliability index is the same in each minimum distribution area, so that the reliability evaluation of the distribution network can improve the efficiency. On this basis, the partition coding table is also established, and the calculation of the minimum distribution area can be directly transformed into the calculation of the corresponding coding area. Combined with monte carlo method, the correctness of the method used in this paper is proved by comparing the example analysis with other literatures.

012011
The following article is Open access

, , and

This paper introduces the basic principle of high voltage switchgear discharge, expounds several common discharge types, and studies the precise localization of partial discharge of high voltage switchgear based on hyperboloid positioning method and space grid search method. The simulation test of the diagnostic process is used to test the feasibility of the above method. On the basis of precise positioning of the power supply, the local discharge power supply is simplified to a Gaussian current source, and the pulse parameters are calculated by the combination of the simulated annealing algorithm and the fdtd method. Evaluation. Simulation results show that the method of using multiple sensor measurements can accurately evaluate the equivalent pulse width Ï„ and equivalent current amplitude i0 of the simulated Gaussian pulse.

012012
The following article is Open access

, , , and

This paper establishes a model for the ability of the distribution network to adopt distributed power sources, and proposes adoption constraints that consider the safe operation of the distribution network. The optimized assessment method of the ability of the distribution network to adopt distributed power sources proposed by analysis of simulation examples can help distribution network planners to ascertain the main limiting factors affecting the ability of the distribution network to adopt distributed power sources. According to the identified main limiting factors, distribution network planners can take appropriate measures to improve the ability of the distribution network to adopt distributed power sources.

012013
The following article is Open access

, , and

A traffic signal control system based on parallel simulation technology is designed in this article. Based on the second-level acquisition of intersection data, this system builds a control algorithm matching rule base based on offline algorithm adaptive analysis. In addition, the system can select the signal control algorithm that matches the real-time traffic status online, and realize the online evaluation of the control algorithm through the online parallel simulation and control algorithm evaluation platform, and provide a decision basis for further implementation of the algorithm. This system improves some existing deficiencies in signal control systems, and also provides decision support for the application of complex control algorithms. The system is applied to the control of actual signalized intersections. Practice shows that the system can better adapt to the actual situation of signalized intersections in District, improve the capacity of intersections and the efficiency of the entire road network.

012014
The following article is Open access

, , and

Aiming at the problem that the ground equipment system of space launch range is complex and it is difficult to implement real fault diagnosis training, this paper analyses the requirements of fault diagnosis training, puts forward a design scheme of fault diagnosis system based on virtual range, and expounds the basic workflow of virtual fault diagnosis training system. The fault tree and expert system are combined to complete the fault diagnosis training process design. The real-time scheduling method of virtual scene based on multi-threaded entrance is used to load and render the virtual scene in real time according to the movement trend of the virtual avatar, which ensures the fluency of interoperability in terms of function and performance, so as to achieve the purpose of improving the training effect.

012015
The following article is Open access

, , and

Aiming at the problem of insufficient maintainability design method of EMU engine, a method of using virtual simulation to verify maintainability design is proposed. This paper mainly studies the quantitative evaluation method of maintainability index, and provides the quantitative evaluation method of accessibility and operation space based on virtual maintenance, the quantitative evaluation method of maintenance human factors and the calculation method of maintenance time. Combining with the characteristics of EMU engine system, the scheme design of virtual maintainability design software system for EMU engine is carried out, and the specific process of maintainability verification is given. The efficiency and quality of engine maintainability design and evaluation are effectively improved, and the development cycle is shortened. It is an effective method to realize the parallel optimization design of engine maintainability for EMU.

012016
The following article is Open access

and

[Purpose / Meaning] The rich tools and media forms of mobile reading, while improving the convenience of reading, are also accompanied by the difficulty of maintaining the user's concentration, and tend to be superficial reading experiences. How to use interaction design to improve the user's reading experience and then cultivate its deep reading ability is of great significance to promote the quality of mobile reading services.[Methods / Processes]Guided by promoting users' continuous use behaviours, this paper adopts an empirical research method in the form of questionnaires and conducts relevant regression analysis on the recovered questionnaire data to explore the influencing factors and influencing path of mobile reading applications on user cognitive experience and behavioural results. [Results / Conclusions] The factors extracted by factor analysis are classified into three categories: independent variables (content, function, interaction, vision), and intermediate variables (perceived usefulness, flow experience, reading social experience, perceived value) and outcome variables (adoption and continuous using behaviour). Furthermore, via the path analysis, this paper builds the influential factor path model on mobile reading behaviour. Based on the final empirical research results, this paper proposes specific interactive optimization strategies for mobile reading application design around Internet product design.

Multimedia Information Systems with Human Centered Computing

012017
The following article is Open access

, , , , and

Manipulating human facial images between two domains is an important and interesting problem in computer vision. Most of the existing methods address this issue by applying two generators or one generator with extra conditional inputs to generate face images with manipulated attribute. In this paper, we proposed a novel self-perception method based on Generative Adversarial Networks (GANs) for automatic face attribute inverse, where giving a face image with an arbitrary facial attribute the model can generate a new face image with the reversed facial attribute. The proposed method takes face images as inputs and employs only one single generator without being conditioned on other inputs. Profiting from the multi-loss strategy and modified U-net structure, our model is quite stable in training and capable of preserving finer details of the original face images. The extensive experimental results have demonstrated the effectiveness of our method on generating high-quality and realistic attribute-reversed face images.

012018
The following article is Open access

, and

In computer vision, there is growing interest in the recognition of pedestrian abnormal behaviors. The abnormal behavior of a person could be the sign of some dangerous activities. However, it's still challenging to extract the discriminative spatial and temporal features effectively faced with video data. In this paper, we propose skeleton-based pedestrian abnormal behavior detection models. The base model is consisting of ResNet as a spatial feature extractor, LSTM as a global temporal feature extractor, and the ResNet network that use the dual-stream network to extract local temporal features. The proposed model is an improvement of all ResNet into Conv1x1_ResNet, and added a layer of Conv1x1_ResNet after dual-stream Conv1x1_ResNet to extract more accurate global space features. The proposed model achieved the highest accuracy of 89.29%, and the averaged get batch time is 0.3399 ms. The base model achieved 88.12% accuracy, and the averaged get batch time is 0.3174 ms less than the time taken by other models. Both models reach speed of 80 frames/sec. Compared with the models made in previous work, the base model has the shortest training time, and the proposed model provides the highest accuracy in the field of pedestrian detection.

012019
The following article is Open access

, , , and

In order to realize the follow-up control of the exoskeleton robot better, the gait phase of the human body should be accurately identified and the human body motion intention should be matched in real time. In this paper, a set of gait data measurement system is used to collect the gait information of the human body during the movement process. Then, the gait recognition of the six models is carried out by the support vector machine through the plantar pressure information. Then the human movement intention is divided into five kinds and the improved DTW algorithm was used to complete the work of matching human motion intentions. Ultimately, the BP neural network model was designed to accurately predict the gait data. The experimental results show that the exoskeleton robot can accurately realize the three functions of recognition, matching and prediction.

012020
The following article is Open access

, , , , and

For the problems the hardware configuration of multi-camera and multi-light source is complicated and the eye image of little auxiliary light source is too dark in the eye gaze tracking system, A new eye gaze tracking system based on two-dimensional mapping model is proposed, which achieved the transformation from two-dimensional coordinates to three-dimensional coordinates combined attitude angle of virtual reality headset. In this paper, the mapping relationship between visual features and eye gaze points is established by improving the method of pupil image recognition and eye gaze estimation based pupil-corneal, which the accuracy of eye gaze estimation is further improved. The experimental results show that the error of eye gaze tracking system is less than 1.1 °, which achieved the good effect of eye gaze tracking.

012021
The following article is Open access

, , , and

Under the premise of fixed computer performance, it is necessary to take into account the accuracy and real-time of face tracking. For this detection requirement, the image processing method of skin color segmentation is incorporated into AdaBoost's face detection algorithm to accurately and quickly locate the face position. However, in the previous stage of face detection, there is a case where the real-time detection does not respond fast enough and fails the detection. In this case, a particle filter tracking algorithm based on multi-feature adaptive fusion is proposed, which uses the face area detected in the first frame as the tracking target, and achieves face detection by self-adjusting CS-LBP and the weight of how the skin color influences tracking effect, in which way the computational efficiency of face detection between frames is improved when the detection accuracy is maintained. And it is also robust to various complex external factors. It has been proved by experiments.

012022
The following article is Open access

, and

With the development of computer technology, human-computer interaction technology has gone through the development process of "command line interface", "graphical user interface" and "natural user interface". This paper explores the use of low-cost Kinect body sensor, in close to the premise of people's habits, let users interact with the virtual scene roaming system in a natural way. Using the Kinect skeleton tracking technology, human posture is recognized through the relative position between joint points, and four kinds of self-defined postures are used to control the change of camera's view angle to realize virtual scene roaming. The algorithm of this method is simple, it realizes the natural interaction between human and computer, and provides reference for the development of related systems in the future.

012023
The following article is Open access

, , and

The majority of existing person re-identification(re-ID) approaches adopt supervised learning pattern, which require large amount of labeled data to train models. However, due to the high cost of marking by hand, they are limited to be widely used in reality. On the other hand, due to the difference of the camera angle, there are many variations in pedestrian postures and illumination. It is known that Extracting discriminative features is pretty effective to solve the problem of person re-ID. Therefore, we propose to fuse exemplar-level features and patch-level features to obtain more distinguishing pedestrian image features for unsupervised person re-ID. Firstly, We carefully design exemplar-level and patch-level feature learning framework(EPFL). The skeleton frame adopts bicomponent branch, one branch is used to learn the global features of pedestrian images, the other is used to learn local features. Then, the global features at the example level and local features at the patch level are fused, thus the discriminative pedestrian image features can be obtained. Furthermore, feature memory bank (FMB) is introduced to facilitate the calculation of the similarity between pedestrian images on unlabeled dataset. We carry on our proposed method on two frequently-used datasets, namely, Market-1501 and DukeMTMC-reID dateset. Experimental results clearly demonstrate the advantage of the proposed approach for unsupervised person re-ID.

012024
The following article is Open access

, and

Recent studies have shown that exploring features of the skeleton data is vital for human action recognition. Nevertheless, how to effectively extract discriminative features is still a challenging work. In this paper, we propose a novel method that extracts a sine feature for human action recognition from skeleton data. Kinect is used to extract human skeleton information (3D coordinates of each joint point) firstly, then two joint points are connected to define a skeleton vector according to the principle of human body structure, and the sine feature of each skeleton vector is calculated as a new pose description feature, SVM is used to classify the obtained features for human action classification. Experimental results on Cornell Activity Database are provided, and the results demonstrate the effectiveness of our approach.

012025
The following article is Open access

, , , and

Person re-identification(re-ID) refers to find a specific pedestrian across disjoint camera views. Recently, person re-ID rely on supervised learning to train network by labeled information. Resulting poor generalization in real-world environment because of the lack of pedestrian labels. At the same time, person images are easily affected by background, illumination and pose variations. And these factors make it difficult to extract discriminative features to distinguish different pedestrians. In order to resolve this research problem, we proposing an unsupervised learning alignment method called Region Alignment of Spatial-Temporal Fusion(RASTF) which joints global features with local aligned features to get more discriminative features. Local features are aligned by calculating the shortest distances between regions. Our proposed framework integrates a novel region alignment method in unsupervised network and the experiment results indicate that can outperform the state-of-the-art unsupervised methods.

012026
The following article is Open access

, , and

In recent years, the progress of person re-identification has advanced significantly, but person in real scenes are often obscured by various objects. This problem has been ignored or solved based on an incomplete assumption by the previous person re-ID methods. This paper proposes a new training mechanism, Deep-Shallow Occlusion Parallelism network, which responds to multi-scale occlusion in a more universal case and alleviate the negative impact of occlusion effectively. Specifically, the proposed consists of shallow occlusion and deep occlusion, to which are applied with shielding simulation to enhance the difficulty of training samples. What's more, channel-spatial attention is applied to various positions in each branch to concentrate on discriminative features afteqr occlusion. In the end, the weighted fusion of the two branches not only informs the network of the information of changes before and after occlusion, but also complements the deep and shallow information effectively. It makes the network more robust. This method has excellent performance on the three basic re-ID datasets and the largest partial re-ID dataset, with the state-of-the-art reached to or even surpassed.

012027
The following article is Open access

, , , , and

Automatic facial expression coding of infants plays an important role in infants-related applications, including computer-aided ASD diagnosis, automatic intervention for ASD children and diagnosis of ADHD, etc. However, most of existing facial expression researches focused on adult facial expression analysis, the infant facial expression recognition has been less investigated. Due to an age gap between the facial expression datasets of adults and infants, a facial expression recognition model trained on adult datasets usually shows poor generalization to infants datasets. A labeled infant facial expression dataset can mitigate this problem, and hence we first collect a facial expression dataset of 30 infants under 24 months of age by recording videos of infants' facial expression during a face-to-face mother-infant interaction. Due to infants spontaneous facial behaviors, the dataset covers multiple challenges, such as large head-poses, occlusion, facial expression intensities, etc. To develop an automatic facial expression coding system, we propose a framework consisted of adaptive region learning and island loss, i.e., ARL-IL, to self-adaptively discover facial regions with higher discriminability between different emotion classes. The framework was verified on our collected dataset, and attained a classification accuracy of 86.86%, which has shown better performance than conventional method based on hand-crafted features and some basic CNN architectures. To interpret the effectiveness of ARL-IL, we also visualize the learned features and find that the proposed framework focuses on facial regions with more emotion information compared with other hand-crafted features or learned features from basic CNN architectures. The experimental results show that our proposed framework has robustness to the large head-poses and occlusion.

012028
The following article is Open access

and

Computer presentations are an integral part of our lives, yet the contemporary presentation tools have not changed for about three decades, still emulating old overhead slide projectors. Combined with the rapid evolution of computer graphics technologies, this results in contemporary tools being ill-suited for many modern needs where linear sequences of fixed-sized 2D slides consisting of text and images is not nearly sufficient. In this work, we propose a presentation system aimed at overcoming the limitations of the contemporary presentation software. To achieve this goal, we abandon the usual concept of a presentation as a 2D slide sequence, and instead treat them as continuous automatically created 3D scenes of non-linear structure with multi-modal content, all without requiring any professional skills from the end user.

012029
The following article is Open access

, and

Prototyping is a vital process of artistic creation. Easy-to-use and animated prototyping helps designers to express their ideas and inspire the team to be more productive. Immersive environment brings more engaged experience and intuitive ways of interaction. In this paper, an immersive prototype animation production method is proposed, which adopts the design theory of natural interaction interface to allow users to quickly produce expected rigid body animation through intuitive interactions in VR environments. In our work, we designed a 3D user interface that allows users to interact quickly and easily in an immersive environment. We allow users to directly control the position, rotation, scaling and other attributes of objects in an virtual scene with their own hands. We have integrated and simplified the animation production process, and users can directly control the animation running speed with body movements. A quantifiable interactive efficiency evaluation method is proposed and compared to traditional methods. The results show that our method can improve the efficiency of prototype animation production.

012030
The following article is Open access

and

As the last node of data journalism creation, visual design plays a crucial role in the transmission of data information. Exploring the routes to improve the quality of data journalism at the visual language and interaction design, this paper analyzes the works won data journalism awards and produced by well-known media from 2012 to 2019, and discusses the misunderstandings and shortcomings in the current data journalism visualization practice. This paper presents the improvement routes of data mapping, symbolic expression, interaction level, and interactive narration.

012031
The following article is Open access

, , , , and

Since the Generative Adversarial Networks (GANs) was proposed, researches on image generation attract many scholars' general attention and good graces. Traditional GANs generate a sample by playing a minimax game between generator and discriminator. In this paper, we propose a new method called EmotionGAN for generating facial expression. Specifically, the inverse of the generator is firstly utilized to establish the mapping between the input and feature vector. Then the Generalized Linear Model (GLM) is used to fit the changing direction of different expressions in the feature space, which provide a linear guidance to the feature vector along the expression axis, and thus spatial distribution consistence with the target feature vector is assured. Finally the generator is applied to reconstruct the facial image of the expression. By controlling the intensity of the feature vector, the generated image can be smoothly changed on a specific expression. Experiments have shown that EmotionGAN can quickly generate face images with arbitrary expressions while ensuring identity information is not changed, and the image attributes are more accurate and the resolution is higher.

012032
The following article is Open access

The further development of 5G technology provides new development possibilities for the application of VR technology in various fields. The application of VR in the social field has derived the concept of "social VR". This paper focuses on where social VR is new and where is the future of social VR going. By means of comparing with traditional ways of socializing, this paper proposes three new features of social VR: high immersion, diverse interactive modes, and contextualized social content. We classify social VR applications and take two popular social VR games: VRChat and Facebook Horizon, as examples to illustrate the current development of social VR applications and the possible existence of social VR. Moreover, a series of problems and countermeasures are explained and further considered. This paper is of great significance in the development of social VR. It indicates that there is a good prospect for the future of social VR.

Computer Vision: Theory and Applications

012033
The following article is Open access

, and

An improved multi-level set C-V model for non-destructive grading of Korean pine seeds is presented in this paper. On the basis of improved Ostu and rough segmentation results of expansion operation, the improved C-V model is used to extract the target contour of Korean pine seeds; the characteristic parameters of fruit length and maximum transverse diameter are extracted by mathematical morphology method, and polynomial fitting is carried out with the actual measured values to construct a mathematical model with better quality; according to the extracted characteristic parameters, a comprehensive evaluation and grading standard for Korean pine seeds is established. The experimental results show that this method can achieve simultaneous classification of multiple Korean pine seeds, and the average accuracy of classification can be up to 97.2%.

012034
The following article is Open access

, , , , and

The recognition of coding characters is vital in automobile manufacturing and assembling, since it guarantees the circulation of essential information on production line. However, due to the fact in real world industrial applications, parts are involved in complex working situation such as reflection, smudge, abrasion and etc. Degradation of the performance of recognition is extremely serious. In this paper, a convolutional neural network (CNN) based method, including localization, segmentation and recognition is proposed to address the problem. Firstly, in order to meet the complex working situation as well as improving the stability and accuracy of character region localization, this method proposes an improved Maximally Stable Extremal Regions (MSER) algorithm introducing Gaussian distribution probability judgment and gray threshold judgment. Next, the combination of statistics and projection algorithm achieves single character segmentation which does not need any prior knowledge. Finally, a convolutional neural network is constructed to recognize character. The proposed method is evaluated through experiments on a platform. Experimental results and comprehensive comparison analysis with respect to traditional recognition method have demonstrated the superiority of the proposed method.

012035
The following article is Open access

, and

Conventional object detection algorithm based on deep learning only make use of deep feature which would become indiscriminative for small target in optical remote sensing images when the network deepens. In this paper, a multi-layer feature fusion method based on residual learning is proposed, which combine the shallow feature with deep feature to classify object comprehensively. First, a ResNet50 backbone network is constructed to extract feature from multiple layers. Then, the scales of feature at different layers are unified through RoI Pooling layer to screen region proposals and be classified through the SVM classifier. Comparative experiments are conducted on UCAS_AOD dataset released by UCAS and the result shows that our model achieved relatively good performance with 0.8802 and 0.9120 f1 score in car and airplane categories respectively.

012036
The following article is Open access

, , and

Image super-resolution processing technology is used to reconstruct high-resolution images from low-resolution images. With the development of deep convolution neural network, image super-resolution processing methods based on deep learning have become the main technology. A novel image super-resolution processing method which is based on attention mechanism is proposed in this paper. The network framework consists of two modules: one is an improved position feature extraction network based on residual network and dense network, and another is a channel feature extraction based on channel attention mechanism. Meanwhile, this paper extracts features directly from low-resolution images, and then amplifies them to reduce the computational complexity. From the experimental results, we can find that the proposed method can enlarge the image better and increase the PSNR on Set5, Set14 and B100.

012037
The following article is Open access

and

Parking slot detection is the basic part of environment perception in automatic parking system. How to detect parking slot accurately and effectively is a key problem that has not been solved in automatic parking system. In order to make up for the defects of parking slot detection based on ultrasonic radar and solve the problems of low recognition rate, sensitivity to environmental changes and weak generalization ability brought by vision-based parking slot detection method. In this paper, a method based on the deep convolution neural network is proposed to detect parking slot. The algorithm takes the fisheye image collected by the four-way fisheye camera mounted on the car body as the input and adopts the improved YoloV3 network structure to detect parking slot directly in the fisheye image. The experimental results show that the recall ratio of the method is 98.72% and the accuracy is 99.14% on the self-made parking data set. The algorithm has achieved good results in real vehicle vehicle environment and can achieve real-time detection.

012038
The following article is Open access

and

We propose an end-to-end rotational motion deblurring method based on conditional generation adversarial networks. The proposed method calculates the blur path value of each pixel on the rotational motion blurred image to provide a priori information of its blur degree, and then connects it to the blurred image as the input of the network. In addition, a rotational motion blurred image dataset is produced, which contains different degrees of rotational motion blurred images, as an evaluation dataset for the method to the effect of rotational motion deblurring. Experiments show that the proposed method is superior to existing end-to-end deblurring methods in both qualitative and quantitative analysis when dealing with different degrees of rotational motion blur.

012039
The following article is Open access

, , , and

Visual place recognition is a challenging problem in the field of Machine Vision and Robotics. Unlike image classification and retrieval, Machine Vision lags far behind humans in place recongniton. Generating image descriptors is the basic problem of VPR. They suppose to be insensitive to changes in illumination and angle of view, and can ensure the stability in the long running process. This paper summarizes the solutions of visual place recognition from two directions: traditional methods and deep learning methods. And explains the key technologies and analyzes the advantages and disadvantages of different methods also the implementation difficulties. In particular, the latest research results are introduced:Using image descriptors generated by semantic segmentation algorithms to solve VPR problems can obtain better performence. Finally, the possibility of using semantic segmentation image descriptors combined with landmark topological relations to solve VPR problems is prospected.

012040
The following article is Open access

, and

Internet of Things (IoT) is a new and surging technological advancement where in which the Internet is connected to various day-to-day physical objects belonging to various domains and making them "smart" like that of various machine dependent processes, manufacturing processes and healthcare. It is a system of interrelated computing devices, machines and people. Internet-connected IoT devices brings several benefits in our daily life but are susceptible to security issues. These vulnerabilities to IoT systems create security threats for any smart environment formed around the aforementioned concept. Thus, there is a need for intrusion detection systems (IDSs) which are designed specifically for IoT systems in order to combat these security related threats and create a secure network for the smart environment. IoT devices tend to have limited computing and storage capabilities. Hence, conventional IDSs might not be suitable for such networks. Due to the specifications and protocols of IoT devices, intrusion detection in IoT systems proves to be a challenging task. Thus, the field of intrusion detection systems in IoT systems is one that requires research and considerable effort to address the key security issues and security threats. A literature review is presented on the IDS in IoT topic, with emphasis on the present-day research, challenges and future directions.

012041
The following article is Open access

, , and

In order to work and travel safely during the outbreak of COVID-19, a method of security detection based on deep learning is proposed by using machine vision instead of manual monitoring. To detect the illegal behaviors of workers without masks in workplaces and densely populated areas, an improved convolutional neural network VGG-19 algorithm is proposed under the framework of tensorflow, and more than 3000 images are collected for model training and testing. Using VGG-19 network model, three FC layers are optimized into one flat layer and two FC layers with reduced parameters. The softmax classification layer of the original model is replaced by a 2-label softmax classifier. The experimental results show that the precision of the model is 97.62% and the recall is 96.31%. The precision of identifying the workers without masks is 96.82%, the recall is 94.07%, and the data set provided has a high precision. For the future social health and safety to provide favorable test data.

012042
The following article is Open access

, , and

Mask R-CNN model is an excellent image segmentation model, but cannot perform well on the task of mold ID recognition. To address the aboved problem, this paper improves the classification branch of Mask R-CNN model by applying DARTs, making the improved classification branch more suitable for mold ID recognition task. Firstly, DARTs is employed to obtain network cell structure. Secondly, the network cell structure is used to improve the classification branch. Moreover, we make the model converge under 2GPU days based on transfer learning. Finally, the experimental results show that the improved model can achieve higher accuracy with IOU of 0.5 (increased by 1.02%) and 0.75 (increased by 2.42%).

012043
The following article is Open access

, , , , and

In recent years, with the rapid development of Internet of Things (IoT) technology, a large number of Internet of things devices such as network printers, webcams and routers have emerged in the cyberspace. However, the situation of network security is increasingly serious. Large-scale network attacks launched by terminal devices connected to the Internet occur frequently, causing a series of adverse effects such as information leakage and property loss to people. The establishment of a set of fingerprint generation system for Internet of things devices to accurately identify the device type is of great significance for the unified security control of the Internet of things. We proposed a RAFM which is a detection and identification system of IoT. RAFM consists two major module including auto detection and fingerprinting. RAFM collects messages sent by different Internet of things devices by means of passive listening. Based on the differences in the header fields of different devices, it USES a series of multi-class classification algorithms to identify device types. Simulation experiments show that RAFM can achieve an average prediction accuracy of 93.75%.

012044
The following article is Open access

, , and

A 3-stages deep neural network (DNN) based camera and lidar fusion framework for 3D objects recognition is proposed in this paper. First, to leverage the high resolution of camera and 3D spatial information of Lidar, region proposal network (RPN) is trained to generate proposals from RGB image feature maps and bird-view (BV) feature maps, these proposals are then lifted into 3D proposals. Then, a segmentation network is used to extract object points directly from points inside these 3D proposals. At last, 3D object bounding box instances are extracted from the interested object points by an estimation network followed after a translation by a light-weight TNet, which is a special supervised spatial transformer network (STN). Experiment results show that this proposed 3d object recognition framework can produce considerable result as the other leading methods on KITTI 3D object detection datasets.

012045
The following article is Open access

, , , and

Due to the influence of germs and viruses, plants often show various symptoms of diseases and insect pests during the growth process, which leads to a large economic loss of fruit farmers. It also brings a certain economic loss to our society, so prevent earlier and advise growers about plant diseases and insect pests have important value and significance. In this case, this paper proposes a detection method which is based on the combination of HOG, LBP and CSS features with Support Vector Machine (SVM) classifier. This method extracts the histogram of oriented gradients, texture, and color self-similar features of potato leaves, and then training samples with SVM classifier to detect late blight as early as possible in the early stages of potato growth. In addition, this paper proposes a method to increase virtual samples, that is, generating symmetrical samples according to the original samples. Due to the limitation of the number of collected samples, increasing symmetrical samples can expand the diversity of samples. The results show that this method can obtain a detection rate of 92.7%, and has better detection and recognition performance in practical application.

012046
The following article is Open access

, and

The purpose of semantic segmentation is to classify the pixels within the target contour. Edge detection is another major basic vision task in machine vision. Today's most effective semantic segmentation models and contour edge detection models are isolated networks. The edge of the output of the semantic segmentation model is coarse and cannot be directly used. And the output of the edge detection network cannot output the classification information of the pixels inside the contour. In view of the above shortcomings of the existing network, we propose a semantic segmentation model based on edge constraint optimization, so that the output of the semantic segmentation model has more delicate edge information, and the network directly outputs accurate contour edge graphs. The edge information output by the network can be directly used for tasks such as corner detection and center point detection. Experiments show that the mIOU statistics obtained by our model on the validation set of PASCAL VOC2012 can reach 83.9%. At the same time, more detailed edge details can be obtained. This algorithm has high engineering and theoretical research value.

012047
The following article is Open access

, , and

The use of deep learning for image segmentation has proven to be an efficient and accurate method, but with the complexity of the network structure, it takes up a lot of computing resources. The consumption of computing resources may be unacceptable during tasks. Aiming at this problem, a fast and light segmentation network (FLSNet) is proposed, which uses the Encoder-Decoder method to extract features. All convolutional layers use depthwise separable convolutions and the channel attention module is linked between Encoder and Decoder. Experiments are performed on the autonomous driving dataset CamVid. The results show that with a slight increase in segmentation accuracy, the model size becomes 8.65% of SegNet, the required computing resources are reduced by a dozen times, and the segmentation speed is increased by about 12%, which show that our network is efficient.

012048
The following article is Open access

, , , and

ESVS (Enhanced Synthetic Vision System) employs multi-modal sensor fusion to provide equivalent out-of-cabin view for pilots. By using such device, pilots can identify the runway and obstacles even in low visibility weather condition during approach and landing. Previously, short-wave infrared video was used as real-time image sensor in our developed ESVS. While, compared to visible image, infrared image is not easy to understand. Moreover, the image quality of infrared degrades to a considerable extent in haze weather condition. To improve the infrared image visibility in ESVS, we proposed a multi-modal sensor fusion strategy by combining real-time infrared video and previous recorded visible video in the clear weather condition. During image fusion, considerable color detail of the visible image is injected to infrared image to ameliorate the infrared image quality. Furthermore, the proposed fusion method has two extra advantages. First, the important tagged information including runway and obstacles in visible image are transferred to infrared video, and second, the missing information in infrared vision can be complemented. To evaluate our proposed method, we use the Y-12F aircraft equipped with both visible and infrared video camera for flight test data collection. By collecting visible and infrared video in clear and haze weather conditions, as well as necessary navigation data, we carry out the final image fusion processing. Experimental results show that the fused video frame has enhanced image quality and improved readability for the pilots, which will significantly promote the pilots' situation awareness during approach and landing. Thus, the proposed idea for fusion of multi-sensor image has the great potential to be applied to other cockpit display systems.

012049
The following article is Open access

, , , , and

An accurate and effective perception of environment is important for autonomous vehicle and robot. The perception system needs to obtain the 3D information of objects, which includes objects' space location and pose. Camera is widely equipped on autonomous vehicle because of its price advantage. However, the monocular camera cannot provide depth information which is necessary for 3D object detection. Many algorithms based on monocular 3D object detection have been developed in recent years. Deep learning is popular for perception system which transforms image data from camera into semantic information. This paper presents an overview of monocular 3D object detection algorithms based on deep Learning and summarize the contributions and limitations of these algorithms. We also compare the performance of different algorithms on different datasets.

012050
The following article is Open access

, , and

At present, under the guidance of the new generation of information technology, the rapid accumulation of data, the continuous improvement of computing power, the continuous optimization of algorithm models, and the rapid rise of multi-scene applications have made profound changes in the development environment of artificial intelligence. In this paper, based on the demand of automobile insurance claims and intelligent transportation, combined with abundant basic data and advanced machine vision algorithm, an intelligent damage determination system of 'Artificial Intelligence + Vehicle Insurance' is constructed. This paper first introduces the functions of the intelligent damage assessment system. Secondly, it discusses the realization path of each functional module in detail, and finally puts forward the vision for the future.

012051
The following article is Open access

, and

With the development of deep learning, various fields of computer vision have made huge progress. Among them, depth estimation is an important part of scene perception, therefore receives much interest and is widely used in daily life with the assistance of GPUs. Besides, the ways to obtain depth maps have also been improved, from using multiple images to a single image to obtain depth, which is called monocular depth estimation task. In this paper, we design a convolutional neural network called ERF-PSPNet to perform the task. We prove that by using unsupervised training, monocular depth estimation's result learned from large-scale dataset is close to the result of stereo matching. We also show that the monocular depth estimation model proposed in this paper can achieve a satisfying precision while maintaining a certain real-time frame rate for day-night driving scenes, which confirms the practical applicability of our design and result.

012052
The following article is Open access

and

In this paper, we address a challenging real-time semantic segmentation task by proposing a lightweight encoder-decoder architecture. The architecture called channel-communication factorization ConvNet (CCFCNet), in which both efficiency and accuracy are taken into account. The core of our work is channel-communication factorization convolution module (CCFC) and dense dialation-rate block (DDB). CCFC module, where split, channel communication and fuse operations are utilized to greatly save reasoning time and improve the quality of information refinement with a few additional calculations. Meanwhile, four CCFC modules with different dialation rates form a dense dialation-rate block (DDB), which can obtain denser feature refinement and enlarge receptive field to improve the segmentation accuracy of objects with different size in images. The proposed architecture, which contains only 1.25M parameters, achieves a mean IOU of 71.4% on the Cityscapes dataset and can run over to 42FPS on a single GTX 1080Ti GPU. It makes a better trade-off between accuracy and efficiency than state-of-the-art methods with comparable performance.

Signal, Image, and Video Process

012053
The following article is Open access

, , and

Text detection and recognition from paper bank receipts image has a lot of applications in the business field, such as in power system marketing, which is applied in verification and cancellation of electricity charges and automatic archiving. In this paper, we use a text detection method to effectively detect text area in receipt image by exploring each character and affinity between characters. Then put the text sequence into text recognizition model which is combined by DCNN and RNN and integrates feature extraction, sequence modeling and transcription. The experiments demonstrate the superiority of the proposed algorithm over the prior arts, which performs well in the task of blurred image-based bank receipt chinese text recognition.

012054
The following article is Open access

and

In Super-Resolution(SR) reconstruction, some recursive networks use shared parameters to keep lightweight. But these methods lack flexibility while use shared parameters to process different feature maps. We propose a non-global shared recursive network that consists of recursive units and channel attention modules. Firstly, we propose multi-scale channel attention module to learn the inter-channel interdependence and spatial information within the channel. The multi-scale channel attention module is composed of a multi-scale pooling layer, fully connected layers and activation functions. In order to flexibly process different feature maps in different layers, the attention module parameters are not shared, but only a small number of parameters are used. Secondly, residual dense recursive unit is used to accelerate convergence and feature reuse. In order to further reduce parameters, bottleneck layer is added to the dense connection. Based on the above mechanism, we propose a lightweight and efficient network structure. Experiments show that the proposed method has better performance than other lightweight networks in the standard dataset.

012055
The following article is Open access

, and

The Interactive Multi-Model (IMM) algorithm uses multiple motion models to simultaneously track the target, which effectively solves the problem of model mismatch when a single model tracks the maneuvering target, and is widely used in maneuvering target tracking tasks. However, the Interactive Multi-Model recognition motion model is not accurate enough, and there is a certain delay in the maneuver recognition of the target, which leads to a reduction in tracking accuracy. To solve this problem, considering that deep neural networks are very good at processing classification tasks, we introduce it into target tracking tasks, combining the respective of deep neural networks and traditional tracking filtering methods for maneuvering target tracking. we use the Recurrent Neural Networks to identify the motion model of the target and propose an improved LSTM-IMM model algorithm based on the interactive multi-model algorithm. Finally, we compare the traditional interactive multi-model algorithm and verify the algorithm using Monte Carlo simulation. The results show that the proposed algorithm has improved the recognition accuracy and recognition speed of the model, and the tracking accuracy has been improved.

012056
The following article is Open access

, , and

The transmission of long-wave signal in lightning is easily weakened to a very low amplitude, and affected by non-stationary time-varying strong noise, which makes the detection signal become extremely difficult, and it is difficult to achieve the extraction of useful information. As the main noise source of interference long wave signal, this paper studies the chaos detection method of lightning noise, proposes the detection method of long wave signal based on the limit threshold chaotic oscillator, combines the Euler equation, the main frequency ratio phase change discrimination algorithm and other technologies, based on the non-linear signal processing technology, completes the chaos detection of long wave weak signal by simulating the noise environment in the laboratory, its reliability is verified by theoretical experiments, which can provide an important basis for the development of long-wave reception technology.

012057
The following article is Open access

, and

The high frequency of X-ray high-voltage power supply (XHPS) leads to conspicuous parasitic effect of power components. And this will transform the equipment into a time-varying and nonlinear complex system. By applying the combination of convolutional neural network (CNN) and traditional methods, this paper proposes a fault detection method based on 2-D feature spectrum reconstruction and CNN. Firstly, the multi-wavelet transform is utilized to decompose the 1-D high-voltage power signal to obtain the coefficients of each frequency band. Secondly, the inverse Zigzag scan reconstructs the multi-wavelet coefficients into a feature spectrum that satisfies the input form of VGG-16, and then cascades the deep features obtained by VGG-16 with the multi-wavelet features. Finally, the final fault detection result is obtained by the support vector machine (SVM). The simulation results show that the proposed method has better fault detection performance and could provide a workable idea for fault prediction and avoidance.

012058
The following article is Open access

, , , , , and

The printing machine's sleeves installation operation is a typical peg-in-hole assembly problem. The printing machine's sleeves are usually heavy, and the precision requirement of fit is high. Therefore it is hard to fit them together, and the assembly efficiency is low. What's more the assembly tasks may make great damage to the worker who assembles it. To solve these problems, a Gough-Stewart parallel peg-in-hole assembly robot with high precision is designed. The robot has a stereoscopic laser measurement system to obtain position and speed information to realize closed-loop workspace control of the robot. The stereoscopic laser measurement system can also ease the impact in the assembly process. The robot has a passive flexibility wrist to solve the jamming question. The assembly parallel robot is designed. The kinematics model of the stereoscopic laser measurement system and the dynamic model of the Gough-Stewart parallel robot are built. A sliding mode controller is designed for the control of this robot and the parameters of the controller are optimized. The validity of the system dynamic model and stability of the sliding mode controller is proved by Matlab simulation results. Gough-Stewart Parallel Robot's application range could be widen by the presented work.

012059
The following article is Open access

, , and

Aiming at the complex surface condition of the printing roller and the requirement of high precision and high efficiency of the surface defect detection of the printing roller, a detection algorithm based on visual salience is proposed. Firstly, the dodging processing is used to eliminate the uneven illumination of the printing roller image; secondly, the non local means denoising algorithm is used to weaken the surface texture based on the redundant information commonly existing in the printing roller image; secondly, the spectral residual salience algorithm is used to calculate the salience of defects in the image and obtain the saliency map; finally, Sobel is used to detect the saliency map of defects and the manually labeled The defect images were compared and analyzed. Experimental results show that the algorithm has high recognition rate and accuracy, and can meet the needs of surface defect detection of printing roller.

012060
The following article is Open access

, , and

In the big data era, the studies of the quantitative stock selection strategy based on machine learning are becoming more and more popular. Most of existing studies focus on short-term strategies, and few on the medium-term or long-term strategies. Moreover, many scholars tend to transform the problem of predicting changes of stock prices into the binary classification problem, which makes it difficult to earn steady abnormal returns. Therefore, it is extraordinary meaningful to study effective quantitative investment strategies. In this article, we propose the modified BP neural network combining AdaBoost algorithm (the modified BP_AdaBoost) and apply it into the quantitative stock selection. We carry out empirical studies about medium-term and long-term price changes in the A share market of our country, construct the factor pool and check the performances of the modified BP_AdaBoost.

012061
The following article is Open access

, and

In this paper noise removal from the medical images using the hybrid filter of technique is presented. From the last couple of decades, medical image processing and analysis techniques based on computing algorithms acquired prominence as an alternate skillset for medical experts in disease diagnosis and prevention. As the number of patients are increasing yearly, doctors don't have enough time to calculate the actual information from the medical images, as most of the medical images are affected by the noise. Medical images contain a different kind of noises because several machines are operating for data acquisition and transmission, so in order to reduce the complexity from the radiologist point of view we were very much interested in the design of an algorithm which can be beneficial and useful at the convenient level. Image Processing has become a very prominent technique in medical image analysis and medical image processing. The proposed architecture is the amalgamation of morphological operations, A modified form of Median Filters and Wiener Filters. The boundary and the shape of the image is extracted through Morphological operation. For noise removal and enhancement purpose modified median filter and wiener filters were used. The parameters like Signal to Noise Ratio (SNR), Peak Signal to Noise Ratio (PSNR), Mean Square Error (MSE) and Root Mean Square Error (RMSE) are determined through proposed algorithm. Overall results indicate that the enhancement quality is performing well in proposed technique.

012062
The following article is Open access

, , , and

The development of the physical layer key extraction technology provides a new type of encryption mechanism that uses the reciprocity and time varying of wireless channels to generate keys. This encryption mechanism can achieve informational security. However, the wireless channels in the real environment are not completely reciprocal, which results in low consistency in generating the keys. In this paper, we study the influence of RF fingerprint on channel reciprocity, and propose a calibration matrix algorithm resist the RF fingerprint effect. Based on the RF fingerprinting model established previously, we propose two calibration methods from the perspective of RF gain and I/Q gain. The superiority of the two calibration methods is proved by the average system capacity and the linear quantification error rate. Next, we extend the calibration matrix algorithm of narrowband systems to wideband OFDM systems. The applicability of the calibration matrix algorithm in OFDM systems is proved by the linear quantification error rate of the uplink and downlink channels. In order to prove the reliability of the algorithm in the real environment, we use a USRP to collect multiple sets of measured data, and analyse the data through the calibration algorithm. The final result shows that this algorithm makes the uplink and downlink channel reciprocity compensated.

012063
The following article is Open access

, , and

Feature extraction of radar echo image is an important part of identification and recognition of air targets. In recent years, with the rapid development of deep learning, new solutions are provided for it. In this paper, the convolution neural network (CNN) is applied to feature extraction. Based on the classic CNN model, Adam is used to update the model parameters, dropout is used to prevent over fitting, and an improved CNN model is constructed. Then, the radar echo image data set is used to train the model, so as to extract target features and classify them. Simulation results show that the accuracy of the improved model is 99%, and the training speed is greatly improved. Different from the traditional extraction method which relies on manual experience, the improved CNN can improve the efficiency of feature extraction of radar echo image and lay a solid foundation for further research and identification of air targets.

012064
The following article is Open access

, and

The methods of dermatological clinical examination are mainly skin images, including dermoscopy. Residual neural network (ResNet) can predict diseases according to dermoscopy images and provide effective proposals for doctors. Based on the ResNet model, this article migrated the pre-trained model on ImageNet to simulation experiment, and used the Focal Loss function to solve the problem of experimental sample imbalance, including but not limited to operations such as flip, rotation, scaling, and loss function replacement, thereby improving network performance. The experimental results show that the model trained by our method can reach completely correct when it classified a small number of samples. Our model can reach accuracy rate of 90.08%, recall rate of 88.44%, and F1 score of 85.25%. Compared with the model with unmodified loss function at the same depth, our model has respectively improved by 1.3%, 4.62%, and 3.58% in the above three aspects, which indicates that our method is effective in predicting rare diseases, and in predicting common diseases the accuracy rate also achieves good results.

012065
The following article is Open access

and

Flame detection has important practical significance. Based on the uniqueness of the color information of the flame, using the flame's color model to initially extract the suspected flame area to improve the accuracy of flame detection in complex environments. In order to reduce the amount of unnecessary algorithm calculation, an improved feature extraction method combing color information with acceleration robust combining features (SURF) is proposed. On this basis, the morphological features of flame, such as roundness and rectangularity, are added as auxiliary classification features, and the flame region features extracted from the image are input to support vector machine (SVM) for training and learning. The experimental results show that the flame detection method proposed in this paper is applicable to a wider range of scenes, with the advantages of high accuracy and robustness, high reliability, small calculation amount, short detection time, and still has good robustness in complex scenes with more interferences.

012066
The following article is Open access

Portrait classification is a complex course including at least face detection, recognization and compare each of which contains multi-tasks, facing plenty of various challenging questions due to askew poses, illuminations, occlusions, image blurring and small scale face in the pictures. Though deep learning methods, such as Convolutional Neural Network (CNN) family and You Only Look Once (YOLO) series, had boomed a large number of areas on object detection and accelerated the solving of these difficulties on image processing, they are not specially designed for the image classification and may require a great deal of resource, expensive computation and taxing annotation. In 2016, an innovative face detection model named Multi-task convolutional neural network (MTCNN) arose and triggered viral and wide spread. Its high efficient and accurate performance on both face detection and face alignment tasks, real time effect based on lightweight CNN as well as effective conducting online hard sample mining, all contribute to significant improvement to the challenges above. This paper introduces the MTCNN algorithm and applies it to the similarity judgement with two industrial real problems together with FaceNet model. In addition, some effective practical methods on increase precision of classification are also proposed to gain better effect.

012067
The following article is Open access

, , and

Object-oriented change detection (CD) method can make full use of feature information of high resolution optical images. However, the feature information is redundancy in Object-oriented CD of high resolution optical image due to the fact that the image has multiple bands. Therefore, feature optimization is necessary to object-oriented CD. Aiming at the problem, a novel CD method based on feature optimization which combines improved locally linear embedding (ILLE) algorithm and object-oriented technology is proposed. Firstly, two temporal images are inverted into objects using multi-scale segmentation algorithm. Secondly, the spectral and texture features of the objects are extracted to construct the novel feature change vector. Thirdly, the improved LLE algorithm, which introduces the Geodesic distance metric, is designed to optimize the feature change vector. Finally, the CD result is obtained by FCM algorithm. Experiments construct on the real GF-1 images, and the results confirm the effectiveness of the proposed method.

012068
The following article is Open access

, , and

In order to improve the robustness of digital image watermarking algorithms, the digital image blind watermarking algorithm proposed in this paper embeds watermark information in a reasonable position. First, scramble the watermark information and embed the random signal into the low-frequency domain of the image. On the one hand, it can effectively prevent attacks on the watermark system and enhance the security of the system, disperse as much as possible. In the watermark embedding process, a method based on the wavelet transform domain is used. The original image is not required to participate in the watermark extraction process, and the blind watermark function is implemented. In order to verify the effectiveness of the algorithm, the watermarked image is subjected to filtering, scaling, noise, cropping and rotation attacks, and the peak signal-to-noise ratio and normalized correlation coefficient are used to quantitatively evaluate the watermarked image. Experimental results show that the algorithm can resist a variety of attacks, and has good imperceptibility and robustness.

012069
The following article is Open access

, , , and

To solve the problem of semantic information dilution in network propagation, a semantic information supplement mechanism (SIS) is proposed to improve the performance of dynamic scene deblurring algorithm. Based on GANs structure, our generator is to recycle the semantic information and features spanning across multiple receptive scales to restore a sharp image, when a blur image is given. What's more, in order to better integrate semantic information with the latent-feature and solve the problem of training difficulty in very-deep network, we put forward a long and short skip-connection method. Extensive experiments show that our Semantic Information Supplement network (SIS-net) achieves both qualitative and quantitative improvements against state-of-the-art methods.

012070
The following article is Open access

and

This paper introduces a multiple photon sampling technique based on stochastic progressive photon mapping. We use the image space concept to divide the scene into continuous sub-blocks and then we calculate our proposed distance function and photon number function in each of the sub-blocks. The distance function is used to calculate the distance error of the hit point and to determine whether each sub-block is located at a boundary between different objects. The photon number function is used to calculate the photon number error and to determine whether the photon distribution in each sub-block is uniform. Based on the values of the distance error and the photon number error, the multiple photon sampling technique is used to acquire multiple samples of the hit point in each sub-block. Instead of using a single radius for the radiance estimate, we use three different radii and compute the final radiance estimate as a weighted average of the three values. When compared with the existing stochastic progressive photon mapping method, our method provides a better solution to the photon distribution problem and can also reduce bias and noise, especially in the scene with drastic changes in light and dark.

012071
The following article is Open access

and

With the continuous improvement of China's national economy and the unprecedented innovation and development of modern science and technology, artificial intelligence technology has been widely used in China's industrial production, power transportation, aerospace and other fields. As an extension of modern artificial intelligence technology, pattern recognition technology plays an important role in modern computer image processing by virtue of its own intelligence and automation. It fundamentally guarantees the accuracy of image processing and greatly improves computer image recognition and information processing. Overall work efficiency. This paper mainly analyses the main characteristics of pattern recognition technology, and further studies the practical application of pattern recognition technology in image processing, hoping to provide corresponding reference for the future application and development of pattern recognition technology in image processing.

Ubiquitous Communications, 5G Technologies and Internet of Things

012072
The following article is Open access

and

In D2D wireless cache network, a good cache strategy is very important, but the current cache strategy has the problems of low cache hit rate and high base station cost. Aiming at this problem, this paper proposes a smart base station cache strategy based on recommendation mechanism. Through the Q-learning algorithm to learn user mobility and file popularity information, a base station cache strategy with the goal of minimizing long-term base station cost is obtained. How to formulate the base station cache strategy and set the recommendation mechanism under different user movement scenarios in the cellular network. The simulation results show that the lower the user's mobility rate is, the smaller the optimal recommendation strength is. In the four user mobility scenarios in the simulation, the proposed cache strategy reduces the core network cost by up to 22.4% compared with the greedy cache, and it is reduced by up to 18.6% recommended strategy.

012073
The following article is Open access

, , , and

The traditional dynamic flow controller has the problems of high conductivity, poor system stability and measurement accuracy. In this paper, the filling time of the liquid is controlled by the 0330 electromagnetic valve of Burkert company in Germany; the electronic balance is verified by the 100g weight of F2 grade of China Shuiling weight factory; the liquid mass after filling is measured by the bt4202s electronic balance of Sartorius company in Germany, and the repeatability of the experimental analysis of the liquid quality after filling is carried out. The experimental results show that the liquid filling mass of the device has high repeatability. It solves the problem of low repeatability of small flow liquid filling mass and has high application value.

012074
The following article is Open access

, , , and

Filling technology has a large number of applications in the chemical, food, pharmaceutical and other industries. Filling repeatability is an important indicator in the field of flow measurement. Because of its powerful data analysis capabilities, virtual instrument technology is playing an increasingly important role in the field of automated detection. This paper designs the host computer software based on LabWindows / CVI for a high-precision filling test system. LabWindows / CVI combines a powerful and flexible C language platform with professional measurement and control tools so it can realize the requirements for designing filling system software. The main functions of the software are hardware communication, data acquisition and processing, and export of test results then a user-friendly interface is designed. This article explains the basic program design process, data structure and introduces program functions. The software can run stably and can provide filling test solutions for different test devices.

012075
The following article is Open access

, , , and

In order to maintain national security and social stability, the ground mobile communication system is monitored by security sector in some area that captures information to prevent crime and terrorist attack. Because TianTong (TT)satellite mobile communication system is run by China Telecom publicly, so it must be equipped with capture system to forbid transmission of crime information. By analyzing advantages of independent station, L1 interface and IU interface, the method of data distribution through fiber between Access Network and Core Network is adopted in this system. The data obtained from the IU interface, through the signalling processing and service processing equipment, forms the call list, analyses the SMS and location, and obtains the corresponding user voice of the call. Finally, these service data are stored in the database. The state list and hierarchical state machine (HSM) mothed is used in this design to solve the difficulty of tracking handset's many states. This design has lots of advantages such as overall data, good expansibility, and easy implementation and so on, which can satisfy the needs of recent and further use extremely. Now, the test of this system is completed and all devices are ready to equip.

012076
The following article is Open access

, and

UAV communication platform has become a reliable solution to meet the needs of users in some unexpected situations. In this paper, the statistical method of small area modeling is used to model the UAV near-ground communication channel. The channel is abstracted into three factors: time-delay power spectrum, Doppler power spectrum and Rice factor. As a result, a new method of channel delay generation is introduced, and the time-varying Doppler frequency gain factor is adopted. The simulation results show that the UAV near-ground communication channel is a Rice channel whose amplitude obeys Rayleigh distribution and phase obeys uniform distribution. The bit-error-rate of UAV near-ground communication channel is much higher than that of AWGN channel. At the same time, the constellation chart also shows that the signal passing through the UAV near-ground communication channel will cause strong inter-symbol crosstalk.

012077
The following article is Open access

, and

This paper mainly studies the key performance of 5G mobile communication system and its development countermeasures. This paper introduces the challenges faced by 5G mobile communication system, which mainly come from increasing mobile data volume, connecting mass terminals anytime and anywhere, shortage of spectrum resources, and green environmental protection and energy efficiency; In order to study the key performance of 5G mobile communication system more comprehensively, first, this paper analyzes the practical significance of establishing the key performance indexes system of mobile communication system. Second, the performance requirements of mobile communication systems in six main application scenes are discussed. Third, the key performance indexes system of mobile communication system is constructed; This paper introduces the purpose and steps of performance evaluation of key performance of mobile communication system, and compares and evaluates the key performance of 5G and 4G in wide-area continuous coverage scene based on AHP. Suggestions on the development of 5G mobile communication system are put forward from four aspects: policy and system, demand and service, technology and talents, and sustainable development.

012078
The following article is Open access

and

Aiming at the fault location of T-type transmission lines, which are susceptible to the influence of wave speed and transmission line length, this paper proposes a method of T-type transmission line fault location based on transient fault traveling wave information. Based on the fault reflection traveling wave generated at the fault point of the transmission line when a fault occurs, this paper detects the current and voltage traveling wave of the transmission line before and after the fault occurs through a current and voltage collector, and uses the non-stationary signal analysis to obtain the fault line The initial time of the wave reflection to the transmission line port is obtained by simplifying the formula to obtain the modal maximum value of the initial time of the transient fault traveling wave. The modal maximum value of the initial reflected traveling wave time is brought into the calculation formula of the fault branch location criterion. Finally determine the fault distance. Experimental results based on PSCAD show that the method has better accuracy and efficiency than traditional fault analysis methods.

012079
The following article is Open access

, , and

In this paper, we study the robust secrecy rate optimization of multiple-input-single-output (MISO) security channels with multiple device-to-device (D2D) communication in an electronic physical system for health monitoring applications. We focus on the problems of robust power minimization and maximizing robust secrecy with transmit power in this system. Both of these robust optimizations are subject to probability-based secrecy and D2D transmission rates. The design of robust beamforming can consider combining statistical channel uncertainty model (CUM), but the problem of construction is a non-convexity problem that is not easy to solve, so we propose two methods to solve this problem: based on Bernstein-type inequality and a conservative approximation of S-Procedure. The simulation results demonstrate that the performance of the S-Procedure method on the reachable confidentiality rate is not as good as the method based on the Bernstein-type inequality.

012080
The following article is Open access

, , and

With the development of the Energy Internet, specific requirements are put forward for the Distribution Internet of Things technology. How to control the panoramic equipment and data through a horizontalization of all links and a secondary level of operation on the entire business cloud. And provide external platforms and applications. This article first proposes the architecture of the IoT cloud platform for distribution IoT, and discusses open cloud platform, Internet of Things access and equipment management, data storage and management technology, cloud edge collaboration, and other key technologies. Finally, it discusses application scenarios based on IoT technology, such as panoramic condition monitoring based on multi-dimensional information perception, tiered line loss hierarchical layered management, low voltage fault location. To provide technical support for the ubiquitous electric power Internet of Things in order to realize the interconnection of all things and human-computer interaction in all aspects of the power system, create a comprehensive status perception, efficient information processing, and convenient and flexible application.

012081
The following article is Open access

, , , and

Low frequency power oscillation threatens the secure and stable operation of power systems. Since the appearance of low frequency oscillation there are lots of approaches to analyze this problem. The oscillation source location method based on oscillation energy is adopted to find the cause of the oscillation, and to locate the oscillation source. Forced power oscillation is a kind of low frequency oscillation based on resonance mechanism, and the key is to quickly locate and remove the disturbance source. Aiming at the shortcomings of the existing disturbance source location methods, an oscillation energy calculation method suitable for the generator control systems level location is proposed. With the rich acquisition of generator data by means of phasor measurement unit (PMU), this method has the advantage of not relying on generator parameters. Taking the forced power oscillation caused by the abnormality of the governor system as an example, the superiority of the method is verified.

012082
The following article is Open access

and

Building energy forecast plays an important role in Intelligent Building. Due to its non-stationarity and uncertainty, the prediction accuracy of existing methods need to be further improved. In view of this problem, propose the EA-XGBoost model, which combines Empirical Mode Decomposition (EMD), ARIMA and XGBoost model to predict building energy consumption. First, EMD is used to decompose the consumption data into multiple Intrinsic Mode Functions(IMF). Afterwards, ARIMA model is applied for each IMF to get regression result, then sum the results and calculate the residual. Taking the residual as an input feature of XGBoost, combined with other energy-related factors such as dry and wet bulb temperature, using XGBoost after Grid-Search to predict building energy consumption data. Compared with ARIMA and XGBoost model, EA-XGBoost hybrid model performs best in forecasting building energy consumption dataset which provided by the US National Renewable Energy Laboratory. The experiment shows the feasibility and effectiveness of the new model.

012083
The following article is Open access

and

Dual channel sales provide consumers with different service experiences. Studies have shown that "cheap" and "safety" are the most important factors for consumers in the online direct selling experience, however, with the higher pursuit of quality of life, The traditional entity sales experience is unique, and a series of real shopping experiences such as "situation consumption" and "physical experience" can bring joy to the consumers, and stimulate the consumer's desire to purchase to a certain extent. Therefore, scientifically integrating the dual-channel supply chain, carrying out orderly competition, and complementing each other is an effective strategy to further optimize the dual-channel supply chain and better serve consumers.