We gratefully acknowledge support from
the Simons Foundation and member institutions.

Electrical Engineering and Systems Science

New submissions

[ total of 69 entries: 1-69 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Wed, 19 Feb 20

[1]  arXiv:2002.07156 [pdf]
Title: Using anisotropic 3D Minkowski functionals for trabecular bone characterization and biomechanical strength prediction in proximal femur specimens
Comments: SPIE Medical Imaging Conference 2014
Subjects: Image and Video Processing (eess.IV); Medical Physics (physics.med-ph)

The ability of Anisotropic Minkowski Functionals (AMFs) to capture local anisotropy while evaluating topological properties of the underlying gray-level structures has been previously demonstrated. We evaluate the ability of this approach to characterize local structure properties of trabecular bone micro-architecture in ex vivo proximal femur specimens, as visualized on multi-detector CT, for purposes of biomechanical bone strength prediction. To this end, volumetric AMFs were computed locally for each voxel of volumes of interest (VOI) extracted from the femoral head of 146 specimens. The local anisotropy captured by such AMFs was quantified using a fractional anisotropy measure; the magnitude and direction of anisotropy at every pixel was stored in histograms that served as a feature vectors that characterized the VOIs. A linear multi-regression analysis algorithm was used to predict the failure load (FL) from the feature sets; the predicted FL was compared to the true FL determined through biomechanical testing. The prediction performance was measured by the root mean square error (RMSE) for each feature set. The best prediction performance was obtained from the fractional anisotropy histogram of AMF Euler Characteristic (RMSE = 1.01 +- 0.13), which was significantly better than MDCT-derived mean BMD (RMSE = 1.12 +- 0.16, p<0.05). We conclude that such anisotropic Minkowski Functionals can capture valuable information regarding regional trabecular bone quality and contribute to improved bone strength prediction, which is important for improving the clinical assessment of osteoporotic fracture risk.

[2]  arXiv:2002.07231 [pdf, other]
Title: Orthogonal Frequency Division Multiplexing With Subcarrier Power Modulation for Doubling the Spectral Efficiency of 6G and Beyond Networks
Comments: 31 pages, 10 figures
Subjects: Signal Processing (eess.SP)

With the emergence of new applications (e.g., extended reality and haptics), which require to be simultaneously served not just with low latency and sufficient reliability, but also with high spectral efficiency, future networks (i.e., 6G) has to be capable of meeting this demand by introducing new effective transmission designs. Motivated by this, a novel modulation technique termed as orthogonal frequency division multiplexing with subcarrier power modulation (OFDM-SPM) is proposed for providing highly spectral-efficient data transmission with low-latency and less-complexity for future 6G wireless communication systems. OFDM-SPM utilizes the power of subcarriers in OFDM blocks as a third dimension to convey extra information bits while reducing both complexity and latency compared to conventional schemes. In this paper, the concept of OFDM-SPM is introduced and its validity as a future adopted modulation technique is investigated over a wireless multipath Rayleigh fading channel. The proposed system structure is explained, an analytical expression of the bit error rate (BER) is derived, and numerical simulations of BER and throughput performances of OFDM-SPM are carried out. OFDM-SPM is found to greatly enhance the spectral efficiency where it is capable of doubling it. Additionally, OFDM-SPM introduces negligible complexity to the system, does not exhibit error propagation, reduces the transmission delay, and decreases the transmission power by half.

[3]  arXiv:2002.07237 [pdf]
Title: Prediction of Dementia-related Agitation Using Multivariate Ambient Environmental Time-series Data
Authors: Nutta Homdee
Comments: 4 pages, 3 figures, 3 tables
Subjects: Signal Processing (eess.SP)

Dementia-related agitation causes high stress for dementia caregivers (CG) and to persons with dementia (PWD). Current clinical research suggests that dementia agitation can be affected or triggered by the ambient environment and other contextual factors. In this study, we evaluate this hypothesis through an analysis of ambient environmental data collected with a remote sensing system deployed in the homes of PWDs and their CGs. Furthermore, we determine if the occurrence of dementia-related agitation can be predicted from ambient environmental data, creating the potential for agitation to be prevented via the environmental alteration. These collected data are used to learn the environmental patterns using a predictive model approach. The agitation labels, used in model training, are provided by the CGs living with the PWDs. The results of the agitation prediction model evaluation suggest that ambient environment can be used as predictors for upcoming dementia-related agitation. We also observed that environmental triggers for agitation are PWD-specific. Future opportunities and techniques used to understand triggers for dementia agitation are also discussed.

[4]  arXiv:2002.07257 [pdf]
Title: An Networked HIL Simulation System for Modeling Large-scale Power Systems
Subjects: Systems and Control (eess.SY)

This paper presents a network hardware-in-the-loop (HIL) simulation system for modeling large-scale power systems. Researchers have developed many HIL test systems for power systems in recent years. Those test systems can model both microsecond-level dynamic responses of power electronic systems and millisecond-level transients of transmission and distribution grids. By integrating individual HIL test systems into a network of HIL test systems, we can create large-scale power grid digital twins with flexible structures at required modeling resolution that fits for a wide range of system operating conditions. This will not only significantly reduce the need for field tests when developing new technologies but also greatly shorten the model development cycle. In this paper, we present a networked OPAL-RT based HIL test system for developing transmission-distribution coordinative Volt-VAR regulation technologies as an example to illustrate system setups, communication requirements among different HIL simulation systems, and system connection mechanisms. Impacts of communication delays, information exchange cycles, and computing delays are illustrated. Simulation results show that the performance of a networked HIL test system is satisfactory.

[5]  arXiv:2002.07277 [pdf, other]
Title: 5G Simulation-Based Experimentation Framework for Vertical Performance Assessment
Comments: Conference: 2020 1st International Workshop on 5G Verticals and Experimentation
Subjects: Signal Processing (eess.SP)

5G is being designed as a common platform where multiple vertical applications will be able to co-exist and grow in a seamless manner. The diversity of the vertical requirements as well as the particular features of the 5G network itself, make it a real challenge to be able to assess or predict applications performance for those verticals. In this paper we motivate the fact that because of the very nature of verticals and 5G infrastructure, measurements alone will not be sufficient for application performance monitoring and prediction. We then propose a comprehensive framework to integrate those measurements in a simulation setting, presenting the key features and current roadblocks for an end-to-end wide-spread implementation.

[6]  arXiv:2002.07315 [pdf]
Title: A New Approach in Optimal Control of Step-Down Converters Based on a Switch-State Controller
Subjects: Systems and Control (eess.SY)

In this paper, an optimal approach based on on-off controller is used to optimally control a DC-DC step-down converter. It is shown that the conventional controller techniques of DC-DC converters based on a linearized averaging model have several drawbacks, including different operating mode, linearization concerns, and constraint difficulties. A single-mode discretize state-space model is used, and a new optimal approach policy is implemented to control a step-down DC-DC converter. The simulation results confirm that the proposed DC-DC model associated with the optimal controller functions properly in controlling a DC-DC unfixed switch-mode step-down converter that is facing load changes, noisy inputs, and start-up procedure.

[7]  arXiv:2002.07332 [pdf, other]
Title: Distributed Optimal Generation and Load-Side Control for Frequency Regulation in Power Systems
Comments: 11 pages, 3 figures, journal
Subjects: Systems and Control (eess.SY)

In order to deal with issues caused by the increasing penetration of renewable resources in power systems, this paper proposes a novel distributed frequency control algorithm for each generating unit and controllable load in a transmission network to replace the conventional automatic generation control (AGC). The targets of the proposed control algorithm are twofold. First, it is to restore the nominal frequency and scheduled net inter-area power exchanges after an active power mismatch between generation and demand. Second, it is to optimally coordinate the active powers of all controllable units in a distributed manner. The designed controller only relies on local information, computation, and peer-to-peer communication between cyber-connected buses, and it is also robust against uncertain system parameters. Asymptotic stability of the closed-loop system under the designed algorithm is analysed by using a nonlinear structure-preserving model including the first-order turbine-governor dynamics. Finally, case studies validate the effectiveness of the proposed method.

[8]  arXiv:2002.07353 [pdf]
Title: Temporal ghost Fourier compressive inference camera
Subjects: Image and Video Processing (eess.IV); Optics (physics.optics)

The need for real-time processing fast moving objects in machine vision requires the cooperation of high frame rate camera and a large amount of computing resources. The cost, high detection bandwidth requirements, data and computation burden limit the wide applications of high frame rate machine vision. Compressive Video Sensing (CVS) allows capturing events at much higher frame rate with the slow camera, by reconstructing a frame sequence from a coded single image. At the same time, complex frame sequence reconstruction algorithms in CVS pose challenges for computing resources. Even though the reconstruction process is low computational complexity, image-dependent machine vision algorithms also suffers from a large amount of computing energy consumption. Here we present a novel CVS camera termed Temporal Ghost Fourier Compressive Inference Camera (TGIC), which provides a framework to minimize the data and computational burden simultaneously. The core of TGIC is co-design CVS encoding and machine vision algorithms both in optical domain. TGIC acquires pixel-wise temporal Fourier spectrum in one frame, and applies simple inverse fast Fourier transform algorithm to get the desired video. By implementing pre-designed optical Fourier sampling schemes, specific machine vision tasks can be accomplished in optical domain. In fact, the data captured by TGIC is the results of traditional machine vision algorithms derived from the video, therefore the computation resources will be greatly saved. In the experiments, we can recover 160 frames in 0.1s single exposure with 16x frame rate gain (periodic motion up to 2002 frames, 1000x frame rate gain), and typical machine vision applications such as moving object detection, tracking and extraction are also demonstrated.

[9]  arXiv:2002.07392 [pdf]
Title: Bit Error Rate Analysis of M-ARY PSK and M-ARY QAM Over Rician Fading Channel
Comments: 5 pages, 6 figures
Journal-ref: International Journal of Computer Engineering and Applications,ISSN 2321-3469 Volume XII, Special Issue, August 2018
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

This paper mainly illustrates the Bit error rate performance of M-ary QAM and M-ary PSK for different values of SNR over Rician Fading channel. A signal experiences multipath propagation in the wireless communication system which causes expeditious signal amplitude fluctuations in time, is defined as fading. Rician Fading is a small signal fading. Rician fading is a hypothetical model for radio propagation inconsistency produced by fractional cancellation of a radio signal by itself and as a result the signal reaches in the receiver by several different paths. In this case, at least one of the destination paths is being lengthened or shortened. From this paper , it can be observed that the value of Bit error rate decreases when signal to noise ratio increases in decibel for Mary QAM and M-ary PSK such as 256 QAM, 64 PSK etc. Constellation diagrams of M-QAM and M-PSK have also been showed in this paper using MATLAB Simulation. The falling of Bit error rate with the increase of diversity order for a fixed value of SNR has also been included in this paper. Diversity is a influential receiver system which offers improvement over received signal strength.

[10]  arXiv:2002.07450 [pdf, other]
Title: Multitask Learning with Capsule Networks for Speech-to-Intent Applications
Comments: To be published in ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS)

Voice controlled applications can be a great aid to society, especially for physically challenged people. However this requires robustness to all kinds of variations in speech. A spoken language understanding system that learns from interaction with and demonstrations from the user, allows the use of such a system in different settings and for different types of speech, even for deviant or impaired speech, while also allowing the user to choose a phrasing. The user gives a command and enters its intent through an interface, after which the model learns to map the speech directly to the right action. Since the effort of the user should be as low as possible, capsule networks have drawn interest due to potentially needing little training data compared to deeper neural networks. In this paper, we show how capsules can incorporate multitask learning, which often can improve the performance of a model when the task is difficult. The basic capsule network will be expanded with a regularisation to create more structure in its output: it learns to identify the speaker of the utterance by forcing the required information into the capsule vectors. To this end we move from a speaker dependent to a speaker independent setting.

[11]  arXiv:2002.07452 [pdf, other]
Title: Hierarchical User Clustering for mmWave-NOMA Systems
Comments: Accepted to 2nd 6G Summit in March 2020 in Levi, Finland
Subjects: Signal Processing (eess.SP)

Non-orthogonal multiple access (NOMA) and mmWave are two complementary technologies that can support the capacity demand that arises in 5G and beyond networks. The increasing number of users are served simultaneously while providing a solution for the scarcity of the bandwidth. In this paper we present a method for clustering the users in a mmWave-NOMA system with the objective of maximizing the sum-rate. An unsupervised machine learning technique, namely, hierarchical clustering is utilized which does the automatic identification of the optimal number of clusters. The simulations prove that the proposed method can maximize the sum-rate of the system while satisfying the minimum QoS for all users without the need of the number of clusters as a prerequisite when compared to other clustering methods such as k-means clustering.

[12]  arXiv:2002.07461 [pdf, other]
Title: Lightweight hardware implementation of VVC transform block for ASIC decoder
Journal-ref: International Conference on Acoustics, Speech, and Signal Processing, May 2020, Barcelone, Spain
Subjects: Image and Video Processing (eess.IV); Signal Processing (eess.SP)

Versatile Video Coding (VVC) is the next generation video coding standard expected by the end of 2020. Compared to its predecessor, VVC introduces new coding tools to make compression more efficient at the expense of higher computational complexity. This rises a need to design an efficient and optimised implementation especially for embedded platforms with limited memory and logic resources. One of the newly introduced tools in VVC is the Multiple Transform Selection (MTS). This latter involves three Discrete Cosine Transform (DCT)/Discrete Sine Transform (DST) types with larger and rectangular transform blocks. In this paper, an efficient hardware implementation of all DCT/DST transform types and sizes is proposed. The proposed design uses 32 multipliers in a pipelined architecture which targets an ASIC platform. It consists in a multi-standard architecture that supports the transform block of recent MPEG standards including AVC, HEVC and VVC. The architecture is optimized and removes unnecessary complexities found in other proposed architec-tures by using regular multipliers instead of multiple constant multi-pliers. The synthesized results show that the proposed method which sustain a constant throughput of two pixels/cycle and constant la-tency for all block sizes can reach an operational frequency of 600 Mhz enabling to decode in real-time 4K videos at 48 fps.

[13]  arXiv:2002.07476 [pdf, other]
Title: Model based fractional order controller design for process plants satisfying desired robustness criteria
Authors: Pushkar Prakash Arya (1), Sohom Chakrabarty (2) ((1),(2) Indian Institute of Technology Roorkee, INDIA)
Comments: 16 pages, 6 figures
Subjects: Systems and Control (eess.SY)

This paper contributes to the design of a fractional order (FO) internal model controller (IMC) for a first order plus time delay (FOPTD) process model to satisfy a given set of desired robustness specifications in terms of gain margin (Am) and phase margin (Pm). The highlight of the design is the choice of a fractional order (FO) filter in the IMC structure which has two parameters (lambda and beta) to tune as compared to only one tuning parameter (lambda) for traditionally used integer order (IO) filter. These parameters are evaluated for the controller, so that Am and Pm can be chosen independently. A new methodology is proposed to find a complete solution for controller parameters, the methodology also gives the system gain cross-over frequency (wg) and phase cross-over frequency (wp). Moreover, the solution is found without any approximation of the delay term appearing in the controller.

[14]  arXiv:2002.07508 [pdf, other]
Title: ELM-based Superimposed CSI Feedback for FDD Massive MIMO System
Comments: 11pages, 7 figures
Subjects: Signal Processing (eess.SP)

In frequency-division duplexing (FDD) massive multiple-input multiple-output (MIMO), deep learning (DL)-based superimposed channel state information (CSI) feedback has presented promising performance. However, it is still facing many challenges, such as the high complexity of parameter tuning, large number of training parameters, and long training time, etc. To overcome these challenges, an extreme learning machine (ELM)-based superimposed CSI feedback is proposed in this paper, in which the downlink CSI is spread and then superimposed on uplink user data sequence (UL-US) to feed back to base station (BS). At the BS, an ELM-based network is constructed to recover both downlink CSI and UL-US. In the constructed ELM-based network, we employ the simplified versions of ELM-based subnets to replace the subnets of DL-based superimposed feedback, yielding less training parameters. Besides, the input weights and hidden biases of each ELM-based subnet are loaded from the same matrix by using its full or partial entries, which significantly reduces the memory requirement. With similar or better recovery performances of downlink CSI and UL-US, the proposed ELM-based method has less training parameters, storage space, offline training and online running time than those of DL-based superimposed CSI feedback.

[15]  arXiv:2002.07590 [pdf]
Title: Speech Emotion Recognition using Support Vector Machine
Subjects: Audio and Speech Processing (eess.AS); Information Retrieval (cs.IR); Sound (cs.SD)

In this project, we aim to classify the speech taken as one of the four emotions namely, sadness, anger, fear and happiness. The samples that have been taken to complete this project are taken from Linguistic Data Consortium (LDC) and UGA database. The important characteristics determined from the samples are energy, pitch, MFCC coefficients, LPCC coefficients and speaker rate. The classifier used to classify these emotional states is Support Vector Machine (SVM) and this is done using two classification strategies: One against All (OAA) and Gender Dependent Classification. Furthermore, a comparative analysis has been conducted between the two and LPCC and MFCC algorithms as well.

[16]  arXiv:2002.07599 [pdf, other]
Title: ELM-based Frame Synchronization in Burst-Mode Communication Systems with Nonlinear Distortion
Comments: 5 pages, 8 figures
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)

In burst-mode communication systems, the quality of frame synchronization (FS) at receivers significantly impacts the overall system performance. To guarantee FS, an extreme learning machine (ELM)-based synchronization method is proposed to overcome the nonlinear distortion caused by nonlinear devices or blocks. In the proposed method, a preprocessing is first performed to capture the coarse features of synchronization metric (SM) by using empirical knowledge. Then, an ELM-based FS network is employed to reduce system's nonlinear distortion and improve SMs. Experimental results indicate that, compared with existing methods, our approach could significantly reduce the error probability of FS while improve the performance in terms of robustness and generalization.

[17]  arXiv:2002.07603 [pdf]
Title: Decentralized Dynamic State Estimation with Bimodal Gaussian Mixture Measurement Noise
Comments: 5 pages, 8 figures
Subjects: Signal Processing (eess.SP)

This paper proposes a decentralized dynamic state estimation (DSE) algorithm with bimodal Gaussian mixture measurement noise. The decentralized DSE is formulated using the Ensemble Kalman Filter (EnKF) and then compared with the unscented Kalman filter (UKF). The performance of the proposed framework is verified using the WSCC 9-bus system simulated in the Real Time Digital Simulator (RTDS). The phasor measurement unit (PMU) measurements are streamed in real-time from the RTDS runtime environment to MATLAB for real-time visualization and estimation. To consider the data corruption scenario in the streaming process, a bi-modal distribution containing two normal distributions with different weights and variances are added to the measurements as the noise component. The performances of both UKF and EnKF are then compared for by calculating the mean-squared-errors (MSEs) between the actual and estimated states.

[18]  arXiv:2002.07604 [pdf]
Title: Dataset after seven years simulating hybrid energy systems with Homer Legacy
Comments: 7 pages, 1 table, 23 references
Subjects: Signal Processing (eess.SP)

Homer Legacy software is a well-known software for simulation of small hybrid systems that can be used for both design and research. This dataset is a set of files generated by Homer Legacy bringing the simulation results of hybrid energy systems over the last seven years, as a consequence of the research work led by Dr. Alexandre Beluco, Federal University of Rio Grande do Sul, in southern Brazil. The data correspond to thirty papers published in the last seven years. Two of them describe hydro PV hybrid systems with photovoltaic panels operating on the water surface of reservoirs. One of these twelve papers suggests the model-ing of hydropower plants with reservoirs and the other the modeling of pumped hydro stor-age, and a third still uses these models in a place that could receive both the two types of hy-droelectric power plant. The other simulated hybrid systems include wind turbines, diesel generators, batteries, among other components. This data article describes the files that inte-grate this dataset and the papers that have been published presenting the hybrid systems under study and discussing the results. The files that make up this dataset are available on Mendeley Data repository at dx.doi.org/10.17632/ybxsttf2by.2.

[19]  arXiv:2002.07605 [pdf]
Title: A comprehensive review on convolutional neural network in machine fault diagnosis
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Machine Learning (stat.ML)

With the rapid development of manufacturing industry, machine fault diagnosis has become increasingly significant to ensure safe equipment operation and production. Consequently, multifarious approaches have been explored and developed in the past years, of which intelligent algorithms develop particularly rapidly. Convolutional neural network, as a typical representative of intelligent diagnostic models, has been extensively studied and applied in recent five years, and a large amount of literature has been published in academic journals and conference proceedings. However, there has not been a systematic review to cover these studies and make a prospect for the further research. To fill in this gap, this work attempts to review and summarize the development of the Convolutional Network based Fault Diagnosis (CNFD) approaches comprehensively. Generally, a typical CNFD framework is composed of the following steps, namely, data collection, model construction, and feature learning and decision making, thus this paper is organized by following this stream. Firstly, data collection process is described, in which several popular datasets are introduced. Then, the fundamental theory from the basic convolutional neural network to its variants is elaborated. After that, the applications of CNFD are reviewed in terms of three mainstream directions, i.e. classification, prediction and transfer diagnosis. Finally, conclusions and prospects are presented to point out the characteristics of current development, facing challenges and future trends. Last but not least, it is expected that this work would provide convenience and inspire further exploration for researchers in this field.

[20]  arXiv:2002.07609 [pdf]
Title: Explicit Circular Harmonic Inversions of Exponential Radon Transform
Subjects: Signal Processing (eess.SP)

Using Plemelj formula we obtain three circular harmonic inversion formulas of the exponential Radon transform with complex coefficients. We also derive two different range conditions and prove that Novikov's range condition does imply the traditional range condition for real coefficients.

[21]  arXiv:2002.07621 [pdf]
Title: Image Entropy for Classification and Analysis of Pathology Slides
Authors: Steven J. Frank
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Pathology slides of lung malignancies are classified using the "Salient Slices" technique described in Frank et al., 2020. A four-fold cross-validation study using a small image set (42 adenocarcinoma slides and 42 squamous cell carcinoma slides) produced fully correct classifications in each fold. Probability maps enable visualization of the underlying basis for a classification.

[22]  arXiv:2002.07626 [pdf]
Title: Transmit Power Optimization in Optical Coherent Transmission Systems: Analytical, Simulation, and Experimental Results
Subjects: Signal Processing (eess.SP); Optimization and Control (math.OC)

In this paper, we propose to use the discretized version of the so-called Enhanced Gaussian Noise (EGN) model to estimate the non-linearity effects of fiber on the performance of optical coherent and uncompensated transmission (CUT) systems. By computing the power of non-linear interference noise and considering optical amplifier noise, we obtain the signal-to-noise (SNR) ratio and achievable rate of CUT. To allocate the power of each CUT channel, we consider two optimization problems with the objectives of maximizing minimum SNR margin and achievable rate. We show that by using the discretized EGN model, the complexity of the introduced optimization problems is reduced compared with the existing optimization problems developed based on the so-called discretized Gaussian Noise (GN) model. In addition, the optimization based on the discretized EGN model leads to a better SNR and achievable rate. We validate our analytical results with simulations and experimental results. We simulate a five-channel coherent system on OptiSystem software, where a close agreement is observed between optimizations and simulations. Furthermore, we measured SNR of commercial 100Gbps coherent transmitter over 300 km single-mode fiber (SMF) and non-zero dispersion-shifted fiber (NZDSF), by considering single-channel and three-channel coherent systems. We observe there are performance gaps between experimental and analytical results, which is mainly due to other sources of noise such as transmitter imperfection noise, thermal noise, and shot noise, in experiments. By including these sources of noise in the analytical model, the gaps between analytical and experimental results are reduced.

[23]  arXiv:2002.07629 [pdf, other]
Title: Multi-Task Siamese Neural Network for Improving Replay Attack Detection
Comments: Submit to INTERSPEECH2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)

Automatic speaker verification systems are vulnerable to audio replay attacks which bypass security by replaying recordings of authorized speakers. Replay attack detection (RA) detection systems built upon Residual Neural Networks (ResNet)s have yielded astonishing results on the public benchmark ASVspoof 2019 Physical Access challenge. With most teams using fine-tuned feature extraction pipelines and model architectures, the generalizability of such systems remains questionable though. In this work, we analyse the effect of discriminative feature learning in a multi-task learning (MTL) setting can have on the generalizability and discriminability of RA detection systems. We use a popular ResNet architecture optimized by the cross-entropy criterion as our baseline and compare it to the same architecture optimized by MTL using Siamese Neural Networks (SNN). It can be shown that SNN outperform the baseline by relative 26.8 % Equal Error Rate (EER). We further enhance the model's architecture and demonstrate that SNN with additional reconstruction loss yield another significant improvement of relative 13.8 % EER.

[24]  arXiv:2002.07631 [pdf, other]
Title: Wireless Power Control via Counterfactual Optimization of Graph Neural Networks
Comments: Submitted to the 21st IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC 2020)
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)

We consider the problem of downlink power control in wireless networks, consisting of multiple transmitter-receiver pairs communicating with each other over a single shared wireless medium. To mitigate the interference among concurrent transmissions, we leverage the network topology to create a graph neural network architecture, and we then use an unsupervised primal-dual counterfactual optimization approach to learn optimal power allocation decisions. We show how the counterfactual optimization technique allows us to guarantee a minimum rate constraint, which adapts to the network size, hence achieving the right balance between average and $5^{th}$ percentile user rates throughout a range of network configurations.

[25]  arXiv:2002.07703 [pdf]
Title: Deep Learning in Medical Ultrasound Image Segmentation: a Review
Authors: Ziyang Wang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)

Applying machine learning technologies, especially deep learning, into medical image segmentation is being widely studied because of its state-of-the-art performance and results. It can be a key step to provide a reliable basis for clinical diagnosis, such as 3D reconstruction of human tissues, image-guided interventions, image analyzing and visualization. In this review article, deep-learning-based methods for ultrasound image segmentation are categorized into six main groups according to their architectures and training at first. Secondly, for each group, several current representative algorithms are selected, introduced, analyzed and summarized in detail. In addition, common evaluation methods for image segmentation and ultrasound image segmentation datasets are summarized. Further, the performance of the current methods and their evaluations are reviewed. In the end, the challenges and potential research directions for medical ultrasound image segmentation are discussed.

[26]  arXiv:2002.07711 [pdf]
Title: An Energy-Efficient Accelerator Architecture with Serial Accumulation Dataflow for Deep CNNs
Comments: 4 pages
Subjects: Signal Processing (eess.SP); Hardware Architecture (cs.AR)

Convolutional Neural Networks (CNNs) have shown outstanding accuracy for many vision tasks during recent years. When deploying CNNs on portable devices and embedded systems, however, the large number of parameters and computations result in long processing time and low battery life. An important factor in designing CNN hardware accelerators is to efficiently map the convolution computation onto hardware resources. In addition, to save battery life and reduce energy consumption, it is essential to reduce the number of DRAM accesses since DRAM consumes orders of magnitude more energy compared to other operations in hardware. In this paper, we propose an energy-efficient architecture which maximally utilizes its computational units for convolution operations while requiring a low number of DRAM accesses. The implementation results show that the proposed architecture performs one image recognition task using the VGGNet model with a latency of 393 ms and only 251.5 MB of DRAM accesses.

[27]  arXiv:2002.07713 [pdf, other]
Title: Intelligent and Reconfigurable Architecture for KL Divergence Based Online Machine Learning Algorithm
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)

Online machine learning (OML) algorithms do not need any training phase and can be deployed directly in an unknown environment. OML includes multi-armed bandit (MAB) algorithms that can identify the best arm among several arms by achieving a balance between exploration of all arms and exploitation of optimal arm. The Kullback-Leibler divergence based upper confidence bound (KLUCB) is the state-of-the-art MAB algorithm that optimizes exploration-exploitation trade-off but it is complex due to underlining optimization routine. This limits its usefulness for robotics and radio applications which demand integration of KLUCB with the PHY on the system on chip (SoC). In this paper, we efficiently map the KLUCB algorithm on SoC by realizing optimization routine via alternative synthesizable computation without compromising on the performance. The proposed architecture is dynamically reconfigurable such that the number of arms, as well as type of algorithm, can be changed on-the-fly. Specifically, after initial learning, on-the-fly switch to light-weight UCB offers around 10-factor improvement in latency and throughput. Since learning duration depends on the unknown arm statistics, we offer intelligence embedded in architecture to decide the switching instant. We validate the functional correctness and usefulness of the proposed architecture via a realistic wireless application and detailed complexity analysis demonstrates its feasibility in realizing intelligent radios.

[28]  arXiv:2002.07777 [pdf, other]
Title: Deep Learning Approaches for Open Set Wireless Transmitter Authorization
Comments: Submitted to SPAWC 2020
Subjects: Signal Processing (eess.SP)

Wireless signals contain transmitter specific features, which can be used to verify the identity of transmitters and assist in implementing an authentication and authorization system. Most recently, there has been wide interest in using deep learning for transmitter identification. However, the existing deep learning work has posed the problem as closed set classification, where a neural network classifies among a finite set of known transmitters. No matter how large this set is, it will not include all transmitters that exist. Malicious transmitters outside this closed set, once within communications range, can jeopardize the system security. In this paper, we propose a deep learning approach for transmitter authorization based on open set recognition. Our proposed approach identifies a set of authorized transmitters, while rejecting any other unseen transmitters by recognizing their signals as outliers. We propose three approaches for this problem and show their ability to reject signals from unauthorized transmitters on a dataset of WiFi captures. We consider the structure of training data needed, and we show that the accuracy improves by having signals from known unauthorized transmitters in the training set.

[29]  arXiv:2002.07806 [pdf, other]
Title: Data-Driven Symbol Detection via Model-Based Machine Learning
Comments: arXiv admin note: text overlap with arXiv:1905.10750
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)

The design of symbol detectors in digital communication systems has traditionally relied on statistical channel models that describe the relation between the transmitted symbols and the observed signal at the receiver. Here we review a data-driven framework to symbol detection design which combines machine learning (ML) and model-based algorithms. In this hybrid approach, well-known channel-model-based algorithms such as the Viterbi method, BCJR detection, and multiple-input multiple-output (MIMO) soft interference cancellation (SIC) are augmented with ML-based algorithms to remove their channel-model-dependence, allowing the receiver to learn to implement these algorithms solely from data. The resulting data-driven receivers are most suitable for systems where the underlying channel models are poorly understood, highly complex, or do not well-capture the underlying physics. Our approach is unique in that it only replaces the channel-model-based computations with dedicated neural networks that can be trained from a small amount of data, while keeping the general algorithm intact. Our results demonstrate that these techniques can yield near-optimal performance of model-based algorithms without knowing the exact channel input-output statistical relationship and in the presence of channel state information uncertainty.

Cross-lists for Wed, 19 Feb 20

[30]  arXiv:2001.10728 (cross-list from eess.SP) [pdf, other]
Title: Design of Non-orthogonal and Noncoherent Massive MIMO for Scalable URLLC Beyond 5G
Comments: arXiv admin note: text overlap with arXiv:1903.01642
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)

This paper is to design and optimize a non-orthogonal and noncoherent massive multiple-input multiple-output (MIMO) framework towards enabling scalable ultra-reliable low-latency communications (sURLLC) in wireless systems beyond 5G. In this framework, the huge diversity gain associated with the large-scale antenna array in massive MIMO systems is leveraged to ensure ultrahigh reliability. To reduce the overhead and latency induced by the channel estimation process, we advocate the noncoherent communication technique which does not need the knowledge of instantaneous channel state information (CSI) but only depends on the large-scale fading coefficients for information decoding. To boost the scalability of the system considered, we enable the non-orthogonal channel access of multiple users by devising a new differential modulation scheme to assure that each transmitted signal matrix can be uniquely determined in the noise-free case and be reliably estimated in noisy cases when the antenna array size is scaled up. The key idea is to make the transmitted signals from multiple users be superimposed properly over the air such that when the sum-signal is correctly detected, the signals sent by all users can be uniquely determined. To further improve the average error performance when the array antenna number is large, we propose a max-min Kullback-Leibler (KL) divergence-based design by jointly optimizing the transmitted powers of all users and the sub-constellation assignment among them. Simulation results show that the proposed design significantly outperforms the existing max-min Euclidean distance-based counterpart in terms of error performance. Moreover, our proposed approach also has a better error performance than the conventional coherent zero-forcing (ZF) receiver with orthogonal channel training, particularly for cell-edge users.

[31]  arXiv:2002.00403 (cross-list from cs.IT) [pdf, other]
Title: Multiuser Scheduling for Minimizing Age of Information in Uplink MIMO Systems
Subjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

This paper studies the user scheduling problem in a multiuser multiple-input multi-output (MIMO) status update system, in which multiple single-antenna devices aim to send their latest statuses to a multiple-antenna information-fusion access point (AP) via a shared wireless channel. The information freshness in the considered system is quantified by a recently proposed metric, termed age of information (AoI). Thanks to the extra spatial degrees-of-freedom brought about by the multiple antennas at the AP, multiple devices can be granted to transmit simultaneously in each time slot. We aim to seek the optimal scheduling policy that can minimize the network-wide AoI by optimally deciding which device or group of devices to be scheduled for transmission in each slot given the instantaneous AoI values of all devices at the beginning of the slot. To that end, we formulate the multiuser scheduling problem as a Markov decision process (MDP). We attain the optimal policy by resolving the formulated MDP problem and develop a low-complexity sub-optimal policy. Simulation results show that the proposed optimal and sub-optimal policies significantly outperform the state-of-the-art benchmark schemes.

[32]  arXiv:2002.07268 (cross-list from physics.soc-ph) [pdf]
Title: P2C2: Peer-to-Peer Car Charging
Subjects: Physics and Society (physics.soc-ph); Signal Processing (eess.SP)

With the rising concerns of fossil fuel depletion and the impact of Internal Combustion Engine(ICE) vehicles on our climate, the transportation industry is observing a rapid proliferation of electric vehicles (EVs). However, long-distance travel withEV is not possible yet without making multiple halt at EV charging stations. Many remote regions do not have charging stations, and even if they are present, it can take several hours to recharge the battery. Conversely, ICE vehicle fueling stations are much more prevalent, and re-fueling takes a couple of minutes. These facts have deterred many from moving to EVs. Existing solutions to these problems, such as building more charging stations, increasing battery capacity, and road-charging have not been proven efficient so far. In this paper, we propose Peer-to-Peer Car Charging (P2C2), a highly scalable novel technique for charging EVs on the go with minimal cost overhead. We allow EVs to share charge among each others based on the instructions from a cloud-based control system. The control system assigns and guides EVs for charge sharing. We also introduce mobile Charging Stations (MoCS), which are high battery capacity vehicles that are used to replenish the overall charge in the vehicle networks. We have implemented P2C2 and integrated it with the traffic simulator, SUMO. We observe promising results with up to 65% reduction in the number of EV halts with up to 24.4% reduction in required battery capacity without any extra halts.

[33]  arXiv:2002.07284 (cross-list from math.ST) [pdf, other]
Title: Sharp Asymptotics and Optimal Performance for Inference in Binary Models
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Signal Processing (eess.SP); Machine Learning (stat.ML)

We study convex empirical risk minimization for high-dimensional inference in binary models. Our first result sharply predicts the statistical performance of such estimators in the linear asymptotic regime under isotropic Gaussian features. Importantly, the predictions hold for a wide class of convex loss functions, which we exploit in order to prove a bound on the best achievable performance among them. Notably, we show that the proposed bound is tight for popular binary models (such as Signed, Logistic or Probit), by constructing appropriate loss functions that achieve it. More interestingly, for binary linear classification under the Logistic and Probit models, we prove that the performance of least-squares is no worse than 0.997 and 0.98 times the optimal one. Numerical simulations corroborate our theoretical findings and suggest they are accurate even for relatively small problem dimensions.

[34]  arXiv:2002.07340 (cross-list from cs.CR) [pdf, other]
Title: Secure Status Updates under Eavesdropping: Age of Information-based Physical Layer Security Metrics
Comments: Submitted for possible publication. The first two authors contributed equally to this work
Subjects: Cryptography and Security (cs.CR); Information Theory (cs.IT); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

This letter studies the problem of maintaining information freshness under passive eavesdropping attacks. The classical three-node wiretap channel model is considered, in which a source aims to send its latest status wirelessly to its intended destination, while protecting the message from being overheard by an eavesdropper. Considering that conventional channel capacity-based secrecy metrics are no longer adequate to measure the information timeliness in status update systems, we define two new age of information-based metrics to characterize the secrecy performance of the considered system. We further propose, analyze, and optimize a randomized stationary transmission policy implemented at the source for further enhancing the secrecy performance. Simulation results are provided to validate our analysis and optimization.

[35]  arXiv:2002.07374 (cross-list from cs.IT) [pdf]
Title: Impact of Fountain Codes on GPRS channels
Comments: conference
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)

The rateless and information additive properties of fountain codes make them attractive for use in broadcast/multicast applications, especially in radio environments where channel characteristics vary with time and bandwidth is expensive. Conventional schemes using a combination of ARQ (Automatic Repeat reQuest) and FEC (Forward Error Correction) suffer from serious drawbacks such as feedback implosion at the transmitter, the need to know the channel characteristics apriori so that the FEC scheme is designed to be effective and the fact that a reverse channel is needed to request retransmissions if the FEC fails. This paper considers the assessment of fountain codes over radio channels. The performance of fountain codes, in terms of the associated overheads, over radio channels of the type experienced in GPRS (General Packet Radio Service) is presented. The work is then extended to assessing the performance of Fountain codes in combination with the GPRS channel coding schemes in a radio environment.

[36]  arXiv:2002.07378 (cross-list from math.OC) [pdf, other]
Title: Distributed Adaptive Newton Methods with Globally Superlinear Convergence
Comments: 14 pages,4 figures
Subjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA); Signal Processing (eess.SP); Systems and Control (eess.SY)

This paper considers the distributed optimization problem over a network where the global objective is to optimize a sum of local functions using only local computation and communication. Since the existing algorithms either adopt a linear consensus mechanism, which converges at best linearly, or assume that each node starts sufficiently close to an optimal solution, they cannot achieve globally superlinear convergence. To break through the linear consensus rate, we propose a finite-time set-consensus method, and then incorporate it into Polyak's adaptive Newton method, leading to our distributed adaptive Newton algorithm (DAN). To avoid transmitting local Hessians, we adopt a low-rank approximation idea to compress the Hessian and design a communication-efficient DAN-LA. Then, the size of transmitted messages in DAN-LA is reduced to $O(p)$ per iteration, where $p$ is the dimension of decision vectors and is the same as the first-order methods. We show that DAN and DAN-LA can globally achieve quadratic and superlinear convergence rates, respectively. Numerical experiments on logistic regression problems are finally conducted to show the advantages over existing methods.

[37]  arXiv:2002.07380 (cross-list from cs.NI) [pdf, ps, other]
Title: Network Slicing for Service-Oriented Networks with Flexible Routing and Guaranteed E2E Latency
Comments: 5 pages, 3 figures, submitted for possible publication
Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Signal Processing (eess.SP); Optimization and Control (math.OC)

Network function virtualization is a promising technology to simultaneously support multiple services with diverse characteristics and requirements in the fifth generation and beyond networks. In practice, each service consists of a predetermined sequence of functions, called a service function chain (SFC), running on a cloud environment. To make different service slices work properly in harmony, it is crucial to select the cloud nodes to deploy the functions in the SFC and flexibly route the flow of the services such that these functions are processed in sequence, the end-to-end (E2E) latency constraints of all services are guaranteed, and all resource constraints are respected. In this paper, we propose a new (mixed binary linear program) formulation of the above network slicing problem that optimizes the system energy efficiency while jointly considers the resource budget, functional instantiation, flow routing, and E2E latency requirement. Numerical results show the advantage of the proposed formulation compared to the existing ones.

[38]  arXiv:2002.07575 (cross-list from cs.AI) [pdf]
Title: AdaEnsemble Learning Approach for Metro Passenger Flow Forecasting
Subjects: Artificial Intelligence (cs.AI); Signal Processing (eess.SP)

Accurate and timely metro passenger flow forecasting is critical for the successful deployment of intelligent transportation systems. However, it is quite challenging to propose an efficient and robust forecasting approach due to the inherent randomness and variations of metro passenger flow. In this study, we present a novel adaptive ensemble (AdaEnsemble) learning approach to accurately forecast the volume of metro passenger flows, and it combines the complementary advantages of variational mode decomposition (VMD), seasonal autoregressive integrated moving averaging (SARIMA), multilayer perceptron network (MLP) and long short-term memory (LSTM) network. The AdaEnsemble learning approach consists of three important stages. The first stage applies VMD to decompose the metro passenger flows data into periodic component, deterministic component and volatility component. Then we employ SARIMA model to forecast the periodic component, LSTM network to learn and forecast deterministic component and MLP network to forecast volatility component. In the last stage, the diverse forecasted components are reconstructed by another MLP network. The empirical results show that our proposed AdaEnsemble learning approach not only has the best forecasting performance compared with the state-of-the-art models but also appears to be the most promising and robust based on the historical passenger flow data in Shenzhen subway system and several standard evaluation measures.

[39]  arXiv:2002.07579 (cross-list from physics.ao-ph) [pdf, other]
Title: Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks
Subjects: Atmospheric and Oceanic Physics (physics.ao-ph); Image and Video Processing (eess.IV); Computational Physics (physics.comp-ph)

We introduce a conditional Generative Adversarial Network (cGAN) approach to generate cloud reflectance fields (CRFs) conditioned on large scale meteorological variables such as sea surface temperature and relative humidity. We show that our trained model can generate realistic CRFs from the corresponding meteorological observations, which represents a step towards a data-driven framework for stochastic cloud parameterization.

[40]  arXiv:2002.07583 (cross-list from cs.IT) [pdf, other]
Title: Optical Rate-Splitting Multiple Access for Visible Light Communications
Comments: 17 pages, 14 figures
Subjects: Information Theory (cs.IT); Signal Processing (eess.SP); Systems and Control (eess.SY)

The proliferation of connected devices and emergence of internet-of-everything represent a major challenge for broadband wireless networks. This requires a paradigm shift towards the development of innovative technologies for next generation wireless systems. One of the key challenges is the scarcity of spectrum, owing to the unprecedented broadband penetration rate in recent years. A promising solution is the proposal of visible light communications (VLC), which explores the unregulated visible light spectrum to enable high-speed communications, in addition to efficient lighting. This solution offers a wider bandwidth that can accommodate ubiquitous broadband connectivity to indoor users and offload data traffic from cellular networks. Although VLC is secure and able to overcome the shortcomings of RF systems, it suffers from several limitations, e.g., limited modulation bandwidth. In this respect, solutions have been proposed recently to overcome this limitation. In particular, most common orthogonal and non-orthogonal multiple access techniques initially proposed for RF systems, e.g., space-division multiple access (SDMA) and NOMA, have been considered in the context of VLC. In spite of their promising gains, the performance of these techniques is somewhat limited. Consequently, in this article a new and generalized multiple access technique, called rate-splitting multiple access (RSMA), is introduced and investigated for the first time in VLC networks. We first provide an overview of the key multiple access technologies used in VLC systems. Then, we propose the first comprehensive approach to the integration of RSMA with VLC systems. In our proposed framework, SINR expressions are derived and used to evaluate the weighted sum rate (WSR) of a two-user scenario. Our results illustrate the flexibility of RSMA in generalizing NOMA and SDMA, and its WSR superiority in the VLC context.

[41]  arXiv:2002.07593 (cross-list from cs.NI) [pdf, other]
Title: Active Learning-based Classification in Automated Connected Vehicles
Journal-ref: PERSIST-IoT 2020: IEEE INFOCOM workshop on Pervasive Systems in the IoT era
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

Machine learning has emerged as a promising paradigm for enabling connected, automated vehicles to autonomously cruise the streets and react to unexpected situations. A key challenge, however, is to collect and select real-time and reliable information for the correct classification of unexpected, and often rare, situations that may happen on the road. Indeed, the data generated by vehicles, or received from neighboring vehicles, may be affected by errors or have different levels of resolution and freshness. To tackle this challenge, we propose an active learning framework that, leveraging the information collected through onboard sensors as well as received from other vehicles, effectively deals with scarce and noisy data. In particular, given the available information, our solution selects the data to add to the training set by trading off between two essential features, namely, quality and diversity. The results, obtained using real-world data sets, show that the proposed method significantly outperforms state-of-the-art solutions, providing high classification accuracy at the cost of a limited bandwidth requirement for the data exchange between vehicles.

[42]  arXiv:2002.07601 (cross-list from cs.IT) [pdf, other]
Title: ADMM-based Decoder for Binary Linear Codes Aided by Deep Learning
Comments: 5 pages, 4 figures, accepted for publication in IEEE communications letters
Subjects: Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)

Inspired by the recent advances in deep learning (DL), this work presents a deep neural network aided decoding algorithm for binary linear codes. Based on the concept of deep unfolding, we design a decoding network by unfolding the alternating direction method of multipliers (ADMM)-penalized decoder. In addition, we propose two improved versions of the proposed network. The first one transforms the penalty parameter into a set of iteration-dependent ones, and the second one adopts a specially designed penalty function, which is based on a piecewise linear function with adjustable slopes. Numerical results show that the resulting DL-aided decoders outperform the original ADMM-penalized decoder for various low density parity check (LDPC) codes with similar computational complexity.

[43]  arXiv:2002.07613 (cross-list from cs.CV) [pdf, other]
Title: An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)

Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we extend the globally-aware multiple instance classifier, a framework we proposed to address these unique properties of medical images. This model first uses a low-capacity, yet memory-efficient, network on the whole image to identify the most informative regions. It then applies another higher-capacity network to collect details from chosen regions. Finally, it employs a fusion module that aggregates global and local information to make a final prediction. While existing methods often require lesion segmentation during training, our model is trained with only image-level labels and can generate pixel-level saliency maps indicating possible malignant findings. We apply the model to screening mammography interpretation: predicting the presence or absence of benign and malignant lesions. On the NYU Breast Cancer Screening Dataset, consisting of more than one million images, our model achieves an AUC of 0.93 in classifying breasts with malignant findings, outperforming ResNet-34 and Faster R-CNN. Compared to ResNet-34, our model is 4.1x faster for inference while using 78.4% less GPU memory. Furthermore, we demonstrate, in a reader study, that our model surpasses radiologist-level AUC by a margin of 0.11. The proposed model is available online: https://github.com/nyukat/GMIC.

[44]  arXiv:2002.07630 (cross-list from math.OC) [pdf]
Title: Extending iLQR method with control delay
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Systems and Control (eess.SY)

Iterative linear quadradic regulator(iLQR) has become a benchmark method to deal with nonlinear stochastic optimal control problem. However, it does not apply to delay system. In this paper, we extend the iLQR theory and prove new theorem in case of input signal with fixed delay. Which could be beneficial for machine learning or optimal control application to real time robot or human assistive device.

[45]  arXiv:2002.07643 (cross-list from cs.CV) [pdf]
Title: Neural arbitrary style transfer for portrait images using the attention mechanism
Comments: in Russian
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Arbitrary style transfer is the task of synthesis of an image that has never been seen before, using two given images: content image and style image. The content image forms the structure, the basic geometric lines and shapes of the resulting image, while the style image sets the color and texture of the result. The word "arbitrary" in this context means the absence of any one pre-learned style. So, for example, convolutional neural networks capable of transferring a new style only after training or retraining on a new amount of data are not con-sidered to solve such a problem, while networks based on the attention mech-anism that are capable of performing such a transformation without retraining - yes. An original image can be, for example, a photograph, and a style image can be a painting of a famous artist. The resulting image in this case will be the scene depicted in the original photograph, made in the stylie of this picture. Recent arbitrary style transfer algorithms make it possible to achieve good re-sults in this task, however, in processing portrait images of people, the result of such algorithms is either unacceptable due to excessive distortion of facial features, or weakly expressed, not bearing the characteristic features of a style image. In this paper, we consider an approach to solving this problem using the combined architecture of deep neural networks with a attention mechanism that transfers style based on the contents of a particular image segment: with a clear predominance of style over the form for the background part of the im-age, and with the prevalence of content over the form in the image part con-taining directly the image of a person.

[46]  arXiv:2002.07646 (cross-list from math.OC) [pdf]
Title: Multi-objective Optimal Reactive Power Dispatch of Power Systems by Combining Classification Based Multi-objective Evolutionary Algorithm and Integrated Decision Making
Authors: Meng Zhang, Yang Li
Comments: Accepted by IEEE Access
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)

For the purpose of addressing the multi-objective optimal reactive power dispatch (MORPD) problem, a two-step approach is proposed in this paper. First of all, to ensure the economy and security of the power system, the MORPD model aiming to minimize active power loss and voltage deviation is formulated. And then the two-step approach integrating decision-making into optimization is proposed to solve the model. Specifically speaking, the first step aims to seek the Pareto optimal solutions (POSs) with good distribution by using a multi-objective optimization (MOO) algorithm named classification and Pareto domination based multi-objective evolutionary algorithm (CPSMOEA). Furthermore, the reference Pareto-optimal front is generated to validate the Pareto front obtained using CPSMOEA; in the second step, integrated decision-making by combining fuzzy c-means algorithm (FCM) with grey relation projection method (GRP) aims to extract the best compromise solutions which reflect the preferences of decision-makers from the POSs. Based on the test results on the IEEE 30-bus and IEEE 118-bus test systems, it is demonstrated that the proposed approach not only manages to address the MORPD issue but also outperforms other commonly-used MOO algorithms including multi-objective particle swarm optimization (MOPSO), preference-inspired coevolutionary algorithm (PICEAg) and the third evolution step of generalized differential evolution (GDE3).

[47]  arXiv:2002.07662 (cross-list from cs.CV) [pdf, other]
Title: FeatureNMS: Non-Maximum Suppression by Learning Feature Embeddings
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

Most state of the art object detectors output multiple detections per object. The duplicates are removed in a post-processing step called Non-Maximum Suppression. Classical Non-Maximum Suppression has shortcomings in scenes that contain objects with high overlap: The idea of this heuristic is that a high bounding box overlap corresponds to a high probability of having a duplicate. We propose FeatureNMS to solve this problem. FeatureNMS recognizes duplicates not only based on the intersection over union between bounding boxes, but also based on the difference of feature vectors. These feature vectors can encode more information like visual appearance. Our approach outperforms classical NMS and derived approaches and achieves state of the art performance.

[48]  arXiv:2002.07673 (cross-list from math.OC) [pdf, other]
Title: Network Theoretic Analysis of Maximum a Posteriori Detectors for Sensor Analysis and Design
Comments: 14 pages; 6 figures; submitted to Automatica
Subjects: Optimization and Control (math.OC); Signal Processing (eess.SP); Statistics Theory (math.ST)

In this paper we characterize the performance of a class of maximum-a-posteriori (MAP) detectors for network systems driven by unknown stochastic inputs, as a function of the location of the sensors and the topology of the network. We consider two scenarios: one in which the changes occurs in the mean of the input, and the other where the changes are allowed to happen in the covariance (or power) of the input. In both the scenarios, to detect the changes, we associate suitable MAP detectors for a given set of sensors, and study its detection performance as function of the network topology, and the graphical distance between the input nodes and the sensors location. When the input and measurement noise follow a Gaussian distribution, we show that, as the number of measurements goes to infinity, the detectors' performance can be studied using the input to output gain of the transfer function of the network system. Using this characterization, we derive conditions under which the detection performance obtained when the sensors are located on a network cut is not worse (resp. not better) than the performance obtained by measuring all nodes of the subnetwork induced by the cut and not containing the input nodes. Our results provide structural insights into the sensor placement from a detection-theoretic viewpoint. Finally, we illustrate our findings via numerical examples.

[49]  arXiv:2002.07677 (cross-list from cs.SD) [pdf]
Title: Performance Analysis of Adaptive Noise Cancellation for Speech Signal
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)

This paper gives a broader insight on the application of adaptive filter in noise cancellation during various processes where signal is transmitted. Adaptive filtering techniques like RLS, LMS and normalized LMS are used to filter the input signal using the concept of negative feedback to predict its nature and remove it effectively from the input. In this paper a comparative study between the effectiveness of RLS, LMS and normalized LMS is done based on parameters like SNR (Signal to Noise ratio), MSE (Mean squared error) and cross correlation. Implementation and analysis of the filters are done by taking different step sizes on different orders of the filters.

[50]  arXiv:2002.07681 (cross-list from cs.LG) [pdf, ps, other]
Title: Deep Neural Networks for the Correction of Mie Scattering in Fourier-Transformed Infrared Spectra of Biological Samples
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Tissues and Organs (q-bio.TO); Machine Learning (stat.ML)

Infrared spectra obtained from cell or tissue specimen have commonly been observed to involve a significant degree of (resonant) Mie scattering, which often overshadows biochemically relevant spectral information by a non-linear, non-additive spectral component in Fourier transformed infrared (FTIR) spectroscopic measurements. Correspondingly, many successful machine learning approaches for FTIR spectra have relied on preprocessing procedures that computationally remove the scattering components from an infrared spectrum. We propose an approach to approximate this complex preprocessing function using deep neural networks. As we demonstrate, the resulting model is not just several orders of magnitudes faster, which is important for real-time clinical applications, but also generalizes strongly across different tissue types. Furthermore, our proposed method overcomes the trade-off between computation time and the corrected spectrum being biased towards an artificial reference spectrum.

[51]  arXiv:2002.07754 (cross-list from cs.CV) [pdf]
Title: Computational optimization of convolutional neural networks using separated filters architecture
Comments: 4 pages, 3 figures
Journal-ref: International Journal of Applied Engineering Research (ISSN 0973-4562), Volume 11, Number 11 (2016), pp 7491-7494
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

This paper considers a convolutional neural network transformation that reduces computation complexity and thus speedups neural network processing. Usage of convolutional neural networks (CNN) is the standard approach to image recognition despite the fact they can be too computationally demanding, for example for recognition on mobile platforms or in embedded systems. In this paper we propose CNN structure transformation which expresses 2D convolution filters as a linear combination of separable filters. It allows to obtain separated convolutional filters by standard training algorithms. We study the computation efficiency of this structure transformation and suggest fast implementation easily handled by CPU or GPU. We demonstrate that CNNs designed for letter and digit recognition of proposed structure show 15% speedup without accuracy loss in industrial image recognition system. In conclusion, we discuss the question of possible accuracy decrease and the application of proposed transformation to different recognition problems. convolutional neural networks, computational optimization, separable filters, complexity reduction.

Replacements for Wed, 19 Feb 20

[52]  arXiv:1711.00493 (replaced) [pdf, other]
Title: Event-Triggered Diffusion Kalman Filters
Subjects: Systems and Control (eess.SY); Robotics (cs.RO); Signal Processing (eess.SP)
[53]  arXiv:1904.10915 (replaced) [pdf, ps, other]
Title: When Smoothness is Not Enough: Toward Exact Quantification and Optimization of the Price-of-Anarchy
Comments: 9 pages, double column, 1 figure, 1 table, to appear in the proceedings of the 2019 IEEE Conference on Decision and Control
Subjects: Systems and Control (eess.SY); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
[54]  arXiv:1905.03109 (replaced) [pdf, other]
Title: Human Gait Database for Normal Walk Collected by Smart Phone Accelerometer
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[55]  arXiv:1907.12802 (replaced) [pdf]
Title: Algorithms for Locating and Characterizing Cable Faults via Stepped-Frequency Waveform Reflectometry
Comments: 10 pages, 13 figures. Accepted for publication on IEEE Transactions on Instrumentation and Measurement
Subjects: Signal Processing (eess.SP)
[56]  arXiv:1908.04863 (replaced) [pdf, other]
Title: Intelligent Reflecting Surface Aided MIMO Broadcasting for Simultaneous Wireless Information and Power Transfer
Comments: Accepted in IEEE JSAC
Subjects: Signal Processing (eess.SP)
[57]  arXiv:1910.05266 (replaced) [pdf, other]
Title: Backpropagation Algorithms and Reservoir Computing in Recurrent Neural Networks for the Forecasting of Complex Spatiotemporal Dynamics
Comments: 41 pages, submitted to Elsevier Journal of Neural Networks (accepted)
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Fluid Dynamics (physics.flu-dyn)
[58]  arXiv:1910.06750 (replaced) [pdf, other]
Title: Full-Scale Continuous Synthetic Sonar Data Generation with Markov Conditional Generative Adversarial Networks
Comments: 6 pages, 6 figures. Accepted to ICRA2020
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[59]  arXiv:1910.09487 (replaced) [pdf, other]
Title: Robust Dynamic State Estimation of Synchronous Machines with Asymptotic State Estimation Error Performance Guarantees
Comments: IEEE Transactions on Power Systems, In Press. V2: Fixed some typos in the appendix
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
[60]  arXiv:1910.11030 (replaced) [pdf, other]
Title: Spatiotemporal Tile-based Attention-guided LSTMs for Traffic Video Prediction
Authors: Tu Nguyen
Comments: Neurips 2019 Traffic4Cast Challenge
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[61]  arXiv:1910.13724 (replaced) [pdf, other]
Title: Metric Learning with Background Noise Class for Few-shot Detection of Rare Sound Events
Comments: 5 pages, 5 figures, accepted for publication in IEEE ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
[62]  arXiv:1911.03929 (replaced) [pdf, other]
Title: Positioning of Multiple Unmanned Aerial Vehicle Base Stations in future Wireless Network
Subjects: Signal Processing (eess.SP); Optimization and Control (math.OC)
[63]  arXiv:1911.07360 (replaced) [pdf, other]
Title: A Simple and Efficient Tube-based Robust Output Feedback Model Predictive Control Scheme
Subjects: Systems and Control (eess.SY)
[64]  arXiv:1911.08641 (replaced) [pdf, other]
Title: Performance Monitoring for Live Systems with Soft FEC and Multilevel Modulation
Comments: 9 pages, 9 figures
Subjects: Signal Processing (eess.SP); Information Theory (cs.IT)
[65]  arXiv:1911.10303 (replaced) [pdf, other]
Title: New Transceiver Designs for Interleaved Frequency Division Multiple Access
Comments: 32 pages, 12 figures
Subjects: Signal Processing (eess.SP)
[66]  arXiv:1912.01852 (replaced) [pdf, other]
Title: PitchNet: Unsupervised Singing Voice Conversion with Pitch Adversarial Network
Comments: Accepted by ICASSP 2020
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[67]  arXiv:1912.03363 (replaced) [pdf, other]
Title: Audio-attention discriminative language model for ASR rescoring
Comments: 4 pages, 1 figure, Accepted at ICASSP 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)
[68]  arXiv:1912.10920 (replaced) [pdf, other]
Title: RPGAN: GANs Interpretability via Random Routing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[69]  arXiv:2001.08094 (replaced) [pdf, other]
Title: Energy-efficient Runtime Resource Management for Adaptable Multi-application Mapping
Comments: Final version to appear in DATE 2020. 6 pages, 4 figures. Corrected Figure 1
Subjects: Systems and Control (eess.SY)
[ total of 69 entries: 1-69 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2002, contact, help  (Access key information)