Maintained by Difan Deng and Marius Lindauer.
The following list considers papers related to neural architecture search. It is by no means complete. If you miss a paper on the list, please let us know.
Please note that although NAS methods steadily improve, the quality of empirical evaluations in this field are still lagging behind compared to other areas in machine learning, AI and optimization. We would therefore like to share some best practices for empirical evaluations of NAS methods, which we believe will facilitate sustained and measurable progress in the field. If you are interested in a teaser, please read our blog post or directly jump to our checklist.
Transformers have gained increasing popularity in different domains. For a comprehensive list of papers focusing on Neural Architecture Search for Transformer-Based spaces, the awesome-transformer-search repo is all you need.
2023
Ding, Li; Spector, Lee
Multi-Objective Evolutionary Architecture Search for Parameterized Quantum Circuits Journal Article
In: Entropy, vol. 25, no. 1, 2023, ISSN: 1099-4300.
@article{e25010093,
title = {Multi-Objective Evolutionary Architecture Search for Parameterized Quantum Circuits},
author = {Li Ding and Lee Spector},
url = {https://www.mdpi.com/1099-4300/25/1/93},
doi = {10.3390/e25010093},
issn = {1099-4300},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Entropy},
volume = {25},
number = {1},
abstract = {Recent work on hybrid quantum-classical machine learning systems has demonstrated success in utilizing parameterized quantum circuits (PQCs) to solve the challenging reinforcement learning (RL) tasks, with provable learning advantages over classical systems, e.g., deep neural networks. While existing work demonstrates and exploits the strength of PQC-based models, the design choices of PQC architectures and the interactions between different quantum circuits on learning tasks are generally underexplored. In this work, we introduce a Multi-objective Evolutionary Architecture Search framework for parameterized quantum circuits (MEAS-PQC), which uses a multi-objective genetic algorithm with quantum-specific configurations to perform efficient searching of optimal PQC architectures. Experimental results show that our method can find architectures that have superior learning performance on three benchmark RL tasks, and are also optimized for additional objectives including reductions in quantum noise and model size. Further analysis of patterns and probability distributions of quantum operations helps identify performance-critical design choices of hybrid quantum-classical learning systems.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Saeed, Fahman; Hussain, Muhammad; Aboalsamh, Hatim A.; Adel, Fadwa Al; Owaifeer, Adi Mohammed Al
Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis Journal Article
In: Mathematics, vol. 11, no. 2, 2023, ISSN: 2227-7390.
@article{math11020307,
title = {Designing the Architecture of a Convolutional Neural Network Automatically for Diabetic Retinopathy Diagnosis},
author = {Fahman Saeed and Muhammad Hussain and Hatim A. Aboalsamh and Fadwa Al Adel and Adi Mohammed Al Owaifeer},
url = {https://www.mdpi.com/2227-7390/11/2/307},
doi = {10.3390/math11020307},
issn = {2227-7390},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Mathematics},
volume = {11},
number = {2},
abstract = {Diabetic retinopathy (DR) is a leading cause of blindness in middle-aged diabetic patients. Regular screening for DR using fundus imaging aids in detecting complications and delays the progression of the disease. Because manual screening takes time and is subjective, deep learning has been used to help graders. Pre-trained or brute force CNN models are used in existing DR grading CNN-based approaches that are not suited to fundus image complexity. To solve this problem, we present a method for automatically customizing CNN models based on fundus image lesions. It uses k-medoid clustering, principal component analysis (PCA), and inter-class and intra-class variations to determine the CNN model’s depth and width. The designed models are lightweight, adapted to the internal structures of fundus images, and encode the discriminative patterns of DR lesions. The technique is validated on a local dataset from King Saud University Medical City, Saudi Arabia, and two challenging Kaggle datasets: EyePACS and APTOS2019. The auto-designed models outperform well-known pre-trained CNN models such as ResNet152, DenseNet121, and ResNeSt50, as well as Google’s AutoML and Auto-Keras models based on neural architecture search (NAS). The proposed method outperforms current CNN-based DR screening methods. The proposed method can be used in various clinical settings to screen for DR and refer patients to ophthalmologists for further evaluation and treatment.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Magalhães, Dimmy; Lima, Ricardo H. R.; Pozo, Aurora
Creating deep neural networks for text classification tasks using grammar genetic programming Journal Article
In: Applied Soft Computing, pp. 110009, 2023, ISSN: 1568-4946.
@article{MAGALHAES2023110009,
title = {Creating deep neural networks for text classification tasks using grammar genetic programming},
author = {Dimmy Magalhães and Ricardo H. R. Lima and Aurora Pozo},
url = {https://www.sciencedirect.com/science/article/pii/S1568494623000273},
doi = {https://doi.org/10.1016/j.asoc.2023.110009},
issn = {1568-4946},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Applied Soft Computing},
pages = {110009},
abstract = {Text classification is one of the Natural Language Processing (NLP) tasks. Its objective is to label textual elements, such as phrases, queries, paragraphs, and documents. In NLP, several approaches have achieved promising results regarding this task. Deep Learning-based approaches have been widely used in this context, with deep neural networks (DNNs) adding the ability to generate a representation for the data and a learning model. The increasing scale and complexity of DNN architectures was expected, creating new challenges to design and configure the models. In this paper, we present a study on the application of a grammar-based evolutionary approach to the design of DNNs, using models based on Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), and Graph Neural Networks (GNNs). We propose different grammars, which were defined to capture the features of each type of network, also proposing some combinations, verifying their impact on the produced designs and performance of the generated models. We create a grammar that is able to generate different networks specialized on text classification, by modification of Grammatical Evolution (GE), and it is composed of three main components: the grammar, mapping, and search engine. Our results offer promising future research directions as they show that the projected architectures have a performance comparable to that of their counterparts but can still be further improved. We were able to improve the results of a manually structured neural network in 8,18% in the best case.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Li, Xingzhuo; Hou, Sujuan; Zhang, Baisong; Wang, Jing; Jia, Weikuan; Zheng, Yuanjie
Long-Range Dependence Involutional Network for Logo Detection Journal Article
In: Entropy, vol. 25, no. 1, 2023, ISSN: 1099-4300.
@article{e25010174,
title = {Long-Range Dependence Involutional Network for Logo Detection},
author = {Xingzhuo Li and Sujuan Hou and Baisong Zhang and Jing Wang and Weikuan Jia and Yuanjie Zheng},
url = {https://www.mdpi.com/1099-4300/25/1/174},
doi = {10.3390/e25010174},
issn = {1099-4300},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Entropy},
volume = {25},
number = {1},
abstract = {Logo detection is one of the crucial branches in computer vision due to various real-world applications, such as automatic logo detection and recognition, intelligent transportation, and trademark infringement detection. Compared with traditional handcrafted-feature-based methods, deep learning-based convolutional neural networks (CNNs) can learn both low-level and high-level image features. Recent decades have witnessed the great feature representation capabilities of deep CNNs and their variants, which have been very good at discovering intricate structures in high-dimensional data and are thereby applicable to many domains including logo detection. However, logo detection remains challenging, as existing detection methods cannot solve well the problems of a multiscale and large aspect ratios. In this paper, we tackle these challenges by developing a novel long-range dependence involutional network (LDI-Net). Specifically, we designed a strategy that combines a new operator and a self-attention mechanism via rethinking the intrinsic principle of convolution called long-range dependence involution (LD involution) to alleviate the detection difficulties caused by large aspect ratios. We also introduce a multilevel representation neural architecture search (MRNAS) to detect multiscale logo objects by constructing a novel multipath topology. In addition, we implemented an adaptive RoI pooling module (ARM) to improve detection efficiency by addressing the problem of logo deformation. Comprehensive experiments on four benchmark logo datasets demonstrate the effectiveness and efficiency of the proposed approach.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Xie, Xiangning; Song, Xiaotian; Lv, Zeqiong; Yen, Gary G.; Ding, Weiping; Sun, Yanan
Efficient Evaluation Methods for Neural Architecture Search: A Survey Technical Report
2023.
@techreport{https://doi.org/10.48550/arxiv.2301.05919,
title = {Efficient Evaluation Methods for Neural Architecture Search: A Survey},
author = {Xiangning Xie and Xiaotian Song and Zeqiong Lv and Gary G. Yen and Weiping Ding and Yanan Sun},
url = {https://arxiv.org/abs/2301.05919},
doi = {10.48550/ARXIV.2301.05919},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {arXiv},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Lee, Seunghyun; Song, Byung Cheol
Fast Filter Pruning via Coarse-to-Fine Neural Architecture Search and Contrastive Knowledge Transfer Journal Article
In: IEEE Transactions on Neural Networks and Learning Systems, pp. 1-12, 2023.
@article{10018843,
title = {Fast Filter Pruning via Coarse-to-Fine Neural Architecture Search and Contrastive Knowledge Transfer},
author = {Seunghyun Lee and Byung Cheol Song},
url = {https://ieeexplore.ieee.org/abstract/document/10018843},
doi = {10.1109/TNNLS.2023.3236336},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {IEEE Transactions on Neural Networks and Learning Systems},
pages = {1-12},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Chauhan, Anshumaan; Bhattacharyya, Siddhartha; Vadivel, S.
DQNAS: Neural Architecture Search using Reinforcement Learning Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-06687,
title = {DQNAS: Neural Architecture Search using Reinforcement Learning},
author = {Anshumaan Chauhan and Siddhartha Bhattacharyya and S. Vadivel},
url = {https://doi.org/10.48550/arXiv.2301.06687},
doi = {10.48550/arXiv.2301.06687},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.06687},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Ye, Peng; He, Tong; Li, Baopu; Chen, Tao; Bai, Lei; Ouyang, Wanli
(beta)-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-06393,
title = {(beta)-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search},
author = {Peng Ye and Tong He and Baopu Li and Tao Chen and Lei Bai and Wanli Ouyang},
url = {https://doi.org/10.48550/arXiv.2301.06393},
doi = {10.48550/arXiv.2301.06393},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.06393},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Ford, Noah; Winder, John; McClellan, Josh
Adaptive Neural Networks Using Residual Fitting Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-05744,
title = {Adaptive Neural Networks Using Residual Fitting},
author = {Noah Ford and John Winder and Josh McClellan},
url = {https://doi.org/10.48550/arXiv.2301.05744},
doi = {10.48550/arXiv.2301.05744},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.05744},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Lakhmiri, Dounia; Zolnouri, Mahdi; Nia, Vahid Partovi; Tribes, Christophe; Digabel, Sébastien Le
Scaling Deep Networks with the Mesh Adaptive Direct Search algorithm Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-06641,
title = {Scaling Deep Networks with the Mesh Adaptive Direct Search algorithm},
author = {Dounia Lakhmiri and Mahdi Zolnouri and Vahid Partovi Nia and Christophe Tribes and Sébastien Le Digabel},
url = {https://doi.org/10.48550/arXiv.2301.06641},
doi = {10.48550/arXiv.2301.06641},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.06641},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
White, Colin; Safari, Mahmoud; Sukthanker, Rhea; Ru, Binxin; Elsken, Thomas; Zela, Arber; Dey, Debadeepta; Hutter, Frank
Neural Architecture Search: Insights from 1000 Papers Technical Report
2023.
@techreport{https://doi.org/10.48550/arxiv.2301.08727,
title = {Neural Architecture Search: Insights from 1000 Papers},
author = {Colin White and Mahmoud Safari and Rhea Sukthanker and Binxin Ru and Thomas Elsken and Arber Zela and Debadeepta Dey and Frank Hutter},
url = {https://arxiv.org/abs/2301.08727},
doi = {10.48550/ARXIV.2301.08727},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {arXiv},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Kong, Gangwei; Li, Chang; Peng, Hu; Han, Zhihui; Qiao, Heyuan
EEG-Based Sleep Stage Classification via Neural Architecture Search Journal Article
In: IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society, 2023.
@article{articleb,
title = {EEG-Based Sleep Stage Classification via Neural Architecture Search},
author = {Gangwei Kong and Chang Li and Hu Peng and Zhihui Han and Heyuan Qiao},
url = {https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10024773&tag=1},
doi = {10.1109/TNSRE.2023.3238764},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {IEEE transactions on neural systems and rehabilitation engineering: a publication of the IEEE Engineering in Medicine and Biology Society},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Shi, Chaokun; Hao, Yuexing; Li, Gongyan; Xu, Shaoyun
EBNAS: Efficient binary network design for image classification via neural architecture search Journal Article
In: Engineering Applications of Artificial Intelligence, vol. 120, pp. 105845, 2023, ISSN: 0952-1976.
@article{SHI2023105845,
title = {EBNAS: Efficient binary network design for image classification via neural architecture search},
author = {Chaokun Shi and Yuexing Hao and Gongyan Li and Shaoyun Xu},
url = {https://www.sciencedirect.com/science/article/pii/S0952197623000295},
doi = {https://doi.org/10.1016/j.engappai.2023.105845},
issn = {0952-1976},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Engineering Applications of Artificial Intelligence},
volume = {120},
pages = {105845},
abstract = {To deploy Convolutional Neural Networks (CNNs) on resource-limited devices, binary CNNs with 1-bit activations and weights prove to be a promising approach. Meanwhile, Neural Architecture Search (NAS), which can design lightweight networks beyond artificial ones, has achieved optimal performance in various tasks. To design high-performance binary networks, we propose an efficient binary neural architecture search algorithm, namely EBNAS. In this paper, we propose corresponding improvement strategies to deal with the information loss due to binarization, the discrete error between search and evaluation, and the imbalanced operation advantage in the search space. Specifically, we adopt a new search space consisting of operations suitable for the binary domain. An L2 path regularization and a variance-based edge regularization are introduced to guide the search process and drive architecture parameters toward discretization. In addition, we present a search space simplification strategy and adjust the channel sampling proportions to balance the advantages of different operations. We perform extensive experiments on CIFAR10, CIFAR100, and ImageNet datasets. The results demonstrate the effectiveness of our proposed methods. For example, with binary weights and activations, EBNAS achieves a Top-1 accuracy of 95.61% on CIFAR10, 78.10% on CIFAR100, and 67.8% on ImageNet. With a similar number of model parameters, our algorithm outperforms other binary NAS methods in terms of accuracy and efficiency. Compared with manually designed binary networks, our algorithm remains competitive. The code is available at https://github.com/sscckk/EBNAS.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Nath, Utkarsh; Wang, Yancheng; Yang, Yingzhen
RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation Technical Report
2023.
@techreport{https://doi.org/10.48550/arxiv.2301.08092,
title = {RNAS-CL: Robust Neural Architecture Search by Cross-Layer Knowledge Distillation},
author = {Utkarsh Nath and Yancheng Wang and Yingzhen Yang},
url = {https://arxiv.org/abs/2301.08092},
doi = {10.48550/ARXIV.2301.08092},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {arXiv},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Jin, Haifeng; Chollet, François; Song, Qingquan; Hu, Xia
AutoKeras: An AutoML Library for Deep Learning Journal Article
In: Journal of Machine Learning Research, vol. 24, no. 6, pp. 1–6, 2023.
@article{JMLR:v24:20-1355,
title = {AutoKeras: An AutoML Library for Deep Learning},
author = {Haifeng Jin and François Chollet and Qingquan Song and Xia Hu},
url = {http://jmlr.org/papers/v24/20-1355.html},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Journal of Machine Learning Research},
volume = {24},
number = {6},
pages = {1--6},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Carrasquilla, Juan; Hibat-Allah, Mohamed; Inack, Estelle; Makhzani, Alireza; Neklyudov, Kirill; Taylor, Graham W.; Torlai, Giacomo
Quantum HyperNetworks: Training Binary Neural Networks in Quantum Superposition Technical Report
2023.
@techreport{https://doi.org/10.48550/arxiv.2301.08292,
title = {Quantum HyperNetworks: Training Binary Neural Networks in Quantum Superposition},
author = {Juan Carrasquilla and Mohamed Hibat-Allah and Estelle Inack and Alireza Makhzani and Kirill Neklyudov and Graham W. Taylor and Giacomo Torlai},
url = {https://arxiv.org/abs/2301.08292},
doi = {10.48550/ARXIV.2301.08292},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {arXiv},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Shi, Chaokun; Hao, Yuexing; Li, Gongyan; Xu, Shaoyun
EBNAS: Efficient binary network design for image classification via neural architecture search Journal Article
In: Engineering Applications of Artificial Intelligence, vol. 120, pp. 105845, 2023, ISSN: 0952-1976.
@article{SHI2023105845b,
title = {EBNAS: Efficient binary network design for image classification via neural architecture search},
author = {Chaokun Shi and Yuexing Hao and Gongyan Li and Shaoyun Xu},
url = {https://www.sciencedirect.com/science/article/pii/S0952197623000295},
doi = {https://doi.org/10.1016/j.engappai.2023.105845},
issn = {0952-1976},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Engineering Applications of Artificial Intelligence},
volume = {120},
pages = {105845},
abstract = {To deploy Convolutional Neural Networks (CNNs) on resource-limited devices, binary CNNs with 1-bit activations and weights prove to be a promising approach. Meanwhile, Neural Architecture Search (NAS), which can design lightweight networks beyond artificial ones, has achieved optimal performance in various tasks. To design high-performance binary networks, we propose an efficient binary neural architecture search algorithm, namely EBNAS. In this paper, we propose corresponding improvement strategies to deal with the information loss due to binarization, the discrete error between search and evaluation, and the imbalanced operation advantage in the search space. Specifically, we adopt a new search space consisting of operations suitable for the binary domain. An L2 path regularization and a variance-based edge regularization are introduced to guide the search process and drive architecture parameters toward discretization. In addition, we present a search space simplification strategy and adjust the channel sampling proportions to balance the advantages of different operations. We perform extensive experiments on CIFAR10, CIFAR100, and ImageNet datasets. The results demonstrate the effectiveness of our proposed methods. For example, with binary weights and activations, EBNAS achieves a Top-1 accuracy of 95.61% on CIFAR10, 78.10% on CIFAR100, and 67.8% on ImageNet. With a similar number of model parameters, our algorithm outperforms other binary NAS methods in terms of accuracy and efficiency. Compared with manually designed binary networks, our algorithm remains competitive. The code is available at https://github.com/sscckk/EBNAS.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Eminaga, Okyaz; Abbas, Mahmoud; Shen, Jeanne; Laurie, Mark; Brooks, James D.; Liao, Joseph C.; Rubin, Daniel L.
PlexusNet: A neural network architectural concept for medical image classification Journal Article
In: Computers in Biology and Medicine, pp. 106594, 2023, ISSN: 0010-4825.
@article{EMINAGA2023106594,
title = {PlexusNet: A neural network architectural concept for medical image classification},
author = {Okyaz Eminaga and Mahmoud Abbas and Jeanne Shen and Mark Laurie and James D. Brooks and Joseph C. Liao and Daniel L. Rubin},
url = {https://www.sciencedirect.com/science/article/pii/S0010482523000598},
doi = {https://doi.org/10.1016/j.compbiomed.2023.106594},
issn = {0010-4825},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Computers in Biology and Medicine},
pages = {106594},
abstract = {State-of-the-art (SOTA) convolutional neural network models have been widely adapted in medical imaging and applied to address different clinical problems. However, the complexity and scale of such models may not be justified in medical imaging and subject to the available resource budget. Further increasing the number of representative feature maps for the classification task decreases the model explainability. The current data normalization practice is fixed prior to model development and discounting the specification of the data domain. Acknowledging these issues, the current work proposed a new scalable model family called PlexusNet; the block architecture and model scaling by the network's depth, width, and branch regulate PlexusNet's architecture. The efficient computation costs outlined the dimensions of PlexusNet scaling and design. PlexusNet includes a new learnable data normalization algorithm for better data generalization. We applied a simple yet effective neural architecture search to design PlexusNet tailored to five clinical classification problems that achieve a performance noninferior to the SOTA models ResNet-18 and EfficientNet B0/1. It also does so with lower parameter capacity and representative feature maps in ten-fold ranges than the smallest SOTA models with comparable performance. The visualization of representative features revealed distinguishable clusters associated with categories based on latent features generated by PlexusNet. The package and source code are at https://github.com/oeminaga/PlexusNet.git.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Dong, Peijie; Niu, Xin; Li, Lujun; Tian, Zhiliang; Wang, Xiaodong; Wei, Zimian; Pan, Hengyue; Li, Dongsheng
RD-NAS: Enhancing One-shot Supernet Ranking Ability via Ranking Distillation from Zero-cost Proxies Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-09850,
title = {RD-NAS: Enhancing One-shot Supernet Ranking Ability via Ranking Distillation from Zero-cost Proxies},
author = {Peijie Dong and Xin Niu and Lujun Li and Zhiliang Tian and Xiaodong Wang and Zimian Wei and Hengyue Pan and Dongsheng Li},
url = {https://doi.org/10.48550/arXiv.2301.09850},
doi = {10.48550/arXiv.2301.09850},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.09850},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Zolnouri, Mahdi; Lakhmiri, Dounia; Tribes, Christophe; Sari, Eyyüb; Digabel, Sébastien Le
Efficient Training Under Limited Resources Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-09264,
title = {Efficient Training Under Limited Resources},
author = {Mahdi Zolnouri and Dounia Lakhmiri and Christophe Tribes and Eyyüb Sari and Sébastien Le Digabel},
url = {https://doi.org/10.48550/arXiv.2301.09264},
doi = {10.48550/arXiv.2301.09264},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.09264},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Thomas, Jibin B.; K.V., Shihabudheen
Neural architecture search algorithm to optimize deep Transformer model for fault detection in electrical power distribution systems Journal Article
In: Engineering Applications of Artificial Intelligence, vol. 120, pp. 105890, 2023, ISSN: 0952-1976.
@article{THOMAS2023105890,
title = {Neural architecture search algorithm to optimize deep Transformer model for fault detection in electrical power distribution systems},
author = {Jibin B. Thomas and Shihabudheen K.V.},
url = {https://www.sciencedirect.com/science/article/pii/S095219762300074X},
doi = {https://doi.org/10.1016/j.engappai.2023.105890},
issn = {0952-1976},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Engineering Applications of Artificial Intelligence},
volume = {120},
pages = {105890},
abstract = {This paper proposes a neural architecture search algorithm for obtaining an optimum Transformer model to detect and localize different power system faults and uncertain conditions, such as symmetrical shunt faults, unsymmetrical shunt faults, high-impedance faults, switching conditions (capacitor switching, load switching, transformer switching, DG switching and feeder switching), insulator leakage and transformer inrush current in a distribution system. The Transformer model was proposed to tackle the high memory consumption of the deep CNN attention models and the long-term dependency problem of the RNN attention models. There exist different types of attention mechanisms and feedforward networks for designing a Transformer architecture. Hand engineering of these layers can be inefficient and time-consuming. Therefore, this paper makes use of the Differential Architecture Search (DARTS) algorithm to automatically generate optimal Transformer architectures with less search time cost. The algorithm achieves this by making the search process differentiable to architecture hyperparameters thus making the network search process an end-to-end problem. The proposed model attempts to automatically detect faults in a bus using current measurements from distant monitoring points. The proposed fault analysis was conducted on the standard IEEE 14 bus distribution system and the VSB power line fault detection database. The proposed model was found to produce better performance on the test database when evaluated using F1-Score (99.4% for fault type classification and 97.7% for fault location classification), Matthews Correlation Coefficient (MCC) (99.3% for fault type classification and 97.6% for fault location classification), accuracy and Area Under the Curve (AUC). The architecture transferability of the proposed method was also studied using real-world power line data for fault detection.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
An, Yang; Zhang, Changsheng; Zheng, Xuanyu
Knowledge reconstruction assisted evolutionary algorithm for neural network architecture search Journal Article
In: Knowledge-Based Systems, vol. 264, pp. 110341, 2023, ISSN: 0950-7051.
@article{AN2023110341,
title = {Knowledge reconstruction assisted evolutionary algorithm for neural network architecture search},
author = {Yang An and Changsheng Zhang and Xuanyu Zheng},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123000916},
doi = {https://doi.org/10.1016/j.knosys.2023.110341},
issn = {0950-7051},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Knowledge-Based Systems},
volume = {264},
pages = {110341},
abstract = {Neural architecture search (NAS) aims to provide a manual-free search method for obtaining robust and high-performance neural network structures. However, limited search space, weak empirical reusability, and low search efficiency limit the performance of NAS. This study proposes an evolutionary knowledge-reconstruction-assisted method for neural network architecture searches. First, a search space construction method based on network blocks with a-priori knowledge of the network morphism is proposed. This can reduce the computational burden and the time required for the search process while increasing the diversity of the search space. Next, a hierarchical variable-length coding strategy is designed for application to the complete evolutionary algorithm; this strategy divides the neural network into two layers for coding, satisfies the need for decoding with neural network weights, and achieves coding of neural network structures with different depths. Furthermore, the complete differential evolution algorithm is used as the search strategy, thus providing a new possibility of using the search space based on network morphism for applications related to evolutionary algorithms. In addition, the results of comparison experiments conducted on CIFAR10 and CIFAR100 indicate that the neural networks obtained using this method achieve similar or better classification accuracy compared with other neural network structure search algorithms and manually designed networks, while effectively reducing computational time and resource requirements.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Li, Guihong; Yang, Yuedong; Bhardwaj, Kartikeya; Marculescu, Radu
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Inproceedings
In: ICLR 2023, 2023.
@inproceedings{DBLP:journals/corr/abs-2301-11300,
title = {ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients},
author = {Guihong Li and Yuedong Yang and Kartikeya Bhardwaj and Radu Marculescu},
url = {https://doi.org/10.48550/arXiv.2301.11300},
doi = {10.48550/arXiv.2301.11300},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {ICLR 2023},
journal = {CoRR},
volume = {abs/2301.11300},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Liu, Yukun; Li, Ta; Zhang, Pengyuan; Yan, Yonghong
LWMD: A Comprehensive Compression Platform for End-to-End Automatic Speech Recognition Models Journal Article
In: Applied Sciences, vol. 13, no. 3, 2023, ISSN: 2076-3417.
@article{app13031587,
title = {LWMD: A Comprehensive Compression Platform for End-to-End Automatic Speech Recognition Models},
author = {Yukun Liu and Ta Li and Pengyuan Zhang and Yonghong Yan},
url = {https://www.mdpi.com/2076-3417/13/3/1587},
doi = {10.3390/app13031587},
issn = {2076-3417},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Applied Sciences},
volume = {13},
number = {3},
abstract = {Recently end-to-end (E2E) automatic speech recognition (ASR) models have achieved promising performance. However, existing models tend to adopt increasing model sizes and suffer from expensive resource consumption for real-world applications. To compress E2E ASR models and obtain smaller model sizes, we propose a comprehensive compression platform named LWMD (light-weight model designing), which consists of two essential parts: a light-weight architecture search (LWAS) framework and a differentiable structured pruning (DSP) algorithm. On the one hand, the LWAS framework adopts the neural architecture search (NAS) technique to automatically search light-weight architectures for E2E ASR models. By integrating different architecture topologies of existing models together, LWAS designs a topology-fused search space. Furthermore, combined with the E2E ASR training criterion, LWAS develops a resource-aware search algorithm to select light-weight architectures from the search space. On the other hand, given the searched architectures, the DSP algorithm performs structured pruning to reduce parameter numbers further. With a Gumbel re-parameter trick, DSP builds a stronger correlation between the pruning criterion and the model performance than conventional pruning methods. And an attention-similarity loss function is further developed for better performance. On two mandarin datasets, Aishell-1 and HKUST, the compression results are well evaluated and analyzed to demonstrate the effectiveness of the LWMD platform.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Son, David; Putter, Floran; Vogel, Sebastian; Corporaal, Henk
BOMP-NAS: Bayesian Optimization Mixed Precision NAS Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2301-11810,
title = {BOMP-NAS: Bayesian Optimization Mixed Precision NAS},
author = {David Son and Floran Putter and Sebastian Vogel and Henk Corporaal},
url = {https://doi.org/10.48550/arXiv.2301.11810},
doi = {10.48550/arXiv.2301.11810},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2301.11810},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wang, Yawen; Zhang, Shihua
In: Mathematics, vol. 11, no. 3, 2023, ISSN: 2227-7390.
@article{math11030729,
title = {Prediction of Tumor Lymph Node Metastasis Using Wasserstein Distance-Based Generative Adversarial Networks Combing with Neural Architecture Search for Predicting},
author = {Yawen Wang and Shihua Zhang},
url = {https://www.mdpi.com/2227-7390/11/3/729},
doi = {10.3390/math11030729},
issn = {2227-7390},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Mathematics},
volume = {11},
number = {3},
abstract = {Long non-coding RNAs (lncRNAs) play an important role in development and gene expression and can be used as genetic indicators for cancer prediction. Generally, lncRNA expression profiles tend to have small sample sizes with large feature sizes; therefore, insufficient data, especially the imbalance of positive and negative samples, often lead to inaccurate prediction results. In this study, we developed a predictor WGAN-psoNN, constructed with the Wasserstein distance-based generative adversarial network (WGAN) and particle swarm optimization neural network (psoNN) algorithms to predict lymph node metastasis events in tumors by using lncRNA expression profiles. To overcome the complicated manual parameter adjustment process, this is the first time the neural network architecture search (NAS) method has been used to automatically set network parameters and predict lymph node metastasis events via deep learning. In addition, the algorithm makes full use of the advantages of WGAN to generate samples to solve the problem of imbalance between positive and negative samples in the data set. On the other hand, by constructing multiple GAN networks, Wasserstein distance was used to select the optimal sample generation. Comparative experiments were conducted on eight representative cancer-related lncRNA expression profile datasets; the prediction results demonstrate the effectiveness and robustness of the newly proposed method. Thus, the model dramatically reduces the requirement for deep learning for data quantity and the difficulty of architecture selection and has the potential to be applied to other classification problems.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Vidnerová, Petra; Kalina, Jan
Multi-objective Bayesian Optimization for Neural Architecture Search Inproceedings
In: Rutkowski, Leszek; Scherer, Rafał; Korytkowski, Marcin; Pedrycz, Witold; Tadeusiewicz, Ryszard; Zurada, Jacek M. (Ed.): Ärtificial Intelligence and Soft Computing", pp. 144–153, Springer International Publishing, Cham, 2023, ISBN: 978-3-031-23492-7.
@inproceedings{10.1007/978-3-031-23492-7_13,
title = {Multi-objective Bayesian Optimization for Neural Architecture Search},
author = {Petra Vidnerová and Jan Kalina},
editor = {Leszek Rutkowski and Rafał Scherer and Marcin Korytkowski and Witold Pedrycz and Ryszard Tadeusiewicz and Jacek M. Zurada},
url = {https://link.springer.com/chapter/10.1007/978-3-031-23492-7_13},
isbn = {978-3-031-23492-7},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {Ärtificial Intelligence and Soft Computing"},
pages = {144--153},
publisher = {Springer International Publishing},
address = {Cham},
abstract = {Ä novel multi-objective algorithm denoted as MO-BayONet is proposed for the Neural Architecture Search (NAS) in this paper. The method based on Bayesian optimization encodes the candidate architectures directly as lists of layers and constructs an extra feature vector for the corresponding surrogate model. The general method allows to accompany the search for the optimal network by additional criteria besides the network performance. The NAS method is applied to combine classification accuracy with network size on two benchmark datasets here. The results indicate that MO-BayONet is able to outperform an available genetic algorithm based approach."},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Huang, Lan; Sun, Shiqi; Zeng, Jia; Wang, Wencong; Pang, Wei; Wang, Kangping
U-DARTS: Uniform-space differentiable architecture search Journal Article
In: Information Sciences, vol. 628, pp. 339-349, 2023, ISSN: 0020-0255.
@article{HUANG2023339,
title = {U-DARTS: Uniform-space differentiable architecture search},
author = {Lan Huang and Shiqi Sun and Jia Zeng and Wencong Wang and Wei Pang and Kangping Wang},
url = {https://www.sciencedirect.com/science/article/pii/S002002552300141X},
doi = {https://doi.org/10.1016/j.ins.2023.01.129},
issn = {0020-0255},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Information Sciences},
volume = {628},
pages = {339-349},
abstract = {Differentiable architecture search (DARTS) is an effective neural architecture search algorithm based on gradient descent. However, there are two limitations in DARTS. First, a small proxy search space is exploited due to memory and computational resource constraints. Second, too many simple operations are preferred, which leads to the network deterioration. In this paper, we propose a uniform-space differentiable architecture search, named U-DARTS, to address the above problems. In one hand, the search space is redesigned to enable the search and evaluation of the architectures in the same space, and the new search space couples with a sampling and parameter sharing strategy to reduce resource overheads. This means that various cell structures are explored directly rather than cells with same structure are stacked to compose the network. In another hand, a regularization method, which takes the depth and the complexity of the operations into account, is proposed to prevent network deterioration. Our experiments show that U-DARTS is able to find excellent architectures. Specifically, we achieve an error rate of 2.59% with 3.3M parameters on CIFAR-10. The code is released in https://github.com/Sun-Shiqi/U-DARTS.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Ji, Zhanghexuan; Guo, Dazhou; Wang, Puyang; Yan, Ke; Lu, Le; Xu, Minfeng; Zhou, Jingren; Wang, Qifeng; Ge, Jia; Gao, Mingchen; Ye, Xianghua; Jin, Dakai
Continual Segment: Towards a Single, Unified and Accessible Continual Segmentation Model of 143 Whole-body Organs in CT Scans Technical Report
2023.
@techreport{https://doi.org/10.48550/arxiv.2302.00162,
title = {Continual Segment: Towards a Single, Unified and Accessible Continual Segmentation Model of 143 Whole-body Organs in CT Scans},
author = {Zhanghexuan Ji and Dazhou Guo and Puyang Wang and Ke Yan and Le Lu and Minfeng Xu and Jingren Zhou and Qifeng Wang and Jia Ge and Mingchen Gao and Xianghua Ye and Dakai Jin},
url = {https://arxiv.org/abs/2302.00162},
doi = {10.48550/ARXIV.2302.00162},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {arXiv},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Kang, Jeon-Seong; Kang, JinKyu; Kim, Jung-Jun; Jeon, Kwang-Woo; Chung, Hyun-Joon; Park, Byung-Hoon
Neural Architecture Search Survey: A Computer Vision Perspective Journal Article
In: Sensors, vol. 23, no. 3, 2023, ISSN: 1424-8220.
@article{s23031713,
title = {Neural Architecture Search Survey: A Computer Vision Perspective},
author = {Jeon-Seong Kang and JinKyu Kang and Jung-Jun Kim and Kwang-Woo Jeon and Hyun-Joon Chung and Byung-Hoon Park},
url = {https://www.mdpi.com/1424-8220/23/3/1713},
doi = {10.3390/s23031713},
issn = {1424-8220},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Sensors},
volume = {23},
number = {3},
abstract = {In recent years, deep learning (DL) has been widely studied using various methods across the globe, especially with respect to training methods and network structures, proving highly effective in a wide range of tasks and applications, including image, speech, and text recognition. One important aspect of this advancement is involved in the effort of designing and upgrading neural architectures, which has been consistently attempted thus far. However, designing such architectures requires the combined knowledge and know-how of experts from each relevant discipline and a series of trial-and-error steps. In this light, automated neural architecture search (NAS) methods are increasingly at the center of attention; this paper aimed at summarizing the basic concepts of NAS while providing an overview of recent studies on the applications of NAS. It is worth noting that most previous survey studies on NAS have been focused on perspectives of hardware or search strategies. To the best knowledge of the present authors, this study is the first to look at NAS from a computer vision perspective. In the present study, computer vision areas were categorized by task, and recent trends found in each study on NAS were analyzed in detail.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Rampavan, Medipelly; Ijjina, Earnest Paul
Genetic brake-net: Deep learning based brake light detection for collision avoidance using genetic algorithm Journal Article
In: Knowledge-Based Systems, pp. 110338, 2023, ISSN: 0950-7051.
@article{RAMPAVAN2023110338,
title = {Genetic brake-net: Deep learning based brake light detection for collision avoidance using genetic algorithm},
author = {Medipelly Rampavan and Earnest Paul Ijjina},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123000886},
doi = {https://doi.org/10.1016/j.knosys.2023.110338},
issn = {0950-7051},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Knowledge-Based Systems},
pages = {110338},
abstract = {Automobiles are the primary means of transportation and increased traffic leads to the emphasis on techniques for safe transportation. Vehicle brake light detection is essential to avoid collisions among vehicles. Even though motorcycles are a common mode of transportation in many developing countries, little research has been done on motorcycle brake light detection. The effectiveness of Deep Neural Network (DNN) models has led to their adoption in different domains. The efficiency of the manually designed DNN architecture is dependent on the expert’s insight on optimality, which may not lead to an optimal model. Recently, Neural Architecture Search (NAS) has emerged as a method for automatically generating a task-specific backbone for object detection and classification tasks. In this work, we propose a genetic algorithm based NAS approach to construct a Mask R-CNN based object detection model. We designed the search space to include the architecture of the backbone in Mask R-CNN along with attributes used in training the object detection model. Genetic algorithm is used to explore the search space to find the optimal backbone architecture and training attributes. We achieved a mean accuracy of 97.14% and 89.44% for detecting brake light status for two-wheelers (on NITW-MBS dataset) and four-wheelers (on CaltechGraz dataset) respectively. The experimental study suggests that the architecture obtained using the proposed approach exhibits superior performance compared to existing models.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Sarti, Simone; Lomurno, Eugenio; Falanti, Andrea; Matteucci, Matteo
Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-01888,
title = {Enhancing Once-For-All: A Study on Parallel Blocks, Skip Connections and Early Exits},
author = {Simone Sarti and Eugenio Lomurno and Andrea Falanti and Matteo Matteucci},
url = {https://doi.org/10.48550/arXiv.2302.01888},
doi = {10.48550/arXiv.2302.01888},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.01888},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Kulbach, Cedric Peter Charles
Adaptive Automated Machine Learning PhD Thesis
Karlsruher Institut für Technologie (KIT), 2023.
@phdthesis{Kulbach2023_1000155322,
title = {Adaptive Automated Machine Learning},
author = {Cedric Peter Charles Kulbach},
url = {https://publikationen.bibliothek.kit.edu/1000155322},
doi = {10.5445/IR/1000155322},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
publisher = {Karlsruher Institut für Technologie (KIT)},
school = {Karlsruher Institut für Technologie (KIT)},
keywords = {},
pubstate = {published},
tppubtype = {phdthesis}
}
Wang, Chao; Jiao, Licheng; Zhao, Jiaxuan; Li, Lingling; Liu, Xu; Liu, Fang; Yang, Shuyuan
Bi-level Multi-objective Evolutionary Learning: A Case Study on Multi-task Graph Neural Topology Search Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-02565,
title = {Bi-level Multi-objective Evolutionary Learning: A Case Study on Multi-task Graph Neural Topology Search},
author = {Chao Wang and Licheng Jiao and Jiaxuan Zhao and Lingling Li and Xu Liu and Fang Liu and Shuyuan Yang},
url = {https://doi.org/10.48550/arXiv.2302.02565},
doi = {10.48550/arXiv.2302.02565},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.02565},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Gao, Yang; Zhang, Peng; Zhou, Chuan; Yang, Hong; Li, Zhao; Hu, Yue; Yu, Philip S.
HGNAS++: Efficient Architecture Search for Heterogeneous Graph Neural Networks Journal Article
In: IEEE Transactions on Knowledge and Data Engineering, pp. 1-14, 2023.
@article{10040227,
title = {HGNAS++: Efficient Architecture Search for Heterogeneous Graph Neural Networks},
author = {Yang Gao and Peng Zhang and Chuan Zhou and Hong Yang and Zhao Li and Yue Hu and Philip S. Yu},
doi = {10.1109/TKDE.2023.3239842},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {IEEE Transactions on Knowledge and Data Engineering},
pages = {1-14},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Zhou, Yuan; Hao, Jieke; Huo, Shuwei; Wang, Boyu; Ge, Leijiao; Kung, Sun-Yuan
Automatic Metric Search for Few-Shot Learning Journal Article
In: IEEE Transactions on Neural Networks and Learning Systems, pp. 1-12, 2023.
@article{10040944,
title = {Automatic Metric Search for Few-Shot Learning},
author = {Yuan Zhou and Jieke Hao and Shuwei Huo and Boyu Wang and Leijiao Ge and Sun-Yuan Kung},
doi = {10.1109/TNNLS.2023.3238729},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {IEEE Transactions on Neural Networks and Learning Systems},
pages = {1-12},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Äkiva-Hochman, Ruth; Finder, Shahaf E.; Turek, Javier S.; Treister, Eran"
Searching for N:M Fine-grained Sparsity of Weights and Activations in Neural Networks Inproceedings
In: Karlinsky, Leonid; Michaeli, Tomer; Nishino, Ko (Ed.): Computer Vision -- ECCV 2022 Workshops, pp. 130–143, Springer Nature Switzerland, Cham, 2023, ISBN: 978-3-031-25082-8.
@inproceedings{10.1007/978-3-031-25082-8_9,
title = {Searching for N:M Fine-grained Sparsity of Weights and Activations in Neural Networks},
author = {Ruth Äkiva-Hochman and Shahaf E. Finder and Javier S. Turek and Eran" Treister},
editor = {Leonid Karlinsky and Tomer Michaeli and Ko Nishino},
isbn = {978-3-031-25082-8},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {Computer Vision -- ECCV 2022 Workshops},
pages = {130--143},
publisher = {Springer Nature Switzerland},
address = {Cham},
abstract = {Sparsity in deep neural networks has been extensively studied to compress and accelerate models for environments with limited resources. The general approach of pruning aims at enforcing sparsity on the obtained model, with minimal accuracy loss, but with a sparsity structure that enables acceleration on hardware. The sparsity can be enforced on either the weights or activations of the network, and existing works tend to focus on either one for the entire network. In this paper, we suggest a strategy based on Neural Architecture Search (NAS) to sparsify both activations and weights throughout the network, while utilizing the recent approach of N:M fine-grained structured sparsity that enables practical acceleration on dedicated GPUs. We show that a combination of weight and activation pruning is superior to each option separately. Furthermore, during the training, the choice between pruning the weights of activations can be motivated by practical inference costs (e.g., memory bandwidth). We demonstrate the efficiency of the approach on several image classification datasets.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Ünal, Hamit Taner; Başçiftçi, Fatih
Neural Logic Circuits: An evolutionary neural architecture that can learn and generalize Journal Article
In: Knowledge-Based Systems, vol. 265, pp. 110379, 2023, ISSN: 0950-7051.
@article{UNAL2023110379,
title = {Neural Logic Circuits: An evolutionary neural architecture that can learn and generalize},
author = {Hamit Taner Ünal and Fatih Başçiftçi},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123001296},
doi = {https://doi.org/10.1016/j.knosys.2023.110379},
issn = {0950-7051},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Knowledge-Based Systems},
volume = {265},
pages = {110379},
abstract = {We introduce Neural Logic Circuits (NLC), an evolutionary, weightless, and learnable neural architecture loosely inspired by the neuroplasticity of the brain. This new paradigm achieves learning by evolution of its architecture through reorganization of augmenting synaptic connections and generation of artificial neurons functioning as logic gates. These neural units mimic biological nerve cells stimulated by binary input signals and emit excitatory or inhibitory pulses, thus executing the “all-or-none” character of their natural counterparts. Unlike Artificial Neural Networks (ANN), our model achieves generalization ability without intensive weight training and dedicates computational resources solely to building network architecture with optimal connectivity. We evaluated our model on well-known binary classification datasets using advanced performance metrics and compared results with modern and competitive machine learning algorithms. Extensive experimental data reveal remarkable superiority of our initial model, called NLCv1, on all test instances, achieving outstanding results for implementation of this new paradigm.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Jin, Guangyin; Sha, Hengyu; Xi, Zhexu; Huang, Jincai
Urban hotspot forecasting via automated spatio-temporal information fusion Journal Article
In: Applied Soft Computing, vol. 136, pp. 110087, 2023, ISSN: 1568-4946.
@article{JIN2023110087,
title = {Urban hotspot forecasting via automated spatio-temporal information fusion},
author = {Guangyin Jin and Hengyu Sha and Zhexu Xi and Jincai Huang},
url = {https://www.sciencedirect.com/science/article/pii/S1568494623001059},
doi = {https://doi.org/10.1016/j.asoc.2023.110087},
issn = {1568-4946},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Applied Soft Computing},
volume = {136},
pages = {110087},
abstract = {Urban hotspot forecasting is one of the most important tasks for resource scheduling and security in future smart cities. Most previous works employed fixed neural architectures based on many complicated spatial and temporal learning modules. However, designing appropriate neural architectures is challenging for urban hotspot forecasting. One reason is that there is currently no adequate support system for how to fuse multi-scale spatio-temporal information rationally by integrating different spatial and temporal learning modules. Another one is that the empirical fixed neural architecture is difficult to adapt to different data scenarios from different domains or cities. To address the above problems, we propose a novel framework based on neural architecture search for urban hotspot forecasting, namely Automated Spatio-Temporal Information Fusion Neural Network (ASTIF-Net). In the search space of our ASTIF-Net, normal convolution and graph convolution operations are adopted to capture spatial geographic neighborhood dependencies and spatial semantic neighborhood dependencies, and different types of temporal convolution operations are adopted to capture short-term and long-term temporal dependencies. In addition to combining spatio-temporal learning operations from different scales, ASTIF-Net can also search appropriate fusion methods for aggregating multi-scale spatio-temporal hidden information. We conduct extensive experiments to evaluate ASTIF-Net on three real-world urban hotspot datasets from different domains to demonstrate that our proposed model can obtain effective neural architectures and achieve superior performance (about 5%∼10% improvements) compared with the existing state-of-art baselines.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Huang, Mingqiang; Liu, Yucen; Huang, Sixiao; Li, Kai; Wu, Qiuping; Yu, Hao
Multi-Bit-Width CNN Accelerator with Systolic-in-Systolic Dataflow and Single DSP Multiple Multiplication Scheme Inproceedings
In: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 229, Association for Computing Machinery, Monterey, CA, USA, 2023, ISBN: 9781450394178.
@inproceedings{10.1145/3543622.3573209,
title = {Multi-Bit-Width CNN Accelerator with Systolic-in-Systolic Dataflow and Single DSP Multiple Multiplication Scheme},
author = {Mingqiang Huang and Yucen Liu and Sixiao Huang and Kai Li and Qiuping Wu and Hao Yu},
url = {https://doi.org/10.1145/3543622.3573209},
doi = {10.1145/3543622.3573209},
isbn = {9781450394178},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
booktitle = {Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays},
pages = {229},
publisher = {Association for Computing Machinery},
address = {Monterey, CA, USA},
series = {FPGA '23},
abstract = {Multi-bit-width neural network enlightens a promising method for high performance yet energy efficient edge computing due to its balance between software algorithm accuracy and hardware efficiency. To date, FPGA has been one of the core hardware platforms for deploying various neural networks. However, it is still difficult to fully make use of the dedicated digital signal processing (DSP) blocks in FPGA for accelerating the multi-bit-width network. In this work, we develop state-of-the-art multi-bit-width convolutional neural network accelerator with novel systolic-in-systolic type of dataflow and single DSP multiple multiplication (SDMM) INT2/4/8 execution scheme. Multi-level optimizations have also been adopted to further improve the performance, including group-vector systolic array for maximizing the circuit efficiency as well as minimizing the systolic delay, and differential neural architecture search (NAS) method for the high accuracy multi-bit-width network generation. The proposed accelerator has been practically deployed on Xilinx ZCU102 with accelerating NAS optimized VGG16 and Resnet18 networks as case studies. Average performance on accelerating the convolutional layer in VGG16 and Resnet18 is 1289GOPs and 1155GOPs, respectively. Throughput for running the full multi-bit-width VGG16 network is 870.73 GOPS at 250MHz, which has exceeded all of previous CNN accelerators on the same platform.},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
Wang, Xueying; Li, Guangli; Ma, Xiu; Feng, Xiaobing
Facilitating hardware-aware neural architecture search with learning-based predictive models Journal Article
In: Journal of Systems Architecture, vol. 137, pp. 102838, 2023, ISSN: 1383-7621.
@article{WANG2023102838,
title = {Facilitating hardware-aware neural architecture search with learning-based predictive models},
author = {Xueying Wang and Guangli Li and Xiu Ma and Xiaobing Feng},
url = {https://www.sciencedirect.com/science/article/pii/S1383762123000176},
doi = {https://doi.org/10.1016/j.sysarc.2023.102838},
issn = {1383-7621},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Journal of Systems Architecture},
volume = {137},
pages = {102838},
abstract = {Neural architecture search (NAS), which automatically explores the efficient model design, has achieved ground-breaking advances in recent years. To achieve the optimal model latency on deployment platforms, a performance tuning process is usually needed to select reasonable parameters and implementations for each neural network operator. As the tuning process is time-consuming, it is impractical for tuning each candidate architecture generated in the search procedure. Recent NAS systems usually utilize theoretical metrics or rule-based heuristics on-device latency to approximately estimate the model performance. Nevertheless, we discovered that there is still a gap between the estimated latency and the optimal latency, potentially causing a sub-optimal solution for neural architecture search. This paper presents an accurate and efficient approach for estimating the practical model latency on target platforms, which employs lightweight learning-based predictive models (LBPMs) to assist to obtain the realistic deployment-time model latency with acceptable run-time overhead, thereby facilitating hardware-aware neural architecture search. We propose an LBPM-based NAS framework, LBPM-NAS, and evaluate it by searching model architectures for ImageNet classification and facial landmark localization tasks on various hardware platforms. Experimental results show that the LBPM-NAS achieves up to 2.4× performance boost compared with the baselines under the same-level accuracy.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Chatzianastasis, Michail; Ilias, Loukas; Askounis, Dimitris; Vazirgiannis, Michalis
Neural Architecture Search with Multimodal Fusion Methods for Diagnosing Dementia Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-05894,
title = {Neural Architecture Search with Multimodal Fusion Methods for Diagnosing Dementia},
author = {Michail Chatzianastasis and Loukas Ilias and Dimitris Askounis and Michalis Vazirgiannis},
url = {https://doi.org/10.48550/arXiv.2302.05894},
doi = {10.48550/arXiv.2302.05894},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.05894},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Zhu, Xunyu; Li, Jian; Liu, Yong; Wang, Weiping
Improving Differentiable Architecture Search via Self-Distillation Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-05629,
title = {Improving Differentiable Architecture Search via Self-Distillation},
author = {Xunyu Zhu and Jian Li and Yong Liu and Weiping Wang},
url = {https://doi.org/10.48550/arXiv.2302.05629},
doi = {10.48550/arXiv.2302.05629},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.05629},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Wang, Zhe; Yang, Fangfang; Xu, Qiang; Wang, Yongjian; Yan, Hong; Xie, Min
Capacity estimation of lithium-ion batteries based on data aggregation and feature fusion via graph neural network Journal Article
In: Applied Energy, vol. 336, pp. 120808, 2023, ISSN: 0306-2619.
@article{WANG2023120808,
title = {Capacity estimation of lithium-ion batteries based on data aggregation and feature fusion via graph neural network},
author = {Zhe Wang and Fangfang Yang and Qiang Xu and Yongjian Wang and Hong Yan and Min Xie},
url = {https://www.sciencedirect.com/science/article/pii/S0306261923001721},
doi = {https://doi.org/10.1016/j.apenergy.2023.120808},
issn = {0306-2619},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Applied Energy},
volume = {336},
pages = {120808},
abstract = {Lithium-ion batteries in electrical devices face inevitable degradation along with the long-term usage. The accompanying battery capacity estimation is crucial for battery health management. However, the hand-crafted feature engineering in traditional methods and complicated network design followed by the laborious trial in data-driven methods hinder efficient capacity estimation. In this work, the battery measurements from different sensors are organized as the graph structure and comprehensively utilized based on graph neural network. The feature fusion is further designed to enhance the network capacity. The specific data aggregation and feature fusion operations are selected by neural architecture search, which relieves the network design and increases the adaptability. Two public datasets are adopted to verify the effectiveness of the proposed scheme. Additional discussions are conducted to emphasize the capability of the graph neural network and the necessity of architecture searching. The comparison analysis and the performance under noisy environment further demonstrate the superiority of proposed scheme.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Ismail, Walaa N.; Alsalamah, Hessah A.; Hassan, Mohammad Mehedi; Mohamed, Ebtesam
AUTO-HAR: An adaptive human activity recognition framework using an automated CNN architecture design Journal Article
In: Heliyon, vol. 9, no. 2, pp. e13636, 2023, ISSN: 2405-8440.
@article{ISMAIL2023e13636,
title = {AUTO-HAR: An adaptive human activity recognition framework using an automated CNN architecture design},
author = {Walaa N. Ismail and Hessah A. Alsalamah and Mohammad Mehedi Hassan and Ebtesam Mohamed},
url = {https://www.sciencedirect.com/science/article/pii/S2405844023008435},
doi = {https://doi.org/10.1016/j.heliyon.2023.e13636},
issn = {2405-8440},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {Heliyon},
volume = {9},
number = {2},
pages = {e13636},
abstract = {Convolutional neural networks (CNNs) have demonstrated exceptional results in the analysis of time- series data when used for Human Activity Recognition (HAR). The manual design of such neural architectures is an error-prone and time-consuming process. The search for optimal CNN architectures is considered a revolution in the design of neural networks. By means of Neural Architecture Search (NAS), network architectures can be designed and optimized automatically. Thus, the optimal CNN architecture representation can be found automatically because of its ability to overcome the limitations of human experience and thinking modes. Evolution algorithms, which are derived from evolutionary mechanisms such as natural selection and genetics, have been widely employed to develop and optimize NAS because they can handle a blackbox optimization process for designing appropriate solution representations and search paradigms without explicit mathematical formulations or gradient information. The Genetic optimization algorithm (GA) is widely used to find optimal or near-optimal solutions for difficult problems. Considering these characteristics, an efficient human activity recognition architecture (AUTO-HAR) is presented in this study. Using the evolutionary GA to select the optimal CNN architecture, the current study proposes a novel encoding schema structure and a novel search space with a much broader range of operations to effectively search for the best architectures for HAR tasks. In addition, the proposed search space provides a reasonable degree of depth because it does not limit the maximum length of the devised task architecture. To test the effectiveness of the proposed framework for HAR tasks, three datasets were utilized: UCI-HAR, Opportunity, and DAPHNET. Based on the results of this study, it has been found that the proposed method can efficiently recognize human activity with an average accuracy of 98.5% (∓1.1), 98.3%, and 99.14% (∓0.8) for UCI-HAR, Opportunity, and DAPHNET, respectively.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Gillard, Ryan; Jonany, Stephen; Miao, Yingjie; Munn, Michael; Souza, Connal; Dungay, Jonathan; Liang, Chen; So, David R.; Le, Quoc V.; Real, Esteban
Unified Functional Hashing in Automatic Machine Learning Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-05433,
title = {Unified Functional Hashing in Automatic Machine Learning},
author = {Ryan Gillard and Stephen Jonany and Yingjie Miao and Michael Munn and Connal Souza and Jonathan Dungay and Chen Liang and David R. So and Quoc V. Le and Esteban Real},
url = {https://doi.org/10.48550/arXiv.2302.05433},
doi = {10.48550/arXiv.2302.05433},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.05433},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Romero, David W.; Zeghidour, Neil
DNArch: Learning Convolutional Neural Architectures by Backpropagation Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-05400,
title = {DNArch: Learning Convolutional Neural Architectures by Backpropagation},
author = {David W. Romero and Neil Zeghidour},
url = {https://doi.org/10.48550/arXiv.2302.05400},
doi = {10.48550/arXiv.2302.05400},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.05400},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Zhang, Jinxia; Chen, Xinyi; Wei, Haikun; Zhang, Kanjian
A lightweight network for photovoltaic cell defect detection in electroluminescence images based on neural architecture search and knowledge distillation Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-07455,
title = {A lightweight network for photovoltaic cell defect detection in electroluminescence images based on neural architecture search and knowledge distillation},
author = {Jinxia Zhang and Xinyi Chen and Haikun Wei and Kanjian Zhang},
url = {https://doi.org/10.48550/arXiv.2302.07455},
doi = {10.48550/arXiv.2302.07455},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.07455},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}
Yuan, Gonglin; Wang, Bin; Xue, Bing; Zhang, Mengjie
Particle Swarm Optimization for Efficiently Evolving Deep Convolutional Neural Networks Using an Autoencoder-based Encoding Strategy Journal Article
In: IEEE Transactions on Evolutionary Computation, pp. 1-1, 2023.
@article{10045029,
title = {Particle Swarm Optimization for Efficiently Evolving Deep Convolutional Neural Networks Using an Autoencoder-based Encoding Strategy},
author = {Gonglin Yuan and Bin Wang and Bing Xue and Mengjie Zhang},
doi = {10.1109/TEVC.2023.3245322},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {IEEE Transactions on Evolutionary Computation},
pages = {1-1},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Bhattacharjee, Abhiroop; Moitra, Abhishek; Panda, Priyadarshini
XploreNAS: Explore Adversarially Robust & Hardware-efficient Neural Architectures for Non-ideal Xbars Technical Report
2023.
@techreport{DBLP:journals/corr/abs-2302-07769,
title = {XploreNAS: Explore Adversarially Robust & Hardware-efficient Neural Architectures for Non-ideal Xbars},
author = {Abhiroop Bhattacharjee and Abhishek Moitra and Priyadarshini Panda},
url = {https://doi.org/10.48550/arXiv.2302.07769},
doi = {10.48550/arXiv.2302.07769},
year = {2023},
date = {2023-01-01},
urldate = {2023-01-01},
journal = {CoRR},
volume = {abs/2302.07769},
keywords = {},
pubstate = {published},
tppubtype = {techreport}
}