Simulators play an indispensable role in an array of scientific fields involving research and development. Particularly in architectural design, simulators provide a secure and cost-effective virtual environment, enabling researchers to conduct rapid experimental analyses and evaluations. Simultaneously, simulators facilitate the acceleration of the chip design and verification processes, thereby conserving time and reducing resource expenditure. However, with the evolutionary advances in processor architectural designs—specifically, the flourishing diversifications featured in dedicated processors—the key role played by simulators in providing substantial feedback for architectural design exploration has gained prominence. This discourse provides an overview of the current developments and applications of architectural simulators, accentuating a few illustrative examples. Analyzing the techniques employed by simulators dedicated to various processors allows for a deeper understanding of the focal points and technical complexities under different architectures. Moreover, this discourse deliberates speculative assessments and critiques of vital aspects of future architectural simulator developments, aspiring to forecast their prospects in the field of processor design research.
The primary task of person Re-IDentification (ReID) is to identify and track a specific pedestrian across multiple non-overlapping cameras. With the development of deep neural networks and owing to the increasing demand for intelligent video surveillance, ReID has gradually attracted research attention. Most existing ReID methods primarily adopt labeled data for supervised training; however, the high annotation cost makes the scaling supervised ReID to large unlabeled datasets challenging. The paradigm of unsupervised ReID can significantly alleviate such issues. This can improve its applicability to real-life scenarios, enhancing its research potential. Although several ReID surveys have been published, they have primarily focused on supervised methods and their applications. This survey systematically reviews, analyzes, and summarizes existing ReID studies to provide a reference for researchers in this field. First, the ReID methods are comprehensively reviewed in an unsupervised setting. Based on the availability of source domain labels, the unsupervised ReID methods are categorized into unsupervised domain adaptation methods and fully unsupervised methods. Additionally, their merits and drawbacks are discussed. Subsequently, the benchmark datasets widely evaluated in ReID research are summarized, and the performance of different ReID methods on these datasets is compared. Finally, the current challenges in this field are discussed and potential future directions are proposed.
Multi-table join order selection refers to the process of determining the optimal join sequence among the tables involved in a query during query optimization, to improve execution performance. In complex queries, different join orders can significantly affect query efficiency. In the era of big data, traditional join order selection algorithms, which typically based on heuristic rules, are challenged by massive datasets, diverse application scenarios, and complex query workloads. Their inability to dynamically adapt to environmental changes or to self-improve through learning affects the generalizability of these models, often resulting in suboptimal join orders that can severely degrade query performance. With the rapid advancement of machine learning, Artificial Intelligence for Databases (AI4DB) has emerged as a transformative approach to query optimization. Machine learning-based techniques address the limitations of traditional methods by enabling self-learning and context-aware adaptations. This study first reviews classical join order selection algorithms and then analyzes their inherent limitations. Next, state-of-the-art machine learning models for multi-table join optimization are systematically summarized, detailing their core technical designs. A comparative analysis is provided in terms of effectiveness and applicable scenarios, offering valuable insights for future research in this field.
Industrial time-series forecasting is critical for optimizing production processes and enhancing decision-making. Existing deep learning-based methods often underperform in this context due to a lack of domain knowledge. Prior studies have proposed using mechanistic models to guide deep learning; however, these approaches typically consider only a single mechanistic model, ignoring scenarios with multiple time-series prediction mechanisms in industrial processes and the inherent complexity of industrial time-series (e.g., multiscale dynamics and nonlinearity). To address this issue, this study proposes a Multi-Mechanism-guided Deep Learning for Industrial Time-series Forecasting (M-MDLITF) framework based on attention mechanisms. This framework embeds multiple mechanistic models into a deep industrial time-series prediction network to guide training and integrate the strengths of different mechanisms by focusing on final predictions. As an instantiation of the M-MDLITF, the Multi-mechanism Deep Wiener (M-DeepWiener) method employs contextual sliding windows and a Transformer-encoder architecture to capture complex patterns in industrial time-series. Experimental results from a simulated dataset and two real-world datasets demonstrate that M-DeepWiener achieves high computational efficiency and robustness. It significantly outperforms the single-mechanism Deep Wiener (DeepWiener), classical Wiener mechanistic models, and purely data-driven methods, reducing the prediction error by 20% compared to DeepWiener-M1 on the simulated dataset.
High-performance computing architectures have facilitated software and hardware with multi-layer parallel structures. These multi-layered system resources can be assigned to computational tasks distributed across different vertical tiers and horizontal groups through various schemes. The allocation schemes, typically determined at runtime by user-defined parallel parameters, significantly affect computational efficiency. As computational scale and complexity increase, the configurable space for these parallel parameters expands, making it increasingly difficult for users to identify the optimal settings. Although such runtime optimization problems are prevalent in scientific computing applications, related research and effective solutions remain scarce. Using the Vienna Ab initio Simulation Package (VASP) as a case study, this study to analyze its multilayer parallel structure to demonstrate how different parallel parameter configurations can lead to significant variations in computational speed. It then proposes a fully automated optimization method based on a reduced parallel efficiency metric. This approach enables users to quickly determine optimal parallel parameters and identifies the most efficient hardware resource allocation, facilitating effective scaling for large-scale parallel computing. Finally, this study integrates the optimization method with a cluster job scheduling system and applies it to actual VASP calculation jobs submitted by users. Statistical results demonstrate that the proposed method significantly improves job execution speed and enhances the utilization efficiency of supercomputing resources, showing great promise for practical engineering applications.
Time series predicting is widely applicable in industrial production, financial decision-making, and early disaster warning. However, most existing methods primarily target stationary time series, failing to accurately capture the evolutionary characteristics of nonstationary sequences. Current approaches for nonstationary time series also inadequately extract multidimensional features and lack comprehensive dynamic perception, thereby compromising prediction accuracy. This study proposes a novel prediction mechanism for nonstationary time series to address these limitations. First, it models seasonality, local trends, and long-term trends that affect sequence stationarity to extract multidimensional hidden states. This study combines the forward—backward algorithm with Maximum Likelihood Estimation (MLE) to compute the maximum transition probabilities for state prediction. Because the mechanism incorporates multiple potential nonlinear factors influencing nonstationary sequences and calculating transition probabilities through a global state perception, it significantly improves the prediction accuracy. The effectiveness of the proposed mechanism is demonstrated through various case studies. Ablation experiments conducted on nine nonstationary time series datasets from diverse domains validate the contribution of each component to the overall prediction accuracy. Comparative results show that both the Mean Absolute Percentage Error (MAPE) and the Root Mean Square Error (RMSE) of the mechanism are consistently lower than those of baseline methods, with the Legates—McCabe index approaches 1 on financial datasets, thereby confirming its robustness and accuracy.
Automated generation of geometry proof problems is a prominent research area in the field of intelligent education. Current methods commonly employ automated geometric reasoning techniques to discover new geometric relationships based on the original geometric relations of an existing question, thereby synthesizing new relationships. However, the problems generated by these methods lack novelty because the geometric relationships they contain can be inferred from the original geometric relationships of the input problem. To address this limitation, this study proposes an approach for generating new problems by recombining the geometric relationships of two existing problems. By introducing the theory of vector identity equation, a theoretical foundation is established for representing geometric relationships and determining whether geometric relationships from different problems can be reorganized to form new problems. Consequently, algorithms are developed to automatically extract geometric relationships from existing problems and recombine them to generate novel problem instances. Experimental analysis and expert evaluation demonstrate the feasibility of the proposed method in terms of operational performance and educational application. Notably, under the same input conditions, the proposed method can generate more novel problems than existing methods.
In traffic sign detection, external environmental interference and the small size of traffic sign targets in driving scenarios hinder detection performance. This paper introduces a novel traffic sign detection algorithm that significantly improves model detection precision while ensuring real-time detection capabilities. This paper initially designs a new multi-scale feature extraction network, incorporating large-scale features to augment small target localization information, and simultaneously designs a multi-scale feature attention enhancement module to further enhance the model′s feature extraction capability. Second, to reduce the computational load and complexity of the model, this paper improves the multi-scale detection heads of the original model by selecting two large-scale detection heads for detecting small targets. Finally, the algorithm modifies the Complete Intersection over Union (CIoU) loss function to enhance its perception of small targets and improve the network′s training efficiency. On two open-source public datasets, namely the TT100K and CCTSDB 2021 traffic sign datasets, the improved model achieves enhanced detection precision for small traffic sign targets, with a mean Average Precision (mAP) of 84.8% and 83.6% on the test sets, respectively. These results show improvements of 3.0 and 3.6 percentage points over the baseline models, respectively, demonstrating the model′s higher detection performance and feature extraction capabilities while meeting the requirements for real-time detection.
Recommendation algorithms are effective in addressing information overload, a common problem in the era of big data. Existing recommendation algorithms have different degrees of effectiveness but still face the challenge of learning higher quality items and user features to enhance recommendation performance. Therefore, this paper proposes a Feature-enhanced Recommendation algorithm based on Bipartite Graph Contrastive Learning (FRBGCL). An item feature initialization module is designed that can use Graph Convolutional Network (GCN) for the representation learning of bipartite graphs of all types of item relationships, and an attention mechanism-based feature fusion strategy is adopted to obtain the initial features of items. In addition, a graph Contrastive Learning (CL) module is designed based on the construction of user-item bipartite graphs, which can further enhance item and user features, leading to an improvement in recommendation performance. On three datasets, XuetangX, Last.fm, and Yelp2018, compared with the suboptimal algorithm, FRBGCL improves the Top20 recommendation results by 2.1%, 6.8%, and 11.6% for recall; 1.8%, 6.1%, and 13.1% for Normalized Discounted Cumulative Gain (NDCG); and 1.7%, 7.8%, and 8.4% for Hit Rate (HR), with optimal parameter selection.
Speaker verification in voiceprint recognition plays a key role in real-life applications such as human-computer interaction, medical diagnosis, and online meetings. Speaker embedding techniques based on Deep Neural Network (DNN) are being increasingly used in speaker verification tasks. Speaker verification under Open-Set is a multi-classification task, which essentially involves metric learning. The performance of existing metric learning techniques highly relies on large batches of high-resolution speech samples with labeled information. To solve this problem, this paper proposes a minimization same-class distance objective algorithm based on metric learning. This algorithm based on triplet loss first introduces the octuplet loss, which captures the relationship between high-resolution and low-resolution speech using four triplet loss terms. It then applies Hard Sample Mining techniques to select appropriate data triples to improve model classification accuracy. An online data augmentation strategy is introduced to address the problem of incorrect classification, which is caused by low-resolution speech in noisy environments, using Room Impulse Response (RIR) and Music, Speech, and Noise Corpus (MUSAN) datasets for model data enhancement. Following data enhancement and after introducing octuplet loss, the algorithm performs fine-tuning training on the Emphasized Channel Attention, Propagation and Aggregation in Time Delay Neural Network based speaker verification (ECAPA-TDNN) pre-model. The fine-tuned network can process low-resolution speech information in a noisy environment and improve model performance. This method can significantly improve cross-resolution speech recognition performance on multiple datasets without affecting the model′s performance in processing high-resolution speech. The Equal Error Rate (EER) on the VoxCeleb1 and CN-Celeb1 datasets reached the optimal values, which were 1.20% and 1.61%, respectively.
With the rapid development of deep learning in recent years, the reconstruction of sparse and noisy low-quality point cloud surfaces based on deep learning has attracted increasing attention from researchers. Existing point cloud surface reconstruction models present limitations such as difficulty in reconstructing complex scenes, incomplete local detail reconstruction, and low reconstruction efficiency. To address these limitations, this paper proposes a point cloud surface reconstruction model based on dual-branch convolution and deep interpolation combined with a convolution occupancy network model. First, a fusion convolution coding module constructed using a PointNet network and dual-branch convolution is used for feature extraction. The dual-branch convolution adaptively integrates point features extracted by the point branch into the volume feature of the voxel branch to provide more fine-grained local information for the volume feature. Subsequently, query point features in combination with the characteristics of neighbor points are generated through a multi-head attention network. Further, a deep interpolation feature module is constructed to improve the accuracy of the Fully Connected (FC) layer network for feature decoding in predicting the spatial location of query points. Finally, a high-quality mesh surface model is extracted based on the Marching Cubes (MC) algorithm. Results of experiments on the object-level dataset ShapeNet and the scene-level dataset Synthetic Rooms show that the proposed model achieves an Intersection over Union (IoU) metric of 0.931 and 0.910 respectively. It outperforms the comparative models such as POCONet, ConvONet, and DP-ConvONet. On the Synthetic Rooms dataset, the average reconstruction time of the proposed model is significantly reduced compared to that of the POCONet model, and it also demonstrates good visual performance. On the object-level dataset ABC, the proposed model exhibits superior generalization performance, which proves the effectiveness of the proposed model.
Existing object detection algorithms suffer from low detection accuracy and poor real-time performance when detecting fall events in indoor scenes, owing to changes in angle and light. In response to this challenge, this study proposes an improved fall detection algorithm based on YOLOv8, called OEF-YOLO. The C2f module in YOLOv8 is improved by using a Omni-dimensional Dynamic Convolution (ODConv) module, optimizing the four dimensions of the kernel space to enhance feature extraction capabilities and effectively reduce computational burden. Simultaneously, to capture finer grained features, the Efficient Multi-scale Attention (EMA) module is introduced into the neck network to further aggregate pixel-level features and improve the network's processing ability in fall scenes. Integrating the Focal Loss idea into the Complete Intersection over Union (CIoU) loss function allows the model to pay more attention to difficult-to-classify samples and optimize overall model performance. Experimental results show that compared to YOLOv8n, OEF-YOLO achieves improvements of 1.5 and 1.4 percentage points in terms of mAP@0.5 and mAP@0.5∶0.95, the parameters and computational complexity are 3.1×106 and 6.5 GFLOPs. Frames Per Second (FPS) increases by 44 on a Graphic Processing Unit (GPU), achieving high-precision detection of fall events while also meeting deployment requirements in low computing scenarios.
As deep learning technology progresses rapidly, it is being increasingly applied in garbage classification, thereby significantly improving classification accuracy and efficiency. However, practical application is hindered by many challenges, such as high data acquisition and annotation costs, insufficient model generalizability, and difficulty in meeting real-time requirements. To address these issues, this paper proposes LSM-PPLCNet, a lightweight garbage classification algorithm combining a large visual model with PP-LCNet. LSM-PPLCNet combines the powerful feature extraction capabilities of large visual models with the design of lightweight models, ensuring that the model meets real-time requirements while achieving improved accuracy on a self-made garbage classification dataset. First, a semi-supervised training strategy based on the CLIP large model is used for data mining on unlabeled data to enrich the training samples and reduce the cost of manual annotation. Second, the knowledge distillation method is used, with the high-precision CLIP large model serving as the teacher model to guide the training of the lightweight network. Finally, the loss function is optimized, and a weighted loss based on the large model is proposed. By assigning different proportions of the loss function to different images, the model can adjust the proportions in the loss function according to the different qualities of the images. After rigorous training and testing on a self-made garbage classification dataset, experimental results show that compared with the original PP-LCNet classification model, LSM-PPLCNet improves the Top-1 Accuracy by 4.03 percentage points without affecting the inference speed and has significant advantages compared with other mainstream models. These results show that LSM-PPLCNet can achieve high-precision and high-speed classification performance in garbage classification tasks.
Steel surface defect detection technology in industrial scenarios is hindered by low detection accuracy and slow convergence speed. To address these issues, this study presents an improved YOLOv8 algorithm, namely a YOLOv8n-MDC. First, a Multi-scale Cross-fusion Network (MCN) is added to the backbone network. Establishing closer connections between the feature layers promotes uniform information transmission and reduces semantic information loss during cross-layer feature fusion, thereby enhancing the ability of the model to perceive steel defects. Second, deformable convolution is introduced in the module to adaptively change the shape and position of the convolution kernel, enabling a more flexible capture of the edge features of irregular defects, reducing information loss, and improving detection accuracy. Finally, a Coordinate Attention (CA) mechanism is added to embed position information into channel attention, solving the problem of position information loss and enabling the model to perceive the position and morphological features of defects, thereby enhancing detection precision and stability. Experimental results on the NEU-DET dataset show that the YOLOv8n-MDC algorithm achieves mAP@0.5 of 81.0%, which is 4.2 percentage points higher than that of the original baseline network. The algorithm has a faster convergence speed and higher accuracy; therefore, it meets the requirements of practical industrial production.
The analysis and mining of uncertain time series data has attracted attention in various industries. Top-k queries, a popular research Topic in the database field, aim to retrieve the Top-k results that best match a user's query conditions from large-scale data. Although Top-k queries have been extensively explored and applied in various fields, research on Top-k queries specifically for uncertain time series is limited. Such queries can effectively help users extract important information from uncertain time series. This study proposes a new Top-k query problem, i.e., Top-k window aggregate queries over uncertain time series, and provides an efficient algorithm to address this problem. This query can serve as a fundamental tool to assist users in exploring and analyzing uncertain time series. Existing methods supporting this query suffer from low efficiency or require high storage space. To address these issues, this study proposes a novel two-level Top-k query method based on the sub-window stitching strategy and a method for efficiently computing the upper bound of thresholds to solve the complexity issues introduced by the sub-window stitching strategy. This method efficiently supports Top-k window aggregate queries over uncertain time series with less pre-computed storage space. The effectiveness and efficiency of the proposed method are evaluated on both real and synthetic datasets. The results demonstrate that the proposed method significantly reduces the storage space for pre-computed lists compared with Top-k query methods based on TA, overcoming challenges that hinder practical application. The average query efficiency of the proposed method and its further optimization using the upper bound of the thresholds are 7.27 times and 20.04 times better than those of the traversal method FSEC-S, respectively.
The deployment of deep learning models at the edge is hindered by challenges such as domain offset, long-tail distribution, and limited computing resources in the training data. Therefore, domain adaptation methods must be applied for online retraining to alleviate the domain offset, and long-tail reduction techniques must be applied during retraining to alleviate the long-tail problem while considering computational costs. However, most existing long-tail reduction techniques have high computational costs or cannot be effectively combined with domain adaptation methods. To address these issued, this paper proposes EdgeTailor, a long-tail optimization method specifically designed for edge-side domain adaptation. EdgeTailor optimizes the continuous unsupervised adaptive process by using synthetic minority class oversampling techniques and class-balanced loss as strategies for tail truncation. Consequently, a buffer is introduced to address the issue of insufficient data for tail classes during online learning, allowing it to mitigate the long-tail problem while conducting online continuous domain adaptation. Experimental results demonstrate the effectiveness of EdgeTailor in edge domain adaptation tasks involving two long-tail datasets with domain shift. Using five deep neural networks as the model backbone, EdgeTailor improves average Top-1 accuracy by approximately 8.10% compared with the baseline in the target domain. In terms of computational cost, EdgeTailor maintains a low level of Floating Point Operations Per Second (FLOPs) and parameter count, reducing FLOPs by approximately 29.84% compared with the data synthesis method, with better performance than the baseline. Overall, EdgeTailor achieves high performance and low cost in addressing both domain adaptation and long-tail visual recognition challenges in edge deployment.
In the era of big data, multi-view data exist in large quantities, and most existing multi-view clustering methods aggregate the data of all views for learning. However, data from different views are stored on different devices in some practical applications, and some of the data are private and cannot be shared. If the data of each view are regarded as nodes in a distributed network, these problems can be solved by introducing federated learning into multi-view clustering. Federated learning utilizes a central server for coordination; however, it becomes invalid when the central server is missing or faulty. This paper proposes a Decentralized Federated Multi-view Clustering (DFMC) approach to address this issue. First, the low-dimensional representation of each view is learned using Non-negative Matrix Factorization (NMF). Next, a consistency constraint is applied to the low-dimensional representations of different views based on the consistency of the view information. This constraint implements information communication between neighboring views and constructs a decentralized federated learning environment. Finally, a unified low-dimensional representation matrix is obtained and applied for clustering. Privacy preservation is achieved using the Alternating Minimization (AM) algorithm for individual views separately. Experimental results on real datasets verify the effectiveness and convergence of the DFMC approach.
The Spark distributed computing framework for job computation based on memory does not consider the intermediate computation results of jobs, leading to the loss of data blocks with high-frequency access, which is especially evident in iterative job types. Spark realizes the caching function of the Least Recently Used (LRU) algorithm through a hash table provided by LinkedHashMap; the elements that have not been used for the longest time are moved to the top and deleted first, causing data recalculation. To address the issue of high-frequency access and unused hot data being replaced from the cache by the LRU cache replacement algorithm used in Spark, this paper proposes a Spark adaptive caching optimization strategy based on Resilient Distributed Dataset (RDD) reuse degree, named LCRD. It includes automatic caching and cache automatic cleaning algorithms. First, the automatic caching algorithm analyzes the Directed Acyclic Graph (DAG) of Spark before job execution, calculates the reuse frequency and operator complexity of RDD, and quantifies the factors affecting execution efficiency based on the reuse degree model. During job execution, the application caches data blocks with a higher reuse degree. Second, in the case of memory bottlenecks or invalid RDD caching, the automatic cache cleaning algorithm traverses the cache queue and cleans the data blocks with low-frequency access. Experimental results indicate that, compared to LRU, when executing PageRank iterations on four public datasets (amazon0302, email-EuAll, web-Google, and wiki-Talk), the efficiency of LCRD improves by 10.7%, 8.6%, 17.9%, and 10.6%, respectively. The average increases in memory utilization are 3%, 4%, 3%, and 5%, respectively. This proposed strategy effectively enhances Spark execution efficiency and improves memory utilization.
Business process conformance checking can help enterprises detect potential problems early and ensure the normal operation and security of business processes. To this end, a conformance checking method for business processes based on improved Bidirectional Encoder Representations from Transformers (BERT)and lightweight Convolutional Neural Network (CNN)is proposed. First, trace prefixes are extracted from historical event logs and labeled with fitness or unfitness, and a dataset is constructed accordingly. Second, the improved BERT model is used to represent the feature vectors of traces, which incorporates relative contextual relationships. Finally, the conformance check classifier, constructed using a lightweight CNN model, is used to complete online business process conformance checking. This method effectively improves the accuracy of conformance checking. Experiments were conducted using five real-life event log datasets. The results show that the proposed model's accuracy is superior to that of Word2Vec+CNN, Transformer, BERT. Furthermore, when compared with the traditional BERT+CNN, the accuracy can increase by up to 2.61%.
The communication between the Supervisory Control And Data Acquisition (SCADA) system and terminal devices in the Industrial Internet of Things (IIoT) is vulnerable to tampering, eavesdropping, forgery, and other attacks. This paper presents a threshold signcryption scheme based on the SM2 algorithm, SM-TSC, to address this problem. First, terminal devices are registered and grouped, and a group secret value distribution method is designed based on Shamir secret sharing to prevent key leakage, signcryption forgery, and other problems caused by the excessive concentration of power in the terminal device nodes in IIoT scenarios. Second, using the SM2 signature algorithm as the basis, combined with the SM3 algorithm, SM4 algorithm, and group secret value distribution methods, a secure and efficient group-oriented threshold signcryption algorithm is designed to ensure the authenticity and confidentiality of communication messages between the SCADA system and terminal device groups. Finally, under the random oracle model, a security reduction method is used to analyze the security of the SM-TSC scheme. The analysis results show that the SM-TSC scheme can achieve semantic security under adaptive chosen-ciphertext attacks and existential unforgeability under adaptive chosen-message attacks, effectively ensuring the confidentiality and authenticity of group communication data. Experimental analysis shows that, compared with other threshold signcryption schemes based on elliptic curves, the SM-TSC scheme reduces the calculation cost by 75% in the threshold signcryption stage and approximately 79.66% in the unsigncryption stage; thus, it has higher feasibility in IIoT group communication.
Malware, Web attacks, and other behaviors frequently occur on the Internet. Therefore, the large amount of user privacy information on the Internet must be prevented from leaking because of malicious network attacks. This makes network intrusion detection systems a popular research topic. Network intrusion data includes a large amount of redundant and irrelevant information. However, current detection models seldom capture the patterns and regularities in the temporal and spatial dimensions of network intrusion data. This has led to limitations in the detection performance of the models. This study establishes a new BRFE-CBIAT model for network intrusion detection by combining feature selection and feature fusion. First, a BRFE model is constructed by combining Random Forest (RF) and Recursive Feature Elimination (RFE). The BRFE model selects some features of the data after eliminating some unimportant ones, thereby reducing redundant information. Second, a CBIAT model is built for the parallel extraction of spatio-temporal features. A one-dimensional convolutional layer of a Convolutional Neural Network (CNN) is used for initial spatial feature extraction from the data. Then, a Bi-directional Long and Short-Term Memory (BiLSTM) network in the temporal features module is used to model the deep sequence data, which captures the temporal relationships between features. An improved spatial attention module is used to focus on the spatial features. Finally, a Softmax classifier is used to process the fused spatio-temporal features to obtain the classification prediction results. The BRFE-CBIAT model proposed in this study has multi-classification detection accuracies of 99.7% and 94.0% on the NSL-KDD and UNSW-NB15 datasets, respectively. This is better than the current mainstream network models. The experimental results also indicate the proposed model's effectiveness in the breakdown of multiple categories. The performances of all breakdowns are more than 96% on the NSL-KDD dataset. The F1-score for detecting DoS attacks reaches 89% on the UNSW-NB15 dataset.
This study attempts to improve the accuracy of perturbation localization when generating countermeasure samples under a black-box attack in the Chinese field and to solve the problem of existing methods ignoring context relevance and semantic density when evaluating word importance. This study proposes a text adversarial sample generation method for Chinese language with multi-level perturbation localization (MDLM). First, a set of multi-level decision models is constructed integrating different feature extraction capabilities by organically combining multi-source heterogeneous deep learning models. Then, three new evaluation functions are added to evaluate the importance of words from multiple dimensions. Finally, the multi-level decision model and the evaluation function work together to accurately position the original text disturbance points. In terms of the text countermeasure sample generation strategy, MDLM integrates a variety of text replacement strategies, such as traditional Chinese characters, Pinyin, polyphonic words, and homonyms, aiming to ensure the success rate of attacks and improve the diversity of generated countermeasure samples. Experimental results show that when MDLM attacks multiple target models on multiple datasets, its disturbance effect is significant, and the maximum attack disturbance rate reaches 43.5%, which further enhances the attack ability against samples. Simultaneously, results of ablation experiments conducted to evaluate the multi-level perturbation localization ability show that the multi-level combination of the scoring function and decision model can significantly improve the attack effect of generating countermeasure samples.
Uploading data from Internet of Things (IoT) devices to the cloud has become a mainstream data management solution. However, cloud-based data management is associated with security risks. Attribute-Based Access Control (ABAC) is considered an effective solution for safeguarding data confidentiality and preventing unauthorized access. However, existing encryption schemes are computationally burdensome and lack robust revocation mechanisms, rendering them unsuitable for dynamic IoT environments. To address these issues, this study proposes a Revocable Attribute-Based Encryption scheme Assisted by Edge and Cloud (ECA-RABE). The scheme utilizes Elliptic Curve Cryptography (ECC) to reduce computational overhead, supports decentralized attribute management among multiple authorities to eliminate single points of failure, and employs Edge Node (EN) to offload computational tasks from IoT devices. Additionally, cloud computing is leveraged for pre-decryption to reduce user-side computational pressure. The scheme incorporates both attribute and system version numbers and designs a revocation mechanism to achieve user-level attribute revocation, system-wide attribute revocation, and user revocation. Security and performance analyses demonstrate that the proposed scheme is secure under the Decisional Bilinear Diffie-Hellman (DBDH) assumption and exhibits high efficiency in encryption and decryption. Therefore, the proposed scheme is well-suited to IoT environments.
Searchable encryption technology utilizes extracted keywords as indexes to search for specific documents within a document collection. However, existing searchable encryption schemes suffer from the issues of significantly increased resource consumption with an increased number of keywords and the inability to allow index collisions in multi-user scenarios. To address the limitations of the current schemes, a searchable encryption scheme for multi-user and multi-keyword scenarios in Internet of Things (IoT) environments is proposed. Leveraging the characteristics of Bloom filters, a memory-efficient vector is employed as an index for grouping document collections, thereby enhancing the efficiency of searchable encryption while permitting index collisions. A verification ciphertext generated from encrypted keywords is used to verify whether the trapdoor contains the keywords present in a document, thereby enabling users to locate matching documents within the shared-index document collection. Based on the discrete logarithm's hardness and the Diffie-Hellman problems, the proposed scheme requires fewer computational operations for ciphertext generation at each stage. Theoretical analysis and experimental results demonstrate that the scheme is both feasible and secure, with reduced communication overhead when compared with alternative approaches.
The direct-forcing immersed boundary method is widely used for solving fluid-structure interaction problems because it can effectively handle complex geometries, including moving and deforming solids. However, three-dimensional complex flow simulations are characterized by large grid scales and consume high computational time, which hinder traditional serial algorithms running on single-core processors from meeting computational requirements. Currently, the research on fluid-structure interaction problems on domestic platforms is limited, and implementing the direct-forcing immersed boundary method on such platforms can enrich their application ecosystems. To this end, this study leverages the domestic Deep Compute Unit (DCU) accelerator and designs and implements a parallel program based on CPU-DCU heterogeneous programming, to solve fluid-structure interaction problems using the three-dimensional direct-forcing immersed boundary method. First, a serial algorithm is implemented on a CPU, and a hotspot analysis is conducted to identify the computationally intensive parts of the program, which are then accelerated using the DCU accelerator in a heterogeneous manner. Second, based on the heterogeneous implementation, the kernel functions are optimized by incorporating the hardware characteristics of the DCU, such as shared memory, loop tiling, and memory access order adjustment. Finally, the correctness and performance of the program are validated and tested through case studies involving the flow around a sphere and self-propelled swimming of a biomimetic fish. Experimental results show that at Reynolds numbers of 100 and 200, the drag coefficients of the sphere are 1.11 and 0.78, respectively, which are in good agreement with the relevant literature. In the self-propelled swimming experiment with the biomimetic fish at a Reynolds number of 7 142, the average forward velocity after stable swimming is 0.396, which is consistent with the results from the relevant literature. In the flow-around-a-sphere experiment, the parallel program achieves an 83.7-fold speed-up compared with the serial program with a grid scale of 50.33 million. These fluid-structure interaction numerical experiments verify the effectiveness and accuracy of the CPU-DCU parallel direct-forcing immersed boundary method for computations on domestic heterogeneous platforms, providing a solid foundation for research on Computational Fluid Dynamics (CFD) algorithms on domestic platforms.
When diversifying projects composed of multiple C/C++ source files, most of the existing software diversification tools adopt the same diversification method for all functions in a single C/C++ source file, which leads to a single diversification method for each function or source file and a lack of targeted diversification methods. To address this issue, a diversified compilation method combining grouping obfuscation and code awareness based on a Low-Level Virtual Machine (LLVM) intermediate representation is proposed. First, this study designs a preselection library of confusion techniques based on different perspectives, which includes various grouping schemes for confusion techniques. During compilation, code analysis and processing are performed on each traversed function to determine its confusion characteristics. Targeted diversification grouping strategies are selected, and diversification techniques within the group are randomly selected to avoid confusion. This achieves a significantly different diversification scheme for each function, making the generated heterogeneous execution set more diverse and providing basic software support for mimetic and mobile target defense technologies. To verify the method′s effectiveness, a standard test set and typical cases are selected to verify both security and performance. The results indicate that the proposed method can ensure security while having almost no impact on the performance, thus verifying the proposed method′s effectiveness and feasibility in practical applications.
The feature distributions of colorectal endoscopic images differ among devices, reducing the trained model′s segmentation performance on new devices. To alleviate the model′s adaptability to new devices, a fine-tuning method based on incremental learning and an improved colorectal polyp segmentation network called CPSegNet are proposed. The incremental learning method consists of two stages: pre-training and fine-tuning on new devices. Pre-training uses data from an old device to train the polyp segmentation network adequately, and the fine-tuning stage is trained with samples from both old and new devices. This also includes a sampling rate adjustment and a regularization loss function to prevent catastrophic forgetting. CPSegNet adopts a pre-trained MiT model as the backbone network, a Multi-Layer Perceptron (MLP) as the decoding module, and an Uncertainty Region Attention (URA) mechanism as the refinement module to optimize the ambiguous boundary regions. To validate the adaptability of the learning strategy to new devices, experiments are conducted using six datasets: Kvasir-SEG, CVC-ClinicDB, CVC-300, CVC-ColonDB, Kvasir-Sessile, and ETIS-LaribPolypDB; the first two datasets are used as the training set, and the other four are simulated data for new devices. The experimental results, using the Dice similarity coefficient and Intersection over Union (IoU) metrics as evaluation indicators, demonstrate that the performance of CPSegNet on new devices is superior to that of mainstream algorithms without incremental learning, particularly on the challenging ETIS-LaribPolypDB dataset, showing increases of 3 percentage points in the Dice similarity coefficient compared with the ColonFormer algorithm when Kvasir-SEG is used as the source domain dataset. When CVC-ClinicDB is used as the source domain dataset, the Dice similarity coefficient is improved by 6 percentage points. Furthermore, both CPSegNet and mainstream algorithms exhibit performance improvements on new devices after using incremental learning, while maintaining segmentation accuracy on old devices.
Brain tissue segmentation in Magnetic Resonance Imaging (MRI) is vital in neuroimaging analysis applications, such as diagnosis, planning, and research. Current transformer methods use the self-attention mechanism to extract features for segmentation, and their accuracy needs to be improved when the computational complexity is high. To address this issue, this paper proposes a network for MRI brain tissue segmentation that combines a query-adaptive bilevel self-attention module in a U-shaped architecture. The query-adaptive bilevel self-attention module consists of a sparse coarse-grained layer and a pixel-level self-attention layer to balance accuracy and complexity. Specifically, the coarse-grained layer utilizes self-attention to dynamically filter out irrelevant image blocks for more flexible and efficient computation, whereas the fine-grained layer applies pixel-to-pixel self-attention for high-precision segmentation. The module achieves high performance while restricting the computation cost. The algorithm is validated on popular brain MRI segmentation benchmarks and outperforms state-of-the-art methods with a Dice Similarity Coefficient (DSC) of 0.917±0.030 and HD95 of 1.196±0.613 mm. Experimental results demonstrate the effectiveness and accuracy of the algorithm in segmenting brain tissue for MRI.
This study addresses the problems of detail blurring, excessive contrast, and darkness in dehazed images generated using the classical All in One Dehazing Network (AOD-Net). The study proposes a novel multiscale image dehazing algorithm, which builds upon the improvements made to AOD-Net. In the enhanced network architecture, traditional convolutions are replaced with depth-wise separable convolutions to reduce redundant parameters for analyzing and processing foggy images at different scales, thereby better capturing image details to accelerate the computation speed, effectively reduce the memory footprint of the model, and enhance the algorithm′s dehazing efficiency. In addition, this study employs a multiscale structure to improve the network′s capability to handle image details and mitigate the blurring of details in dehazed images. Furthermore, a pyramid pooling module is incorporated into the network architecture to aggregate contextual information from different image regions, thereby enhancing the network′s ability to capture global information in hazy images and mitigate problems such as color tone distortion and detail loss. Additionally, a low-light enhancement module selectively enhances the predicted noise and improves the contrast stretch, effectively restoring noisy regions. Consequently, moderate improvements are observed in terms of the Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity (SSIM) metrics for low-light dehazed images, concurrently achieving elevated levels of overall naturalness. Experimental results demonstrate that the proposed algorithm yields satisfactory dehazing outcomes. The processed images exhibit greater naturalness and improved balance in saturation and contrast compared with the classical AOD-Net. Finally, on the SOTS subset of the RESIDE dataset, which includes both outdoor and indoor sets, the proposed algorithm achieves improvements of 4.559 3 dB and 4.065 6 dB, respectively, in terms of PSNR over the classical AOD-Net. Furthermore, it achieves improvements of 0.047 6 and 0.087 4 in terms of SSIM.
Image segmentation is a crucial technology for environmental perception, and it is widely used in various scenarios such as autonomous driving and virtual reality. With the rapid development of technology, computer vision-based blind guiding systems are attracting increasing attention as they outperform traditional solutions in terms of accuracy and stability. The semantic segmentation of road images is an essential feature of a visual guiding system. By analyzing the output of algorithms, the guiding system can understand the current environment and aid blind people in safe navigation, which helps them avoid obstacles, move efficiently, and get the optimal moving path. Visual blind guiding systems are often used in complex environments, which require high running efficiency and segmentation accuracy. However, commonly used high-precision semantic segmentation algorithms are unsuitable for use in blind guiding systems owing to their low running speed and a large number of model parameters. To solve this problem, this paper proposes a lightweight road image segmentation algorithm based on multiscale features. Unlike existing methods, the proposed model contains two feature extraction branches, namely, the Detail Branch and Semantic Branch. The Detail Branch extracts low-level detail information from the image, while the Semantic Branch extracts high-level semantic information. Multiscale features from the two branches are processed and used by the designed feature mapping module, which can further improve the feature modeling performance. Subsequently, a simple and efficient feature fusion module is designed for the fusion of features with different scales to enhance the ability of the model in terms of encoding contextual information by fusing multiscale features. A large amount of road segmentation data suitable for blind guiding scenarios are collected and labeled, and a corresponding dataset is generated. The model is trained and tested on the dataset. The experimental results show that the mean Intersection over Union (mIoU) of the proposed method is 96.5%, which is better than that of existing image segmentation models. The proposed model can achieve a running speed of 201 frames per second on NVIDIA GTX 3090Ti, which is higher than that of existing lightweight image segmentation models. The model can be deployed on NVIDIA AGX Xavier to obtain a running speed of 53 frames per second, which can meet the requirements for practical applications.
A multimodal remote sensing small-target detection method, BFMYOLO, is proposed to address misdetection and omission issues in remote sensing images with complex backgrounds and less effective information. The method utilizes a pixel-level Red-Green-Blue (RGB) and infrared (IR) image fusion module, namely, the Bimodal Fusion Module (BFM), for effectively making full use of the complementarity of different modes to realize the effective fusion of information from two modalities. In addition, a full-scale adaptive updating module, AA, is introduced to resolve multitarget information conflicts during feature fusion. This module incorporates the CARAFE up-sampling operator and shallow features to enhance non-neighboring layer fusion and improve the spatial information of small targets. An Improved task decoupling Detection Head (IDHead) is designed to handle classification and regression tasks separately, thereby reducing the mutual interference between different tasks and enhancing detection performance by fusing deeper semantic features. The proposed method adopts the Normalized Wasserstein Distance (NWD) loss function as the localization regression loss function to mitigate positional bias sensitivity. Results of experiments on the VEDAI, NWPU VHR-10, and DIOR datasets demonstrate the superior performance of the model, with mean Average Precision when the threshold is set to 0.5 (mAP@0.5) of 78.6%, 95.5%, and 73.3%, respectively. The model thus outperforms other advanced models in remote sensing small-target detection.
Owing to complex biochemical reactions and the constant change in influent flow and concentration, sewage treatment processes exhibit strong nonlinear and time-varying characteristics, making it difficult to control the process variables accurately. Therefore, this paper presents a fuzzy Proportional, Integral, Derivative (PID) controller optimized by the Sparrow Search Algorithm (SSA) to track the concentration of dissolved oxygen and nitrate-nitrogen. First, SSA is used to optimize the initial PID parameters of the variable theory domain fuzzy PID controller of units 5 and 2. Then, quadratic optimization is carried out, that is, the quantization factor and scale factor are optimized. A theory domain adaptive adjustment strategy based on fuzzy rules is designed to adjust the controller parameters online, to improve the tracking accuracy of the controller. Finally, Benchmark Simulation Model no.1 (BSM1) for the sewage treatment process is used to experimentally verify constant value and dynamic variable value tracking control, and the applicability of SSA in wastewater treatment process is analyzed. Experimental results show that compared with the conventional fuzzy PID controller based on the adaptive scaling factor variable theory domain, the absolute error integral index of the designed controller is reduced. Moreover, energy consumption is reduced while effluent quality is improved.
The high penetration of distributed energy, addition of many energy storage units, and flexible loads have hindered the optimization and scheduling of active distribution networks. Existing economic dispatch systems do not consider the integration of flexible loads and energy storage units effectively, resulting in slow convergence speed. Aligning with the national ″dual carbon″ goals, this paper proposes a multi-objective economic dispatch strategy for distribution networks based on the FISCO BCOS platform, combined with the consistency algorithm in communication topology optimization. This strategy comprehensively considers power generation cost, gas emissions, energy storage cost, and the benefits of flexible load electricity consumption. It uses the consistency algorithm in communication topology optimization to improve the system′s convergence speed. Moreover, it combines the storage of the FISCO BCOS alliance chain and the reduced Practical Byzantine Fault Tolerance (rPBFT) consensus mechanism to optimize information sharing between nodes, reduce the centrality of leading nodes, prevent malicious nodes from attacking, and achieve multi-objective optimal power allocation in the distribution network. Simulation results show that the proposed multi-objective scheduling economic dispatch strategy converges quickly despite leader node switching, node exit and addition at different stages, power exchange instruction changes, and convergence coefficient changes. It is robust, stable, and achieves a higher convergence speed than that of the fast-consistency algorithm. It selects appropriate target weight coefficients, yielding economic and environmental outcomes that are better than those provided by the multi-objective NSGA-Ⅱ algorithm.
This study presents MGW-YOLO, an algorithm based on YOLOv8n. The study aims to address the need for an accurate, real-time, robust, and lightweight algorithm for target detection during hedge trimming on both sides of a road. The study also proposes a new C2f_ModuGhost+ module to replace the C2f module in the backbone network, in which modulated deformable convolution increases the number of offset feature channels, which accelerates model inference and improves the real-time algorithm. The Grouped Spatial Convolution (GSConv) lightweight convolution technique and slim-neck design paradigm are introduced into the neck of the network, which integrates concepts such as standard convolution, depth-separable convolution, and Shuffle module; reduces the number of parameters; and makes the model lightweight. A focal-WIoU loss function with a double weighting mechanism is designed. The two-layer cross-attention mechanism in WIoU effectively reduces the false detection rate when multiple hedges are connected and occluded, and the focal loss weighting factor is utilized to improve the detection accuracy of difficult-to-classify samples such as special-shaped hedges. In addition, the adversarial training strategy of TRADES is adopted to balance robustness and accuracy in the classification problem. Experimental results show that, compared with the baseline algorithm, i.e., YOLOv8n, the mAP@0.5 and mAP@0.5 ∶0.95 of MGW-YOLO increases by 3.29 and 2.87 percentage points, respectively. Experiments on an unmanned chassis show that the pre-processing time, average inference time per frame, and post-processing time per frame of MGW-YOLO are reduced by 0.7 ms, 10.7 ms and 0.7 ms, respectively. The detection speed improves by 15.7 frame/s compared to that of the original algorithm, which is suitable for the real-time operation of hedge trimmers on both sides of a road.
One-Class Classification (OCC) techniques, such as Support Vector Domain Description (SVDD) and One Class Support Vector Machines (OCSVM), have received widespread attention in application fields such as computer vision, machine learning, and biometric recognition. Most current OCC models are designed based on the L2 norm; therefore, issues such as insufficient sparse solutions, noise sensitivity, and the need for second-order or higher optimization persist, making real-time object detection difficult. To address this issue, this study proposes a one-class classifier called L1-OCSVM by replacing the interval term of OCSVM′s L2 norm with the L1 norm. This substitution not only inherits the large interval principle of Support Vector Machines (SVM) but also leads to first-order optimization problems. However, owing to the introduction of the L1 norm, the feature samples in the nonlinear L1-OCSVM model no longer appear in pairs like the L2 norm, and therefore they cannot be replaced by the inner product of the L2 norm. Thus, an equivalent optimization strategy is provided, which directly minimizes the L1 norm term of the variable, resulting in extremely sparse solutions that are very conducive to real-time detection. Experiments on non-rigid object detection in forestry problems, such as forest fire, forest smoke, and tree crowns, using unmanned aerial vehicle images and ground remote sensing images verify the advantages of L1-OCSVM in object detection accuracy, sparsity, and real-time detection.
This study proposes a Proximal Policy Optimization (PPO)-based vehicle intelligence control method to improve vehicle driving efficiency in a mixed environment on highways and reduce traffic accidents. First, a hierarchical control framework is constructed, integrating deep reinforcement learning and traditional Proportional Integral Derivative (PID) control. The upper-level deep reinforcement learning agent is responsible for determining the control strategy, while the lower-level PID controller executes the control strategy. Second, an advantage distance is defined to filter the observed environmental state matrix to enhance driving efficiency, helping the ego-car to choose lanes with longer advantage distances for lane changing. A new state collection method is proposed based on the defined advantage distance to reduce the amount of data to be processed, to accelerate the convergence speed of the deep reinforcement learning model. Additionally, a multi-objective reward function is designed to balance the safety, driving efficiency, and stability of vehicles. Finally, simulation tests are conducted in a vehicle reinforcement learning task simulation environment called Highway_env, built on Gym. The proposed approach achieves a faster convergence rate than that of the Deep Q-Network (DQN) method. It also enables vehicles to safely and smoothly accomplish driving tasks at two different target speeds.