In view of the large volume of defective texts in the current relay protection systems and the limitations of traditional mining methods
such as insufficient text feature extraction
inaccurate semantic recognition
and low operation efficiency
a defect text deep mining method based on the Linear Transformer-CNN model is proposed. First
massive defect text data are preprocessed
and the processed words are input into MacBERT to generate comprehensive word embeddings. Next
a linear attention mechanism is introduced into the Transformer to improve overall operation efficiency. Then
a multi-layer CNN module is added to compensate for the Linear Transformer’s limited ability to extract defect text features. Finally
the comprehensive word embeddings are fed into the multi-layer CNN and Linear Transformer modules to extract local key features and long-distance semantic features of defect texts
respectively. The fused features are then classified using a SoftMax layer. Experimental results show that
compared with traditional text mining methods
the training and testing time of the proposed method is shorter
and the classification accuracy reaches 94.24%
enabling fast and accurate classification of defect texts.