Virtual Class Matching Based Detection of Out-of-distribution Texts
-
Graphical Abstract
-
Abstract
In electric power, the performance of the established deep learning model often deteriorates in the practical application of text classification tasks. It is urgent to detect the actual text data using the out-of-distribution (OOD) text detection method to ensure the model's generalization ability. In the context of text rating for power field operations, we present a comprehensive examination of the factors leading to the emergence of OOD texts and the challenges in the detection and subsequently propose an OOD text detection method based on virtual class matching. We employ feature decomposition to obtain the main and sub-component molecular spaces to enhance the distinction between the In-Distribution (ID) texts and OOD texts. The sub-component molecular space aids in constructing the virtual class of the OOD texts, which is analyzed and has the advantage of blending probability and characteristic distribution space methods. The datasets of different lexical similarities of ID texts and OOD texts demonstrate the feasibility and effectiveness of the proposed method. As a result, the practical application effect of our automatic text rating method in power field operations significantly improved rating performance and confidence.
-
-