Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 19;27(7):767.
doi: 10.3390/e27070767.

A Quantum-like Approach to Semantic Text Classification

Affiliations

A Quantum-like Approach to Semantic Text Classification

Anastasia S Gruzdeva et al. Entropy (Basel). .

Abstract

In this work, we conduct a sentiment analysis of English-language reviews using a quantum-like (wave-based) model of text representation. This model is explored as an alternative to machine learning (ML) techniques for text classification and analysis tasks. Special attention is given to the problem of segmenting text into semantic units, and we illustrate how the choice of segmentation algorithm is influenced by the structure of the language. We investigate the impact of quantum-like semantic interference on classification accuracy and compare the results with those obtained using classical probabilistic methods. Our findings show that accounting for interference effects improves accuracy by approximately 15%. We also explore methods for reducing the computational cost of algorithms based on the wave model of text representation. The results demonstrate that the quantum-like model can serve as a viable alternative or complement to traditional ML approaches. The model achieves classification precision and recall scores of around 0.8. Furthermore, the classification algorithm is readily amenable to optimization: the proposed procedure reduces the estimated computational complexity from O(n2) to O(n).

Keywords: interference; quantum-like heuristic algorithms; sentiment analysis; text classification; vector-space language model.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Color map of the probability density, ρ, calculated with the help of (4) for three wave functions corresponding to three words of the phrase. For visualization, the points reflecting the words of the phrase are placed on the plane; the axes of the plane relate to Cartesian coordinates (x,y) expressed in conventional units. The coordinates of the points have been selected in such a way that the distances between them are proportional to the semantic distances between the words. In particular, the distances between two points, rij, are specified as (xixj)2+(yiyj)2, where (xi,yi) and (xj,yj) are the coordinates of the corresponding points. The wave numbers are calculated using (7) and (9); for const=4, the rci correspond to the semantic distance between the individual words of the phrase, and the centroid of the triangle reflects the relative position of the words in semantic space. The narrow color box reflects the probability density levels, represented in dimensionless units; purple corresponds to the minimum level of probability density, and red corresponds to its maximum.
Figure 2
Figure 2
Block diagram of a semantic text classification algorithm. The input and output stages of the algorithm are marked in green, the data processing stages are blue, and the intermediate datasets are gray.
Figure 3
Figure 3
An example of a possible relative position of terms defining positive and negative classes (excellent, good, bad, and worst) on a color probability density map for a certain wave packet. It shows how, due to the high semantic proximity, similar probability density values can be obtained for the “good” and “bad” classes. Due to a greater mutual distance, “Excellent” and “worst” are more likely to fall into areas with different probability densities. For visualization, the points corresponding to the words are placed on the plane taking into account their mutual distances. The axes of the graph reflect Cartesian coordinates (x,y) expressed in conventional units.
Figure 4
Figure 4
Dependence of the algorithm running time on the data dimension.

Similar articles

References

    1. Heylighen F. The Future Information Society. Volume 8. World Scientific; Singapore: 2021. Distributed Intelligence Technologies: Present and future applications; pp. 136–145. (World Scientific Series in Information Studies).
    1. Guy T.V., Homolová J., Gaj A. Indirect Dynamic Negotiation in the Nash Demand Game. IEEE Access. 2022;10:105008–105021. doi: 10.1109/ACCESS.2022.3210506. - DOI
    1. Guleva V., Shikov E., Bochenina K., Kovalchuk S., Alodjants A., Boukhanovsky A. Emerging complexity in distributed intelligent systems. Entropy. 2020;22:1437. doi: 10.3390/e22121437. - DOI - PMC - PubMed
    1. Kárný M., Guy T.V. On the Origins of Imperfection and Apparent Non-rationality. In: Guy T.V., Kárný M., Wolpert D.H., editors. Decision Making: Uncertainty, Imperfection, Deliberation and Scalability. Springer International Publishing; Cham, Switzerland: 2015. pp. 55–89.
    1. Hassan S.U., Ahamed J., Ahmad K. Analytics of machine learning-based algorithms for text classification. Sustain. Oper. Comput. 2022;3:238–248. doi: 10.1016/j.susoc.2022.03.001. - DOI

LinkOut - more resources