Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jan;24(1):71-96.
doi: 10.1016/s0933-3657(01)00091-4.

Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Affiliations

Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Bertha Guijarro-Berdiñas et al. Artif Intell Med. 2002 Jan.

Abstract

The validation of a software product is a fundamental part of its development, and focuses on an analysis of whether the software correctly resolves the problems it was designed to tackle. Traditional approaches to validation are based on a comparison of results with what is called a gold standard. Nevertheless, in certain domains, it is not always easy or even possible to establish such a standard. This is the case of intelligent systems that endeavour to simulate or emulate a model of expert behaviour. This article describes the validation of the intelligent system computer-aided foetal evaluator (CAFE), developed for intelligent monitoring of the antenatal condition based on data from the non-stress test (NST), and how this validation was accomplished through a methodology designed to resolve the problem of the validation of intelligent systems. System performance was compared to that of three obstetricians using 3450 min of cardiotocographic (CTG) records corresponding to 53 different patients. From these records different parameters were extracted and interpreted, and thus, the validation was carried out on a parameter-by-parameter basis using measurement techniques such as percentage agreement, the Kappa statistic or cluster analysis. Results showed that the system's agreement with the experts is, in general, similar to agreement between the experts themselves which, in turn, permits our system to be considered at least as skillful as our experts. Throughout our article, the results obtained are commented on with a view to demonstrating how the utilisation of different measures of the level of agreement existing between system and experts can assist not only in assessing the aptness of a system, but also in highlighting its weaknesses. This kind of assessment means that the system can be fine-tuned repeatedly to the point where the expected results are obtained.

PubMed Disclaimer

Publication types

LinkOut - more resources