Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Bertha Guijarro-Berdiñas¹, Amparo Alonso-Betanzos

Affiliations

Affiliation

¹ Laboratory for Research and Development in Artificial Intelligence (LIDIA), Department of Computer Science, University of A Coruña, Campus de Elviña s/n, 15071, A Coruña, Spain. cibertha@udc.es

PMID: 11779686
DOI: 10.1016/s0933-3657(01)00091-4

Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Bertha Guijarro-Berdiñas et al. Artif Intell Med. 2002 Jan.

. 2002 Jan;24(1):71-96.

doi: 10.1016/s0933-3657(01)00091-4.

Authors

Bertha Guijarro-Berdiñas¹, Amparo Alonso-Betanzos

Affiliation

¹ Laboratory for Research and Development in Artificial Intelligence (LIDIA), Department of Computer Science, University of A Coruña, Campus de Elviña s/n, 15071, A Coruña, Spain. cibertha@udc.es

PMID: 11779686
DOI: 10.1016/s0933-3657(01)00091-4

Abstract

The validation of a software product is a fundamental part of its development, and focuses on an analysis of whether the software correctly resolves the problems it was designed to tackle. Traditional approaches to validation are based on a comparison of results with what is called a gold standard. Nevertheless, in certain domains, it is not always easy or even possible to establish such a standard. This is the case of intelligent systems that endeavour to simulate or emulate a model of expert behaviour. This article describes the validation of the intelligent system computer-aided foetal evaluator (CAFE), developed for intelligent monitoring of the antenatal condition based on data from the non-stress test (NST), and how this validation was accomplished through a methodology designed to resolve the problem of the validation of intelligent systems. System performance was compared to that of three obstetricians using 3450 min of cardiotocographic (CTG) records corresponding to 53 different patients. From these records different parameters were extracted and interpreted, and thus, the validation was carried out on a parameter-by-parameter basis using measurement techniques such as percentage agreement, the Kappa statistic or cluster analysis. Results showed that the system's agreement with the experts is, in general, similar to agreement between the experts themselves which, in turn, permits our system to be considered at least as skillful as our experts. Throughout our article, the results obtained are commented on with a view to demonstrating how the utilisation of different measures of the level of agreement existing between system and experts can assist not only in assessing the aptness of a system, but also in highlighting its weaknesses. This kind of assessment means that the system can be fine-tuned repeatedly to the point where the expected results are obtained.

PubMed Disclaimer

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- Elsevier Science
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Affiliation

Empirical evaluation of a hybrid intelligent monitoring system using different measures of effectiveness

Authors

Affiliation

Abstract

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical