Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun;4(2):75-88.
doi: 10.1089/big.2016.0007. Epub 2016 Jun 7.

A Framework for Considering Comprehensibility in Modeling

Affiliations

A Framework for Considering Comprehensibility in Modeling

Michael Gleicher. Big Data. 2016 Jun.

Abstract

Comprehensibility in modeling is the ability of stakeholders to understand relevant aspects of the modeling process. In this article, we provide a framework to help guide exploration of the space of comprehensibility challenges. We consider facets organized around key questions: Who is comprehending? Why are they trying to comprehend? Where in the process are they trying to comprehend? How can we help them comprehend? How do we measure their comprehension? With each facet we consider the broad range of options. We discuss why taking a broad view of comprehensibility in modeling is useful in identifying challenges and opportunities for solutions.

Keywords: data analysis; human-computer interaction; machine learning; statistical modeling; visual analytics; visualization.

PubMed Disclaimer

Figures

<b>FIG. 1.</b>
FIG. 1.
A summary of the framework proposed in this article. The specific lists for each question are initial organizations to show the broad range of aspects to consider.
<b>FIG. 2.</b>
FIG. 2.
Visualization of a validation experiment for a DNA-binding surface classifier that allows exploration of classification results. The corpus overview (left) is configured to display each molecule in the test set as a quilted glyph and orders these glyphs by classifier performance to show how performance varies over the molecules. Those proteins that appear more green have more true positive classifications, whereas those molecular that appear more red or blue have more misclassifications (false negatives and false positives, respectively). Selected molecules (left, yellow box) are visualized as heatmaps in a subset view (middle) and ordered by molecule size to help localize the positions of errors relative to correct answers. The detailed view (right) shows a selected molecule to confirm that most errors (blue, red) are close to the correctly found binding site (green).
<b>FIG. 3.</b>
FIG. 3.
Visualization of example Explainers, classifiers constructed with tradeoffs that emphasize comprehensibility concerns. In this example, Shakepeare's 36 plays are measured with a set of 115 “Docuscope” features. Classifiers are constructed to identify the 12 comedies (green). Each column represents a linear SVM classifier, with the plays sorted according to their score. The leftmost classifier uses only two features with unit coefficients. It makes several mistakes (e.g., misclassifying the tragedies Othello and Romeo and Juliet as comedies), but the simplicity of the classifier makes it useful for building theory about how Shakespeare used the linguistic constructs in the different genres. In contrast, other classifiers may use more features and more complex weights to achieve better accuracy (and larger SVM margins), at the expense of how easy the functions are to comprehend. SVM, support vector machine.

Similar articles

Cited by

References

    1. Schulz H-J, Nocke T, Heitzler M, Schumann H. A design space of visualization tasks. IEEE Trans Vis Comput Graphics. 2013;19:2366–2375 - PubMed
    1. Huysmans J, Baesens B, Vanthienen J. Using rule extraction to improve the comprehensibility of predictive models. SSRN 2006. Available at: http://dx.doi.org/10.2139/ssrn.961358
    1. Stiglic G, Povalej Brzan P, Fijacko N, Wang F, Delibasic B, Kalousis A, Obradovic Z. Comprehensible predictive modeling using regularized logistic regression and comorbidity based features. PLoS One. 2015;10:e014443–9. - PMC - PubMed
    1. Zeiler M, Fergus R. Visualizing and understanding convolutional networks. In Fleet D, Pajdla T, Schiele B, Tuytelaars T. (Eds.): ECCV 2014, Volume 8689 of Lecture Notes in Computer Science, Cham: Springer International Publishing, 2014. pp. 818–833
    1. Munzner T. Visualization Analysis and Design. Boca Raton, FL, CRC Press, 2014

Publication types

LinkOut - more resources