On Trusting a Cyber Librarian: How rethinking underlying data storage infrastructure can mitigate risks of automation

Maria Israel; Mark Graves; Ahmed Amer

doi:10.4108/eai.1-12-2021.172359

EAI Endorsed Transactions on Creative Technologies (Dec 2021)

On Trusting a Cyber Librarian: How rethinking underlying data storage infrastructure can mitigate risks of automation

Maria Israel,
Mark Graves,
Ahmed Amer

Affiliations

Maria Israel: Santa Clara University, Santa Clara, CA 95053, USA
Mark Graves: University of Notre Dame, Notre Dame, IN 46556 USA
Ahmed Amer: Santa Clara University, Santa Clara, CA 95053, USA

DOI: https://doi.org/10.4108/eai.1-12-2021.172359
Journal volume & issue: Vol. 8, no. 29

Abstract

Read online

INTRODUCTION: The increased ability of Artificial Intelligence (AI) technologies to generate and parse texts will inevitably lead to more proposals for AI’s use in the semantic sentiment analysis (SSA) of textual sources. We argue that instead of focusing solely on debating the merits of automated versus manual processing and analysis of texts, it is critical to also rethink our underlying storage and representation formats. Specifically, we argue that accommodating multivariate metadata is an example of how underlying data storage infrastructure can reshape the ethical debate surrounding the use of such algorithms. In other words, a system that employs automated analysis may typically require manual intervention to assess the quality of its output, or demand that we select between multiple competing NLP algorithms. Settling on whichever algorithm or ensemble can produce the best results, this is a decision that need not be made a priori at all.OBJECTIVES: An underlying storage and representation system that allows for the existence and evaluation of multiple variants of the same source data, while maintaining attribution to the individual sources of each variant, would be an example of a much-needed enhancement to existing storage technologies, especially in anticipation of the proliferation of AI semantic analysis technologies.METHODS: To this end, we take the view of AI in SSA as a sociotechnical system, and describe a possible novel solution that would allow for safer cyber curation. This can be done by allowing multiple different annotations to coexist within a single publishing ecosystem (whether those different annotations are the result of competing algorithmic models, or varying degrees of human intervention).RESULTS: We discuss the feasibility of such a scheme, using our own infrastructure model (MultiVerse) as an illustrative model for such a system, and analyse the ethical implications.CONCLUSION: Considering an underlying storage and representation system that allows for the existence and evaluation of multiple variants of the same source data, while maintaining attribution to the individual sources of each variant within a single publishing ecosystem helps mitigate risks of automation and enhances AI (semantic) explainability.

Published in EAI Endorsed Transactions on Creative Technologies

ISSN: 2409-9708 (Online)
Publisher: European Alliance for Innovation (EAI)
Country of publisher: Belgium
LCC subjects: Technology
Website: https://eudl.eu/journal/ct

About the journal

Abstract

Keywords