⇡

đ– « The notion of data and its quality dimensions - Fox, Levitin, Redman - 1994

Last updated Aug 16, 2024 | Originally published Aug 15, 2024

The notion of data and its quality dimensions - Fox, Levitin, Redman - 1994

Fox, C., Levitin, A., & Redman, T. (1994). The notion of data and its quality dimensions. Information Processing & Management, 30(1), 9-19. 10.1016/0306-4573(94)90020-5

Fox, Leviton, and Redman presented one of the earliest fundamental conceptualizations of data and data quality. They argued that existing definitions suffered from flaws in either linguistic or usefulness criteria, and instead defined data as follows: Data is any collection of data items (or datum) that model the real world in terms of its entities, their attributes, and the values of those attributes, represented and recorded in some medium.

They draw on Tsichritzis and Lochovsky’s work in Data Models, a 1982 book published by Prentice-Hall, where those authors defined datum “as a triple $<e, a, v>$ where the value $v$ is selected from the domain of the attribute $a$ to represent that attribute’s value for the entity $e$” [@Notion-Data-Its-Quality-Dimensions-1994-Fox, p. 12, paragraph 6].

Fox, Leviton, and Redman [-@Notion-Data-Its-Quality-Dimensions-1994-Fox] note that the definition allows us to examine three sets of quality issues: model quality, data quality, and representation/recording quality. The latter is mostly the concern of database design and maintenance, but the former two sets may apply to my work on serendipity, as adopting this perspective allows us to separate dimensions of data quality from dimensions of model quality. Moreover, as shown in their table 2, reproduced below, this allows us to separate measures of datum quality from measures of database quality, too.

Table 2. Quality dimensions for data values. Reproduced from [@Notion-Data-Its-Quality-Dimensions-1994-Fox, p. 17].

Dimensions Target description Typical datum measure Typical database measure Related notions
Accuracy Accurate or correct Size of error fraction incorrect precision, reliability
Currentness current how far out-of-date fraction out-of-date age, timeliness
Completeness complete Y/N fraction incomplete duplication
Consistency consistent Y/N fraction inconsistent integrity