Data, Information, Knowledge, Understanding, and Wisdom

Last updated Sep 15, 2024 | Originally published Jan 18, 2024

# Data, Information, Knowledge, Understanding, and Wisdom

# Why does data matter?

Before we discuss what data is and how it is used, we should strive to understand why data matters.

The Data, Information, Knowledge, Understanding, and Wisdom hierarchy is a simple mental model useful in appreciating the role of data in management and innovation. This hierarchy was first formally developed by Russ Ackoff, a forerunner in the study of management information systems. Ackoff described the hierarchy in a 1988 address to the International Society for General Systems Research, and it was reproduced by the Journal of Applied Systems Analysis in 1989 (Ackoff, 1989). There have since been many interpretations and iterations of the hierarchy (usually in the form of “DIKW”). In this article, I present my own, in which I strive to present the hierarchy as pragmatically as possible. You will see how the DIKUW hierarchy is a fundamental paradigm underpinning how we manage and innovate with data.

There are actually six levels to the hierarchy. It begins with phenomena: things that exist and are happening in the world. We turn phenomena into data by observing and capturing that observation. We turn data into information by adding context to it. We turn information into knowledge by applying the information to something. We turn knowledge into understanding by critiquing the knowledge, diagnosing and prescribing problems with our knowledge in order to learn. Finally, we turn understanding into wisdom by formalizing our understandings in the form of theories. In the next section we discuss each of these levels in more detail.

# Phenomena

Before we can have data, we must have something to represent with data. That something is phenomena: things that are happening in the world. In fact, the world itself is made of phenomena. Phenomena are the material, concrete stuff affecting and making up the world.¹ When we observe phenomena, they become something we can think about: data. In other words, underlying all wisdom, understanding, knowledge, information, and data, are the phenomena we are trying to observe and capture.

To make this more concrete, consider buying a coffee at a café. Almost certainly, that café has an information system of some kind supporting its business. When you purchase that coffee, the information system models at least one kind of phenomena: money. So what kind of data may be generated about that transaction? The value of the sale, the proportion going to taxes, and how the value of the sale adds to the café’s revenues for the day. If you have a membership or rewards card for the café, the information system also models a different phenomenon: you. It registers that you (or at least, someone who has your rewards card, or who knows your rewards number) made a purchase. It probably knows what that purchase was, and associates the kind of coffee you bought with its model of you. All of these phenomena are observed and potentially captured by the café.

Reflect: what other kinds of phenomena are interacting with the café? (I.e., what else might people do while they are there?) Which of these might be interesting for the business of the café?

# Data

When we observe something about the world, and especially when we record that observation (in the form of a note, or an image, or a value in a spreadsheet or database), we capture phenomena in the form of data. When a café’s information system observes and captures records of you and your purchases, it is creating data to represent those phenomena. Data helps us ask the simplest of questions about phenomena, even if we didn’t observe them directly. What has happened? When and where did it happen? Who was involved? For instance, an analyst working for the cafe can use the records created by its information system to see how many coffees were purchased in the past month without having to be present to observe each purchase themselves. This is a simple and obvious idea, but it is incredibly powerful.

You might note at this point that this observing and capturing need not depend on a computer system. Indeed, workers at the café could simply write down each purchase, its value, and the number on your rewards card. An analyst later could conceivably look back through written ledgers to re-observe your purchase. What computers do is make it much easier for someone to consume data: to use the data to do something.

After all, as Palmer (2006) observes, “data is the new oil”. Technological and methodological innovations of the past several decades have turned data into an invaluable resource that many prize as the ultimate modern good. However, Palmer also notes data’s important caveat: it is not useful by itself. Just like crude oil, data needs to be refined in order to be useful. Computers help us to query (that is, to look through) and refine data in order to use it.

# Information

When we first review data, we add context. We relate pieces of data with one another, including metadata (data about data, such as how the data was captured) and our own perspective (such as why we’re reviewing the data to begin with, our assumptions about the data, or other things that we’re relating the data to in our minds).² In doing so, we create information. Information is data imbued with meaning (Bellinger et al., 2004) — and information is useful.

Information helps us ask basic questions about the present, current world. What is happening? When and where does it happen? Who is involved? Most importantly: what happens if we do this or that? As a result, we can combine and compare information to produce patterns (if that happens, then this happens) and to find outliers.

Consider the café example again. A simple question the café might have is “what products are the least popular?” Since the café has been recording each purchase made, this question is answered with a simple query of the purchases data. An analyst using this data would be able to quickly return a ranked ordering of all products by volume. This information could be very useful for a café looking to simplify its operations or cut down on costs.

# Knowledge

Knowledge is produced when we apply information, especially when we develop ways to apply information systematically (i.e., when we can create instructions for others to act on similar information in similar ways to produce similar results). For instance, when we find a pattern (if that happens, then this happens) we can say (if that happens again, we should expect this, and therefore do [something]). Or, if we do this, then that will happen. Knowledge therefore helps us to start asking the most important kinds of questions: questions about futures.³ What will happen? When and where will it happen? Who will be involved? What if we do this or that?

With a ranked list of product purchases by volume, it would be easy for a café manager looking to cut costs to identify which purchases to remove from service. They can ask the question, “What would happen if we removed the three least popular products from our product lines?” and answer it with “It would have a negligible effect on revenues.”

# Understanding

Understanding is produced when we develop knowledge about knowledge. As we interact with knowledge, we begin to detect gaps, problems, and errors. For instance, if we expected this to happen, but that happened instead. Understanding is therefore diagnostic: it allows us to detect, identify, and investigate problems and errors. It is also prescriptive: it allows us to identify, specify, and theorize about what we now need to know (Ackoff, 1989).
For instance, through understanding, we develop the best kinds of questions to ask about a given problem, the most valuable types of tests to try on a new idea before we implement it. We begin to know what we don’t know — and what we need to know.

Understanding is therefore the difference between learning and memorizing. It is simple enough to memorize a set of instructions (i.e., to memorize knowledge). Some instructions are simple. How to tie a set of shoelaces is a good example. To solve the problem of tying your shoelaces, it is sufficient to have the knowledge of how to tie a set of shoelaces. You can then apply those exact same instructions to every set of laced shoes you will ever encounter. No further knowledge is necessary. Other instructions, however, are complicated: consider launching a rocket to geospatial orbit. The instructions to do so were quite tricky to figure out. As a civilization, we have gotten fairly good at launching rockets, but even now, some of our rocket launches nonetheless fail. In these complicated problems, memorization is insufficient. This is where the diagnostic and prescriptive role of understanding becomes important. To solve complicated problems, we must not only apply previously-memorized knowledge, but also learn how to seek new knowledge in order to make progress. Thus, we must be able to detect, identify, and investigate gaps and errors in our knowledge — and to specify and theorize about what kinds of knowledge we need to obtain in order to resolve those gaps and errors.

To illustrate the value of understanding, return again to the hypothetical café. You may have thought that there are other kinds of questions to ask: who purchases those products? When, and why? Perhaps the least frequently purchased products is a milk steamer, bought by parents of young kids who are also buying a number of other products each transation. If the steamer were removed from the menu, maybe those parents would take all of their other purchases elsewhere. Thus, a system tracking not only products purchased but also who purchases them may be used by an analyst to generate higher-quality knowledge from more information. That analyst, however, needs to have an understanding of the phenomena of the domain — that is, the relationship between different kinds of customers and different kinds of products — and must use that understanding to critically question the knowledge they are seeking from the café’s information system.

# Wisdom

Ackoff (1989) notes that data, information, knowledge, and understanding help us increase efficiency. That is, given a particular predetermined goal, we use data, information, knowledge, and understanding to attain that goal, and to attain it more predictably and with fewer resources (time, money, or whatever). In other words, these are operational layers. With more phenomena, more data, more information, more knowledge, and more understanding, we can do more, better, faster, and more easily. However, how do we judge what we should be doing and how we should be doing it? Ackoff (1989) argues that effective judgment comes from wisdom: the development and application of values. He therefore places wisdom at the “top” of the DIKUW hierarchy. However, Ackoff (1989) does not describe where wisdom comes from nor how it is developed with respect to phenomena, data, information, knowledge, or understanding. For that, we turn to the discipline of design.

To design is to designate; to mark or give status to something, in order to decide what is important about a given thing. In other words, deciding upon values is a design decision. Liz Sanders, a scholar of design and design research, argues that the transformation of data into information, knowledge, and understanding is an analytical process: that means it involves investigating and breaking down the things that we’ve observed (Sanders, 2015). Sanders explains that wisdom guides the analysis of data through the mechanism of theory: our higher-level explanations, predictions, and prescriptions of how the phenomena we are interested in work (Gregor, 2006). For instance, our theory of our café’s operations might include ideas like “more product lines mean more effort involved in making them” and “fewer purchases of a product indicate that a product is not a valuable offering”. Similarly, when we organize and make sense of data, information, knowledge, and understanding, we synthesize and generate new theory — and therefore build up our wisdom about the phenomena we are interested in. So, when the café’s managers apply the ideas mentioned immediately above to the information that milk steamers are low-frequency purchases, they may conclude with a prescription: cut milk steamers from the menu in order to reduce operational complexity and cut costs while having minimal impact on the café’s ability to provide value to its customers. So, that is where wisdom comes from: building up and formalizing our understandings about phenomena into theories that explain, predict, and prescribe those phenomena.

# So why does data matter?

Crucially, phenomena happen whether or not we observe them. A coffee company’s regular customers may start feeling less satisfied about their newly-purchased beans whether or not the company is seeking feedback on their latest roast. Similarly, data means nothing unless we add context to it. Worsening reviews of the coffee company’s latest roast becomes useful data when we combine it with the realization that something about the roasting process has changed. Likewise, information is useless unless we apply it. If the coffee company’s customers are liking the latest roast less, maybe their blend of coffee beans needs further fine-tuning. However, our knowledge might be wrong. Maybe it wasn’t the bean blend, but the roasting process, or a competitor’s latest light roast, or a change of season (coldbrew, anyone?). We must understand the problems we’re dealing with as deeply as possible — to recognize if they are even the right problems to solve. After all, if the coffee company’s regular customers like the latest roast less … but they are bringing in many new customers with the change in approach, maybe it is the right thing to do.

The truth is that data matters, but only if we are collecting the right data. And even then, data doesn’t matter — not without information, knowledge, understanding, and wisdom.

That, however, leads us to a more nuanced question: how do we know if we have the right data? In fact, in many contexts, it’s not a case of “right data” or “wrong data.” Instead, we think of data quality as a spectrum. Moreover, there is no one way of thinking about data quality. Therefore, we must consider different data qualities, or dimensions of data quality, and make judgements about which ones matter the most for whatever we’re trying to achieve.

# From data quality to data qualities

Wand and Wang (1996) surveyed a set of notable data quality dimensions, finding that the five most cited dimensions were accuracy, reliability, timeliness, relevance, and completeness. Data accuracy measures the degree to which data precisely captures the phenomena it is supposed to represent. For example, if a transaction processing system were to record a $4.95 transaction as $5, it would be imprecise. Reliability is related to accuracy — it refers to the degree to which data correctly represents the phenomena it was meant to capture. The $5 transaction previously mentioned is not necessarily incorrect, even if it is imprecise. Timeliness refers to the degree to which captured data is out of date. A count of a population — say, the number of polar bears living in the wild — is out of date as soon as a polar bear is born or dies. Relevance is relative to what the data is used for. A count of the population of polar bears living in captivity is largely irrelevant if a data user is wondering about the health of wild polar bear populations. Finally, completeness is the degree to which captured data fully represents a phenomenon. For instance, a transaction processing system may only capture money in and money out, but a more complete record of each transaction would include what was bought or sold, who bought it, the transaction type, and so on.

However, while these five dimensions are perhaps generally the most important for information systems design and use, Mahanti (2019) has demonstrated that there may be many more dimensions that matter to a given data producer or data consumer. She has identified a total of 30 dimensions of data quality (Mahanti (2019), p. 7) that are worth considering for any information systems. By definition, anyone using an information system will be collecting data and trying to use that data to produce information, knowledge, understanding, and possibly wisdom. So, they should ask themselves: which of these are the most important for my project? Then: how do I know if this data is good enough for my purposes? Finally: how might I make sure I improve my data on these dimensions in the future?

# What about ChatGPT?

The latest advancements in data capabilities — namely, ChatGPT and similar generative AI tools — are excellent examples underscoring the significance of the DIKUW hierarchy. These tools are perhaps the greatest example we have ever had of the power of data. Ask ChatGPT, Microsoft Copilot/Bing, or any of their competitors a question and you will generally get a sophisticated, conversational answer in response.

These tools are probabilistic generators. This is how they work: given some input (a “prompt”), they return the most likely response to that input, based on the patterns found in their training data. Their responses are injected with a little bit of randomness (depending on their “temperature,” this can be more or less random) so that they rarely offer the exact same response to the exact same input. These tools are able to achieve this functionality because they are fundamentally built of data. Basically, researchers pointed very powerful computers at the vast amounts of data that now exists on the Internet and trained those computers to learn the patterns found in that data.

Given a prompt, these tools can do wondrous things with data, transforming it into information and knowledge. However, these tools need understanding in the form of well-constructed prompts in order to produce the most useful information and knowledge. Moreover, they must be paired with a wise human in order to produce useful output, as they have no real concept of their own gaps and errors. Uncritical use of the output of these tools has had disastrous consequences for foolish (or at least unaware) users, as they will completely make up falsehoods while seeming absolutely confident that the output they’re producing is correct. (This is called “hallucinations.”)

This is where the DIKUW hierarchy must be re-emphasized: the quality of output of these probablistic generators varies greatly depending on the quality of input. Thus, users of ChatGPT and similar tools have learned that there are ways to artfully craft or even engineer prompts for these tools. We are now seeing the emergence of markets for high-quality prompts (e.g., https://promptbase.com/) and training for prompt engineering (e.g., https://github.blog/2023-07-17-prompt-engineering-guide-generative-ai-llms/). Working with “AI” has therefore become a modern skill that may grow in relevance and value as generative tools become more prominent.

Note that this objectivity does not mean that things that are conventionally “subjective,” such as someone’s opinion about something, is not a phenomenon. Indeed, once someone has formed such an opinion, they have materialized it into a phenomenon that can be objectively observed (such as by a listener hearing the opinion) and captured as data. ↩︎
This is particularly important in serendipity, as serendipitous discoveries are the result of both an observation and a valuable association of that observation with some other idea. ↩︎
Bellinger, Castro, and Mills (2004) interpret Ackoff as saying that wisdom is the only dimension of the hierarchy concerned with the future (“Only the ﬁfth category, wisdom, deals with the future because it incorporates vision and design”, para. 4). I disagree, however: Ackoff writes about how knowledge describes instruction, and instruction requires prediction, and therefore some expectation about how current actions will influence future results; similar for understanding. Moreover Bellinger et al. are inconsistent in this interpretation, as they later describe similar ideas about knowledge and prediction. ↩︎

𖧹/△ Fulcra