The difference between data and content may seem subtle. But if you analyze the difference carefully, you’ll notice that each possess distinct characteristics that, in comparison, are worlds apart. In a short video interview clip from Informational Development World, Rahel Anne Bailie goes into considerable detail explaining the difference between data and content. Our aim in this post is to tease out the main points in her interview and to further explore this difference and what it means for the content professional.

The difference between data and content

Bailie opens the discussion by laying out the fundamental distinction between data and content. She begins, “The difference between data and content is…context.” She continues with an example, “If I give you the number 12, it’s easy to process the number 12 as a piece of data…it goes from system to system to system, and it has no context to it.” In this brief set of statements, Bailie sums up everything we need to know about the difference between data and content. Content is contextualized data. Context situates data within a system of values, concepts, and utterances. Let’s unpack these ideas further, starting with data.

What is data?

Data is a raw fact or value; a “given.” It is a unit of information that is self-referencing and circular (it can only lead back to itself). Take Bailie’s example, the number 12. It means nothing other than the number 12—a numerical value. But, as Bailie states, data can move “from system to system.” This brings up an interesting point. Data may not have a context other than its own self reference, but it possesses two important attributes that allow it to move through multiple systems: connectivity and transformability. Connectivity: The number 12 has the capacity to connect to signs, concepts, and values, all of which function as variables: 12 dollars, 12 minutes, 12 children, 12 feet, 12 sets of 12, etc. Data’s potential for connectivity is what allows it to move across or combine with multiple systems. Transformability: Each time data makes a connection, it also qualitatively transforms: the U.S. dollar sign converts a plain numerical value (12) into a domestic monetary value: $12; “$12 for a pair of shoes” plugs the monetary value of $12 into a system of comparative economic value with regard to quality (is it a true bargain or just a cheap pair of shoes?), price competitiveness, and affordability. At this point the number 12 can be considered “content.” If you add a time value to our example, as in “$12 for a pair of shoes, sale ending January 31st,” then you add not only a calenderized value to the current set but also an implicit call to action and sense of urgency. Caveat: The examples above seem to characterize data as something that lacks information (other than self-reference), and content as something that contains adequate information. This can be misleading.  Neither quantity nor quality of information characterizes data or content. Information-rich content can just as easily become data, as we’ll see below.

Quantity or quality of information is separate from context

What makes data “data” has little to do with the quantity of information it holds. For example, the number twelve by itself means very little. But in response to the question, “how many children are on the bus?” The answer “twelve,” implicitly says a lot more. It can now be considered content. But the preceding question is what contextualizes and transforms it into content, as the statement “twelve” means very little without the question. Let’s look at the reverse scenario.  Suppose that you come across a website containing lots of information-rich content. You come across a number of headings—fastest order execution; risk disclaimer; voted #1 for customer service; menu of technological features; pricing menu; service menu; franchise opportunities, etc—all of which are followed by clear and detailed descriptions. The quantity and quality of information may be adequate for each segment of content, but the content itself is still subject to the influence of context. Let’s assume that perception plays a variable role in shifting or displacing context. The web content creator/manager views each piece of content as a set of data to be categorized or processed in multiple ways. If the site is poorly organized, the customer may end up experiencing the content as a jumbled mix of data. And here’s another scenario: if every company within a given industry offers the same services as reflected by virtually the same content, then regardless of the content’s quality or presentation, customers might interpret the content from a data perspective—the same old and not so unique UVP (unique value proposition) 1, UVP 2, UVP 3, etc. offered by almost every company across the entire industry. Going back to Bailie’s statement, the difference between data and content is context. What is context? And what actually happens to data as it undergoes contextualization?

What is context?

A simple Google search will give you the following definitions:

  1. The circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed. (emphasis mine)
  2. The parts of something written or spoken that immediately precede and follow a word or passage and clarify its meaning. (emphasis mine)

Definition 1: Circumstances that form a setting

Starting with the first definition, let’s see how “setting” plays out in the data-to-content conversion process. We start with the number 12—a piece of data. We then give it a context: “$12 for a pair of shoes at ABC shoe store.” Now it is a short piece of content. What just happened? Data is a raw and self-referencing statement or value. By contextualizing it, we are essentially plugging data into a network of concepts that have a directionality of thought and that exist within a wider field of related concepts. Numerical value (12) is plugged into a system of monetary concepts (specifically USD…not British Pounds, in which case the shoes might be even cheaper due to currency valuation). This monetary value is then plugged into a wider field of concepts including economic value, quality, affordability, and also seller/brand/location/methods of purchase, etc. This is also why content, unlike data, cannot easily go from system to system. As Bailie points out, content is much more complex and nuanced. Unlike data, in which there are only simple values to connect, content is plugged into a whole network of concepts that may not so easily connect with other concepts or perspectives. For instance, we can easily translate the number twelve (as a piece of data) into another language. But if the number twelve were part of a larger content piece, say, a marketing slogan rich with idiomatic nuance, then the message contained within the content might easily get lost in translation (and the number 12 might not make as much sense at this point).

Definition 2: A part of something written or spoken

When data is contextualized and transformed into content, something else happens. The number 12, which says very little by itself, now says a lot when embedded into the phrase “$12 for a pair of shoes at ABC shoe store.” It becomes an utterance; it sets into motion a communicative gesture (or it participates in an existing one). When the number 12 is stated in response to the question, “how many children are on the bus?” it participates in an utterance set forth by the question; it joins the exchange of ideas.

Conclusion – Why this distinction matters

It isn’t uncommon to come across instances where data and content are treated similarly, rather than as separate and distinguishable elements. You might have come across content structured so poorly that it read like data. In other cases you might have noticed how important data got lost or obfuscated within its supporting content. From a content management perspective, you might have come across situations in which data lacked the proper variables or taxonomical structures to produce content fit for processing or automation. Recognizing the difference between data and content can help provide a clearer perspective in analyzing, categorizing, and producing content. This is especially important considering the fact that content is both a critical business asset and extension of user experience. Content’s capacity to contain and transmit information is secondary to its potential for eliciting and initiating action.