Rethinking the DBMS

2009/02/24

Defining relation value and relation variable

Filed under: Theory — Tags: — Ben Samuel @ 15:49

This entry is to lay a little groundwork as many readers aren’t familiar with the relational algebra, or the calculus. (Incidentally, I recently got a copy of Logic and Databases, where Date goes into exactly what the two terms mean; more to come when I’ve finished reading it.) Even if you are, there are a lot of wrong ideas about some of the basics and anyone could be stuck with them.

  • A relation value, in the proper context, is a structured assertion about the world.

Context is important: if someone asks, “How many fingers do you have?” then the number 5, an integer value, is an assertion about the world. A value has meaning only inasmuch as it is placed in context.

As an example of what it means to place something in context, consider a computer program. By itself it is just a long stream of symbols, but if you run it in a computer it actually does something. This sonds like common sense, but in reality most people don’t understand this and often succumb to magical thinking. You see this often in the phrase “there ought to be a law” which carries the implicit assumption that merely writing a proscription against something will make it stop.

Structure means that you can derive new values from a value in a reproducible fashion. In particular, I’m talking about deterministic structure, such as the fact that you can add integers to get a certain sum, or decompose them to a product of primes. Nondeterministic structure would include statistical operations. For example, say you have a plain text representation of a book, written in English. You could analyze the frequency of various letters or letter combinations and assert with a high degree of confidence that the language is English, but there’s no mathematical operation that can say, for certain, that the language is X and you can’t take an argument from the book and algorithmically determine a counter-argument. The structure is there, the book definitely is written in English, it has sentences, ideas, themes, &c, but you’re forced to compare it indirectly with a mathematical model.

  • A relation variable is a portion of the larger context adding meaning to the value it is assigned.

The structure of a relation variable is typically represented as a table with labeled columns and unlabeled rows. The problem with any representation is that it tends to add stuff that shouldn’t be there. Here’s an example:

simple-table.png

This is a pretty typical example of how people record information. There are some minor issues, such as the fact that this representation suggests ordered columns and rows, but neither are ordered in a relation value. The biggest issue that this representation has added that simply doesn’t exist in an individual relation variable is meaning. Even if you weren’t sure what a sign-in sheet was, you could plainly see a list of people signing their name in chronological order all on the same day, during business hours.

In a typical DBMS, you might have a comment that explained roughly what your table is for, sometimes euphemistically known as a natural language predicate, but that’s just a comment to you. All you can really attach to a variable is a set of constraints, saying what values aren’t allowed. Even the predicate logic view doesn’t get you very far. The idea is that each row is a proposition. A proposition is saying that p(a, b, c, d) is true. And the predicate it exists in is just the pattern of the proposition, it adds no meaning.

The reason I bring all this up is that plenty of people did buy into the idea that adding labels and comments can add meaning; the latest example of attempting this is XML. Mathematically, XML gets you nothing you didn’t already have with symbolic expressions, which date back at least to the ‘60s. (This isn’t to say that the technologies that XML replaced weren’t worse.)

Conclusion What I’m going to describe in later posts is a minimalist representation for relation values, based on a small addition to the relational algebra. We’ll explore some basic relation operators and derive a standard form.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: