How To DB

by Jay F. Davis,

So, you've decided that the spreadsheet you've been using to keep track of your company's business information is just not cutting it anymore. And a friend says that you need to ditch that spreadsheet and get a database. You don't have to rush out and hire a database expert. In fact, there are some common-sense guidelines you can follow to create an effective and growable database using any modern "desktop" database such as Microsoft Access or Filemaker Pro.

First, don't rush the process. Take your time to design a good database structure. The thinking you put in up front will save you significant time later.

Before beginning, let's define a few key terms that will be useful in the discussion below.

  • Table: A collection of data of a particular type. For example, you might have tables called Contacts, Addresses, Customers and Orders.
  • Column: A particular type of data inside a table. For example, a Mailing List table might have columns for name, address, city, state and zip code. Columns are typically defined so that you can easily search for a particular value in a particular column. For example, for a mailing list you might need to group labels by city and zip code. Thus, city and zip code should be in separate columns. Note: Columns are also called fields.
  • Row: The combined information for all the columns for one item in a table. For example, all the information for one name on a mailing list would be a row (the name, address, city, state and zip code for one person). Note: Rows are also called records.
  • Primary Key: A primary key is the information in a table which allows you to uniquely identify a particular row in a table. For example, in our mailing list table, we might add another column called "ID" which contains a number which increments each time we add a new row. The first row would have an ID value of 1, the second 2 and so on. This ID column would be the primary key for the table because it would identify a particular row in the table.
  • Unique Key: A unique key is defined by establishing one or more columns in a table which cannot be repeated in any other row. In our mailing list example, we might say we have a unique key by combining the name and address columns. That is, no row can repeat the same name and address that is in another row. (Yes, this could also be the primary key.)

With these basics, we can now follow some guidelines to build a usable database.

The Guidelines:

1. Don't duplicate data. A correctly designed database should force you to organize your data in order to avoid duplication. Fortunately, all modern databases allow you to assign primary keys and unique keys which prevent duplicate data from being entered into a table. Figure out how to set up primary and unique keys.

2. Standardize abbreviations. This will help you prevent duplicate data. In our mailing list example, even if you define a unique key on the combination of name and address, the duplicate will not be caught if you enter the same address twice with "Street" spelled out in one case and "St." abbreviated in the other case. The same goes for "Mister" and "Mr." or anything that can be abbreviated. Be sure to record and distribute your abbreviation rules so that others in your organization can follow them too.

Tip: When I am importing already existing data into a newly created database, I start with a work file in which I do every logical search and replace that I can think of. That is, I replace "Street" with "St." and "Doctor" with "Dr.", or vice-versa. I also painstakingly scan the data for other oddities that I can filter out with search and replace before I put the data into my pristine database (another good example is multiple spaces--get rid of them!). My goal is to make duplicate addresses (as one example) show up as duplicates when I load to the database. Then the database will eliminate the duplicates!

3. Take care to logically organize your data. Suppose you want to keep track of multiple people at each address in your mailing list. Mr. & Mrs. John and Debbie Jones live together at a particular address. Sometimes you want to send a mailing only to John and at other times only to Debbie and at other times to both together. Do you need separate rows for "John Jones", "Debbie Jones" and "John & Debbie Jones"?

No! That would be serious duplication of data. Instead, we divide the mailing list table into two tables and create a people table and an addresses table. In our people table, we add a pointer to the appropriate address in the addresses table.

The concept of pointers to other table is essential to creating an effective database. A pointer is called a "foreign key" in database terminology. A foreign key is a value that points to a particular row (the primary key) of another table. In our person & address example where we have three people with the same address, all three people would have the same foreign key, pointing to the same address in our addresses table. Thus, the address only has to be entered once!

Judicious use of foreign keys can greatly simplify database management by eliminating duplicate data. With foreign keys, if an address changes, it only has to be changed once!

4. Learn how to use queries. This is essential because queries are the secret to getting at your data once it is separated in non-duplicated chunks. In our mailing list example, we've separated our names from our addresses with only a seemingly random number connecting the two. (In one table we have the foreign key, in the other the primary key.)

How do we get the addresses connected with the names to create a real mailing list? We make a 'query'--a pseudo-table that is a combination of the two tables 'names' and 'addresses.' In the query, the foreign key in names is connected to the primary key in addresses. Then, we use the query instead of the names table to generate our mailing list. (As if the query is a table.) Look up 'queries' in your database documentation to find our how to create one.

Tip: In an effort to simplify this process, Apple's Filemaker Pro uses different terminology from other database vendors. For Filemaker, you'll need to set up relationships first (connect foreign and primary keys), then use portals to see the 'related' data. In our mailing list example, you would set up a layout with names as the base table and addresses would be seen through the portal. The concepts are exactly the same, just implemented differently.


Above article Last updated: 15 May 2004 02:37:20 PM

Flickr Photo Demo: Jay's Photos
jayduck_99's items Go to Jay's photostream


We use MxGuardDog spam filtering.