SQL database design...

JohnDoe8450

macrumors member
Original poster
Feb 3, 2009
61
0
Ok, the original version of my post is quite long...
Here's a short version :

Am I better off with a few (4-5) small tables or one great big table?

The option of using a few small tables gives me more flexibility but is it really much slower?
According to this presentation on scribd, "1000 small very quick unnecessary queries is worse than 1 slow query".

Help?

PS : The long version...
Hey guys,

I'm coding an exam bank for my student association and I'm unsure about how best to structure my database tables...

Each Exam has to have the following info :
Faculty, Department, Program, Course #, Course name, Professor's name, Year, Semester, Exam name, File name

I'm hesitating between two ideas...

A) One big table with each row being an exam, with the Faculty, Department, Program, Course # and Course name being stored as text

B) Four separate tables for Faculties, Departments, Programs and Classes, each row containing, for example, the Program's ID (a sequential index), the Department ID it's part of, it's abbreviation and it's full name.
And, one big table with each row being an exam, with the Faculty, Department, Program and Course # being stored as index numbers to the 4 other tables


The advantage I see in A is that I only run 1 SQL query to get all the info about a given exam and if I make Faculty, Department, Program and Course # indexes, I can quickly get listings for exams from a given Departmens, Program...

The disadvantage I see in B is that I'd need to run multiple SQL queries, although they'd all pretty much be simple index lookups (which are quicker than getting stuff from a textual index like in A?). The advantage to B is that it let's me do more stuff, makes listing the Classes from a given Program or Department much simpler, and simplifies management for certain tasks (class changes name).


I'm leaning towards B for the extra flexibility but I'm not sure if the time costs associated with multiple SQL queries will kill me if I start generating mucho traffic!


Opinions?
 

splitpea

macrumors 65816
Oct 21, 2009
1,009
241
Among the starlings
B.

The chances you'll generate enough traffic for (B) to be a problem are much lower (and the issue is much more easily solved) than what happens with option (A) if your app is used for a few years and then needs updating or the data needs migrating to a new platform.

You can still grab your data from multiple tables with a single query (if you don't know how, go research SQL JOINs, which are absolutely fundamental.) When you do hit performance problems, everything from indices to views to more hardware will be available to solve the problem.

But dealing with migrating or validating denormalized data is a huge headache, and chances are that what with sloppy data entry, after a while you'll no longer be able to even properly pull up the data you need with your queries. Take it from someone who's migrating a denormalized database from a desktop to a web interface right now.
 
Comment

JohnDoe8450

macrumors member
Original poster
Feb 3, 2009
61
0
Thanks!

I got the same advice at two different places, so I'm gonna take it and run!

I'm not intimately familiar with the notion of normalized/de-normalized data but I think I have a pretty good hunch so yeah, option B does seem like it'd be simpler to manage.
 
Comment

savar

macrumors 68000
Jun 6, 2003
1,952
0
District of Columbia
Thanks!

I got the same advice at two different places, so I'm gonna take it and run!

I'm not intimately familiar with the notion of normalized/de-normalized data but I think I have a pretty good hunch so yeah, option B does seem like it'd be simpler to manage.
"A" is what you might call a "Data Warehouse". You only use that design if you have very high throughput requirements and you need to generate a lot of reports.

Normalization is a fancy synonym for "repetition". Your "A" is less normal because each row will repeat the same information over and over... I.e. you have 400 rows with the exact same department name. Normalization is the process of moving that repeated information into a separate table and using relations to link the tables together. You can easily run large queries with 3 joins. Beyond that it can get hairy, but when I say large, I'm talking about 100,000 - 1,000,000 rows. I'm guessing you'll be much smaller than that.

Make sure to use foreign keys (use the InnoDB storage engine if you're using mySQL) so that the joins will be efficient, and you'll be far happier with "B".
 
Comment
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.