SQL database design...

Discussion in 'Web Design and Development' started by JohnDoe8450, Dec 12, 2009.

  1. JohnDoe8450 macrumors member

    Joined:
    Feb 3, 2009
    #1
    Ok, the original version of my post is quite long...
    Here's a short version :

    Am I better off with a few (4-5) small tables or one great big table?

    The option of using a few small tables gives me more flexibility but is it really much slower?
    According to this presentation on scribd, "1000 small very quick unnecessary queries is worse than 1 slow query".

    Help?

    PS : The long version...
     
  2. splitpea macrumors 6502a

    Joined:
    Oct 21, 2009
    Location:
    Among the starlings
    #2
    B.

    The chances you'll generate enough traffic for (B) to be a problem are much lower (and the issue is much more easily solved) than what happens with option (A) if your app is used for a few years and then needs updating or the data needs migrating to a new platform.

    You can still grab your data from multiple tables with a single query (if you don't know how, go research SQL JOINs, which are absolutely fundamental.) When you do hit performance problems, everything from indices to views to more hardware will be available to solve the problem.

    But dealing with migrating or validating denormalized data is a huge headache, and chances are that what with sloppy data entry, after a while you'll no longer be able to even properly pull up the data you need with your queries. Take it from someone who's migrating a denormalized database from a desktop to a web interface right now.
     
  3. JohnDoe8450 thread starter macrumors member

    Joined:
    Feb 3, 2009
    #3
    Thanks!

    I got the same advice at two different places, so I'm gonna take it and run!

    I'm not intimately familiar with the notion of normalized/de-normalized data but I think I have a pretty good hunch so yeah, option B does seem like it'd be simpler to manage.
     
  4. savar macrumors 68000

    savar

    Joined:
    Jun 6, 2003
    Location:
    District of Columbia
    #4
    "A" is what you might call a "Data Warehouse". You only use that design if you have very high throughput requirements and you need to generate a lot of reports.

    Normalization is a fancy synonym for "repetition". Your "A" is less normal because each row will repeat the same information over and over... I.e. you have 400 rows with the exact same department name. Normalization is the process of moving that repeated information into a separate table and using relations to link the tables together. You can easily run large queries with 3 joins. Beyond that it can get hairy, but when I say large, I'm talking about 100,000 - 1,000,000 rows. I'm guessing you'll be much smaller than that.

    Make sure to use foreign keys (use the InnoDB storage engine if you're using mySQL) so that the joins will be efficient, and you'll be far happier with "B".
     

Share This Page