Normalize Database

  • Author
  • #44098



    i use wordpress from a lot of time, but every time i must work with database i cry :)

    Actually i’m developing a JSON-RPC interface to let an offline managment using a my software – i know well that there is an xmlrpc interface, but i’m developing a structure do manage in the same manner misc cms.
    The problem is that a lot of types, name of fields and in general tables are bad structurated!

    What i mean? look, for example look at the categories table:
    cat_id, biting(20), not unsigned
    cat_name, varchar(55)
    category_nicename, varchar(55)
    category_description, longtext
    category_parent, biting(20), not unsigned
    category_count, biting(20), not unsigned

    Ok, first: names! Some are cat others category … why not call all fields with the same prefix? long names in tables are really useful because when someone read them can understood them really fast! I know whell that cat or category is equal, but however only one instead of two is a good thing!

    Second: types! bigint is something like REALL REALLY REALLY big for this kind of stuff! And, however, it is set as signed, it can have negative values, but using autoincrement you can’t have it, so you are using half of the value! So first set unsigned field type and second change BIGINT to something like MEDIUMINT or SMALLINT … why? an unsigned bigint go from zero to 18.446.744.073.709.551.616 and i don’t think will be never a blog with all these cats :) SMALLINT can have up to 65.000 values, they are a lot too, but however are better than a bigint (SMALLINT uses 2 bytes, BITINT uses 8 bytes)

    Third: yet types :)
    category_description uses a longtext field that is TOOOOOO big for a description! longtext can contain up to 4gb of text and, like over, i don’t think that will be never a description of this size :)
    A normal text size can be good, infact it can contain up to 65kb of text.

    If we wanna to look to another table we can take the posts table:
    ID, bigint(20)
    post_author, bigint(20)
    post_date, datetime
    post_date_gmt, datetime
    post_content, longtext
    post_title, text
    post_category, int(4)
    post_excerpt, text
    post_status, enum
    comment_status, enum
    ping_status, enum
    post_password, varchar(20)
    post_name, varchar(200)
    to_ping, text
    pinged, text
    post_modified, datetime
    post_modified_gmt, datetime
    post_content_filtered, text
    post_parent, bigint(20)
    guid, varchar(255)
    menu_order, int(11)
    post_type, varchar(100)
    post_mime_type, varchar(100)
    comment_count, bigint(20)

    Look first at names, they are casual: some comment prefix, other post prefix, other no prefix :)

    Second … types … some bigint for field that will not contain a so big value, other fields like longtext that will never contain 4gb of text :)

    If you look the table you see that post_category is an INT 4 (a smallint 4 is more appropiated :)) but the table can contain a value a lot more higher :D Look at author field too … it uses a bigint 20 … a blog will cannot have all these users :D

    So, wordpress is a really good blog, but, for example, the database need to be normalized:
    – to achieve better performances
    – to make it more human friendly
    – to make it more simpler

    However, good work :D

    (note: sorry for my bad englush :D)



    I think you need to be at, not We are running different software and can’t help you here. This information is in the pink thread at the top of the page marked READ ME FIRST.



    That tag on the front page is getting bigger and bigger…

The topic ‘Normalize Database’ is closed to new replies.