Small fields speed lookup speed in MySql

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

I’d previously thought the main reason to keep VARCHAR definitions short was to save disk space. Since disk space is cheap, I had not thought this was very important. I’d happily put down VARCHAR(255) if I thought the field might ever, in the future, need 255 characters, even though in the short term I knew that the field was not going to be used that way. But now, to my surprise, I read that this has implications for lookup speed:

Index short values. Use smaller data types when possible. For example, don’t use a BIGINT column if a MEDIUMINT is large enough to hold the values you need to store. Don’t use CHAR(100) if none of your values are longer than 25 characters. Smaller values improve index processing in several ways:

Shorter values can be compared more quickly, so index lookups are faster.

Smaller values result in smaller indexes that require less disk I/O.

With shorter key values, index blocks in the key cache hold more key values. MySQL can hold more keys in memory at once, which improves the likelihood of locating key values without reading additional index blocks from disk.

For the InnoDB and BDB storage engines that use clustered indexes, it’s especially beneficial to keep the primary key short. A clustered index is one where the data rows are stored together with (that is, clustered with) the primary key values. Other indexes are secondary indexes; these store the primary key value with the secondary index values. A lookup in a secondary index yields a primary key value, which then is used to locate the data row. The implication is that primary key values are duplicated into each secondary index, so if primary key values are longer, the extra storage is required for each secondary index as well.

Post external references

  1. 1
    http://www.informit.com/articles/article.aspx?p=377652
Source