Create a Database Table for Custom Module in Drupal 7

Creating a database table for Drupal module is straight forward and one of the easiest task in Drupal. To create a table, what all you need is to create a .install file inside module directory and implement hook_schema(). Inside hook_schema() implementation, specify the structure of table in array form.

As an example, here we are going to create a table to keep ratings for individual nodes in the system. To keep a rating for a node, we need following fields:

  • nid - Field to keep node id to which rate is applicable.
  • rating - Field to store actual rating for the node. We will be using integer values representing different levels of rating. Values will be ranging from 0 to 4. 0 for 'Very Poor', 1 for 'Poor', 2 for 'Average', 3 for 'Good' and 4 for 'Very Good'.

Above two are essential fields. Besides those, we will have one more field to uniquely identify each rating.

Assuming module name is 'myrating'. Our hook_schema() implementation will look like this:

/**
 * Implements hook_schema().
 */
function myrating_schema() {
  $schema['node_rating'] = array(
    // This explains purpose of this table.
    'description' => 'Table to keep rating for a node',

    // This array keeps information about the fields in this table.
    'fields' => array(
      'rid' => array(
        'description' => 'The primary identifier for a rating.',
        'type' => 'serial',
        'unsigned' => TRUE,
        'not null' => TRUE,
      ),
      'nid' => array(
        'description' => '{node}.nid of node to which this rating applicable.',
        'type' => 'int',
        'unsigned' => TRUE,
        'not null' => TRUE,
      ),
      'rating' => array(
        'description' => 'Rating for a node',
        'type' => 'int',
        'not null' => TRUE,
        'default' => -1,
      ),
    ),
    // Index 'rating' field to speed up retrieval of rows based on rating.
    'indexes' => array(
      'rating' => array('rating'),
    ),
    'unique keys' => array(
      'nid' => array('nid'),
    ),
    'foreign keys' => array(
      // This table contains a reference field to node table.
      // 'nid' field of this  table references a row in {node} table.
      'node' => array(
        'table' => 'node',
        'columns' => array('nid' => 'nid'),
      ),
    ),
    'primary key' => array('rid'),
  );

  return $schema;
}

We have provided inline comments explaining purpose of each section within array for table. 'rid' field is the primary key of the table. We specified its type as 'serial'. Upon seeing this, Drupal will make that field as auto generated. It will be 'auto increment' in MySQL and auto generated serial in Postgresql etc.

You can see that we created an index containing 'rating' field. This will speedup retrieval of rows based on 'rating' field value. But, we should not index fields unnecessarily, doing so may cause for opposite, reduce performance.

We made 'nid' as unique, so a single node cannot have more than one rating in this table. Also, we specified 'nid' field as foreign key. Which means values of 'nid' field are key values of another table (here 'node' table).

Making 'rating' field store integers instead of string will be good in terms of database table size and performance, instead of using string values like 'poor', 'good' etc.

As you read already, rating field will be storing integer values to represent 5 different levels of rating. So, our code for handling logics of rating need to use integer values within range 0-4. As programmers are humans and each of these integers have specific means, using these numeric values directly within code will confuse programmers sometimes. So it is a good practice to define constants representing each rating value. These constants to be defined within .module file, then they can be used any files within that module. So, inside our .module file we define these constants.

// Constants representing different levels of rating.
define('MYRATING_NOT_RATED', -1);
define('MYRATING_VERY_POOR', 0);
define('MYRATING_POOR', 1);
define('MYRATING_AVERAGE', 2);
define('MYRATING_GOOD', 3);
define('MYRATING_VERY_GOOD', 4);

You might have noticed that above code contains one extra constant MYRATING_NOT_RATED. This is to denote the non-rated state of a node. You can also see that above schema definition we used -1 as default value for 'rating' field. Instead of that literal value we can use corresponding constant for readability. So, rating field definition can be changed to:

'rating' => array(
  'description' => 'Rating for a node',
  'type' => 'int',
  'not null' => TRUE,
  'default' => MYRATING_NOT_RATED,
),