Symfony World blog is not maintained anymore. Check new sys.exit() programming blog.

faster queries: indexing tables

When designing complex web applications, you have to pay attention to the project performance to make the framework handle your request as fast as possible. This involves optimising client side (clean CSS, clean HTML, fast Javascript, etc.) and server side (caching templates and queries, usage of the database and many others). We will concentrate on the database here. In short, the database should have such structure that all information fetched to handle any request should be accessible very fast. This short article will show you few facts and tricks about symfony projects performance.


built-in foreign key management


One of the brilliant features in symfony is creating indexes for foreign keys by default. This saves a lot of time for the developers and, surely, leverages the overall performance of all symfony applications. Below is an example schema:


Book:
  actAs:
    Timestampable: ~
    SoftDelete: ~
  columns:
    category_id:
      type: integer
      notnull: true
      comment: "kategoria książek"
    title:
      type: string(255)
      notnull: true
      comment: "tytuł"
    author:
      type: string(255)
      comment: "autor"
    description:
      type: string
      comment: "opis"
  relations:
    Category:
      class: BookCategory
      local: category_id
      foreign: id
      foreignAlias: Books

BookCategory:
  actAs:
    Timestampable: ~
    SoftDelete: ~
  columns:
    name:
      type: string(255)
      notnull: true
      comment: "nazwa"

Such schema will generate the following SQL code. Note that the book.category_id column is indexed (faster queries) and constrainted (no data inconsistency) at the same time, automatically:


CREATE TABLE book_category (id BIGINT AUTO_INCREMENT ...;
CREATE TABLE book (id BIGINT AUTO_INCREMENT ...
  INDEX category_id_idx (category_id) ...;
ALTER TABLE book ADD CONSTRAINT book_category_id_book_category_id
  FOREIGN KEY (category_id) REFERENCES book_category(id);

Obviously, you may create your custom indexes (and this will be discussed later). This section from The symfony and Doctrine book covers the topic of custom indexes.


optimising MySQL queries


Before you do anything to speed up your queries execution, you need to know what queries you have in your system. The obvious thing is to look at the powerful web debug toolbar. It's a great tool, but it won't tell you what to do when executing your queries takes too long. But it can point out which queries are definitely poorly written (they are highlighted with orange color then). Then it's time for you to solve the problem. Often, it may happen that you need to join tables (more about this is written in "less doctrine queries" article).


If the number of queries cannot be limited, probably you may need to add custom indexes on your tables. An index is an additional structure, bound to a table, that speeds up selecting the appropriate results (there are lots of good tutorials on this topic, such as the tizag tutorials). The database server, when executing any query, looks for the best structure that can be used to serve the results as fast as possible. We can ask our database server to analyse given any query to tell us how is it going to be executed. And the best tool to visualise this is the EXPLAIN statement (short reference here). We will optimise a heavy query executed on the homepage of a social website, using explain and adding custom index.


example - social website homepage problem


The manager of the social website wants the developers to emphasise the users who are the most active ones. For example, he wants to display last logged users at the homepage. The developers figured out that they need to create an action table that will store actions performed by users. Action and profile tables are related to each other - a simple JOIN will be used each time when the homepage action is executed: last logged x profiles are fetched from the database and displayed then.


The website has been set off. Many users have registered and the action table is growing bigger and bigger every day. After few months, it has over 300'000 records. The manager is very happy that his project is becoming popular, but he noticed that the homepage is being served few seconds slower than it was in the beginning. The developers tell him that they didn't run high performance tests and they have to spend some time on optimisation. The manager is not pleased that it was not considered before.


NOTE: always use test data when focusing on project performance

Symfony has a built-in fixture mechanism which allows you to easily generate lots of different data (see the jobeet tutorial). This is essential when you want to make sure that your project will manage with big popularity. Anyway, if you decide to generate really big amount of data, do NOT use any ORM. It consumes too much memory and generating fixtures takes a lot of your precious time. I'd suggest to generate raw SQL INSERT queries instead - they'll be a lot faster.


Okay, let's move on. Once you have got lots of data (either real or test), execute each action - one after another - and check its performance. First thing you should look at is the mentioned web debug toolbar in the top right corner of the screen when running dev application environment. You should be worried, when you see something like the following:




There is a big problem: at least one of the queries is unoptimal (orange color) and as a consequence, executing this action takes too much time (imagine, almost 5 secs per one execution is really long and it doesn't matter that I'm using my personal computer for testing). Left click on the query icon (the one to the right):






One query takes almost 4 seconds to be executed. This surely causes a serious performance problem! Don't panic, just let your database server analyse the query for you:


EXPLAIN
  SELECT a.id AS a__id, a.created_by AS a__created_by
  FROM action a
  LEFT JOIN profile p ON
    (a.created_by = p.id AND p.deleted_at IS NULL )
  WHERE (
    a.type_id_external = '2'
    AND p.avatar_id IS NOT NULL
    AND p.mode =4
  )
  ORDER BY a.created_at DESC;





Here we can see, that the query has to check at least 1690 p (profile) table rows. And each profile record stores a lot of text data, describing each website user. All this makes the query take such long time to execute. If we want to speed it up, we just have to read carefully the query and concentrate on all columns used (and the order of their appearance). The solution is to find the best index (this topic may be quite complex and independent on the framework you use - ask google about indexing database tables/optimising databases - and read those articles carefully).


In this case, the developers executed the following line in MySQL:

ALTER TABLE profile ADD INDEX deleted_at_idx (deleted_at);
which created an index on the deleted_at datetime column. Thanks to this index, the EXPLAIN command shows that only 10 different rows in the profile table have to be analysed for the query to be executed. And this is a great success - the execution time went down to 0,01 second. Imagine the 4 seconds difference for each homepage display. This is the benefit from optimising project databases.


By the way, I cannot understand why the deleted_at column in the SoftDelete behavior is not indexed by default, especially when you can turn on automatic not-null-check (sql callback):

$manager->setAttribute(Doctrine_Core::ATTR_USE_DQL_CALLBACKS, true);
Provided this line is present in the project configuration, each query that fetches a SoftDelete model will automatically add the "WHERE deleted_at IS NOT NULL" part. It's obvious that such column has to be indexed - the index can be complex though - and the deleted_at can be the last column in this index - but, anyway, default index on deleted_at is a good idea! As you can see, you have to pay attention to what queries are executed inside your projects!


Note: database server different versions use different indexes

Different database server versions may use totally different indexes to execute the same queries on the same database structure! Make sure you run your performance test in the (future) production environment. Otherwise, you may find your application execute unoptimised queries on the production server even if you spent a lot of time optimising it in your local dev machine.


In the example above, it turned out that the production server has a different db server than the developer's local machine. The developer didn't check it - he was not aware of the differences and their negative impact on project's performance. The index that has been built is useless in the production environment (so it should be deleted, because inserting each row is slowed down by this index). Moreover, it happedened, that the new index the developers needed to speed up the query should be build on the action table... pay attention to database server versions you work on!


how many indexes to create

Table indexes are really helpful and they speed up database performance. The more complex your application is, the more queries may be executed. In other words, the bigger your application is, the more queries it may need to provide good performance. But look out - do not create too much indexes and never create any index when you are not sure that it is used somewhere. Why? It's very simple - each index is an additional structure which uses some space and needs some time to be created and managed. When a record is inserted, updated or removed, each index has to be updated accordingly to the change made, which surely consumes time. If you create too many indexes, you may simply slow down your database. For example, each user login inserts new action record - then an action table with 10 indexes will be slower than an action table with only 3 indexes).


further reading: this, this and this

symfony validate database content with sfValidatorCallback

Scene from "The three Musketeers" by Stephen Herek (1993)

In my last project, I needed to create a custom validation feature that would check if there are some specific records in database tables. You can read a lot about symfony validators at the official documentation, but I didn't found an exact solution to my problem there.


There are some customers with their names and their security codes and there are some offers that are bound to customers. There is a offer submission form in the frontend application, where the customer's name and security code have to be given. I found none of the simple validators useful. In the sfDoctrineGuardPlugin there is a sfGuardValidatorUser which is used to validate passwords in the signin form:


$this->setWidgets(array(
  'username' => new sfWidgetFormInputText(),
  'password' => new sfWidgetFormInputPassword(array('type' => 'password')),
  'remember' => new sfWidgetFormInputCheckbox(),
));
 
$this->setValidators(array(
  'username' => new sfValidatorString(),
  'password' => new sfValidatorString(),
  'remember' => new sfValidatorBoolean(),
));
 
$this->validatorSchema->setPostValidator(new sfGuardValidatorUser());

Take a look at the validator's class here. In short, the validator retrieves the user object and compares its password stored in the database with the one accessible from the request. As I mentioned, no simple validator can help, so a post validator has to be used on the entire validator schema:


$this->validatorSchema->setPostValidator(new sfGuardValidatorUser());

However, you can have the same effect without creating a custom validator. Symfony framework provides a callback validator:


$this->validatorSchema->setPostValidator(
  new sfValidatorCallback(array
    ('callback' => array($this, 'customer_code_callback'))));

The customer_code_callback has to be created now:


public function customer_code_callback($validator, $values)
{
  $customer_count = Doctrine_Query::create()
    ->from('Customer c')
    ->where('c.name = ?', $values['customer'])
    ->andWhere('c.code = ?', $values['code'])
    ->count();
 
  if (!$customer_count)
  {
    throw new sfValidatorError($validator, 'Niepoprawny kod lub firma');
  }
 
  return $values;
}

Following the principle the simplest is the best, I'm doing almost the same stuff as the sfGuardValidatorUser, but with less work. Of course, it is up to you to decide whether you prefer to create a simple callback function or a reusable, entire validator class (which is more complicated, more elegant but provides the same functionality). So this is enough for the form to validate the customer's security code, before the offer file can be uploaded.


Of course, apart from the post validator, all form widgets can be validated with simple validators:


$this->setWidgets(array(
  'customer' => new sfWidgetFormInputText(),
  'code' => new sfWidgetFormInputText(),
  'file' => new sfWidgetFormInputFile(),
));
 
$this->setValidators(array(
  'customer' => new sfValidatorString(array(
    'required' => true,
    'trim' => true
  ), array(
    'required' => 'Podaj nazwę firmy',
  )),
  'code' => new sfValidatorString(array(
    'required' => true,
    'trim' => true
  ), array(
    'required' => 'Podaj kod weryfikacyjny',
  )),
  'file' => new sfValidatorFile(array(
    'required' => true,
    'path'     => sfConfig::get('sf_upload_dir').'/offer/',
  ), array(
    'required' => 'Wybierz załącznik',
  ))
));

further usage


The example I gave seems quite similar to password check. But as you can see, you can just replace the example Doctrine query with any other queries, e.g.

  • check number of offers submitted by a customer, which are still being processed (not finished), if their total count exceeds 10, the validation shall fail
  • check if there is any free machine/employee that can handle a service, requested by the user; for example, all machines are busy, the validation shows an error you have to wait 36 minutes to wait for the first free machine and, optionally, store the IP to be handled as the first in the queue

less doctrine queries in symfony admin modules

Scene from "The Damned" by Luchino Visconti (1969)

basic stuff

This short article is a hint for symfony newbies. I'll show how to configure admin modules to limit the number of SQL queries called to generate the list of result.


the problem

The problem occurs when we create an admin module for a model that has some related records which we want to show. By default, Doctrine is not smart enough to guess that joining those related records will generate less SQL queries, so - depending on the database structure - generating a list of 20 objects can generate hundreds of queries (or even thousands, if the there are so many related records to be displayed). The example below is not so tragic, there are only 20 useless queries.



By default, symfony calls a basic

SELECT * FROM model;
query first. Then, while generating the row for each record, it calls for all related objects separately. As we can see below, these are the same queries run for different ids. A great waste of resources.



Now, the thing is to make symfony call less SQL queries to do the same job.


the solution

Let's say, we have a custom model called Proc, representing a procedure.



Go to the generator.yml file in the custom admin module config directory and add the table_method option in the list section:

generator:
  class: sfDoctrineGenerator
  param:
    #...
    config:
      list:
        table_method: getListQuery
This means that the configuration tells the admin module controller that if it want to fetch objects to display in a list, it has to look for the getListQuery in the ProcTable class. So we shall define it:
public static function getListQuery()
{
  return Doctrine_Query::create()
    ->from('Proc p')
    ->leftJoin('p.Tags');
}
And it's all. Fast, easy and clean.

There's also a possibility to achieve the same goal by overriding the buildQuery method in the actions class of the custom admin module (apps/backend/modules/proc/actions.class.php), but it's the dirty solution (thus it's not advisable):
protected function buildQuery()
{
  $query = parent::buildQuery();
  return $query
    ->leftJoin($query->getRootAlias().'.Tags');
}


The result

Now, each Proc object has all related Tag objects fetched already. In the previous situation, when symfony was generating a row for an object (in the admin module list) it called the object for its related objects. None were fetched, so SQL queries were called. Now, when the Proc object is called for its Tags, they're already there, so no additional SQL queries are called.



This limits number of queries and is very important (ignoring this may cause the server to go down frequently). Each action (not only admin module) should be optimised to make symfony call as little SQL queries as possible. Of course, you may join all related models, not only one (the Proc model could have more than only Tags relation, for example, Category, Author, Department and so on - each of them would generate useless queries as well).



Enjoy ;)



EDIT: This post was updated thanks to fizyk's and sacri's opinions. Thanks, guys ;).