Symfony World blog is not maintained anymore. Check new sys.exit() programming blog.

outsourcing applications with symfony

outsourcing applications

When you need to write from scratch an application that already exists (it's too old to be maintained anymore or simply someone else has written hopelessly unmaintainable junk - and you're the big guy that got the job) - symfony is a good choice! In the following article, I'll concentrate mostly on model and data migration.


overall

The first task is to prepare model for both (old & new) systems (multiple database connection used). Then, if the application already existed, the developer has to migrate all the date from the old structure to the new one, because it has to be available for the new system (e.g. already registered users). By migration I mean simply copying data between two different structures. Data migration will be run many times during new system development and finally it will be run last time when the old system becomes history and the new one is switched on - data has to be up to date. So a migration engine should be developed - this is the second task. And finally, the last one - a useful and handy tool to enable the developer to easily modify models. Each of the three tasks will be described in details below.


model for both systems

Start with creating your local copy of the database that is used by the webpage you are about to refactor. Set up both connections in config/databases.yml file. Now run

./symfony doctrine:build-schema
to creates a schema from an existing database (for the old one). Define your new schema as well. I'd suggest to put them in separate files (e.g. new.yml and old.yml; for Doctrine you only need to put it in config/doctrine directory, it doesn't have to be called exactly schema.yml, can be anything-else.yml).

And the problem starts at this moment... For sure, you'll have to rebuild entire new system (database and all classes) - to modify originally designed database, because of some unpredicted stuff. But this will damage the old system database if you want to create SQL as well (and if the old one is really big, it'd be very annoying to load database dumps anytime you need to change schema). But the part is done - schemas for both systems are created (schema of the old system will never change - we don't care about it anymore).


data migration

For such task it's always better to create a separate class that will handle all data migration, we'll call it MigrationManager.

Basically, one static method should perform data migration for one model (or more, if they're connected, like 1:n or m:n relations). If there'd be too many methods, you can always create more classes - it all depend on the project size. And if the old system data is really big, unfortunately, you won't be able to use Doctrine to insert new data. It consumes lots of memory to create Doctrine record objects. An example of what you can do to avoid the problem is to generate MySQL code (INSERT INTO ...) - then the migration is performed in two steps - generate a database dump file and simply load it.

If the amount of data doesn't force our local machine to use all it's memory (we can use Doctrine to insert data), it should be really easy:

  /**
   * Migrates data from old.aktualnosci to new.article.
   */
  static public function migrateArticles()
  {
    $aktualnosci = Doctrine_Query::create()
      ->from('Aktualnosci a')
      ->orderBy('a.news_id')
      ->fetchArray();
 
    foreach ($aktualnosci as $row)
    {
      $article = new Article();
      $article->setId($row['news_id']);
      $article->setTitle($row['tytul']);
      $article->setTextShort($row['skrot']);
      $article->setTextFull($row['pelny']);
      $article->setActive($row['aktywny']);
      $article->setCreatedAt($row['kolumna_data']);
      $article->save();
    }
  }
As you can see it's not difficult, even if you need to split or join the data or whatever. Then, enable using the migrateArticles() method from the controller - e.g. create a migrator module in the backend application, the module will have has distinct actions migrating different data. Calling any MigrationManager method should result in inserting data to the new system which is equivalent to the old system data.


easy schema/model modification tool

Finally, the last and the most complicated task. As mentioned before, there'll be no need to modify the structure of the old system, as long as no one is working on it at the same time. But there has to be an easy way to modify new system structure.

Assuming we won't have to modify old system structure, I suggest to move all old system files into separate subdirectories of model, form and filter directories (you'll do it only once). For example, let's move them to lib/doctrine/model/old, lib/doctrine/form/old and lib/doctrine/filter/old (and base Doctrine classes to lib/doctrine/model/old/base, lib/doctrine/form/old/base and lib/doctrine/filter/old/base respectively). Moving those files to separate directories gives us two advantages: [1] we'll be able to temporarily get rid of all old connection model classes to make symfony leave old database alone (when running doctrine:build --all) and [2] it simply separates two systems - it's more clear where to find which class.

We'll also need a versioning system like Subversion: after we temporarily delete class files, we'll retrieve them back.

To make it all easier, I created two batch scripts that handle most of the job. Rebuilding new system structure requires doing 5 things:

  • execute remove_old script - delete everything that is connected with the old system (so that old system data won't be erased)
  • clearing cache - old connection classes shall be removed from cache also
  • doctrine:build --all - run the main task
  • committing all structure changes - using a versioning system, all changes will be saved, new system classes will replace their previous version so the versioning system will automatically note file differences
  • execute model_reload script - get the original settings back
Examples of both scripts are provided below.


remove old - script

#!/bin/sh
 
echo 'deleting old system files'
rm config/doctrine/old.yml
cp config/databases.without-old.yml config/databases.yml
rm -fR lib/model/doctrine/old
rm -fR lib/form/doctrine/old
rm -fR lib/filter/doctrine/old
echo 'old system files deleted.'
  • old.yml file is the schema file holding old system structure. We delete it because we don't need those classes to be rebuild, especially that we moved them to a separate directory, so if they were rebuild, we would get two parallel set of classes (too bad...)
  • databases.without-old.yml is the same as databases.yml (two connections defined) with the only difference that the old connection is commented out (I prefer not to delete it just in case anything goes wrong with using versioning system).
  • removing class files - we don't have to list all classes separately (deleting entire directories is enough), because they've been separated from new system classes


model reload - script

#!/bin/sh
 
echo 'deleting new generated schema/model/form/filter'
rm -fR config/doctrine
rm config/databases.yml
rm -fR lib/model
rm -fR lib/form
rm -fR lib/filter
echo 'model/form/filter deleted.'
 
echo 'restoring svn schema.'
svn up config/doctrine
svn up config/databases.yml
echo 'svn schema restored.'
 
echo 'restoring svn model.'
svn up lib/model
echo 'svn model restored.'
 
echo 'restoring svn form.'
svn up lib/form
echo 'svn form restored.'
 
echo 'restoring svn filter.'
svn up lib/filter
echo 'svn filter restored.'
  • removing everything
  • retrieving it from the versioning system; the project has now acces to both databases and is up to date with both schema files!


Test it yourself. Any feedback is welcome!




Take a look at the data migration with symfony & doctrine article which is a continuation of this article, inspired by Fizyk's comment.

2 comments:

  1. I'd suggest to create the "MigrationManager" as a symfony task, with some command line output and parameters that will help to migrate data in batches, in case there are tons of data and php memory won't be enough to migrate them in one run.

    ReplyDelete
  2. Well, I already did it :) but I haven't written it here.

    But even a symfony task fails if there is too much data. For example, if you have a stats table that has 3-4 columns but 300k rows, this will take hours to be migrated by Doctrine. The solution is to write a simple MySQL "SELECT FROM INSERT INTO" query. But either you have to run it manually from the root level or you shall grant privileges to MySQL users defined in databases.yml, because this script copies data from db1.table1 to db2.table2.

    To sum up, if the table structure is easy, raw MySQL is better, but if data is modified during migration, symfony/Doctrine should be used. The worst happens when thousands of rows have to be modified deeply during migration (recalculating foreign keys between variable objects for example) - then the only solution is to increase your PHP limits, run the migration and make yourself coffee.

    Maybe I'll write another post about that, thanks for the suggestion :)

    ReplyDelete