Symfony World blog is not maintained anymore. Check new sys.exit() programming blog.

symfony crons and cron task logging

Scene from "Frankenstein" by James Whale (1931)

this is really magic...

Recently, I found an update of the sfTaskLoggerPlugin. As the description says, the plugin allows you to run custom tasks and store the results. In fact, this is a highly customizable tool that provides you monitoring all cron tasks, checking their performance (objects processed, running time) and so on. Installing and configuring this plugin in my company CRM was one of the most challenging and most interesting things I have done since many months.


few words about cron tasks

The most important of my projects is a CRM tool that provides some tools for managing an E-commerce company. It is an application that stores lots of data in its own database, but it also has access to several other databases and connects external systems through APIs. Such a complicated system requires several cron tasks (the number is going to be doubled soon). Some of them have to be run once per day, but others have to be run every few minutes. There are few problems you have to face when using cron tasks in a big system:

  • Running each cron consumes server resources and estimating the best time period between the cron finishes his task and before he's run again is not easy. If the period is too small, lots of resources will be consumed with no sense (everything depends on the cron, of course), but if the crons are run too rare, other employees' work is a lot more dificult, because their data s not up to date.
  • Another important thing is task performance - how fast does it take until the cron finishes his task. When you implement a new task, you measure how long does it take to run the task once. But if your data grows incredibly fast during just few months, will you always remember to check if everything is ok with the application performance?
  • Finally, you need statistics to know what is going on in your system - how often a functionality is used, what is the traffic during specific part of the day or for different days of week, etc.
  • Just one more thing - all this information you would like to know about your own crons should be easily accessible. Until now, I had a basic logfile system which was really far away from meetig my expectations (look above).


so, is this plugin useful?

Extremely! As almost everything in life, the most useful things are usually the easiest. And it's the same right here. But I must admit, that at first sight the plugin's readme seemed complicated - but it's only the first sight (according to COil, the author of the plugin). In fact, my original task was handled by the plugin after 20 minute. COil replied to my mail within few hours and I got all the answers I needed to install his plugin into my apps.


few examples of where sfTaskLoggerPlugin is useful

There are some of my cron tasks:

  • migrating data between databases. Different www applications are run independently, each of them having it's own database. It's pretty clear, that an employee won't check several different admin panels each few minutes, so everything has to be in one place. And here comes migrating orders... A customer visits a webpage and submits his order. A system cron is waiting for this to happen and after at most 5 minutes he's run and copies the order data into the CRM database, where all tools to manage this order are available. But what is the time when most orders are submitted? How many, how often? From which of our applications? And how often such cron should be run? And how long does it take...
  • e-mail sending - similar as above.
  • redundant table columns speeding up other functionalities. In my project I have to generate lots of different XML files for external applications such as price comparators or catalogues. We're using the prestashop to deploy shops. Source prestashop database structure can be highly ineffective in our case, so some product data has to be copied between tables once per day (the ineffective queries are executed only once and the result is stored in the redundant columns which are very easily accessible). And the problem is that it has to be run on all products we have in our database. Running a time-consuming query may be dangerous for performance (it's expected that number of products we have is going to be 4-5 times more that it is now). So I need to watch and check if there's need to divide the process into phases (processing parts of data separately) to prevent a breakdown.
  • Another thing to look closely at is accessing external applications' WebAPI. It happens that the WebAPI may be developer-unfriendly :) by providing methods that are really poor. And by the same time, the external application generates great traffic to your company. Well, such integration will be difficult then. Not only you'll have to watch out for lots and lots of traps, problems (and finally - bugs), but also you need to know when to run the cron task and how often. And this is not only about the comfort of other employees' work, but mainly about data consistency, this time. Programming can be art, definitely :).

main sfTaskLoggerPlugin's features

Well, it's pretty easy - each cron run is stored in the database. You know the starting and the ending time of each task (so you know how long does it take to finish the task, how often such task is run, you can check if such task was already run today - and so on). You can define the count of the processed and unprocessed objects (if any error occurs or if just some amount of data has to be processe later - than you define the id of the last processed object). You define the error code, stating if there were any errors during runtime, you're able to check how many crons are running at the moment (or how many of them has broken and was never finished). Finally, you define your own comments for each cron. I use it to generate HTML code that is easily accessible by office workers (in case they need to find if some data was processed and what was the result).

It's easy to display/generate project cron statistics. I'm gonna use stOfcPlugin, for example, to display how many percent of cron task run migrates any data (and how many just checkes that there is nothing to migrate). There is plenty of examle statistics to generate. I'm gonna provide some real cron examples with potential stats soon...

code example

It took me some time to publish it, but finally, an example symfony task basing on sfTaskLoggerPlugin is described inside this article.

why so small popularity?

To sum up, it's such a shame that there are so many good plugins that are not popular, including sfTaskLoggerPlugin. It's really difficult to break through and make your tool popular in the community. Well, using cron tasks is not the most important symfony feature - but I'm pretty sure that there are more than just 4 users who can make use of logging cron tasks... Or am I wrong? :) Just give this plugin a try!

By the way, we're looking for a Propel developer who is willing to contribute to the plugin. Any good contribution is welcome! ;-)

5 comments:

  1. Thanks for the review Tomaz. ;)

    Indeed I think there are more than 4 users, perhaps it was unpopular because I didn't release a PEAR package since quite a long time and it was not tagged as symfony 1.4 compatible, witch is now done. :)

    ++

    ReplyDelete
  2. Hi!

    I'm definely giving it a try! The thing with heavy cron tasks usage not being so popular is that only demanding apps requires such background processing.

    Most of cases you manage to set the time limit a little bit higher than usual and get the thing done.

    In my case I'm building a high demanding app. I intended to build my own background job management system. From what I've read, this plugin will fit my initial needs. Sweet!

    Great job COil!
    Nice article Tomasz!

    ReplyDelete
  3. hi there,
    I have two questions:

    - what's wrong about Propel in this plugin?

    - can I have the particullar task re-run (automatic or manually) in case of error ?

    thanks

    ReplyDelete
  4. Hello,

    PROPEL:
    There is nothing wrong with Propel in this plugin - there's just a new task to be implemented that supports propel. And the plugin team needs a Propel developer (dunno about COil, but I don't work with Propel anymore).

    RE-RUN:
    I'm not sure if I understand your question... You can always run a task again. But if you ask if you can update a database row describing task run - the answer is no. A task run is a single action that has only one opportunity to succeed. If your task fails, you may purge it (turning it's running flag off). Just run this task once again.

    As much as I understand the main concept of this plugin, there is no need to update database data about a single task run. The data inserted into the database describe one single task run, so if it fails, it should not be updated, because a failing task has stopped its work. And a task re-run (which is another task run) has its own row in the database.

    ReplyDelete
  5. An example code of symfony task basing on the sfTaskLoggerPlugin is available at http://symfony-world.blogspot.com/2012/08/symfony-cron-task-logging-examples.html

    ReplyDelete