Get your FREE 30 page Developing SOLID Applications guide!

Peer Review: Looking At Abstraction – Redux

Out Of Date Warning

Languages change. Perspectives are different. Ideas move on. This article was published on August 31, 2009 which is more than two years ago. It may be out of date. You should verify that technical information in this article is still current before relying upon it for your own purposes.

This entry is part of an ongoing series involving the review of a code sample and it’s refactoring. For the original code sample, see here.

Editor’s Note: The response of the community to this series has been great, and I’ve been given a large number of suggestions. I’ve incorporated some of those suggestions into the code and into this article. Thanks to Jeff Carouth, Greg Beaver and Daniel O’Connor for their help and suggestions.

This entry will focus on our use of the database, and specifically on the already_tweeted() method. This method has a number of problems, and while we’re focusing on the implementation of the database, it’s important to note that we will also need to address some of the logic (which will be the next part of the series).

In our last entry, we focused on abstracting the HTTP request out to a seperate class. Lots of people wrote comments with suggestions of HTTP handlers, including pecl_http, the PEAR HTTP class, HTTP_Request2 and the PEAR Log class for logging. These are all great suggestions, and all will help abstract out the class without causing us to have to write our own implementation of common problems (the Not Invented Here (N-I-H) syndrome).

In focusing on the already_tweeted method, one thing becomes immediately apparent: it is private. This suggestion, provided by Greg Beaver relates to our first discussion of coding standards and we will change the class to a protected class for extendability later on.

Let’s take a look at the function and dive in to some of the mistakes in it:

/**
* Check to see if this tweet has already been posted and add it to the DB
* if it has.
* @param string $message The tweet
* @return bool
*/
protected function already_tweeted($message)
{
$text     = mysql_real_escape_string(trim($message));
$date     = date('Y-m-d');
$code     = md5($text . $date);
$sql      = "SELECT id FROM twitterLog WHERE thash = \"$code\" ORDER BY id DESC LIMIT 1";
if (!$res = mysql_query($sql))
{
die(mysql_error());
}
$num      = mysql_num_rows($res);
if ($num)
{
return TRUE;
}
$sql      = "INSERT INTO twitterLog (tdate, thash, tweet) VALUES ( \"$date\", \"$code\", \"$text \" )";
if (!$res = mysql_query($sql))
{
die(mysql_error());
}
return FALSE;
}

Let’s talk first about assumptions. This function assumes that a database connection has already been established for us. And it also assumes that we’re going to use MySQL. What if we want to use another database type (like Postgres, SQLite, or MSSQL)? These are both assumptions that I’d prefer not to make.

We have a few choices with regards to database interactions. We could refactor this using PDO, which would allow us many benefits: PDO is object-oriented, supports multiple databases with the same function calls and structure (meaning that changing database backends wouldn’t break our code), and is relatively easy to learn and understand. We could certainly refactor this using PDO.

The problem is, however, that using PDO doesn’t abstract out the development of our SQL queries. We still have raw SQL in our class. This is less than optimal.

Instead, I want to make use of an ORM (Object Relational Mapping) framework. My preference is Propel, as I’ve worked with it before, but you can use one of the many PHP-based ORMs (like Doctrine). This provides many benefits:

  • We don’t assume that a connection exists; one will be automatically created if it doesn’t.
  • We don’t have to spend lots of time filtering the data, as it will be handled for us.
  • If the database structure changes, most of the schema changes are handled by the framework, meaning our code doesn’t probably have to change all that much. Propel, for example, generates its objects automatically.
  • Most ORM frameworks create a model layer for our MVC frameworks for us, and make use of PDO so that we can change the backend database server with little or no effort.

This isn’t to say that utilizing an ORM or abstracting this out makes life easy; it doesn’t. This will hinder your ability to redistribute the code without the full ORM framework. However, I feel that the benefit outweighs the cost. You can make that decision for yourself.

Another thing that we should refactor is the use of die() statements. This is a very poor practice that I see often in code; die() is a bad way to “handle” errors because it doesn’t really handle them at all. In classes, exceptions should be used instead. ORMs (like Propel) will throw their own exceptions, and you should throw your own as well.

We’ve done a lot of refactoring at this point, so let’s take a look at the function, refactored:

/**
* Check to see if this tweet has already been posted and add it to the DB
* if it has.
* @param string $message The tweet
* @return bool
*/
protected function already_tweeted($message)
{
$hash = md5( date('Y-m-d') . $message);

$c = new Criteria();
$c->add(TwitterLogPeer::THASH, $hash);
$count = TwitterLogPeer::doCount($c);

if($count > 0)
{
return TRUE;
}

try {
$logTweet = new TwitterLog();
$logTweet->setDate(time()); // This will be turned into a DateTime object.
$logTweet->setHash($hash);
$logTweet->setTweet($message);
$logTweet->save();
} catch (Exception $e) {
throw new Exception('There was an error in saving the item to the database.');
}

return FALSE;
}

Taking a look at this, it’s easier to read, it has no raw SQL, there’s little chance of us missing something or not cleaning it, and we’re in a strong OOP position. Yes, there are a few more lines of code (3 to be exact) but this isn’t about reducing the code, it’s about making it better (the extra lines come from the exception).

In the next entry of the series, we’ll focus on refactoring some of the broken logic in the application, including refactoring this function again into separate functions.

Write better object oriented PHP today.

Object oriented programming always leaves you with a headache. What if you could master it instead?

Get the book now! »

David wrote at 8/31/2009 10:35 am:

What about setting up a storage adapter? (not sure if that’s the right pattern name…)

But basically, something like:

$twitter = new Twitter;
$twitter->setStorage(new Twitter_Storage_Propel());
….
protected function already_tweeted($message){

$hash = md5( date(‘Y-m-d’) . $message);

$storage = $this->getStorage();

if($storage->messageExists($message)){
return true;
}
try{
$storage->save($message, $hash);
}catch($e Exception){
throw new Exception(‘There was an error in saving the item.’);
}
return false;
}

Yes, it adds even another layer of abstraction, but that way the Twitter class doesn’t have to know anything about how to save the message, and instead passes that on to the storage class.

Brandon Savage (@brandonsavage) wrote at 8/31/2009 5:56 pm:

That’s certainly a valid option as well. In fact, Propel offers the ability to customize some of the functions in the data model, which is where I might put some of this logic.