1 | 2 | 3 | 4 | 5

16 - 20   [22]

MySQL Schema Via PEAR::DB_DataObject

Also relates to MySQL and PEAR

I have been updating some of my PEAR classes and stumbled upon the DB_DataObject package. I have actually been trying to implement a data modelling layer within my PHP CMS project for some time, and this package is a comprehensive collection of methods to achieve just that abstraction. Configuration was relatively painless and with a set of data objects ready to go I have been testing their application in the CMS API.

In the original interface, I was parsing initialisation files (parse_ini_file) to load the table schema from which the appropriate form or table model could be constructed utilising a mix of PEAR's HTML packages and custom classes. The main problem with this was that changes in the MySQL table schema had to be manually updated in the ini file.

When a set of data objects are created with DB_DataObject::createTables(), an ini file is also created that records the database schema. This file is parsed internally when a new data object is instantiated for a particular entity. The table schema is instantly available by calling DB_DataObject::table().


$obj =& DB_DataObject::factory('data_object');
$dbo_schema = $obj->table();
echo "<pre>";
foreach($dbo_schema as $fld_name => $fld_type)
{
  printf("%-20s", $fld_name);
  if ($fld_type & DB_DATAOBJECT_INT) echo "int ";
  if ($fld_type & DB_DATAOBJECT_STR) echo "str ";  
  if ($fld_type & DB_DATAOBJECT_DATE) echo "date ";  
  if ($fld_type & DB_DATAOBJECT_TIME) echo "time ";  
  if ($fld_type & DB_DATAOBJECT_BOOL) echo "bool ";  
  if ($fld_type & DB_DATAOBJECT_TXT) echo "txt ";  
  if ($fld_type & DB_DATAOBJECT_BLOB) echo "blob ";  
  if ($fld_type & DB_DATAOBJECT_NOTNULL) echo "not null ";  
  if ($fld_type & DB_DATAOBJECT_MYSQLTIMESTAMP)
     echo "timestamp ";  
  echo "\n";
}
echo "</pre>";

The returned schema contains key/value pairs where the key is the field (attribute) name, and the value is the datatype(s) as defined when the ini file is created. The datatype value is conditionally tested against the named constants with the bitwise & operator to determine the actual types defined. This works fine for a table model, where the schema fields and a generalised data type is all I require [the table model uses DOM scripting to allow column reordering based on general data types]. However, for the form model, I need to know the maximum allowed size for fields, default values and other metadata to compose accessible and usable form controls. For this the obvious solution is to fall back to the MySQL layer itself, and run a "SHOW COLUMNS" SQL query.

A PEAR::DB object is already a persistant property of the instantiated data object, so if this can be retrieved a standard DB::getAll() query can be run.


$db =& $obj->getDatabaseConnection();
$mysql_schema = $db->getAll("SHOW COLUMNS FROM "
                              . $obj->tableName());
echo "<pre>";
foreach($mysql_schema as $field)
{
  printf("%-8s %-25s %-10s %-6s %-20s\n",
         $field[0], $field[1],
         (empty($field[2]) ? "[ NOT NULL ]" : ""),
         $field[3], $field[4]);  
}
echo "</pre>";

DB_DataObject::getDatabaseConnection() returns a reference to the PEAR::DB object. To keep the code snippet general DB_DataObject::tableName() is called to retrieve the table name within the SQL query. Example output of this code is:


id      smallint(5)   unsigned  [ NOT NULL ]   PRI
name    varchar(80)             [ NOT NULL ]
url     varchar(100)                           MUL
version float(3,1)              [ NOT NULL ]          1.0
amended timestamp(14)
added   timestamp(14)

This provides all the data I need to create a corresponding set of form controls. If the CMS model is amending an entity, loading the data is just a case of calling $obj->get(id), and the data can be updated with $obj->update() once it has been passed through a data validator, and the new values assigned to the object's properties.

Next is to put the SQL builder aspect of DB_DataObject to test with some of the more complex triple join queries currently in the CMS model…

Posted on Feb 28, 2004 at 00:52:09. [Comments for MySQL Schema Via PEAR::DB_DataObject- 1]

Excel To MySQL Via PHP ODBC And COM

Also relates to MySQL and COM

As a keen statistician with a fascination for weather variation over time (in particular in relation to equilibrium tidal variations) I have acquired comprehensive weather data over the past few years from my digital weather station. All the archive pressure and temperature readings are tucked away in Excel spreadsheets - this year I have actually moved over to Open Office as my spreadsheet of choice. I finally decided it was time to move some of this data into a MySQL database on my server, where I could then manipulate the data for display on the web.

To start with I have decided to keep things simple, storing average daily pressure and temperature for graphical display through the GD library. In part for experimentation, I had a look last weekend at the options for achieving the transfer using the PHP Command Shell.

Firstly I tried using ODBC by creating a System DSN for the Microsoft Excel Driver, and SQL to extract the data. The inbuilt ODBC PHP functions are fairly intuitive and the script was quite simple to build. However the solution was too proprietary since I had to first use MS Query to establish the field names for the required columns and synchronising the query itself:


define("COL_DATE", "F2");
define("COL_AVG_PRESSURE", "F27");
define("COL_AVG_TEMP", "F30");

[..snip..]

$s_query = "SELECT `" . $arr_month . "$`." . COL_DATE
         . ", `" . $arr_month . "$`." . COL_AVG_TEMP . " "
         . ", `" . $arr_month . "$`." . COL_AVG_PRESSURE . " "
         . "FROM `" . $arr_month . "$` "
         . "ORDER BY `" . $arr_month . "$`." . COL_DATE;

Instead I decided to have a look at the DCOM abilities of PHP. Unfortunately documentation is sparse in the PHP Manual and across the Web generally, with only one clear example of accessing Excel that I could find. So I decided to trial and error a bit to see how much of the Excel Object Model is recognised. Based in part on the example mentioned above this was the best solution I could get to run:


$xls = new COM("Excel.sheet") or die("Did not connect");
print "Application name:{$xls->Application->value}\n";
print "Loaded Version: {$xls->Application->version}\n";
foreach($years as $year)
{
  $workbook = "C:\weather\_" . $year . ".xls";  
  $wkb = $xls->Application->Workbooks->Open($workbook) 
         or die("Failed to Open Workbook"); 
  $xls->Application->Visible = 1;
  foreach($sheets as $sheet)
  {
    $month = (array_search($sheet, $sheets) + 1);
    $ws = $wkb->Worksheets($sheet);
    $ws->activate;
    for ($day = 1, $i = First_Cell; 
         $i <= Last_Cell; $i++, $day++)
    {
      if (checkdate($month, $day, $year))
      {
        $arr_day[0] = $year . "-" 
                    . $month . "-" . $day;
        $cell = $ws->Cells($i, AD);
        $cell->activate;
        $arr_day[1] = (! empty($cell->value)) 
                    ? sprintf("%01.1f", $cell->value) 
                    : NULL;  
        $cell = $ws->Cells($i, AA);
        $cell->activate;
        $arr_day[2] = (! empty($cell->value)) 
                    ? round($cell->value) 
                    : NULL;          
        $values[] = $arr_day;
      }      
    }
  }
  $xls->Application->ActiveWorkbook->Close("False");
}
$xls->Release();
unset($xls);

This algorithm quite literally steps through each required cell in turn, by cycling through the monthly worksheets and yearly files (note the constants AA and AD define the indexes of the required columns). It is however quite inefficient, and a faster method would be to extract values as range objects using a call like $xls->Application->Range('Jan!$AA$6:$AA$36).Value, but the PHP COM library does not seem to accept this. Still, the above code works and could be manipulated and applied to future data. However, I will probably look to use Python in the future, with a broader implementation of COM via the Win32Com module and its integration with Open Office. At least I now know that MS Office documents can be manipulated by PHP.

Posted on Feb 12, 2004 at 21:26:53. [Comments for Excel To MySQL Via PHP ODBC And COM- 10]

Result Sets From Different Sources

This post does not relate to any other topics

Continuing the discussion of cleaner programming layers, one problem I find with similar sets of results from different tables, is that the API layer becomes cluttered with functions formatting similar result sets, with negligible differences. For example links to different sections of this weblog (the categories and archives). A quick solution to this is to set generic field names for the result set in the SQL query, and then using the same function to format the data in an array_walk call.


function ss_fns_build_list(&$str_item)
{
  extract($str_item);
  $str_item = sprintf("<li><a href=\"%s\"%s>%s</a></li>\n",
                      $t_url, $t_title, $t_text);
}

function ss_fns_get_data($str_query)
{  
  $var_results = ss_db_sel($str_query);
  if (PEAR::isError($var_results)) { // error trap }
  else
  {
    array_walk($var_results, 'ss_fns_build_list');
    return implode("\n", $var_results);
  }
}

The data is initially formatted in the SQL query (not shown) and assigned to generic field names (t_url, t_title, t_text). Then each item of the result set array is passed by reference to the function ss_fns_build_list where it is reformatted as a string. Then the entire array can be returned as a single string by first calling implode(). The PHP Manual states that modifying the array from inside the function may cause unpredictable behaviour, however I have not witnessed any testing the above code. A safety measure would be to pass a third parameter to the function by reference and fill that with the newly formatted data.

This solution is satisfying since it is visibly extensible. As an example, suppose the result set has groups of data (eg links to websites, categorised by region). Suppose, for this example, the grouping category has been given the generic field name t_list in the SQL query. Now an additional function can be defined at the API layer to do the additional formatting.


function ss_fns_build_list_multiple(&$arr_item, $key, &$arr_sets)
{
  $str_key = $arr_item["t_list"];
  ss_fns_build_list(&$arr_item);
  $arr_sets[$str_key][] = $arr_item;  
}

function ss_fns_get_grouped_data($str_query)
{
  $var_results = ss_db_sel($str_query);
  if (PEAR::isError($var_results)) { // error trap }
  else
  {
    array_walk($var_results,
               'ss_fns_build_list_multiple',
               &$arr_sets);
    foreach($arr_sets as $str_group_header => $arr_list_items)
    {
      $str_output .= "$str_group_header\n" 
                   . implode("\n", $arr_list_items) 
                   . "\n";
    }
    return $str_output;
  }
}

Each item of the result set is passed by reference to the function ss_fns_build_list_multiple. The category is looked up in the field t_list and then the item is formatted using the original ss_fns_build_list function. Finally it is assigned to the correct category in the third parameter, $arr_sets, which is also passed by reference. All that is left to do is to loop through the key, value pairs to produce the return data.

I have found these generic functions can be effectively applied to a number of scenarios, thus reducing the volume of code in the API. One further development would be to pass the location of a template as the third parameter in array_walk and hence remove the HTML from the layer completely, allowing more flexibility.

UPDATE - 2004-02-25 I have recently switched off the allow_call_time_pass_reference configuration. The function calls in ss_fns_build_multiple_list and ss_fns_get_grouped_data should not declare the variables $arr_item and $arr_sets, respectively, be passed by reference (this is left to the function being called). Under the new configuration, these call-time pass-by-references produce warnings. Thanks to the PHP Anthology for pointing this out to me.

Posted on Jan 30, 2004 at 00:46:40. [Comments for Result Sets From Different Sources- 0]

Selecting Next And Previous

Also relates to MySQL

Starting to get things sorted with the new look business site now, so time to catch up on some blogging. I am going to start off by looking at the common scenario of creating next and previous links for a data result set.

A typical application of the next and previous links is in a News Archive. These links offer intrinsic usability benefits presenting the user with less steps to view articles (eg they do not have to jump back to an index to find the link for the next article). However, selecting neighbouring items from a MySQL database is not that simple.

The obvious solution is to select the item with a unique key one less/more than the current record. But this has two drawbacks. Firstly it requires integers as unique keys, which is not always the case. It also assumes that there is a continuous sequence of unique keys with no breaks. This is rarely the case in a content management system, where some items might be deleted.

For this example I am going to assume that the unique key is not integer based, and instead focus on the timestamp at which each entry was added. So rather than adding or subtracting directly from a unique key, all that is required are a couple of comparative selection queries, having firstly established the timestamp for the current record.


SELECT @now := added 
       FROM table 
       WHERE uid = {x};
SELECT @prev := uid 
       FROM table 
       WHERE added < @now 
       ORDER by added DESC 
       LIMIT 0, 1;
SELECT @next := uid 
       FROM table 
       WHERE added > @now 
       ORDER by added ASC 
       LIMIT 0, 1;

This in itself is one solution, if, rather than creating SQL variables, the required data was just extracted directly in the second and third statements. But this will require two separate calls to the database abstraction layer from the API, with two sets of result data to manipulate. The code can be made more manageable at each layer if this is reduced to a single call. Having acquired the unique ID of the next and previous item, it is merely a case of collecting the data.


SELECT IF(uid = @prev,'p','n') as position, uid, title 
       FROM table 
       WHERE (uid = @prev) OR (uid = @next);

In this statement I have created an additional result field position which will be set to p or n for the previous and next records respectively. This is because I generally use PEAR::DB as my database abstraction with a set of custom functions. In this scenario, the first three statements can be executed in the standard way, $dbh->query($query), and the final statement called via $dbh->getAssoc($query). This will return an associative array of a maximum two items with keys of either n or p and the corresponding data.

Of course there is always more than one way of doing things, and I feel the logic of this alternative approach quite elegant, however, with limited permissions on my current servers, I can only test this on my local machine.


SELECT @row := 0;
DROP TABLE IF EXISTS temptable;
CREATE TEMPORARY TABLE temptable 
       SELECT @row := @row + 1 as rownum, table.* FROM table
       ORDER BY added DESC;
SELECT @current := rownum 
       FROM temptable 
       WHERE uid = {x};
SELECT IF(rownum < @current, 'n', 'p') as position, uid, title
       FROM temptable 
       WHERE (rownum = @current - 1) OR (rownum = @current + 1)

The code utilises the CREATE TEMPORARY TABLE statement to create a replica of the data that is being examined, and adding an additional field to the temporary table to act as a unique integer identifier (rownum). This resolves the problem outlined initially regarding the datatype of unique indexes, and sequence gaps.

There are no doubt copious other solutions to this problem. Of course an obvious choice is to do the work on the API layer. For example, the entire dataset could be collected and then PHP's array functions used to manipulate the dataset and extract the required information. However, I have found in the past this can get quite kludgey! Doing the work at the Database layer itself does not (appear to) harm the performance of the database, and helps keep all layers in the application that bit purer.

Posted on Jan 30, 2004 at 00:45:22. [Comments for Selecting Next And Previous- 4]

PHP Session Management

Also relates to Apache

I have rarely been content with session management in PHP. Since the web is stateless, a session must be maintained on a login site. The optimal way to achieve this is to place a session cookie, in the form of a session ID (a 32 byte alpha-numeric string), on the client. When the client has cookies disabled PHP appends the session ID to the end of each URI on the page. This latter practice comes with inherent complications. Server redirects must have the session ID hard coded into the source, which I sometimes find complicated to manage in complex validation and redirection scripts over multiple pages. Security Focus also highlights the major vulnerability of passing the session ID in the URI for PHP versions prior to 4.3.2.

When testing with cookies disabled I have also found browsing Back a number of pages can inadvertently destroy the session. If not coded carefully, the outcome can be a blank page, leaving the user confused and likely to just leave the site. My opinion is that the safest and most usable solution is to only allow the login process to proceed if cookies are enabled. If they are not a message is presented to the user with recommendations and advice on enabling session cookies and the implications of doing so.

So, the challenge is creating a user friendly process, where the user is informed immediately if they are unable to log in, and directions are given on enabling session cookies. The problem is that when a page first loads, the cookie is only passed to the client, so there is no way of knowing if the cookie has actually been set, and hence whether cookies are enabled. The following code snippet resolves this.


if (! isset($_COOKIE["PHPSESSID"]) && ! isset($_GET["PHPSESSID"]))
{  
  $sess_id = substr(SID,(strrpos(SID,"=")+1));
  $redirect_url = CMS . "init.session_" . $sess_id;
  header("Location: " . $redirect_url);
  exit;
}
elseif (! isset($_COOKIE["PHPSESSID"]) && isset($_GET["PHPSESSID"]))
{  
  $cookies_disabled = true;
}
else
{
  $cookie_disabled = false;
}

First the code tests for the cookie, and if it does not exists it looks for the PHPSESSID variable in the global $_GET array. If the tests fails, then the user has arrived at the page for the first time so it is not possible to tell if cookies are enabled or not. So, a server redirect is done to the same page (here defined by CMS in the initialisation file) with a custom string appended to the end. A custom string is used to enhance security, and the Apache mod_rewrite module is used to rebuild the URI as follows:


RewriteEngine On
RewriteRule init.session_([a-z0-9]{32})$ /cms/?PHPSESSID=$1

When the page reloads it runs the same test again. This time it passes the second condition and will only fail on the first condition if cookies are disabled. At this point a notice can be displayed to the user advising them that cookies must be enabled to actually log-in. To enhance the usability, if $cookies_enabled $cookies_disabled is true, a class is added to the login form, which overrides the normal CSS declarations, disabling the input fields and striking out corresponding labels. To aid users of assistive technologies, a title is also appended to each label stating that fields are disabled because cookies are required.

Perhaps this is common practice, but I recall in the past having trouble finding useful information for PHP session management best practices, and this seems to be an effective approach. The whole process is transparent to the user, and incorporating mod_rewrite to cloak the session id raises the level of security for the initial server redirect in which it is passed, as well as replacing the aesthetically displeasing URI string:

http://mysite/login/?PHPSESSID=sessionid

with the more comprehendable:

http://mysite/login/init.session,sessionid

If the initialisation variable session.name is changed to something other than PHPSESSID (or session_name(string) is called) then the URI becomes much harder to hack.

Posted on Nov 10, 2003 at 02:19:13. [Comments for PHP Session Management- 6]

Breadcrumbs Trail

[ Home ] -> TW Blog -> PHP
Site Map

The Severn Solutions website achieves the following standards:

[ XHTML 1.0 ] [ CSS 2 ] [ WAI AA ] [ Bobby AA ]

Page compiled in 0.013 seconds