Reason is the slave of the passionsDavid Hume
Also relates to MySQL and PEAR
I have been updating some of my PEAR classes and stumbled upon the DB_DataObject package. I have actually been trying to implement a data modelling layer within my PHP CMS project for some time, and this package is a comprehensive collection of methods to achieve just that abstraction. Configuration was relatively painless and with a set of data objects ready to go I have been testing their application in the CMS API.
In the original interface, I was parsing initialisation files (parse_ini_file)
to load the table schema from which the appropriate form or table model could be
constructed utilising a mix of PEAR's HTML packages and custom classes. The main problem with
this was that changes in the MySQL table schema had to be manually updated in
the ini file.
When a set of data objects are created with DB_DataObject::createTables(), an ini
file is also created that records the database schema. This file is parsed internally
when a new data object is instantiated for a particular entity. The table schema is
instantly available by calling DB_DataObject::table().
$obj =& DB_DataObject::factory('data_object');
$dbo_schema = $obj->table();
echo "<pre>";
foreach($dbo_schema as $fld_name => $fld_type)
{
printf("%-20s", $fld_name);
if ($fld_type & DB_DATAOBJECT_INT) echo "int ";
if ($fld_type & DB_DATAOBJECT_STR) echo "str ";
if ($fld_type & DB_DATAOBJECT_DATE) echo "date ";
if ($fld_type & DB_DATAOBJECT_TIME) echo "time ";
if ($fld_type & DB_DATAOBJECT_BOOL) echo "bool ";
if ($fld_type & DB_DATAOBJECT_TXT) echo "txt ";
if ($fld_type & DB_DATAOBJECT_BLOB) echo "blob ";
if ($fld_type & DB_DATAOBJECT_NOTNULL) echo "not null ";
if ($fld_type & DB_DATAOBJECT_MYSQLTIMESTAMP)
echo "timestamp ";
echo "\n";
}
echo "</pre>";
The returned schema contains key/value pairs where the key is the field (attribute) name,
and the value is the datatype(s) as defined when the ini file is created.
The datatype value is conditionally tested against the named constants with the bitwise & operator to determine the
actual types defined. This works fine for a table model, where the schema fields and a generalised data type is
all I require [the table model uses DOM scripting to allow column reordering based on general data types]. However,
for the form model, I need to know the maximum allowed size for fields, default values and other metadata to
compose accessible and usable form controls. For this the obvious solution is to fall back to the MySQL layer itself, and run a "SHOW COLUMNS" SQL query.
A PEAR::DB object is already a persistant property of the instantiated data object, so if this can be retrieved a standard DB::getAll() query can be run.
$db =& $obj->getDatabaseConnection();
$mysql_schema = $db->getAll("SHOW COLUMNS FROM "
. $obj->tableName());
echo "<pre>";
foreach($mysql_schema as $field)
{
printf("%-8s %-25s %-10s %-6s %-20s\n",
$field[0], $field[1],
(empty($field[2]) ? "[ NOT NULL ]" : ""),
$field[3], $field[4]);
}
echo "</pre>";
DB_DataObject::getDatabaseConnection() returns a reference to the PEAR::DB object. To keep the code snippet general
DB_DataObject::tableName() is called to retrieve the table name within the SQL query. Example output of this code is:
id smallint(5) unsigned [ NOT NULL ] PRI name varchar(80) [ NOT NULL ] url varchar(100) MUL version float(3,1) [ NOT NULL ] 1.0 amended timestamp(14) added timestamp(14)
This provides all the data I need to create a corresponding set of form controls. If the CMS model is amending
an entity, loading the data is just a case of calling $obj->get(id), and the data
can be updated with $obj->update() once it has been passed through a data validator, and the new values assigned to
the object's properties.
Next is to put the SQL builder aspect of DB_DataObject to test with some of the more complex triple join queries currently in the
CMS model…
Posted on Feb 28, 2004 at 00:52:09. [Comments for MySQL Schema Via PEAR::DB_DataObject- 1]
As a keen statistician with a fascination for weather variation over time (in particular in relation to equilibrium tidal variations) I have acquired comprehensive weather data over the past few years from my digital weather station. All the archive pressure and temperature readings are tucked away in Excel spreadsheets - this year I have actually moved over to Open Office as my spreadsheet of choice. I finally decided it was time to move some of this data into a MySQL database on my server, where I could then manipulate the data for display on the web.
To start with I have decided to keep things simple, storing average daily pressure and temperature for graphical display through the GD library. In part for experimentation, I had a look last weekend at the options for achieving the transfer using the PHP Command Shell.
Firstly I tried using ODBC by creating a System DSN for the Microsoft Excel Driver, and SQL to extract the data. The inbuilt ODBC PHP functions are fairly intuitive and the script was quite simple to build. However the solution was too proprietary since I had to first use MS Query to establish the field names for the required columns and synchronising the query itself:
define("COL_DATE", "F2");
define("COL_AVG_PRESSURE", "F27");
define("COL_AVG_TEMP", "F30");
[..snip..]
$s_query = "SELECT `" . $arr_month . "$`." . COL_DATE
. ", `" . $arr_month . "$`." . COL_AVG_TEMP . " "
. ", `" . $arr_month . "$`." . COL_AVG_PRESSURE . " "
. "FROM `" . $arr_month . "$` "
. "ORDER BY `" . $arr_month . "$`." . COL_DATE;
Instead I decided to have a look at the DCOM abilities of PHP. Unfortunately documentation is sparse in the PHP Manual and across the Web generally, with only one clear example of accessing Excel that I could find. So I decided to trial and error a bit to see how much of the Excel Object Model is recognised. Based in part on the example mentioned above this was the best solution I could get to run:
$xls = new COM("Excel.sheet") or die("Did not connect");
print "Application name:{$xls->Application->value}\n";
print "Loaded Version: {$xls->Application->version}\n";
foreach($years as $year)
{
$workbook = "C:\weather\_" . $year . ".xls";
$wkb = $xls->Application->Workbooks->Open($workbook)
or die("Failed to Open Workbook");
$xls->Application->Visible = 1;
foreach($sheets as $sheet)
{
$month = (array_search($sheet, $sheets) + 1);
$ws = $wkb->Worksheets($sheet);
$ws->activate;
for ($day = 1, $i = First_Cell;
$i <= Last_Cell; $i++, $day++)
{
if (checkdate($month, $day, $year))
{
$arr_day[0] = $year . "-"
. $month . "-" . $day;
$cell = $ws->Cells($i, AD);
$cell->activate;
$arr_day[1] = (! empty($cell->value))
? sprintf("%01.1f", $cell->value)
: NULL;
$cell = $ws->Cells($i, AA);
$cell->activate;
$arr_day[2] = (! empty($cell->value))
? round($cell->value)
: NULL;
$values[] = $arr_day;
}
}
}
$xls->Application->ActiveWorkbook->Close("False");
}
$xls->Release();
unset($xls);
This algorithm quite literally steps through each required cell in turn, by cycling
through the monthly worksheets and yearly files (note the constants AA
and AD define the indexes of the required columns). It is however quite inefficient, and
a faster method would be to extract values as range objects using a call like
$xls->Application->Range('Jan!$AA$6:$AA$36).Value, but the PHP
COM library does not seem to accept this.
Still, the above code works and could be manipulated and applied
to future data. However, I will probably look to use Python in the future, with a broader
implementation of COM via the Win32Com module and its integration with Open Office. At
least I now know that MS Office documents can be manipulated by PHP.
Posted on Feb 12, 2004 at 21:26:53. [Comments for Excel To MySQL Via PHP ODBC And COM- 10]
This post does not relate to any other topics
Continuing the discussion of cleaner programming layers, one problem I find with similar sets of results from different tables,
is that the API layer becomes cluttered with functions formatting similar
result sets, with negligible differences. For example links to different sections of this weblog (the categories and archives).
A quick solution to this is to set generic field names for the result set in the SQL query, and then using the same function to format
the data in an array_walk call.
function ss_fns_build_list(&$str_item)
{
extract($str_item);
$str_item = sprintf("<li><a href=\"%s\"%s>%s</a></li>\n",
$t_url, $t_title, $t_text);
}
function ss_fns_get_data($str_query)
{
$var_results = ss_db_sel($str_query);
if (PEAR::isError($var_results)) { // error trap }
else
{
array_walk($var_results, 'ss_fns_build_list');
return implode("\n", $var_results);
}
}
The data is initially formatted in the SQL query (not shown) and assigned to generic field names
(t_url, t_title, t_text). Then each item of the result set array is passed by reference to the function ss_fns_build_list
where it is reformatted as a string. Then the entire array can be returned as a single string by first calling implode().
The PHP Manual states that modifying the array from inside the function may cause unpredictable behaviour, however I have
not witnessed any testing the above code. A safety measure would be to pass a third parameter to the function by reference
and fill that with the newly formatted data.
This solution is satisfying since it is visibly extensible. As an example, suppose the result set has groups of data (eg links to websites, categorised by region). Suppose, for this example, the grouping category has been given the generic field name t_list in the SQL query. Now an additional function can be defined at the API layer to do the additional formatting.
function ss_fns_build_list_multiple(&$arr_item, $key, &$arr_sets)
{
$str_key = $arr_item["t_list"];
ss_fns_build_list(&$arr_item);
$arr_sets[$str_key][] = $arr_item;
}
function ss_fns_get_grouped_data($str_query)
{
$var_results = ss_db_sel($str_query);
if (PEAR::isError($var_results)) { // error trap }
else
{
array_walk($var_results,
'ss_fns_build_list_multiple',
&$arr_sets);
foreach($arr_sets as $str_group_header => $arr_list_items)
{
$str_output .= "$str_group_header\n"
. implode("\n", $arr_list_items)
. "\n";
}
return $str_output;
}
}
Each item of the result set is passed by reference to the function
ss_fns_build_list_multiple. The category is looked up
in the field t_list and then the item is formatted using
the original ss_fns_build_list function. Finally it is
assigned to the correct category in the third parameter, $arr_sets,
which is also passed by reference. All that is left to do is to loop
through the key, value pairs to produce the return data.
I have found these generic functions can be effectively applied to a number of
scenarios, thus reducing the volume of code in the API.
One further development would be to pass the location of a template as the third parameter in
array_walk and hence remove the HTML from the layer completely, allowing
more flexibility.
UPDATE - 2004-02-25 I have recently switched off the allow_call_time_pass_reference configuration. The function calls in ss_fns_build_multiple_list and ss_fns_get_grouped_data should not declare the variables $arr_item and $arr_sets, respectively, be passed by reference (this is left to the function being called). Under the new configuration, these call-time pass-by-references produce warnings. Thanks to the PHP Anthology for pointing this out to me.
Posted on Jan 30, 2004 at 00:46:40. [Comments for Result Sets From Different Sources- 0]
Also relates to MySQL
Starting to get things sorted with the new look business site now, so time to catch up on some blogging. I am going to start off by looking at the common scenario of creating next and previous links for a data result set.
A typical application of the next and previous links is in a News Archive. These links offer intrinsic usability benefits presenting the user with less steps to view articles (eg they do not have to jump back to an index to find the link for the next article). However, selecting neighbouring items from a MySQL database is not that simple.
The obvious solution is to select the item with a unique key one less/more than the current record. But this has two drawbacks. Firstly it requires integers as unique keys, which is not always the case. It also assumes that there is a continuous sequence of unique keys with no breaks. This is rarely the case in a content management system, where some items might be deleted.
For this example I am going to assume that the unique key is not integer based, and instead focus on the timestamp at which each entry was added. So rather than adding or subtracting directly from a unique key, all that is required are a couple of comparative selection queries, having firstly established the timestamp for the current record.
SELECT @now := added
FROM table
WHERE uid = {x};
SELECT @prev := uid
FROM table
WHERE added < @now
ORDER by added DESC
LIMIT 0, 1;
SELECT @next := uid
FROM table
WHERE added > @now
ORDER by added ASC
LIMIT 0, 1;
This in itself is one solution, if, rather than creating SQL variables, the required data was just extracted directly in the second and third statements. But this will require two separate calls to the database abstraction layer from the API, with two sets of result data to manipulate. The code can be made more manageable at each layer if this is reduced to a single call. Having acquired the unique ID of the next and previous item, it is merely a case of collecting the data.
SELECT IF(uid = @prev,'p','n') as position, uid, title
FROM table
WHERE (uid = @prev) OR (uid = @next);
In this statement I have created an additional result field position which will be set to p or n
for the previous and next records respectively. This is because I generally use PEAR::DB as my database abstraction with a set
of custom functions. In this scenario, the first three statements can be executed in the standard way, $dbh->query($query),
and the final statement called via $dbh->getAssoc($query). This will return an associative array of a maximum two items
with keys of either n or p and the corresponding data.
Of course there is always more than one way of doing things, and I feel the logic of this alternative approach quite elegant, however, with limited permissions on my current servers, I can only test this on my local machine.
SELECT @row := 0;
DROP TABLE IF EXISTS temptable;
CREATE TEMPORARY TABLE temptable
SELECT @row := @row + 1 as rownum, table.* FROM table
ORDER BY added DESC;
SELECT @current := rownum
FROM temptable
WHERE uid = {x};
SELECT IF(rownum < @current, 'n', 'p') as position, uid, title
FROM temptable
WHERE (rownum = @current - 1) OR (rownum = @current + 1)
The code utilises the CREATE TEMPORARY TABLE statement to create a replica of the data that is being
examined, and adding an additional field to the temporary table to act as a unique integer identifier (rownum).
This resolves the problem outlined initially regarding the datatype of unique indexes, and sequence gaps.
There are no doubt copious other solutions to this problem. Of course an obvious choice is to do the work on the API layer. For example, the entire dataset could be collected and then PHP's array functions used to manipulate the dataset and extract the required information. However, I have found in the past this can get quite kludgey! Doing the work at the Database layer itself does not (appear to) harm the performance of the database, and helps keep all layers in the application that bit purer.
Posted on Jan 30, 2004 at 00:45:22. [Comments for Selecting Next And Previous- 4]
Also relates to Apache
I have rarely been content with session management in PHP. Since the web is stateless, a session must be maintained on a login site. The optimal way to achieve this is to place a session cookie, in the form of a session ID (a 32 byte alpha-numeric string), on the client. When the client has cookies disabled PHP appends the session ID to the end of each URI on the page. This latter practice comes with inherent complications. Server redirects must have the session ID hard coded into the source, which I sometimes find complicated to manage in complex validation and redirection scripts over multiple pages. Security Focus also highlights the major vulnerability of passing the session ID in the URI for PHP versions prior to 4.3.2.
When testing with cookies disabled I have also found browsing Back a number of pages can inadvertently destroy the session. If not coded carefully, the outcome can be a blank page, leaving the user confused and likely to just leave the site. My opinion is that the safest and most usable solution is to only allow the login process to proceed if cookies are enabled. If they are not a message is presented to the user with recommendations and advice on enabling session cookies and the implications of doing so.
So, the challenge is creating a user friendly process, where the user is informed immediately if they are unable to log in, and directions are given on enabling session cookies. The problem is that when a page first loads, the cookie is only passed to the client, so there is no way of knowing if the cookie has actually been set, and hence whether cookies are enabled. The following code snippet resolves this.
if (! isset($_COOKIE["PHPSESSID"]) && ! isset($_GET["PHPSESSID"]))
{
$sess_id = substr(SID,(strrpos(SID,"=")+1));
$redirect_url = CMS . "init.session_" . $sess_id;
header("Location: " . $redirect_url);
exit;
}
elseif (! isset($_COOKIE["PHPSESSID"]) && isset($_GET["PHPSESSID"]))
{
$cookies_disabled = true;
}
else
{
$cookie_disabled = false;
}
First the code tests for the cookie, and if it does not exists it looks for the PHPSESSID variable
in the global $_GET array. If the tests fails, then the user has arrived at the page for the first time so it is not
possible to tell if cookies are enabled or not. So, a server redirect is done to the same page (here defined by CMS in the initialisation file) with a custom string appended to the
end. A custom string is used to enhance security, and the Apache mod_rewrite module is used to rebuild the URI as follows:
RewriteEngine On
RewriteRule init.session_([a-z0-9]{32})$ /cms/?PHPSESSID=$1
When the page reloads it runs the same test again. This time it passes the second condition and will only fail
on the first condition if cookies are disabled. At this point a notice can be displayed to the user advising them that cookies
must be enabled to actually log-in. To enhance the usability, if $cookies_enabled $cookies_disabled is true, a class is added to
the login form, which overrides the normal CSS declarations, disabling the input
fields and striking out corresponding labels. To aid users of assistive technologies, a title is also appended to each label stating that
fields are disabled because cookies are required.
Perhaps this is common practice, but I recall in the past having trouble finding useful information for
PHP session management best practices, and this seems to be an effective approach. The whole process is transparent to the user,
and incorporating mod_rewrite to cloak the session id raises the level of security for the initial server redirect in which it is passed, as well as replacing the aesthetically displeasing URI string:
http://mysite/login/?PHPSESSID=sessionid
with the more comprehendable:
http://mysite/login/init.session,sessionid
If the initialisation variable session.name is changed to something other than PHPSESSID (or session_name(string) is called) then
the URI becomes much harder to hack.
Posted on Nov 10, 2003 at 02:19:13. [Comments for PHP Session Management- 6]