db4free.net blog: November 2005

Wednesday, November 30, 2005

New links online

I have added a few links here in my weblog and at the Developer's page on db4free.net. This Developer's page was initially planned to end up as something like MySQLDevelopment.com - but as there is such a variety of great sites (what this page never could have competed with), I changed this plan to offer links to the best other resources, instead.

Many people who are familiar with the various resources about MySQL development will probably know most of the links, but for those who are not (yet), here are my personal recommandations:

MySQLDevelopment.com (a great place to get started with the new features of MySQL 5!)

Andrew Gilfrin's blog

Roland Bouman's blog

Beat Vontobel's blog

Giuseppe Maxia's blog

MySQL General Purpose Stored Routines Library

Jay Pipes' blog

Besides the official resources at mysql.com and planetmysql.org (where many of the articles written in the mentioned weblogs can be found, too), these are the best sites about MySQL development that I know!

Tuesday, November 29, 2005

Thanks, congratulations and continue the good work!

This blog post has moved. Please find it at:

http://www.mpopp.net/2005/11/thanks-congratulations-and-continue-the-good-work/.

Friday, November 25, 2005

PHP 5.1 released

Today (24th November), PHP version 5.1.0 has been released. Find a list of the new features, which include that PDO extension is now enabled by default, on the PHP website. The PDO extension provides a common interface to communicate to different database systems.

Also read the Changelog and the Upgrading Guide to get detailed information about what's new.

Tuesday, November 22, 2005

Many thanks ...

... to Shane Bester for finding a repeatable testcase for the bug report #14829, where I asked for help early this morning. The bug has now been acknowledged and it is likely that the bug will be fixed soon :-).

This showed me once more how great this Community works. If there is a problem that I can't solve myself, it really works to ask for help and get some assistance. That's one of the points that makes MySQL more attractive than many other products.

Monday, November 21, 2005

TX isolation levels (2) READ COMMITTED

This blog post has moved. Please find it at:

http://www.mpopp.net/2005/11/tx-isolation-levels-2-read-committed/.

Call for help to make a bug repeatable

I have found a bug which disallows me to log in with a user, after I have created and deleted another user. After restarting the server, I can log in again as if there never was a problem. I have reported this bug to

http://bugs.mysql.com/bug.php?id=14829

I have tried to create a repeatable case and I made it to create a situation, where I could reproduce this bug on 3 different machines. Unfortunately, when Miguel Solorzano tries to repeat this, it still works - although we have already checked out that everything is completely identical.

So I'm asking for help to find out more details to reproduce this. If you have a Windows (2000 or XP) test server available that you can play around with, please try the following:

* Install MySQL 5.0.15, if possible to c:\mysql
* download and install this my.ini file
* download the dump file from dump_mysql.sql and put it to C:
Log in as root (without password) and delete the test database and the contents from the mysql.db table:

drop database test;
delete from mysql.db;
flush privileges;

Then do the following steps, starting from C:
C:\>mysql -u root mysql < dump_mysql.sql

C:\>mysqladmin -u root flush-privileges

C:\>mysql -u root
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 7 to server version: 5.0.15-nt-max

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create user user@localhost;
Query OK, 0 rows affected (0.00 sec)

mysql> drop user user@localhost;
Query OK, 0 rows affected (0.00 sec)

mysql> exit
Bye

C:\>mysql -u mpopp

... if everything works "fine" (to say, you can reproduce the bug), you should get following message:

ERROR 1130 (HY000): #HY000Host 'localhost' is not allowed to connect to this
MySQL server

After restarting the server, you should be able to log in as user mpopp.

If you find any clue why this bug still doesn't occur on all installations (under the given conditions), please add a comment to the bug report. Please let me know, if you can or cannot reproduce it and mail me at m.popp@db4free.net. Thanks to all who help :-).

Saturday, November 19, 2005

TX isolation levels (1) READ UNCOMMITTED

This blog post has moved. Please find it at:

http://www.mpopp.net/2005/11/tx-isolation-levels-1-read-uncommitted/.

Tuesday, November 15, 2005

db4free optimizations

Today I did some server optimizations on db4free.net, which has recently welcomed its 1,500th user. As I already wrote earlier in my weblog, I'm currently preparing for the MySQL professional exam, and I could use the new skills from learning for this exams immediately for re-configuring the db4free.net server to offer better performence to its users. I enabled the query cache, extended the buffer limits, allowed more concurrent users and enlarged the max_allowed_packet value to make it possible to import larger dump files.

But there are more benefits from learning for MySQL certification exams than that. For example, before I started, I used to create tables and users, or change user's permissions with 3rd party tools like phpMyAdmin. I knew the fundamental statements how to do it with SQL, but I didn't know the details good enough to really use it. Today, I feel even more (!) comfortable to do it in MySQL monitor using SQL than with any graphical tool. So this proofs that the exams helped me a lot to become much better in writing SQL and to be independent of any 3rd party software.

Stay steady as one of the next articles in my weblog will be to show, how the different transaction isolation levels work in practice and how a deadlock can be created artificially. Very interesting stuff, indeed!

But now back to db4free. As already mentioned, db4free has hit the 1,500 users mark. Time for some statistics. As of today, there are 1,575 users from 60 different countries. The website has been visited 9,415 times from people from 91 different countries. Most visitors (and also users) come from Germany, followed by China, USA, Taiwan, Poland, Austria and Hongkong (as it's easy to see, db4free is very popular in the Far East region). After Germany (339), most users come from Poland (259), China (192), Taiwan (166), Hongkong (160) and the USA (98).

Information schema allowed me to get some more interesting statistical data. The 1,575 users have created 22,445 tables, from which are 21,894 MyISAM tables, 470 InnoDB tables, 80 MEMORY tables and (as db4free is offering the Max version of MySQL 5.0.15) 1 BerkeleyDB table. There are 13 views, 23 Stored Procedures and 28 User Defined Functions.

Friday, November 11, 2005

Using a SP to split up a table

Lets assume that we have one big table with forum posts (called forum_posts) from quite a long period. We want to split this big table up in several smaller tables for every month. The date is stored in a datetime column called dt.

Here's a stored procedure that splits the table up in smaller tables called fp_YYMM which are identical in its structure as the original table. It uses a cursor to walk through every month, create the new table (with a CREATE TABLE ... LIKE statement) and copying the correct data into the archive tables. The original table will persist.

The procedure also uses prepared statements to dynamically build up the SQL statements for creating the new tables and copying the contents.

DELIMITER //

DROP PROCEDURE IF EXISTS splitUpTables //
CREATE PROCEDURE splitUpTables ()
BEGIN
  declare finished int default 0;
  declare month char(7);
  declare month2 char(6);
  declare dfrom date;
  declare dto date;

  declare curCreate cursor for 
     SELECT distinct date_format(dt, '%Y-%m') as month 
     FROM forum_ posts order by date_format(dt, '%Y-%m');
  declare continue handler for not found set finished := 1;

  open curCreate;

  fetch curCreate into month;

  while finished = 0 do

    select concat(substring(month, 1, 4), substring(month, 6, 2)) 
       into month2;

    select month2;

    set @sql := concat('create table if not exists fp_', month2, 
                       ' like forum_posts ');
    prepare pCreate from @sql;
    execute pCreate;

    set dfrom := cast(concat(month, '-01') AS date);
    select date_sub(date_add(dfrom, interval 1 month), interval 1 day) 
       into dto;

    set @sql1 := concat('truncate table fb_', month2);
    set @sql2 := concat('insert into fp_', month2, 
       ' select * from forum_posts where dt between \'', 
       dfrom, '\' and \'', dto, '\' order by dt');

    prepare pTruncate from @sql1;
    execute pTruncate;
    prepare pInsert from @sql2;
    execute pInsert;

    fetch curCreate into month;

  end while;

  close curCreate;

END //

DELIMITER ;

Thursday, November 10, 2005

Calculate uniqueness of text fields for setting indexes

Indexing long text fields can be more consuming than helpful. Indexes have to be maintained by the server. So if you have an indexed 255 character long varchar field, it probably means more work for the server than the index can help you.

Fortunately, it's easy to limit the index length to a specified number of characters. We can do this with e.g. a KEY(field(15)) clause when adding the index with a CREATE or ALTER TABLE or CREATE INDEX statement. In this case, only the first 15 characters would be indexed.

The question is - how long should the indexed length be? As a rule of thumb we could say, we should choose a length that's just enough to make most of the indexed contents unique. Maybe we won't reach 100 % of uniqueness, but that's not necessary. It will probably be too little to index only the first one or two characters, because they will be the same for many records. So the next question is - what's a good value?

To make the decision a little easier, we can calculate the uniqueness with following formula:

SELECT count(distinct left(field_name, indexed_length)) / 
   count(*) * 100 FROM table_name;

The higher the value, the more unique the indexed contents are.

So I thought to myself, why not put this into a stored procedure where we pass the table name, the column name and two boundaries from which value to which other value we'd like to see the percentage of uniqueness. Here's how I did it: I created a temporary table inside the procedure, prepared the insert into [temporary table] select ... statement and queried the values inside a while loop from the lower bound to the upper bound. Here it is:

DELIMITER $$

DROP PROCEDURE IF EXISTS getUniqueness $$
CREATE PROCEDURE getUniqueness (IN _table VARCHAR(255),
   IN _column VARCHAR(255), IN _noFrom INT, IN _noTo INT)
BEGIN

  create temporary table tt_uniqueness
     (noChar int unsigned not null, uniqueness decimal(10,2) not null);

  set @sql := concat('insert into tt_uniqueness ',
    'select ?, cast(count(distinct left(',
    _column,
    ', ?)) / count(*) * 100 as decimal(10,2)) from ',
    _table);

  prepare pSql from @sql;

  set @count := _noFrom;

  while @count <= _noTo do

    execute pSql using @count, @count;

    set @count := @count + 1;

  end while;

  select * from tt_uniqueness;

  drop temporary table if exists tt_uniqueness;

END$$

DELIMITER ;

Finally, lets take a look how it works. I try it on the phpbb_posts_text table from a phpBB database which holds 340 entries:

mysql> explain phpbb_posts_text;
+--------------+-----------------------+------+-----+---------+-------+
| Field        | Type                  | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+-------+
| post_id      | mediumint(8) unsigned | NO   | PRI | 0       |       |
| bbcode_uid   | varchar(10)           | NO   |     |         |       |
| post_subject | varchar(60)           | YES  |     | NULL    |       |
| post_text    | text                  | YES  |     | NULL    |       |
+--------------+-----------------------+------+-----+---------+-------+
4 rows in set (0.00 sec)

mysql> select count(*) from phpbb_posts_text;
+----------+
| count(*) |
+----------+
|      340 |
+----------+
1 row in set (0.00 sec)

mysql> call getUniqueness('phpbb_posts_text', 'post_text', 1, 10);
+--------+------------+
| noChar | uniqueness |
+--------+------------+
|      1 |       7.94 |
|      2 |      27.65 |
|      3 |      41.47 |
|      4 |      51.18 |
|      5 |      59.41 |
|      6 |      67.94 |
|      7 |      71.47 |
|      8 |      78.82 |
|      9 |      84.12 |
|     10 |      85.00 |
+--------+------------+
10 rows in set (0.03 sec)

Query OK, 0 rows affected (0.03 sec)

mysql>

So we can see, how the uniqueness rises as we choose longer index lengths. If we for example decide that we want to have a uniqueness of 80 % for the index, it would be fine to select 9 characters as the index length.

Monday, November 07, 2005

ApacheLog2MySQL

I have written a little Java program that parses an Apache Log file and stores its content in a MySQL database.

The program is published under the General Public License and can be downloaded from http://apachelog2mysql.mpopp.net/ApacheLog2MySQL.jar. You can download the source code at http://apachelog2mysql.mpopp.net/ApacheLog2MySQL.zip and find the ReadMe file at http://apachelog2mysql.mpopp.net/readme.txt. In this ReadMe file you will find all details about the limitations and the features that are planned in the future.

You can also find all the information on my website at http://www.mpopp.net/apachelog2mysql.php.

The program runs at all platforms that have the Java Runtime Environment 1.5.0 or higher installed. It can be started from the console with following command:

java -jar ApacheLog2MySQL.jar [filename] [host] [user]
   [password] [database]

The program is very basic at the moment, but I plan to extend its features, so it might become a great little tool to analyze the Apache log. Please refer to the ReadMe file to get more detailed information about where the program is still limited and which features I plan to implement.

Feel free to try the program, modify it, give me feedback and let me know your thoughts about this little tool.

Friday, November 04, 2005

Compiling PHP 5 with mysql and mysqli

This blog post has moved. Please find it at:

http://www.mpopp.net/2005/11/compiling-php-5-with-mysql-and-mysqli/.

My first look at 5.1 ...

It was a great thing when MySQL 5.0 came out, but what's next? Well, of course, the stabilization process of MySQL 5.0 continues, but the work for the next release - 5.1 - has begun quite a while ago.

So, up to now I've always been using the official binaries from MySQL. That's the best thing to do for running MySQL servers in production. But why not take a look a bit further to see what's going on beyond the public releases?

So I looked into MySQL's documentation to read about Installing from the Development Source Tree. I did all the points step by step and now I've ended up with my first MySQL 5.1 server:

These are very interesting days in terms of MySQL - two weeks ago the first Production Release of MySQL 5.0 and now my first personal MySQL 5.1 installation. It's always very exciting to have a first look at a new milestone.

db4free.net blog