Wednesday, April 12, 2006

User XYZ != User XYZ

It's nothing new to me and probably nothing new to most folks who work with MySQL on a regular basis that the username alone does not identify a MySQL user, but a combination of username@host.

But if you forget about it for a moment, you can easily begin to wonder about special behaviors. That just happened to me, so I just thought that I have discovered a very evil bug - but looking twice, it became quite clear to me why MySQL behaved this way.

I have two computers, both with MySQL servers, both with users called mpopp - but each server has two of them - one is mpopp@localhost and the other is mpopp@10.0.0.1__ (both with the same permissions), so it allows me to access user mpopp from each of my PCs. I created a little stored procedure on one computer and then copied this procedure to the other one - together with the new DEFINER option that's new in MySQL 5.0.20. But I didn't think about it (and didn't look carefully) and created the new procedure with the option DEFINER=mpopp@'10.0.0.1__' (while being logged in as user mpopp@localhost).

It was quite a shock when I called "SHOW CREATE PROCEDURE" and there was no output for the definition - so it looked like my procedure was gone. Also in information_schema.ROUTINES I couldn't find the definition, although the procedure itself was listed.

It became clear when I logged in as root and finally found that the procedure definition was still there. I looked a bit more carefully and found the mpopp@'10.0.0.1__' user in the DEFINER column. Of course, you can only output the procedure definition from the user that created the procedure (or from a root user, obviously), and in that case, it was user mpopp@'10.0.0.1__' and not mpopp@localhost (even though this was the user that I was logged in when I created it).

That makes me think - wouldn't it be more intuitive if the user name alone would identify a user and the host definition would be separated by a 1 : n relation? So for example - there's a user mpopp with all the permissions and there's a separate table that does nothing more than to define from which hosts the user has access? I know, to change that would break backwards compatibility completely - but on the other hand, it might be less confusing. I would be curious what other people think about this.

3 comments:

Beat Vontobel said...

Hi Markus,

there were in fact situations where I was extremely happy to have the possibility to "split" a user and define grants that were slightly different for the same user name from different hosts (see my blog post Too many connections: Handling of an emergency situation in MySQL).

So I think a combination of both would be best. Define one user but with exceptions for different sources. Of course roles would be great - as the more I use the different GRANT features and the more users I have, the more difficult it gets to keep everything in sync.

Regards,
Beat

Jay Pipes said...

Hi Markus and Beat!

Yes, from a logical perspective, it would make sense for there to exist a 1:n relationship between the user and host, but this perspective would only currently be for the benefit of us, the user. Internally, MySQL pulls the mysql.user table into memory (in the ACL hash called acl_cache) when starting up (see sql/sql_acl.cc below).

The hash (composed of acl_entry elements) is keyed on the host, user, and db (if applicable) together, as these components are glued together for each THD* connection instance and used for the ACL authentication (see sql/sql_acl.cc acl_get() below).

From sql/sql_acl.cc:

class acl_entry :public hash_filo_element
{
public:
ulong access;
uint16 length;
char key[1]; // Key will be stored here
};

So, the ACL cache is a hash of acl_entry elements. The elements are added to the ACL cache in acl_get(), further into the same file (stripped down for brevity):

ulong acl_get(const char *host, const char *ip,
const char *user, const char *db, my_bool db_is_pattern)
{
...
end=strmov((tmp_db=strmov(strmov(key, ip ? ip : "")+1,user)+1),db);
...
// bunch of access checks, then we add to acl_cache:
memcpy((gptr) entry->key,key,key_length);
acl_cache->add(entry);
...
}

So, as you can see, the hash key is simply a concatenation of the host (ip), the user, and the db (if it's supplied).

and from sql/sql_acl.h we see the classes and structs which describe a user in the ACL system within the server:

struct acl_host_and_ip
{
char *hostname;
long ip,ip_mask; // Used with masked ip:s
};

class ACL_ACCESS {
public:
ulong sort;
ulong access;
};

/* ACL_HOST is used if no host is specified */

class ACL_HOST :public ACL_ACCESS
{
public:
acl_host_and_ip host;
char *db;
};


class ACL_USER :public ACL_ACCESS
{
public:
acl_host_and_ip host;
uint hostname_length;
USER_RESOURCES user_resource;
char *user;
uint8 salt[SCRAMBLE_LENGTH+1]; // scrambled password in binary form
uint8 salt_len; // 0 - no password, 4 - 3.20, 8 - 3.23, 20 - 4.1.1
enum SSL_type ssl_type;
const char *ssl_cipher, *x509_issuer, *x509_subject;
};

Notice that the ACL_USER has a *single* member field "host" of type acl_host_and_ip. To properly implement the 1:n relationship of user to host within the ACL system, this would have to change to a hash or linked list of acl_host_and_ip structs. Then, upon each THD* connection instance, the supplied host from the THD* instance would have to be searched in this hash. Not a huge deal, but it would be an additional step and added complexity... Now, think about the complexity that roles (user groups) would add! :)

Ostensibly, this is to allow the localhost user and a remote user of the same name to have entirely different privileges. Normally, you'd want to lock down the remote, and grant further access rights to the local user...

So, in order to support a 1:n relationship form user to host, changes would have to be made so to support an inheritance structure (tree structure) for inheriting base privileges from the user table (without a host column) and modifying those privileges based on host lookup. Currently, the mysql.host table already complicates things, as it allows you set up blacklists for various hosts on the network using similar usernames. This complication would have to increase, as you would throw a user->host relationship into the mix (currently, you will notice that the mysql.host table has no column for User; it only allows for host-specific blacklisting).

I do agree with Beat that roles would be excellent, and I wrote about that in chapter 15 of Pro MySQL. However, as always, this feature must be weighed with a consideration of whether it would make installation and use of MySQL more difficult. As "ease of use" is a core goal for MySQL, role-based user management is being thought-out internally so as to get the benefit while remaining easy to use... :)

And of course, added complexity often breeds decreased performance. Currently, user authentication and connection to a MySQL server is lightning fast (sub millisecond) partly due to its relative simplicity. Adding roles and/or a user->host 1:n relationship would increase the calculational complexity of authentication code, and thus affect performance of a critical speed component of the server :( All of this, of course, must be taken into account.

Cheers, guys! Oh, and see you all soon!

-jay

Markus Popp said...

Hi,

this would also fit with my new idea that I've introduced in my new article. That would leave the privilege system itself as it is (hence, not reduce the performence) but still enhance its functionality through some sort of "wrapping functionality" that anybody can use or can not use ;-).