Version 6

MySQL and character encoding

How to fix character display issues with MySQL

Created by: laetzer, Last modification: 06 Dec 2008 (12:58 UTC) by laetzer

Character encoding issues

If you are using MySQL, you might run into character encoding problems on your site or in your database: non-English characters might appear as question marks or look strange (ö, ß, ü, ä, or Ãœ, instead of characters like ä, Ü, ß, ç, or é).

Database connection setup

The problem for this, as sometimes stated, is that the encoding of the data is not the same all the way between website and database. The shortest and best solution is to have an UTF-8 system end-to-end.

However, it is no problem to have a database storing its data in Latin1, and displaying that data with correct UTF-8 characters on a webpage in a browser. If a database stores a character Latin1 encoded, but an application requests that character expecting utf8 encoding, the database sends that character utf8 encoded. The actual problems are bugs between database applications, database abstraction applications, and web applications, and, most importantly, wrong settings (or lack of access to config files to actually apply the correct settings).

Some MySQL installations use Latin1 by default, for the database itself as well as for connections to it. This means, if you are storing your data as UTF-8, Bitweaver has to be told to request data as UTF-8. Unless told otherwise, Bitweaver is using your server's defaults. Since Bitweaver relies on the 3rd party application AdoDB for database abstraction, a fix would have to be included there.

That means, if your server is generally set up for UTF-8, especially your database and database tables, but for some reason the database connection is not, try to edit the script that AdoDB uses to connect to MySQL databases:

yourbiteaver/util/adodb/drivers/adodb-mysql.inc.php


<?php
function _connect($argHostname$argUsername$argPassword$argDatabasename)
    {

    
// ... some stuff ...

        
if ($this->_connectionID === false) return false;

    
// edit adodb-mysql.inc.php around line 366 or 373
    // (depends on the version of that file)
    // function "_connect"
    //----------------------------------------------
    // THE FOLLOWING LINE IS THE ONE TO ADD:
        
if (mysql_query("SET NAMES 'UTF8'") === false) return false;
    
//-----------------------------------------------
        
if ($argDatabasename) return $this->SelectDB($argDatabasename);
        return 
true;
    }
?>


Database and server setup

Databases can be set to utf8 upon creation, as well as database tables. They can also be changed afterwards, even with tools like PhpMyAdmin. If the tables contain already user data, though, there will be yet again correct ASCII characters but wrong non-ASCII characters.

If your whole server OS, server software, or database software is not set up for UTF-8 yet, it's a good idea to do that. Consider some of the following steps. Bear in mind, that this is web server configuration info. There are many extensive tutorials about settings like these online. This can only be a starting point.

my.cnf (MySQL config file)


[mysqld]
# --------------------------------------------
# add the lines below to enforce utf8 encoding
# then restart the MySQL engine
# --------------------------------------------
collation_server=utf8_general_ci
character_set_client=utf8
character_set_server=utf8
skip-character-set-client-handshake


.htaccess (bitweaver root)


AddDefaultCharset UTF-8


php.ini


default_charset = "UTF-8"


Also: Your database may not be UTF-8 in it's collation settings. Consider dumping the SQL to a file, and use an UTF-8-compliant text editor to chang the CHARSET=Latin1 to CHARSET=utf8). Create a new database, with Collation set to utf8_general_ci, and SET NAMES 'UTF8'. Using the text editor, find/replace every bad character with the correct character. This can take hours. There are also scripts available to do that for you. Search for convert + charset etc.

Then upload the cleaned database to the new, empty database. Adjust Bitweaver's /kernel/config_inc.php file to reflect any change in database name.

To see how your database is set up, run SHOW VARIABLES; at the MySQL command prompt on your server (or find the link in PhpMyAdmin).

More information

Page History
Date/CommentUserIPVersion
08 Dec 2008 (22:08 UTC)
added info from tommie-lie regarding utf8 and phpmyadmin
laetzer85.178.6.2497
Current • Source
laetzer85.178.55.1586
View • Compare • Difference • Source
laetzer85.178.55.1585
View • Compare • Difference • Source
laetzer85.178.55.1584
View • Compare • Difference • Source
Combat Wombat118.90.21.1263
View • Compare • Difference • Source
Combat Wombat118.90.21.1262
View • Compare • Difference • Source
Combat Wombat118.90.21.1261
View • Compare • Difference • Source