Converting a MySQL database to UTF-8

I occasionally need to convert a mysql database to UTF-8. This can be a painful process and if it goes wrong can result in a nightmare of character-set collisions. Some digging around on the lazyweb led me to a nice solution I found over at oscarm.org that shows you how to quickly convert the actual data, but I wanted to convert the database too.


Here’s what I did:

1) Dump the database into a sql file using mysqldump:

mysqldump --opt --quote-names -u username -p password my_database > my_database.sql

2) Convert the dumped sql data from the current character set to utf-8, for example:

iconv -f ISO_8859-1 -t UTF-8 my_database.sql > my_database-utf-8.sql

3) Change the CHARSET definitions in the sql file to utf8. You could do this by manually editing the file, but this little perl snippet will do it for you (here we’re going from latin1 to utf8 – change to fit your needs):

perl -pi -w -e 's/CHARSET=latin1/CHARSET=utf8/g;' my_database-utf-8.sql

4) Import the database back into mysql:

mysql -u username -p password my_database < my_database-utf-8.sql

Voila. Your database and data is now converted to utf-8.