WordPress, Unicode, and ‘?’s

I previously had some problems when I mixed Unicode with WordPress. Every time I typed a Unicode character, (after posting) it would display as a ‘?’. This post will describe how to fix this.

Basically, the problem is that WordPress is not comprehending this, and instead of telling the database to store the Unicode characters, it just says, “Heck, just stick a bunch of question marks in there.”

Of course, this can be easily fixed in two steps. All you’ll need is FTP access to your server and a fair comprehension of how to type. So, let’s get started!

  1. Open up ‘wp-config.php’ from the root directory of your WordPress installation.
  2. Add ‘//’ at the very beginning of these two lines:
    define('DB_CHARSET', 'utf8');
    define('DB_COLLATE', '');

So that section should now look like this:
//define('DB_CHARSET', 'utf8');
//define('DB_COLLATE', '');

You’re already finished. How easy was that?

Important notes:

  1. The quotes surrounding // in step 2 should not be inserted. Those are just indicating that the // is the part you should insert.
  2. If you’ve meddled with that part of ‘wp-config.php’ before, it may look a bit different. But pay no attention to the differences. Just be sure add // to the lines containing DB_CHARSET and DB_COLLATE.

Did you find this article useful? Please leave a comment to let me know. Don’t worry, you don’t need to register for a simple comment.

70 Responses to “WordPress, Unicode, and ‘?’s”


  1. 1 geoffreyking October 13, 2007 at 2:36 pm

    Looks to me like you’re going a long way round here. My computer just does Unicode unless I tell it to do something else. A few programs still come with Latin 3 or even Latin 1 by default but you can easily change this is the ‘Tools’. Or am I missing something?

  2. 3 engel October 13, 2007 at 4:27 pm

    But by default WordPress won’t parse Unicode characters correctly. At least with Esperanto characters, that’s what I used to try it..

  3. 4 CowDir October 31, 2007 at 7:38 pm

    Pretty awesome article. Thanks! – CowDir

  4. 5 Nolawi December 25, 2007 at 9:26 pm

    thank you so so so much for this… i wasted an hour trying to fix till i found your post and then violla… FIXED

  5. 6 Chris January 24, 2008 at 10:59 am

    Are you kidding me?!? After hours of trying to figure out what in the blazes was going on, all it took was a few //s? Incredible. You are a genius in my mind. Thanks!

  6. 7 dinu May 14, 2008 at 4:20 am

    it worked without making this change… for malayalam…
    :)

  7. 8 fairbro June 28, 2008 at 12:00 am

    I was pulling my hair on this one – the source code for the web page says that WordPress IS doing Unicode – but it isn’t. Bill Gates to thank for why this is a problem only on some computers.

    I am so glad to fix this after only one day.

    Actually, you fixed it.

    Thanks!

    (If you think Bush is bad, wait’ll you see what’s next…)

  8. 9 vamana October 2, 2008 at 4:09 am

    Wonderful insight. This made such a huge difference to the effort I was putting to get this working.

    Thank you so much. You are making the blog a wonderful learning and sharing tool.

  9. 10 Vinayak Anivase October 6, 2008 at 9:36 am

    Thanks a lot,its so easy n working.
    Thanks once again.:)

    keep goin!!

  10. 11 David October 26, 2008 at 9:49 am

    Thanks. There must be a reason why unicode is not enabled by default. I was very puzzled initially because the upper characters would display properly while editing the post initially. Only later did I find that they were converted to ???s when I saved/published the post. That was the clue that led me to your post.

    I’ll be reading more of your site – thanks for documenting your insights!

    David

  11. 12 rithy November 4, 2008 at 8:47 pm

    i want to create blog with khmer language.
    can u help me, how to do?

  12. 14 web design December 21, 2008 at 6:54 pm

    Thanks! It was very helpful!

  13. 15 tyson April 27, 2009 at 10:35 am

    Just surfing the web and found your site,I am also involved with people search and background checks.Your site has been really helpful thanks.

  14. 16 ramag June 10, 2009 at 12:26 pm

    Thank you very very much dear, I was strugulling lot to fix this problem, how easily you described, thanks lot,

  15. 17 phaseill September 7, 2009 at 5:26 am

    Awesome, my site in in the maori language using macrons etc. Yours is a tip i’ll no doubt use time and time again :) Thanks! (Unless WP fixes it for us?!?!?)

  16. 18 kanishka September 27, 2009 at 9:26 am

    i tried making such change and reviwed twice to make sure, i didnt commit any mistake..
    its not working for me. i just started developing a website and you may see it ..the demo on http://blogprahari.in

    i wish to show hindi unicode characters.. and the same ? ? ?????
    signs appear.. please help me at earliest.

  17. 19 kanishka September 27, 2009 at 9:34 am

    ooh!
    it did .. but for the newer posts I made..
    it didnt work for the earlier posts..
    Thanks a lot..

  18. 20 vamshee October 1, 2009 at 11:12 am

    I had this same problem – couldn’t get my new blog to show Unicode characters. Then I figured out where the problem is. I just wanted to share it here for future reference.

    At the time of installing WP, in the config file the following settings need to be present
    define(‘DB_CHARSET’, ‘utf8′);
    define(‘DB_COLLATE’, ‘utf8_general_ci’);

    By default the collate setting is left as ”.
    That is the problem. In the MySQL database tables that wordpress creates, all the table fields will be left as default collation which is the ‘latin1_swedish_ci’ .

    This causes an inconsistency. You are writing UTF8 chars into a latin collation. so the data gets lost(turns to ‘???’)

    So if you are installing fresh, make sure you set both the settings.

    Now as suggested here, when you comment out the first DB_CHARSET setting, then you will be using a ASCII and latin collation combo which works. (because, you are no longer storing the data as Unicode – they will be stored as some funky characters – something like à°°à°¾à ±‡…)

    And, another point, no matter what you do, you will not be able to get your UTF8 data once its corrupted.(ie. when they turn to “???” and not these -”à°° ±‡” ).

    Hope this helps.

    -V

  19. 21 兜兜爸爸 November 19, 2009 at 2:58 am

    simple but effective!! thanks

  20. 22 viiral November 19, 2009 at 2:06 pm

    Omg.. was it that easy o.o cheers!

  21. 23 Emil December 1, 2009 at 5:22 pm

    Great tip, solved my problems with Serbian characters šđčćž :)

  22. 24 kanishka December 14, 2009 at 1:16 pm

    Sir!
    I have the similar “??? ” problem with my wp installation I made using fantastico installer. I used your trick and it made the things go perfect. Now I upgraded the installation script, had the same error ( characters were now “-”à°° ±‡”). I again repeated the trick, but it made partial correction with the display. Still some posts have the similar ???? problem.
    help me out please!

  23. 25 'Pong January 3, 2010 at 11:17 pm

    I’ve have been stumbled this issues for a long time. The two line code works like a magic. Thank you.

  24. 26 Subhash Makkena January 20, 2010 at 11:41 am

    When you are recreating your blog from an import XML file, Do this modification and then import.

    • 27 Adam February 17, 2010 at 8:35 pm

      You know I tried this method and it worked, but it caused issues all over my site in other posts. It started to mess up apostrophes and such. Any recommendations?

  25. 29 uldis April 12, 2010 at 6:58 am

    Thank you so very, very much! :)

  26. 30 GujaratiSMS May 22, 2010 at 9:59 am

    Thanks.

    Its working for me. I am searching for this from last many days.for my website http://www.gujaratisms.com for Gujarati Language.You help me out.

    Thanks again.

  27. 31 Daniel June 21, 2010 at 7:11 pm

    thank you very much! Now my blog works great!

  28. 32 amila July 7, 2010 at 7:51 pm

    this work fine. But the problem is

    define(‘DB_CHARSET’, ‘utf8′);

    is an essential line.If u commented it you cannot use your database for do any other work.

    Best way is , when creating a database make collation to either
    utf8_general_ci or utf8_unicode_ci. Both work same manner except sorting method. cheers :-)

  29. 33 masud August 1, 2010 at 5:06 am

    thank u for a great problem solve. it worked for our site accurately, :) thanking is not enough for this

  30. 34 zerovic October 5, 2010 at 1:12 pm

    thanks for the solution! it actually works for the posts, but I still have this bug on the main page…i’m stonned

  31. 35 Irini November 22, 2010 at 4:10 am

    You are God!Thanks so much!!!

  32. 36 Ali Raza Cheema January 23, 2011 at 3:31 am

    Thanks Dude, you solve my problem so easily, I was searching for this problem for an hour over google. Thanks Again

  33. 37 aqeel February 9, 2011 at 5:28 am

    Thanks
    You solved my problem –YEEEHAW!

  34. 38 Victor February 10, 2011 at 1:44 am

    Perfect fix, very simple step. Despite I wonder what is the implication of removing these two lines, will the db not know what charset to use? or sth else?

  35. 39 Todd February 18, 2011 at 9:17 am

    Tried it, but it made all these diamond shaped question marks after every period. Weird… back to searching the web for a different solution I guess

    • 40 Clifton March 9, 2012 at 7:46 pm

      I had problems with this too, Todd. I have a lot of Spanish text on my site and any Spanish character where there’s an accent or tilde is now corrupt, even though I am now able to post new material in Hebrew just fine. I also wasn’t able to reverse the damage that this “fix” did. What did you end up doing?

  36. 41 manoj March 3, 2011 at 3:04 pm

    hai , i did this. but not fixed.
    and tell me one more thing.how can i change wordpress site’s dash boadrd interface language? that means i want full telugu wordpress.

  37. 42 sandal distro March 9, 2011 at 2:31 am

    Thank you for sharing the blog. I look forward to many more interesting articles.

  38. 43 sathish March 9, 2011 at 6:30 am

    thank a lot

  39. 44 Rob Jones March 23, 2011 at 3:30 am

    Fab! You were the last ‘find’ before I gave up. Worked a treat and in seconds. Thank You!

  40. 45 laurux77 May 10, 2011 at 3:10 am

    Thank you so much! I was having the same problem in Latvian language. And now I have fixed it finally. So THANK YOU!

  41. 46 manoj May 10, 2011 at 7:05 am

    hi, atlast fixed my problem through this method. thank a lot.

  42. 47 Paul Cable May 27, 2011 at 4:22 am

    Thanks! What an easy fix – I wonder why that’s not default.

  43. 48 arunpdl June 23, 2011 at 1:37 am

    thanx a lot man…that really worked for me :)

  44. 49 Evi Helviani July 10, 2011 at 12:26 am

    I am very lucky to find your web site. Your article is very useful. Thank you for share

  45. 50 evi helviani July 27, 2011 at 9:22 am

    I think your article is very useful for me. Thanks

  46. 51 Alan Porter August 14, 2011 at 7:04 pm

    I think this might be a short-sighted solution. The problem may be that your database tables are defined with a different encoding. In my case, they were mysql tables with “latin1″ encoding. I changed their encodings (using mysqldump, an editor, and mysql < dumpfile) and now they are properly encoded in UTF-8.

    Alan

  47. 52 RR August 17, 2011 at 10:52 am

    Thank you.
    This solved my problem.
    Simple solution for A BIG PROBLEM.

  48. 53 sanjay August 19, 2011 at 11:28 pm

    Yes, worked for me. Thanks a lot.

  49. 54 Max October 26, 2011 at 12:30 pm

    Thank you very-much!!!!! This world works because of you nice people who like helping others…

  50. 55 Dhara November 17, 2011 at 3:34 am

    Thank you very much.
    I am able to create a wordpress website in gujarati

  51. 56 lucholibre December 2, 2011 at 6:38 pm

    Thanks! I was going nuts with some articles a writer had sent me!

  52. 57 Dennis Tran February 14, 2012 at 5:31 pm

    Thank’s alot for your Great information….
    It helps many people with this problem…

    Dennis Tran


  1. 1 katagrapho » Punctuating Eph 2:14b-15a Trackback on January 24, 2008 at 11:03 am
  2. 2 árvíztűrő tükörfúrógép at Íráskényszer Trackback on April 18, 2008 at 5:40 am
  3. 3 Krunk4Ever! » Blog Archive » Upgraded to WordPress 2.5.1 Trackback on June 12, 2008 at 3:57 am
  4. 4 HD-Trailers.net Blog » Blog Archive » Upgraded to WordPress 2.5.1 Trackback on June 12, 2008 at 3:58 am
  5. 5 Using Unicode - Blog Test 2 Trackback on August 28, 2008 at 1:29 am
  6. 6 WordPress, Unicode, and ‘?’s « Obsessed with the Press Trackback on September 4, 2008 at 1:07 am
  7. 7 project-2501.net » Blog Archive » obsessed with anonymous functions Trackback on January 21, 2009 at 1:32 am
  8. 8 Puppet Kaos » Blog Archive » Upgraded to WordPress 2.7 Trackback on February 11, 2009 at 5:12 am
  9. 9 WordPress, Unicode and ? Trackback on February 18, 2010 at 6:52 am
  10. 10 Order of the Bath » Unicode in Wordpress Trackback on May 8, 2010 at 11:05 am
  11. 11 ਸਿੱਖ ਧਰਮ ਦੇ ਪੈਰੋਕਾਰਾਂ ਦਾ ਵੈਸਾਖੀ ਨਾਲ ਗੂੜਾ ਸਬੰਧ ਹੈ। | www.punjabexpress.com.au-Australia's No 1 Punjabi Indian Newspaper Trackback on May 29, 2010 at 10:25 am
  12. 12 Chinese language wordpress plugin - NamePros.com Trackback on March 9, 2012 at 9:11 pm
  13. 13 Fun with JavaScript: count parentheses | The Java Hacker - Peter Jaric's Blog Trackback on April 19, 2012 at 1:41 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.