Learn Languages Through Games on Steam
If you like playing PC games, and want to learn a language at the same time, try this. Set your Steam language to the foreign language you’re trying to learn, and you can play most of your library in that language. It’s under Preferences > Interface then choose your preferred language from the first drop-down.
Sometimes Steam will need to re-download the data to get the localised content, but once that’s all done, you should be able to play without any trouble. For example, both Portal games, Civilisation, Team Fortress 2 and Skyrim all work in French. Check the languages section of Steam store page for that game for what languages are available.

The only problem with learning from games is you might learn a lot of weird vocabulary. Just think of it as gaming with a bonus, rather than actual language practice.
Japanese Equivalent of The Onion
I love The Onion. For a long time I wished there was an equivalent satrirical site in Japanese. Someone just told me about the Kyoko Shimbun, a Japanese site full of made-up amusing stories.
For example they have a story on McDonalds Japan releasing the McDonal-don, a bowl of rice (don) topped with a burger.
Another story on naturally drying baumkuhen cakes in the sun.
Discovering that pi is only 10 digits long. Calculations until now having been a bug in the program running them. The quote from the researcher at the end is great.
The stories are short enough and have a good variety of vocabulary to be a pretty good way to practice Japanese I think.
Learn French with Bouletcorp

Bouletcorp is a superb French comic blog, with a combination of mind-expanding topics, humour and great artwork. The French is quite challenging and idiomatic, which makes it great to study but tricky to look up. But luckily most of the comics are translated into English. So open up both versions in your browser and get reading.
The image is taken from the French and English comic called Kitchen Darwinism.
Le Lotus Bleu in French. I read Tintin a lot as a child. This one is set in Japanese-occupied China which is a sensitive topic at the best of times. Some of the depictions and stereotypes are very dated if not borderline racist. Interesting though. Merci!
"Why doesn't the Unicode Standard adopt a compositional model for encoding Han ideographs?"
Found something interesting on the Unicode FAQ:
The compositional nature of the script makes it attractive to propose a compositional encoding model, such as can be used for Hangul. Such a mechanism would result in the savings of thousands of code points and relieve the IRG from the burden of having to examine potential candidates for encoding.
Unfortunately, there are some difficulties involved with a compositional model for Han.
Dictionaries and Copyright Law
Disclaimer: Before starting this I want to make it clear that I am not scraping dictionaries, nor do I plan to. I am part of a project to create an open-source Korean dictionary, and so I’m wondering what the law is so we don’t run into any trouble. The data for the project so far comes from freely-published Korean government data, and manually-made definitions, so we should be in the clear. I’m also interested in what it means to “own” some fundamental parts of language.
There are hundreds of commercially-produced and (I assume) copyrighted dictionaries in many languages. Dictionaries are extremely useful to not only language learners, but developers of language learning tools. However these commercial dictionaries are out of reach of most developers due to copyright or exorbitant licensing costs.
I wonder what the extent of the copyright law is.
The process of creating a new dictionary could be broken into two parts.
- Scrape the data.
- Reformat the data to the extent that it is not a copy of the original.
The first is the clear grey area. “Making unauthorised copies” would probably cover it, but viewing the dictionary itself creates a copy, and whether you store that or not is not known by the other party.
What would be the difference between the following two methods of creating the dictionary:
- Manually — Humans writing definitions of words they know, using existing dictionaries to look up those they didn’t, and writing their own new unique definitions.
- Programmatically — Using existing dictionaries, using paraphrase, language models and bilingual texts to generate new unique definitions.
The first seems completely natural. That’s how new dictionaries get created all the time. The other dictionaries are used for reference, but the definitions are clearly the creations of the new authors, and as such (I assume) there is no copyright infringement.
However the second seems a lot more grey. Despite the fact that the same rewriting/paraphrasing has taken place, as it was automated it seems less clear who owns the copyright.
Does anyone have any answers to this? Past experiences?
BBC Horizon: Do you see what I see? The Himba tribe
The second half covers the Himba tribe whose language includes only 4 colour terms. An experiment they conduct seems to indicate that language somehow affects people’s abilities to differentiate colours.
Having fun with languages. Prerequisites: Kanji, English, a slightly twisted mind.
Interesting documentary on Old English
Remembering 雨 Related Kanji
I sometimes find it easier to make up stories about Kanji to remember how to write them, especially if they look really similar. Here’s how I remember weather-related Kanji that use 雨. Most of them are easy, but I often get the right-hand parts of dew and mist confused.
- 雨 rain - Basic.
- 雲 cloud - (Can’t think of one).
- 雪 snow - Katakana ヨ at the bottom which is like ユ in ゆき.
- 露 dew - Has 足 at the bottom, you get dew on your feet.
- 霧 mist - mist is hard to predict, so contains 予. Also because mist is really fine, it has no power… so it has 力 at the bottom right.
- 霰 hail - Not sure about this one, bottom looks like 昔, hail is an old word?
Does anyone else use systems like this for remembering Kanji?