Introducing HanziCraft

By Confused Laowai | Date: January 11th, 2013 | Category: Learning Tools

Today, I introduce HanziCraft. It’s a project that I’ve been working on for some time now. It started with my thesis, where I needed to easily decompose Chinese characters for my research. I then found data and wrote a little decomposition tool for myself. This was called HanziJS. However, as time went on I realized I wanted more than just decomposition data. Why not create a site that pushes the value of Chinese character dictionaries to a new level?

HanziCraft

What I mean by this, is that when I want to look up a Chinese character to learn (especially for something like the Chinese character challenge) I want as much information possible to help me learn that character, especially regarding how some radicals affect your reading on a sub-conscious level.

Questions that come up:

  • What are the radicals?
  • How does the decomposition traverse itself?
  • Is there any phonetic information available?
  • Does this character have many definitions? How is it pronounced?
  • Now that I know more of the character, where does it fit into vocabulary?
  • Is the vocabulary useful?

Those questions are the ones that I want answered when it comes to a Chinese character dictionary. I have found some sites that can serve such a goal, but they are either badly designed, don’t have all the answers and/or is in Chinese only.

What will become of HanziJS?

HanziJS is the code behind HanziCraft. It is an open-source module for Node.js. This is the backbone. HanziCraft is thus an application of the code itself (hopefully other people will create their own apps with the HanziJS code in the future!).

If you are a coder, go check out the github repo. There’s quite a lot of updates to it, as well as quite a bit of refactoring (thanks Dusan!). I’m busy writing proper documentation for it.

Introducing HanziCraft

I think the best you can do, is to just visit the site and see for yourself. Check the info for the character  for instance:

魔

Delicious juicy info!

If you want to know how I determine the example words, find the question in the FAQ.

P.S. – If you see any question marks or blocks, then you need to install the proper font to display all the components. Download the font here.

Beta

HanziCraft is now in it’s beta phase. There will be bugs! But with this beta phase, you’ll get a discounted premium account. What do you get with a premium account? (Besides my eternal gratitude)

  • Lookup more than one character at a time
  • Favorite lists
  • Your own user dashboard (currently showing your lookup history. I will implement character analytics in the future)
  • No ads
  • All future premium features forever free.
  • AND Less hassle in learning Chinese characters (and who doesn’t want that!?)

Future Features

I’ve got quite a bit more features planned for HanziCraft. Some premium & some free. Here’s what to expect:

  • Display potential phonetic information in the radicals
  • Display similar characters based on components (if a character shares more than 50% similar components)
  • Text Analysis (this will be a premium feature that will take a group of characters, perhaps an article and compute what you need to know from the text. This will include unique characters, unique radicals, frequency counts and other cool information)

With all this being said, I think HanziCraft is a tool that I created for myself, mainly because I had trouble finding all the useful information I needed. I was eager, like a crazy addict, trying to find the information I craved, and after countless hours HanziCraft was born.

I hope HanziCraft becomes your goto Chinese character dictionary. I’m building it to be my own goto dictionary, so tell me, what do you need to make it the ultimate Chinese character dictionary? I’ll try my best to implement the features you need. Happy learning!

Related posts:

Introducing Hanzi - A Character Decomposition Tool
Duolingo Launches
9 Essential Digital Tools for Chinese Learners
HanziJS gets updated. Now with more decomposition data!

Subscribe via email to receive new posts straight in your inbox!

Enter your email address:


  • http://twitter.com/HackingChinese Olle Linge

    This tool has great potential (which I’ve already told you in person). With the current features, it’s already a very useful tool that I recommend to learners on all levels. You’ve already come a long way, but I’m sure this can be developed into something even better. Keep up the good work!

  • http://mykafkaesquelife.blogspot.com/ My Kafkaesque Life

    Looks like the frequency is based on Putonghua, not Guoyu, so I guess it’s not very useful to those like me, who just want to learn the Taiwanese version of Mandarin. Or am I wrong? I usually know how to decompose a character into its main parts, but when I want to know how common a term or phrase is, I usually ask my wife.

  • Greg

    Hey Niel – I’ve been watching the posts for a while on this, because the idea is really interesting. The place where I have done this break-down most extensively is in the Heisig method of learning to read – which I’ve blogged about a lot. (For the sake of those reading those who don’t know Heisig, I’ll give a quick summary …)

    The individual components are given ‘images’ – usually a close match for their meaning, but sometimes changed to make the process easier. For example, 白 is given the image of ‘white dove’ and not just ‘white’ – because it makes the visualisation much easier. And as you work through the 1500 characters in Book 1 (and then another 1500 in Book 2) the imagery builds up off these components.

    Using your 魔 as an example, the two components in the image in constructing ‘devil’ are 麻 (hemp) and 鬼 (ghost) – and it’s easy enough to make a visualisation that lets you connect these 3 components. But of course this assumes you know how to write 麻 and 鬼 – which you would have done earlier in the book, again building it up with images and components.

    I was thinking – if you were really bored :) – at some stage it might be interesting to add a Heisig extension to your HanziCraft. A character break-down would take place into the defined Heisig primitives, using the keywords he has allocated, allowing the user to zoom in as far as he/she wants. Anyway, just a thought – since there are so many Heisig students out there …

  • http://niel.delarouviere.com NielDLR

    The frequency for the words are based on data mined by a Netherlands University from Weibo in January 2012 (over 4 million posts). So, yes, there might be a slight bias towards mainland vocabulary due to the potential distribution of users on Sina Weibo. However, Sina Weibo would probably be the most neutral source regarding Chinese language usage regarding frequency as it is not based on specific locations.

    It will thus be mostly useful I assume. In any case, the frequency is related to characters, so for instance, a difference between Taiwanese and Mainland vocabulary is the word for potato (马铃薯 vs 土豆 respectively). If you search for 马 or if you search for 土, you’ll see both words pop-up in the example words.

    Therefore, this is not necessarily biased towards a certain dialect, but rather gives a neutral view regarding the usages of characters. I also count both simplified + traditional versions of the characters as one character when computing the frequencies, because again, Sina Weibo would have more simplified characters.

  • http://niel.delarouviere.com NielDLR

    Hi Greg,

    I’ve always been in two minds about Heisig (it’s a long discussion I know!), but it’s definitely not a bad idea to implement that into HanziCraft. I’ll see if I can find some data regarding the primitives, otherwise it might take some time to create the data needed for lookups. But it’s going in my future features list for sure!

  • Greg

    Well, when I’m next in Taipei :-) let’s sit down over a 珍珠奶茶 and chat, maybe I can help?

  • http://niel.delarouviere.com NielDLR

    I quickly did some searching and found this document. It looks pretty good, but for instance 白 is ‘white’ according to this list? How does it look? https://docs.google.com/spreadsheet/ccc?key=0Auaa1SVOGeeBdGNDZWU1UGJsNmNlM0lpMTBmN2xnUVE&hl=en#gid=0

  • Greg

    I haven’t seen that before – looks like a good version. Have book-marked it :-)

    In terms of 白, for example, of course the key meaning is white, and that will appear in the table. But the book presents a visualisation of a white dove, which is more about the method than the meaning.

    The doc you found gives you the starting point, but it doesn’t give the deconstruction – that’s what HanziCraft is fore :-) And I guess there are two stages – the first is what THAT character’s component parts are, then the second is how they each in turn deconstruct.

    But this is much easier face2face – see you in Taipei.

  • Pingback: The 5 challenges I faced in developing my new project HanziCraft

  • Pingback: HanziCraft: One month in

  • Ruben Moor

    Hey Niel

    first of all: awesome tool! Looks pretty unique to me.

    I found Hanzicraft when looking for digitally available data on Heisig’s
    primitives. I am working on tool which turns now out to be quite similar
    to hanzicraft. Main difference: I got inspired by Heisig and wanted
    base it on his keyword logic (and stories).

    Now that I found Hanzicraft I don’t know if my tool would add any value. And I envy you for having already such an advanced tool!

    Say, your faq states that the decomposition data is from Gavin Grover. I am using Heisig’s way of decompose characters, which will have similarities, yet is focussed on supporting his mnemonic method.
    I am afraid I have to extract Heisig’s decomposition data manually from the book … (don’t ask me about copyright issues :S). This might be useful for a respective extension of your tool as well, if you are interested.

    Also: Is there a particular reason, why your tool does not show characters that *contain* the one (like the reverse to decomposition)?

    Aah … and later on I might have questions about the dictionary data you added to your characters.

    Best regards,
    Ruben

  • Greg

    Yes agreed – I would love it if the tool had a reverse-decomposition function :-)

  • http://niel.delarouviere.com NielDLR

    Hi Ruben,

    at the moment, I went with establishing a neutral base for the project. Standard Mandarin for component decomposition and definitions. I’m considering adding Heisig yes! But like you said, data is a bit of problem as is copyright issues. If you get the data, I’ll definitely considering adding after consulting with Heisig regarding the data.

    With regards to the reverse-decomposition, it’s definitely a feature I’m working on. I’m working on a function that allows me to compute similar characters based on the decomposition. At the moment, I don’t have an eta, but it’s the next high priority!

  • http://twitter.com/DecryptChinese Decipher Characters

    Niel,

    This looks like a great tool. It reminds me of the work I did a few years back when I created chinese-characters.org. Would you be interested in collaborating on a future project?

    Ed

  • http://niel.delarouviere.com NielDLR

    Hi Ed,

    yes, I’m a big fan of your website! Being using it for some time now.

    I would definitely like to collaborate. What do you have in mind?

    Send me an email: confusedlaowai@gmail.com

  • disqus_Bfn4tu9SQx

    Hi Neil, I just subscribed to your new tool, Been reading some of your posts, great to share in your passion and frustration of learning mandarin!

    Just a query, when I look up a character, and then click on a word formed by that character, I get a great breakdown of the individual characters in that word, but I can’t find the definition anywhere of the actual compound word…some I know, some I don’t, and it would be great to find out the meaning of all these compound words there on the screen… Otherwise I am going to have to look them up in Pleco or another dictionary…

    Are the definitions somewhere and I just can’t find them? Also then, sentences with these words in would definitely be a great feature….

    Cheers, and keep up the posts and passionate work!

    Donna

  • http://niel.delarouviere.com NielDLR

    Hi Donna,

    yes, that is something that I’ve thought of, but was never really a issue due to the fact that I use a pop-up dictionary, but I understand that not everyone is using a pop-up dictionary. I’m adding this to my to-do list!

    Thanks for subscribing by the way. Really appreciate it!

    Regards,
    Niel

  • disqus_Bfn4tu9SQx

    Hi Neil, oh yeah I forgot, I use mdbg reader tool on my computer, so I should be able to get the meanings when on my comp, but I mostly use my iPad these days, is there a pop up dictionary that you know of that I can use on the iPad?

    Ta, Donna

  • disqus_Bfn4tu9SQx

    Niel sorry about your name appearing wrong, I made sure both times I had typed in your name correctly, the autocorrect obviously overrode it both times :-)