Some thoughts on testing icons

December 14th, 2009 by Björn Balazs

It is really great to see how many of you already took part in our KMail-Icon-Test. A lot of questions arrived my by mail or in my blog, so I thought I would just explain a little about testing icons…

We do this testing, because icons are useful and beautiful - they save place and people can recognize them faster than they can read text. Practically icons work via a visual-metaphor. If that metaphor, however, is not understood by the user, the icon does not work out. One can easily realize, that it will be much easier to find a visual metaphor for “save” than for “save as” or even more complex or abstract computer actions.

With the tests we mainly want to answer two questions:

  • We only have limited resources. Now we want to work on some new icons - but which icons should we work on?
  • We want KDE SC to be usefull for everyone in the world. So, do our icons and our terms work for everyone in the world?

The icon test finds indicators for answering these questions, by asking you - the user - to allocate icons and terms. We can measure how well icons and terms match. And we can also do this individually for any language the study runs in (so we get the world answer). Technically we have found a lot of indicators for the quality of icons, that focus on explicit or ambiguous allocations of icons, missing icons and time spent to actually decide for an icon. All these indicators help us to answer the above questions.

So this is a method mainly for evaluation and no not for finding inspiration. But still, this is not absolutely true. Some people complained about not offering a “no icon fits”-button. There are two things to say about this:

  1. The evaluative result is the same - whether we offer the “no icon fits”-button or not - and whether people read instruction to skip a term or not. To assure this we us a multiple indicators approach.
  2. The inspirational result is worse when we offer an explicit “no icon fits”-button, because we do not get the idea about approaches to the visualisation of that term. Typically you do not get a random result if a term has no well corresponding icon. You do get some icons that drop out and you can try to understand why. So you might just find ideas about how to construct the metaphor for that missing icon.

So we will continue  to explain that a term can be skipped, but we will not offer a “no icon fits”-button.

There have also been some complains about the terms, missing context and the sense of life. Well, it’s not always sunshine. Testing typically means not being “real”. In other words: Testing always has some restrictions. With the icon test we focus on the association between the icon and a corresponding term. If the term is rubbish, the icon has no chance. As terms for our KDE SC icon tests we use the labels next to the icons or tool-tips of the icons (if no label is around) you can find in the application. We tell the participants that they can find these icons in the context of a certain application - this time KMail.

And, as I mentioned above, we can compare the results of different groups. May it be the spanish speaking people with the Windows(TM)-only users. Thus we can thoroughly investigate the relationship between icon and term. And we think the indicators we get are much better than no indicators, and we are well aware of the existing restrictions. You can find a sample result of an icon-test here.

Tags: ,

10 Responses to “Some thoughts on testing icons”

  1. Dion Moult Says:

    You should offe an optional checkbox “could be impoved” or “unsure” alongside the “pick the closest icon”". that’ll both provide a visual indicator and an ambiguity indicator.

  2. Björn Balazs Says:

    Again this idea would make the interface more complex and at the same time we would not gain more information. We can see from the answering behaviour whether an icons is working fine or not. We do not need participants to explicitly state this.

  3. Fri13 Says:

    I do not understand how you can see that icon is working fine when user selects the icon but still thinks it is not best one for that?

    I made many such choices because I did not understand at all that I can just vote “empty” by not selecting the icon and pressing Next.

    There were many icons what should be fixed but they were familiar now from KMail.

    There really should be “None of shown is good” -choice there or someway to tell “I see this fits best from these all options, but it could be much more better”.

    Like there were question what is best icon for inbox… who would really choose the arrow left or the arrow right icon? Or who would choose a icon showing STOP sign?

  4. Björn Balazs Says:

    @Fri13: These tests do not work because of an individual response. They work because of masses of individual responses. And if there is no real matching icon for a term, people will randomly choose icons. We can see this random choice, because different people choose different random icons and it takes much longer to choose a random icon, than to spot the good ones. This is why we do not need a “None of them is good” option.

  5. Fri13 Says:

    Björn Balazs, that has one point, but multiple choices what people make are gathered from individuals and it should be possible always to give them information that they can otherwise express their opinion that none of the icons fit the function, even that they are familiar for the current icons. Just not choosing anything is harder and just gives a doubt for the results, especially when the user is old KMail users, as the test ask.

    There is no way to be sure that user has selected the icon because she/he knows it is correct one (already used) or that he believes it just fits to question but is not best for it.

    And when the scale of the icons are very limited, test had ~30 icons(?) and if question is “Draft”, there is very small amount of icons what can actually fit to that function because user does not have multiple choises from different icons what would actually resemble the “draft”. Like if 90% icons are about arrows to different directions, a lock or a camera and 1-2 has similarity to paper and pen or storing for later user, the 90% icons there does not give the user the possibility to express real truth what icon is best there.

    I was waiting that there are multiple versions for every action what have be done. So we can choose best of them. Not that there is over 90% icons what does not even resemble the function what is questioned.

    To make a analogy, it was like 90% of people who you can vote are white and only 10% are black and you have 10 person who you can vote. You can not get truthfully results if you ask “Select the person who would not be seen in the dark so easily”. But if all person who you can vote are colored by their skin, the voter need to actually thing more and you can get more accurate results for better results. Like if 10% are white, only for very weird reason someone would vote him/her as answer for same question.

  6. Stuart Jarvis » Blog Archive » How to stop worrying and love the rebranding Says:

    [...] (CC-by) I just had a quick scan through Planet KDE for examples and I’m going to pick on Björn’s excellent post about icon usability (if you didn’t already, please complete the test). Not because it’s bad – or [...]

  7. Random Dork #42523 Says:

    Have to say that I agree that some of the icons just did not fit and I had to think a while about what to put, even when nothing fit. So eventually I just used the trash can icon for such things. Had I known I could leave stuff blank, I would have chosen that. :-/

    I mean, really, sometimes an icon is more cumbersome than the words. I kept thinking to myself, “I’m pretty sure I know which one they had in mind, but it really doesn’t fit and/or make sense.”

    Also: I’m on a small form factor machine with a max resolution of 1024×768, and I generally don’t like to browse full screen. It would be nice if your survey would fit into the browser window instead of using a fixed size and forcing me to scroll for every answer. Now *that’s* bad usability…

  8. Björn Balazs Says:

    I tend to repeat myself: We know whether the users chose an icon directly because it simply fits or not. We can even figure out when an icon was chosen by 100% of the users, but only as a result of sorting out all other icons as less fitting. So we just do not need a “no icon fits”-button.

    We ask the questions concerning experience, usage behviour etc. simply because we want to be able to analyse the results further and to understand how the background influences the understanding of icons.

  9. Random Dork #42523 Says:

    Don’t get me wrong here, I think you guys are doing important work. But please, please, pretty please, get rid of the fixed width on the survey HTML. :) Not everyone has 1920×1280 monitors. ;)

  10. Björn Balazs Says:

    @Random Dork #42523: Sorry, we will keep this on the agenda, but it won’t be highest priority. So, you will have to live with the way it works at the moment for some more time.

Leave a Reply