Delivering more diverse search results

31Dec09

If you’ve ever tried searching for something on Flickr, you may have been inundated with dozens or even hundreds of photos that were taken by the same person. In my experience, this happens when someone takes a ton of pictures at a concert, trade show or company gathering, and then tags them all with the same keywords. If your search matches those keywords but you aren’t interested in that person’s photos, it can be tricky to get past the virtual roadblock that the pictures create on the search results pages.

In Flickr’s case, the obvious solution is to limit how many results are shown from the same user. Most web search engines set this cap at two results per website, but maybe five or ten makes the most sense for image searches. Either way, common sense dictates that there ought to be a cap.

Generalizing this a bit, if you have enough information to be able to tell the origin or author of the documents in a given dataset, then you should set a hard limit on how many entries from the same source that you show at once. In doing so, you’ll make the search results more diverse, increase the chance that users find what they need, and reduce the frustration that comes from seeing nearly identical results over and over again.