Time to start thinking about data and IR algorithms
Search remains one of the largest components in information architecture structure, and people’s most familiar form of discovery in today’s apps and websites. We search every day, every hour. It’s become second-nature for humans to tap that little magnifying glass when in the need of locating information. Delivering a great search experience is critical to success, I mean, how many of you use Bing?
Here’s my thoughts on how designers and teams can build better search and discovery within their products.
I like to see the core of searching split it into two simple components, the query, and the result. That word, phrase (or sentence if you’re getting on in age) that you type into the box before pressing enter — that’s the query.
I remember laughing at my Dad typing long questions into Google, ending it with a please or thank you, then getting annoyed when he couldn’t find the result he was looking for. I joke about writing sentences as a query, but I’ve found myself doing it more than I used to. Google has come a long way since then; now it’s aware when you’re asking a question, when you’re looking for an article, and when you’re searching for a video — all through the language you use.
Type “When is ___ ?” or “How long until ___ ?” into Google and see how it prioritises one answer/result if it has enough confidence it understood what you were looking for.
The important take away here is to be aware of language in the query. If you detect predefined phrases in the query, for example “author of ___ ”, you can have confidence they are searching for a person, not a book, and therefore improve the experience by showing authors first.
Another way to improve language detection in your search query is to power your search with a thesaurus. Using a thesaurus to find synonyms of a query can vastly improve the experience for international users, and catch the use of colloquialisms and slang.
Everyone has their own style and way of saying things, so it’s impossible to get this in one go. The key is analytics, you’ll be very surprised looking at what people search for, it’s eye opening to see real data of searches on your product.
Remember, rather than saying “Wow, what an idiot!” — try to understand what that person was really looking for.
Most people will, again, think of Google when I say suggestions. That really clever suggestion engine that autocompletes your sentences as you type, powered by extensive machine learning. In reality, we don’t all have the budget for this type of search in our products.
What we can do: real-time content suggestions. An example I love is the search for Medium. While typing a query, Medium matches it to content (which could be an article, a publication or person etc.) and updates the suggestions in real-time. The difference here, they’re suggesting content — not language. Hopefully, the developers will agree that this is a much more achievable solution!
The nice thing about suggesting content, is that you have the potential of helping the user find what they’re looking for much faster. Completely bypassing the process of entering the full query, loading a new page, then locating the correct result.
People have poor short term memory, it’s a fact. Perhaps they found what they were looking for but it was a bad time, or they got distracted with something else. When they finally get round to wanting to re-find that information (which could be weeks or even months later), show them their history of recent searches and allow them to easily restart the journey.
It’s much easier for people to retrace actions from their memory than it is to remember pieces of information. You need to be their short-term memory.
Avoiding the query
What would be better than entering a query and getting an exact match on the information you wanted? Not having to enter the query.
As I mentioned at first, “search remains the most familiar form of discovery in today’s apps and websites”. This could all change with how some of the most popular apps are approaching search and attempting to associate the well-known magnifying glass icon — with Discovery.
Take the examples below from Twitter, Instagram and Pinterest.
All three contain a standard input query search, but they also attempt to give the user valuable content before they perform a search — completely removing the manual interaction.
This works both ways, perhaps even more so. People will start to come here to discover new suggested content, then if they can’t find what they like, input query search is right there when you need it.
When designing for search, ask yourself whether there’s anything you can do to evolve your search feature, into a discovery feature.
So you’ve nailed the query design, great! But how do you know what results to show?
Information Retrieval (IR) Algorithm
This sounds very scary, I know. Something you’d probably expect your engineer to deal with — I mean it does have the word algorithm in the title! I can’t pretend I’m an expert in writing these algorithms, but I want to stress how important it is that designers and information architects are involved in building them.
The IR algorithm calculates and retrieves information based on the query supplied to it. They don’t have to be complex, in fact they can be very simple. An example could be, when a hashtag is passed in the query it retrieves all posts with that exact hashtag and displays them in reverse-chronological order.
This algorithm matches based on an exact match for one property, then displays them by sorting another property. There’s no fancy thesaurus work going on here, no advanced calculations, no data weighting.
This simple algorithm can be seen in both Instagram and Twitter.
Now we’ve looked at a simple one, let’s start thinking about a more complex one. The one above only used two data properties, it wouldn’t be uncommon for an algorithm to look at hundreds or even thousands of properties to match and display the information. As I’ve used Instagram and Twitter as an example, let’s stick with them.
Both social apps have a section called ‘Top’ when performing a search. This section attempts to show you the most valuable information first, perhaps it’s trending right now, perhaps it’s by somebody that you know. Unfortunately we don’t know the exact algorithm, but we can start to think about how it might work.
What we do know; there is an invisible property created per post that ranks it’s top value, calculated based on lots of pieces of data. If the post’s top value passes the benchmark top value it is displayed within the top section.
Some data it could be using:
- Number of followers
I’m guessing they use all of this — and probably many more. The interesting part here is weighting. It’s not as simple as the sum of these numbers equals the top value. Each property can be assigned a multiplier to weight the most important data, for example Likes has more user value than Views in my opinion. For something like Twitter where the real-time feed is so important to the product and therefore user experience, I’m sure Date has a very high weighting to keep content moving and updating. Perhaps Date is more than just one piece of data, what if Likes per Hour is a property we don’t know about? What if Date was applied to all properties? It would put a huge emphasis on ‘What’s popular now’. Makes sense for Twitter, right?
The meaning of this is not to start reverse-engineering Twitter’s algorithms, most probably powered by machine learning and data scientists much more intelligent than us.
It’s to start thinking about data priority in search and within your product — and how it can effect the user’s experience.
One last example
IR algorithms are used in more than just manual search queries. Let’s look at Dribbble.
As I’m sure most will know, Dribbble has a ‘popular’ page. It displays trending work, uploaded by designers in the community.
The popular page uses an IR algorithm just like the Twitter ‘Top’ section, but it doesn’t take a manual query — it defaults to show all content. From a high level view, the algorithm seems to take into account Likes and Views within a timeframe, perhaps a Day.
The page is actually heavily criticised by new members. The same names seem to appear on the page each day, people with a large social following. This makes it really hard for new users (who could be doing outstanding work) to get visibility on the platform. Ultimately they become demotivated and stop using the platform after a couple of posts. So, how could Dribbble improve this?
Experimenting with data priority. They could increase the weighting if the user has posted less than 10 times. They could decrease the timeframe to update the page more frequently, if these popular designers aren’t uploading each hour it gives room for other designers to appear higher up the page. They could consider the ratio of number of followers to views, if two designers both get 1,000 views, the one with the less followers is technically higher in popularity for that post.
It’s not easy, as while you want to help other lesser known users, the page needs to populated with quality content. It’s all about playing with the numbers, testing with real users, and finding the right balance.
Writing and building algorithms will rightfully always be a developer’s role. I’m suggesting, that as designers, we don’t shy away from these data driven features, but instead steer into them. Your team will thank you.