A man wearing the first hardware for eyetracking

Attention Please! To Eyetrack or Not to Eyetrack for Usability Evaluation

The UX Research Toolbox

by Philipp Ehrle • 7min read

Eyetracking as an Evaluation Method

Eyetracking is a common tool for recording the gaze of individuals. The goal is to study visual attention by examining eye movements, fixations, their duration and sequence. This allows for the investigation of the complex interplay between visual elements and spatial orientation of individuals. The areas of application are diverse; it is used in advertising, product development and scientific research, among other fields. For example, eyetracking is an established and highly standardized tool in the automotive industry for measuring visual behavior while driving, as demonstrated by the international standard ISO 15007:2020.

Eyetracking entices with objective, reliable data and thereby promises clear derivation of action recommendations. Additionally, due to technological advancements, it now offers a certain flexibility in recording. Mobile apps, websites, and physical products can be tested, as a head-mounted or remote eyetracking system can be chosen depending on the application.

But how relevant is eyetracking really for product development that aims to create user-friendly interactive systems? For usability evaluation, the question arises: To eyetrack or not to eyetrack?

The cumbersome beginnings of eyetracking in a study by Shackel in 1960 (Note on Mobile Eye Viewpoint Recording, https://www.researchgate.net/publication/9143696_Note_on_Mobile_Eye_Viewpoint_Recording). Over time, there have fortunately been noticeable technological advancements for study participants in terms of the weight and obtrusiveness of the eyetracking system.

The Valuable Insights of Eyetracking

Eyetracking provides a multitude of data; both standardizable metrics such as gaze durations and gaze sequences, as well as intuitive visualizations like heatmaps, opacity maps, and gaze plots.

Heatmap, opacity map und gazeplot (https://www.usability.de/leistungen/ux-testing-nutzerforschung/eyetracking.html)

In the analysis, the data can be used to answer various research questions:

What do users perceive, and what do they overlook?

Which texts do users skim, and which do they read?

What distracts users?

What are the causes of usage errors and near misses?

In what order do users perceive content?

Why do users need so much time?

How user-friendly is the navigation and layout?

How do two products or designs differ from each other?

With questions like these, UX research has revealed some interesting insights through eyetracking in the past:

Inattentional blindness (Mack & Rock, 1998, Inattentional blindness. https://psycnet.apa.org/record/1998-07464-000): People are practically blind to visual elements that are irrelevant to their goals and actions. This is also demonstrated by the well-known Selective Attention Test by Daniel J. Simons in 1999 (https://www.youtube.com/watch?v=vJG698U2Mvo).

Banner Blindness: Based on inattentional blindness, it has been shown that people ignore elements on the web that look like advertising. (https://www.nngroup.com/articles/banner-blindness-original-eyetracking/)

Pattern Scanning: People scan websites according to specific patterns. The pattern depends on the type of content and the goal of the person. The most well-known and frequently cited is probably the F-pattern. However, the Layer-Cake Pattern and the Spotted Pattern are also relevant. (https://www.nngroup.com/articles/f-shaped-pattern-reading-web-content/)

Love at first sight phenomenon: People do not always aim for the best possible solution. Especially in visual information searches on websites, it has been shown that a large proportion of people are satisfied with the first result and do not look any further. (https://www.nngroup.com/articles/love-at-first-sight-pattern/)

But What About Pitfalls, Challenges and Limitations?

Limited significance and difficult to interpret data

Usability is commonly measured by effectiveness, efficiency, and satisfaction. However, eyetracking only provides insights into potential issues for efficiency and effectiveness and cannot make any statements about satisfaction. Therefore, it is recommended to use the method in combination with qualitative usability tests and think-aloud techniques.

Additionally, eyetracking only shows foveal gaze direction, not peripheral perception or cognitive attention. When a user looks at an element, it is not clear whether they have understood its meaning. However, observation over time can be informative here. Similarly, if they ignore an element, it is unclear whether they have already perceived it in their peripheral vision and deemed it unimportant for the task (https://www.interaction-design.org/literature/article/eye-tracking-ux). This is a blind spot in the method.

Moreover, the focus in the data analysis quickly shifts to visual representations like heatmaps because they are intuitively accessible to many people. However, these are difficult to standardize and context-sensitive, making them hard to interpret. The objective and well-standardizable data, which actually represents the primary value of eyetracking, receives less attention than it deserves.

High levels of expertise required

The gaze behavior of individuals is strongly influenced by the task and the specific scenario (see Yarbus, 1967). This means that eyetracking studies should be designed and conducted by experts who can ensure the accuracy and validity of the data. Without proper planning, the findings might not reflect the intended user behavior.

Even after data collection, researchers are left with a large volume of unstructured data. Modern eyetrackers can capture over 7,500 readings per minute, which means the data needs to be processed, analyzed, and interpreted. This step requires a high level of expertise to ensure that the insights drawn are actionable and relevant to the product design.

High costs and efforts

The highest efforts in eyetracking studies arise from the participants. They need to be recruited, incentivized, and guided through the study. Jakob Nielsen recommends at least 30 participants (Nielsen, Pernice, 2009, Eyetracking Web Usability, https://dl.acm.org/doi/10.5555/1823564). However, since things can go wrong, such as data not being recorded, it is better to aim for 40 users. There is also the option of conducting smaller qualitative studies with a sample size of 5-10 participants. However, this eliminates the inferential statistical validation of the design. Instead, the data allows for deriving hypotheses about the design, which must be validated in the further course of development.

Another point for expenses are the tools; software and hardware. This includes acquisition costs, setup, and pilot testing. Technical difficulties always occur. Therefore, sufficient time buffers should be planned.

The setup also includes calibration. This can be difficult between individuals; for example, capturing is more challenging for people with with extensive facial expressions. Different eye shapes and sizes, as well as glasses, also make calibration more difficult. During the recording, the participant must maintain a similar position continuously in the case of a remote eyetracking system. Otherwise, the calibration is lost. However, it should be noted that calibration has become much easier over time due to technological advancements.

Soo… To Eyetrack or Not to Eyetrack?

As always in design, the answer is... it depends. In the right hands and with the right research question, eyetracking can provide great value. It can bring a completely new behavioral dimension to usability testing, which can be particularly insightful because it highlights usability issues that would remain hidden with conventional observation and survey methods. At the same time, the method has blind spots that should be compensated for by other methods.

Jakob Nielsen also poses the question and answers it himself: "Should you use eyetracking in your usability studies? Probably not." He advises most companies against using eyetracking. If the method is to be used, he recommends relying on experts.

The key factor in determining whether eyetracking is useful is which product is to be evaluated with what goal. Especially for critical domains with safety risks or complex contexts of use, eyetracking is an important tool in the methodological toolbox. The automotive, aviation, military and medical technology industries come to mind. For simple marketing websites, an eyetracking study will rarely justify the cost-benefit ratio. However, there are attempts to replace costly participant studies with a click using large databases and AI (e.g. https://www.neuronsinc.com/) If these tools are going beyond a basic analysis of visual hierarchy remains to be seen in the future. For the usability evaluation of complex, safety-critical systems, this will not be helpful in most cases. These systems and their use cases are too individual and too complex. But we will see how technological advancements in the field of AI will surprise us anew.

This article is available for download in a beautifully formatted PDF, carefully designed for a pleasant reading experience and highly suitable for presentations. Just send us an email to: hellothere@newspective.design

Philipp Ehrle

Digital Designer