But What About Pitfalls, Challenges and Limitations?
Limited significance and difficult to interpret data
Usability is commonly measured by effectiveness, efficiency, and satisfaction. However, eyetracking only provides insights into potential issues for efficiency and effectiveness and cannot make any statements about satisfaction. Therefore, it is recommended to use the method in combination with qualitative usability tests and think-aloud techniques.
Additionally, eyetracking only shows foveal gaze direction, not peripheral perception or cognitive attention. When a user looks at an element, it is not clear whether they have understood its meaning. However, observation over time can be informative here. Similarly, if they ignore an element, it is unclear whether they have already perceived it in their peripheral vision and deemed it unimportant for the task (https://www.interaction-design.org/literature/article/eye-tracking-ux). This is a blind spot in the method.
Moreover, the focus in the data analysis quickly shifts to visual representations like heatmaps because they are intuitively accessible to many people. However, these are difficult to standardize and context-sensitive, making them hard to interpret. The objective and well-standardizable data, which actually represents the primary value of eyetracking, receives less attention than it deserves.
High levels of expertise required
The gaze behavior of individuals is strongly influenced by the task and the specific scenario (see Yarbus, 1967). This means that eyetracking studies should be designed and conducted by experts who can ensure the accuracy and validity of the data. Without proper planning, the findings might not reflect the intended user behavior.
Even after data collection, researchers are left with a large volume of unstructured data. Modern eyetrackers can capture over 7,500 readings per minute, which means the data needs to be processed, analyzed, and interpreted. This step requires a high level of expertise to ensure that the insights drawn are actionable and relevant to the product design.
High costs and efforts
The highest efforts in eyetracking studies arise from the participants. They need to be recruited, incentivized, and guided through the study. Jakob Nielsen recommends at least 30 participants (Nielsen, Pernice, 2009, Eyetracking Web Usability, https://dl.acm.org/doi/10.5555/1823564). However, since things can go wrong, such as data not being recorded, it is better to aim for 40 users. There is also the option of conducting smaller qualitative studies with a sample size of 5-10 participants. However, this eliminates the inferential statistical validation of the design. Instead, the data allows for deriving hypotheses about the design, which must be validated in the further course of development.
Another point for expenses are the tools; software and hardware. This includes acquisition costs, setup, and pilot testing. Technical difficulties always occur. Therefore, sufficient time buffers should be planned.
The setup also includes calibration. This can be difficult between individuals; for example, capturing is more challenging for people with with extensive facial expressions. Different eye shapes and sizes, as well as glasses, also make calibration more difficult. During the recording, the participant must maintain a similar position continuously in the case of a remote eyetracking system. Otherwise, the calibration is lost. However, it should be noted that calibration has become much easier over time due to technological advancements.