Credit Suisse
Abstract
Mobile apps have increasingly become the preferred method of banking, as clients are able to access their accounts and conduct transactions at their convenience, anytime and anywhere. Due to the widening scope and adoption of mobile banking, the usability of such interfaces becomes increasingly important as a focus of study and directly impacts the user experience of clients across all demographics.
With the opportunity to test the latest Credit Suisse iOS mobile banking app prototype, we focus on usability of the user interface and the payments workflow. We seek to uncover possible usability issues and discover new ways to improve payments on the mobile platform from an international point of view.
For the study, we tested the prototype with thirteen participants spanning relationship managers and multinational university students. Our findings suggest that simplicity, efficiency, and contextual convenience are of the highest priority when designing banking experiences on the mobile platform.
We conclude with recommendations for the final design of the interface and underscore the importance of aligning the feature set to the purpose of mobile banking, error prevention, and educating the users on new design and functionality.
Theory
An entire subfield within Human-Computer Interaction focuses on testing the usability of interfaces in hopes to develop them into systems that are as user-friendly as possible. The main goal for usability testing is to “improve the quality of an interface by finding flaws in it” of which “cause problems for users” (Lazar 2010). In contrast to Summative Testing which takes place at the later stages of the design process, formative user-based testing is a subfield of usability that occurs in the earlier-stage design process. At this stage, users are often given wireframes and prototypes to test that don’t yet have the entire set of features implemented. As a result, the main focus of formative usability studies is to provide an opportunity for problem discovery, generate data, and gather feedback that can be helpful in guiding the direction of the final design of the interface.
Through the process of usability testing, it is said that five users will find around 80% of the usability problems in a given interface (Virzi, 1992). However, that “magic number” also depends on other factors, such as the complexity of the interface and the scope of the tasks (Lazar 2010). Regardless, the goal of usability testing is not to find all the usability issues, but instead to uncover the major, high-priority, high-severity ones.
The basis of usability testing is centered around various task scenarios that emulate real-world ones. In doing so, both quantitative and qualitative data are generated throughout the process. Different quantitative metrics such as task success, satisfaction, and error rates, can be evaluated throughout the interviews, while qualitative data can be identified as users talk through their thought process aloud. This is also known as the Thinking-Aloud Protocol (Lazar 2010). This protocol is particularly helpful for Formative usability testing given that not every part of the interface is interactable or implemented.
Another critical part of usability testing includes surveys and questionnaires. Questions that probe users’ perception about a system may encompass: simple rating scales to rate difficulty and quantify expectations before and after a given task or scenario, such as Likert scales; specific qualitative attributes, such as visual appeal and enjoyment, on users’ emotional connection with the system; and open-ended questions to encourage critical thinking, such as ranking top three things users liked the most (Tullis, 2008).
In particular, the System Usability Scale (SUS) provides a standardized metric for benchmarking and evaluating the overall usability of a given system (Brooke 1996). The survey consists of ten questions — five of which are positively worded, while the other half are negatively worded. All the questions are taken into account and combined to form an overall score from 0 to 100, with 100 being the best score. Scores above 80 are considered good; consequently, all systems should aim to surpass this threshold (Tullis 2008).