Go back

citibeats

Query editor powered by AI.

User Research

task analysis

data-driven

Role

Lead Product Designer

Duration

3 months

Collaborators

Alex, Data Science

Irene Chausse, PM

Marc Ribera, Front-End

Oscar Delgado, Back-End

Anna Ruiz, Back-End

Tools

Figma

The company

Social listening platform

Citibeats is a B2G SaaS start-up collaborating with leading multilateral agencies, including the World Health Organisation (WHO), the World Bank, and other United Nations entities. It offers a social listening platform powered by ethical AI and NLP to process large volumes of unstructured social media data, delivering actionable insights into population concerns on key impact areas.

The problem

Too long, too costly

The lengthy and tedious process for setting up and defining research topics, characterised by boolean queries, was causing the company thousands in losses, with onboarding stretching for over a month and requiring long hours from the customer success team.

Our objective was to empower users to independently define topics for analysis, reducing the onboarding time from a month to under 30 minutes.

Complex boolean query definition.

Design brief

Reimagined, not refined

As the lead on product design and UX strategy, I collaborated with major stakeholders to set the design brief, KPIs, and success criteria.

Redesign dataset creation process for non-data-analysts to create relevant topics for their analysis, without citibeats intervention, in a timely manner & within budgetary constraints.

Other key areas of focus included:

Streamlining the complexity through user-centric design instead of a technology-driven interface.
Exploring the integration of chatGPT's potential for intelligent query support.
Implementing new clustering and summarization models to enhance functionality.

User Personas

Catering to generalists

We quickly identified that improving the usability of the topic creation process required us to cater to various personas beyond "Data Analysts" familiar with Boolean logic and complex queries.

Quick Search for speed, Query Wizard for simplicity, and Query Editor for precision—tailored for every type of user.

We introduced three approaches for topic creation: Quick Search for instant data previews, Query Wizard for complex analysis targeting non-technical users, and the Query Editor for advanced users needing detailed control.

This case study focuses on the development of the Query Wizard, which was designed to maintain the depth of the Query Editor while being accessible to all user personas.

Design: part one

Defining keywords

The first goal was to translate the complex query definition into a user-centric solution that would be accessible for all. Here is the evolution of iterations (from left to right):

Design: part two

Leveraging LLMs

People think in questions rather than keywords. This insight led us to explore how Large Language Models (LLMs) like chatGPT could be integrated to make the query process more intuitive.

A lot of the users were stuck at the starting line. Our approach was simple, allow chatGPT to give you keyword suggestions based on your topic name and get the ball rolling.

However, we identified some issues with the first iteration.

The suggestions were pushed into the keyword area, leading users to over-rely on AI suggestions rather than considering keyword relevance.

It created a subtractive experience, which felt disjointed.

For the next iteration, we separated AI suggestions from manual inputs, allowing users to auto-generate and carefully consider keyword choices, ultimately enhancing the user experience.

Variant B: AI suggestions moved to a different section.

Lastly, an alternative to a “research question” driven UI has been to move the AI query input to AI Suggestions, hence eliminating the need for a Research Question to define anything and leaving it to pure keyword setup.

Variant C: AI suggestions with a text input.

In the end, after some usability studies and primary interviews, we confirmed that formulating a research question from users side yielded better usability results and therefore, kept proposal “B”.

Design: part three

Cluster, summarise

To further enhance user experience and time-to-insight, I worked with our data science team to integrate the new clustering and summarisation ML models into the interface.

We wanted to use AI as a partner in reducing cognitive load.

During the first iteration, we introduced a Data Preview feature, allowing users to see a sample of most relevant documents based on their query.

Twitter documents preview.

It takes a lot of time to go through all the conversations to grasp the overall meaning of data. Utilising the algorithms, I designed an interface focused on grouping the conversations into subtopics and summarising them. We also gave users the ability to rate the relevance of subtopics, which helped refine the system’s machine learning model.

Data preview section with subtopic summaries.

However, usability tests revealed that some summaries were occasionally faulty, making it clear we couldn't get rid of the conversations completely. Hence, a more detailed view of each cluster. We tested two different layouts:

Vertical layout improved comprehension, as participants spent 60% more time reviewing conversation data, resulting in twice as many keyword refinements.

Design KPIs

Focusing on what matters

Having iterated an all the design components, I evaluated the new designs using the Top Tasks methodology, previously established through card sorting research. We focused on the following KPIs:

Task Success Rate

We aimed for an >80% success rate, testing 8 tasks per session (1 hour) with at least 15 participants. These included both frequent and infrequent users, ranging from data analysts to non-technical users.

Task completion times

We set target times for each task in collaboration with the whole team and tracked progress every 3 months to ensure continued improvement.

System Usability Score

To ensure ease of use, we aimed for a System Usability Score (SUS) of >75, ensuring most users could easily navigate and interact with the interface.

Usability testing

Data relevance comes first

To validate our design hypotheses we tested three different layout patterns, gathering feedback on overall user experience and task performance.

Variant A: Stepper

Variant B: Top-to-bottom

Variant C: Split (1/2)

Variant C: Split (2/2), slide-out pane

Overall, we ended up conducting usability testing over 31 participants spread across 3 different groups per each layout, yielding the following results:

Design KPIs results.

The usability studies revealed that faster task completion times and more clicks correlated with the relevance of data and insights users gathered after topic creation.

It's not always about task completion, sometimes the quality and relevance of data and insights is more important.

We ultimately prioritised the "Relevancy of Data" scores, choosing the "Top-to-bottom" variant for its superior usability across all criteria.

The impact

Positive results and much more to do

Six months after the project, the project showed its impact:

We onboarded 9 times more users, cutting onboarding time by 300 hours per user.

After three rounds of usability testing, we achieved an 80%+ task completion rate.

We observed a 20-fold increase in topics created, signalling more citizen insights.

These outcomes highlight the project's significant impact on the user journey and the overall business performance. The integration of AI-powered clustering and summarisation models has shown measurable improvements in reducing cognitive load and enhancing user control over large datasets.