Hugo Alves on Real & Synthetic Users

THAT BUSINESS OF MEANING Podcast

0:00

-55:48

Hugo Alves on Real & Synthetic Users

A THAT BUSINESS OF MEANING Conversation

Peter Spear

Apr 29, 2024

Transcript

AI Summary. In this interview, Peter Spear talks with Hugo Alves, co-founder of Synthetic Users, a pioneering company in AI-powered user research. Hugo shares his journey, the inspiration behind Synthetic Users, and the challenges of being at the forefront of a new category. He discusses the potential of synthetic user research to complement traditional methods and envisions Synthetic Users evolving into an end-to-end innovation engine. The interview explores the world of AI-powered user research and its potential impact on how companies innovate and develop products in the future.

Hugo Alves is a Founder and the Chief Product Officer of Synthetic Users that offers AI generated persona’s for use in user research.

He started out in clinical psychology, and got interested in building something - which led to his current company and role. I spoke with him when they launched almost a year ago, and created quite a splash. I had been asked recently for my thoughts on the impact of synthetic qualitative research on face-to-face qualitative research, so thought of Hugo. I really enjoyed getting to know him, and hearing the origin story of Synthetic Users.

So this is a new thing for me. I have this newsletter and so I just take advantage of the opportunity to invite people into a conversation and I'm really enjoying it of course, but I start all of my interviews with the same question and I'll pose it to you. And before I ask the question, I always over explain it because it's a beautiful question, but it's a big question and the question so you can answer or not answer any way that you want to, you are in total control.

And the question is, where do you come from?

Got it. It's a big one. Particularly for me, I'm going to go maybe oversharing and a little bit too personal. But I'm adopted. So I was adopted at six months old. So quite young. The reality I knew until I was like five or six was just that family. I had no idea.

But then when I discovered that there was some weird dynamic in the family about someone that was visiting, that was my biological mother it brought me to that question, where do I come from? What is my background? Having said this it's not an existential question for me. I made peace with it a long time ago.

If I had to describe where I come from, I think I come from essentially an event diagram with an intersection between technology and what's new, I'm a neophile, and what's possible with the amazing things that human beings can learn to change, and also with humanity. I studied psychology and for me who are we, how do we work, how do we reason about things, what makes us feel emotions, what are our biases and how they can be helpful, and at the same time, influence some decisions.

That's essentially, I think, where I come from. It's from a mix of two worlds. From a really humanistic perspective in which I think that humans are core to everything we do. We're always about relationships. We're always about connections, while at the same time believing that humans can extend themselves beyond just themselves and have an impact on the world through technology.

And when I say technology, I use it in the broad sense. Kevin Kelly has a beautiful book called "The Technium" and he mentions that a pencil is technology. And we tend to forget that even those kind of steps that are really basic, like fire, like water, the wheel, those are technologies that humans created to act upon the world and to act upon other people. So I think that's where I come from.

Did you have an idea as a kid what you wanted to be when you grew up?

So first from my grandmother's recollection, I wanted to be a fisherman. That was the first thing I wanted to do. Then it seems that I drifted into more standard stuff where I grew up in the 80s, an astronaut.

Then I wanted to be a football soccer player which of course I would never be a good one because I'm quite clumsy. But when I then started to have more of a notion of who I am and what my aspirations were, I wanted to be a psychologist. I wanted to help people understand themselves, solve issues that they might have so help them regulate their relationships, which is probably one of the core challenges of human nature is how we relate to others.

So I wanted to be a psychologist and a scientist. So there was a dichotomy in me. I applied to psychology, didn't get in. But I got in marine biology in the Azores but it was too far away for me. I didn't want to leave my small family quite far from continental Portugal.

But what I think I always wanted to be was someone who tries to understand stuff, be it on the human side, be it on the scientific side, I always wanted to be someone that figures out and senses.

And tell me a little bit about where you are now. Did you grow up in Portugal? Tell me a little bit about where you are now and what you're doing.

Last year was my, what I call my switch year. It was the first time that I spent more time in a different city than my hometown. Because I lived in my hometown for 21 years and last year it was the first time that I spent 22 years in Lisbon Portugal's capital. What I've been doing for those last 22 years was essentially studying.

I have a master's in clinical psychology and then I worked at my university for some years in a psychology lab, helping researchers set up their studies, running experiments managing the participant pool of the university. And then I felt that I wanted to build something. So I left and I joined a web design and development agency, which is where I met my co-founder from Synthetic Users, to try to build a product that would help researchers collect data in a faster way. I wanted to build a Amazon Mechanical Turk, but with high quality participants. It didn't end up happening.

And I ended up being a product manager for a lot of different companies. Five different companies, B2B, B2C small teams, like five people, larger corporations, like 200 employees. So this is my passage figuring out how can we deploy human capital, I hate this kind of word, human capital, it's a little bit too transactional but how can we use the skills of designers, developers, strategists to build something that solves someone's problem.

When did you first discover that this idea of understanding people was something you could make a living doing? When did you first encounter, "Oh my gosh, this is what I want to do. This is what I can do."

I'm honestly not sure because one of the things, psychology as an academic is an amazing thing. There's a lot of people doing amazing research, but what I wanted to be was a clinical psychologist, and it's not a job market that is fulfilling because there's a lot of psychologists.

Most people can't afford, although they should have access to it, they can't afford mental health services. So we end up, particularly in Portugal, this weird situation where there's a lot of psychologists, there's a lot of people in mental health support, but then the people who need it don't have the money to pay.

So a lot of psychologists end up in different careers. I think I understood that I could leverage my psychology and empathy skills in this new function of product management. Essentially what a product manager does, I normally make this joke with friends when they ask me what I do, I help people build products, but I'm the only one on the team who doesn't do anything.

Because I don't code, I don't design, I don't use Figma. There's no real deliverable from a product manager. What product managers are supposed to do is understand if a particular group of people have a particular problem, if that problem is even worth solving for those people, because we have a lot of problems that are not even worth solving that we just go along with. And how to best leverage some capabilities, typically engineering, typically design, to build a solution for that problem. That is a solution that is feasible, viable, the Martin Kagan stuff and that they can build a business that can solve their problems and end up being able to improve their lives.

So I think that was my realization that I can use my understanding of how humans work and how humans reason and how humans face challenges in my product management role.

And tell me a little bit about how do you introduce people to Synthetic Users? How do you describe what it is?

It depends on the moods. If I'm in a joking mood, I tell them that I build fake people. That's my joke to friends who are not familiar with either market research or user research or consumer research. And they're like, "You build people, but why would you do that?"

When I'm talking to people who are more in the industry, what I focus on is the problem first. I help them remember that for anyone to build a valuable product, you need to understand if they tried to solve the problem in some way before, if they tried to solve it, why didn't the solution work? All of that. So user research and consumer research and market research are fundamental aspects of building great products that really provide value to the world. As much as this is a central process to that product creation flow, it's also a process that involves a lot of friction that sometimes it's almost impossible to believe.

For example, if you want to study a hard to reach segment of the population, imagine someone who has a rare disease, it's really hard to find those people sometimes and to have a conversation with them to understand how they go about it. If you want to build a product for CEOs, you're not going to get them to sit down with you for one hour for a $50 Amazon voucher. People who are having an extremely busy time, it's not easy to do customer interviews with them.

So what we're trying to solve is exactly that. Is to help people who need to do research in a fast and more effective way to have at least an early view of what the potential customers and potential users might say when confronted with particular situations or when asked particular questions and do that in a way that really reflects what those people would say out there in the real world.

Of course, the overlap can't be 100%. I'm not even sure that it will ever be, but we believe that the overlap that we now get in between our synthetic users and the real research that teams would run and the difference between cost and speed is sufficiently valuable to help teams.

And how does that work? Can you tell me a story, like in the example of the CEO, if I come to you and I say, listen, I'm developing something for CEOs. They're very hard to get. What do you do? How do you make fake people?

So first of all typically "CEOs" is just too broad of a group to study as a homogenous group. So what we need, we are now re-implementing some of our interface exactly because of that. Because sometimes we get people that either because they're not as good at research or just because they don't know how to do it, still haven't understood the power of focusing on a particular segment.

They come to us and ask, "Hey, can you generate nurses?" And I'm like, "Yes, our system will generate nurses, but you're going to get nurses that work in so many different conditions that might not be exactly what you want."

Right now, if you come to us and say, "Hey, please generate CEOs," our interface will ask you, "Okay, great. Let's study some CEOs. Is there any particular segmentation that you want to apply? Are they CEOs in Fortune 500 companies? Are they CEOs in small startups? Any particular geography you might be interested in?" So what I try to do now is clarify a little bit more the audience and the user group, because the more well-defined the audience is, the better results we get in terms of product building, because we need to be focused on particular segments and not just build products for everyone.

Which is an error that I see first time entrepreneurs make, is that we want to build a product for everyone. That doesn't work that well, but assuming that you came to me and said CEOs, but I was able to ask some clarifying questions and we're now at "CEOs of consumer packaged goods companies in the US", which is already quite well defined. What we do after that is one thing that we know is no matter how well defined your segment is, there will still be some diversity within that group. Because within this group you defined, you still have some diversity in what kind of consumer packaged goods, the size of the company, all of that.

So what we do is we generate what we call dynamic parameters. So for any audience that our customers ask us to generate, we generate a set of five parameters that we feel are likely relevant to segment and capture the full diversity of that audience. And then we generate the amount of synthetic users we need. Typically on our platform we can go up to 30. We normally generate, most people generate around six and each one of them will have a name, an age, a location, a profession, and then we'll vary within those five parameters that we generated specifically for that audience. We can then edit those parameters, add and remove, but in general terms, this is the third step of generating the synthetic profiles.

From that point on, we recently introduced a new type of interview, which we call dynamic interviews, and it's essentially for each interview that we generate in our platform, we generate this interview. But we also generate the synthetic researcher that takes the research goal that our customers provide. For example, "I want to understand what are their challenges when trying to change pricing on their products." And it takes that research goal and dynamically asks follow up questions to try to achieve the research goal that you asked.

When all the interviews are done, you generate a report that summarizes at the high level what was mentioned in those interviews, and then you can do follow up questions on the synthetic users. Now we can even chat with the report itself.

And how has it been, how long - I feel like we talked maybe a year ago or a year and a half ago. How long has it been?

I think we talked around one year ago, not long after we launched. We launched in March last year. It's been a wild ride, we were completely bootstrapped until October last year. It was mostly about trying to understand how people were using our platform, watching usage, and there was a lot of stuff that needed improvement in terms of the quality of the interviews themselves.

Now I look back, I still look at it and I say, there's a lot to improve here, but now I look at the early ones and I'm like, "Oh my God, how could I put this out there in the world?" Because compared to the ones we have right now there was, it's just that they were shallow, there were repetitive things. But I think at the time, they were already better than not doing any research.

A lot of people that came to us were a little bit in that mindset. We had early adopters, entrepreneurs, people just wanted to build a business and wanted to check if they had the right feel for it or not.

And we iterated on the system. The level of complexity of our system today, compared to one year ago is probably 10x because there's, although audiences don't see it, there's a lot of stuff going on in the background that really makes the interviews a lot better.

In terms of our team, when we launched last year we were three - myself, my co-founder Kwame, and Bart who was our developer at the time. So we were really constrained in terms of being able to iterate on the product quickly. Now we have four developers and still growing, so the team has a lot more leeway, has a lot more expertise. We're working on some cool new stuff like surveys and the ability to import existing data to enrich the synthetic users. So it was quite a big change.

Also quite a big change in terms of market reception. One year ago, almost every post I saw on LinkedIn was kind of, "Oh, these guys are crazy, they should just stop what they're doing. It will never work." And now with people using the product, people sharing the outcomes that they get with their teams and with their colleagues, we now already see a lot of positive feedback, which was a welcome change. I think I'm a tough guy in terms of how I handle criticism, but there were some tough weeks getting roasted by big user research names on LinkedIn. It was quite tough to handle that criticism, but we used it to make the product better.

And I think in the meantime, convince some people that there's value in what we do. And I also think there's some people that will never be convinced because it's more of a moral position regarding what we do than really an effectiveness and a pragmatic one.

Can you tell me a little bit about what that's like to be in the middle of a storm like that? I can only imagine. I think there's so much about your name too. You're really creating this category, the way you're just showing up and bringing automation to this place that's got, everybody has so many feelings about, and it's so humanist. Can you tell me a little bit of what that was like to be in that kind of storm and to be carrying, to be representing generative AI in a humanist place?

So one interesting thing about our product is that although right now a lot of people say, "Oh, your name is amazing. It really conveys everything", until two days before we launched, we had a different name. We were called "Synths" with a Z at the end. And then I think someone on the team said, "It's a mouthful, I need to explain to people how to spell it and all of that." And I was like, "Okay, I agree. But I'm not stopping the launch. We're going to launch in two days. Anyway, I'm not stopping the launch. It can be whatever name you want. It can be Synthetic Users. Let's go with whatever you want, but I'm not stopping the launch." We decided we're going to launch on Friday and we're going to launch on Friday. So we ended up with Synthetic Users almost by accident. It was a placeholder name. Eventually, if we need to rebrand, we will. That's how it started.

But to be at the forefront, it's the first time in my life that I felt that I was doing a product that was genuinely something that I have never seen in the market. I was like, "Okay I'm at the forefront right now." That by itself doesn't mean anything, because you can be at the forefront of something that didn't exist or there's no market for it, but I felt that. And since then, as the category grew and established itself a little bit more, now there's different companies trying to do similar stuff to us. So I've been satisfied that we were really pioneers.

But because of that, we were also the ones that were taking all of the criticism, because as you say, most people that go into user research, consumer research, are people who really care about feelings, are people who are deeply empathetic, who are wanting to make the world, sometimes even in an idealistic approach, a better place.

And I think there's just fundamentally a moral discomfort that people see in a product like ours. Because they feel that we're taking away the essence of what's really meaningful. And Iget user researchers sending me messages saying, "No, my job is to talk to people and that's the best thing I do in my life." And I need to refrain myself from saying to them that, no, your job is helping whatever company or whatever product you're helping be better. It might require talking to people. It might not require talking to people. So don't identify with the task, identify with the goal. And there's a lot of this dilemma in a lot of people.

I think it's okay to get the criticism. To be honest, I think the fact that we knew what we were going for, so when they came to us with posts saying "User research without users. Damn, that's stupid." So I knew, and I even told my co-founder that we should have kind of a page on the website with prebuilt tweets so people could say "This is really stupid. Just tweet saying that our product shouldn't exist. That is really stupid." So we knew and our communication saying "user research without users" was intentionally provocative because the fact that people were saying that this is something that shouldn't exist was something that was putting our name out there, so it ended up helping us a lot in our early growth.

And it helped bring people that were essential to give us feedback about what was working and what was not in the product so we could improve it.

You said a couple things I want to follow up on. You mentioned that you can be out in front of something, but you pointed out that you could also be out in front of nothing. At what point did you realize that you had traction, that you had something that people wanted? As I understand it, you've introduced a whole new behavior, a new behavior for organizations to learn. Is that true?

The thing is, I knew one thing, which was the category would exist, even if we weren't successful. That was the first thing I understood, because this happens a lot, whoever is the company that starts a particular category ends up not being the one that dominates the category.

But what I understood most from academic research, because half my day is reading papers, reading machine learning papers that I don't understand the math, but I understand what is now possible, so I can try to apply it on our side. And papers from sociology, psychology, human-computer interaction, in which a lot of academic researchers are comparing this idea of silicon samples, synthetic users, synthetic personas, and seeing how well these large language models really are at mimicking people in diverse situations, be it running morally violating situations, all of that.

So I had that observation on the academic side that, hey, we're not crazy. There's people doing exactly the work that we also want to do, which is to compare the results and show where the overlap is. While at the same time, I was seeing that, of course, this is going to be a thing. Because models, they don't regress. The next model won't be worse than this one. And if we're at this stage with this one, it can always get better. So I saw that this is going to be an industry and a big one, I believe.

I believe that when we get to that place where people are more comfortable, and we saw this with online research, in the beginning, when online research came out, there was a huge resistance from the industry saying "No, you need to talk to people face to face. Otherwise it's not real." And then for the efficiencies you get from online research, people adopted it. And I think it's going to be the same with synthetic research and synthetic users in particular, is that people just need to be more accustomed to the idea that comes with it and try it themselves.

Most people that I've seen commenting on LinkedIn saying, "Oh, I saw these guys building synthetic users, this will never be possible," I always go there and I ask, "Are you saying this because you tried it and you feel the results are not good or are you saying this because it's more of an absolute perspective and there's nothing that can prove you wrong?" Because I think most people just are so skeptical because they haven't even tried it.

What have you learned from your customers or your users in terms of how it changes how they learn? What does it do to face-to-face qualitative interviews? And what have you learned about the ideal use case for synthetic user research?

So we have quite diverse usage in terms of how people and companies are adopting our platform. What I normally now recommend, in the beginning I was more forceful in how I pitched synthetic users, but I think I was making a mistake at that time, what I now recommend is that companies, for them to start feeling comfortable with what we do, that they use this as shadow research.

What do I mean by shadow research is you already have set up a research project about a particular category of users that you're going to be investing in doing surveys and user interviews. Keep going, keep doing that, and use us to try to reproduce the research, not the results. Just try to use synthetic users as a platform to ask the same questions. And then you are the one who has all the data. You can see what synthetic users said, you can see what humans said, and you compare it with your own eyes. In the beginning, I was like, "No just trust us. Just go in and use us."

So now I pitch more as synthetic users should be the first step, to use this as a stepping stone, feel comfortable with the overlap you get with our results. A lot of people use us to pre-test ideas before they validate it ultimately with humans, which I think for high risk decisions, that should be the case. Some people use us just to decide, they're planning on prioritizing three features, they're not sure which one to start with. They use synthetic users to help them make that decision.

I know some people, this is not necessarily something that I recommend, there are now almost doing everything with synthetic users and only going to humans for really high stakes decisions. I'm not comfortable with this. I think there's a lot to improve in our product. I want to put some metrics, some evaluations out there, so people understand better where to use it. But the teams that have been doing this, mostly early stage startups, they're quite comfortable with it, which is quite pleasant to hear. But at the same time, I think that there's still, I think there always will be a role for humans in research. I don't think synthetic research will ever replace humans 100 percent. I don't think that should be the goal of any company. I think what we want to replace is some of the inefficiencies in the standardization process.

What are other types of customers or use cases that you've been really surprised by?

When we launched, we were really focused on a particular segment. That was like, this is going to be for early adopters, for entrepreneurs, for people who don't have a lot of experience doing research, who don't have access, who might not even have the money to do conventional user research and market research. And those are the ones we're going to be focusing on. We're going to be like Canva that allows you to design postcards and stuff like that. We're going to take away all the bells and whistles, going to really simplify and help those people.

And when we launched that was our ideal customer profile. The idea was, let's go for kind of the lower end of the market, people who are not experts and really need some help doing this. The surprising thing was we suddenly, and I can't because of NDAs say names, but we started getting emails that I'm forwarding to my co-founder Kwame asking him, "Can you confirm that this is really coming from them?" Big financial institutions, big consumer packaged goods companies, some big research agencies reaching out to us and saying, "Hey, we think your proposition might have something to add to our company and to our process. Can you do a demo?"

And suddenly I'm doing a demo for a company that almost any everyone in the world has bought a product from them. And I'm like, "Oh my God, maybe we focused on the wrong segment. We need to refocus." And that's a little bit of what we've been doing now. We felt that the cost was the most challenging part of research that we were solving. What we understood is that for big companies they have the money. They can pay for research. They've been paying for research for a long time. Some of them have their own panels. It's not lack of access to customers or to being able to do interviews. It's just the amount of time it takes that really slows down those companies.

And those companies, almost by definition, because of their size are already quite slow companies because there's some organizational friction that comes with that, being a big consumer packaged goods or a big financial service company. If you join the organizational friction with the research friction, sometimes they're really slow to react to the market and to launch new products, because the iteration cycle between doing research, coming back to the core team, the team iterating, doing research and iterating, it really slows them down.

They come to us exactly because of that, because sometimes what one of the insights managers on this told me was "I know that I don't need 100 percent certainty in all the decisions I make. Sometimes I'm okay with 60%. Sometimes I'm okay with 80%. Sometimes I'm okay just having a better notion of what the consumer might say when listening to this value proposition" and the trade-offs between the time that conventional research takes and what they can get with synthetic users is really what they want.

So we've refocused a little bit more on the enterprise side. It was something really painful to me because I remember saying to Kwame when we launched and were going for the lower end of the market, it was like, "I don't want to do compliance. I don't want to do SOC 2, type 2 security checks. I don't want to do all those things because it's a pain." I know it's needed when you work with companies that size, but I didn't want to do that because it's one of those things, I'm a guy who likes to iterate fast. I don't want to spend time just implementing SSO and SAML and all of that. But truth is that here I am doing it because it's where the market is, so that's what the customers we're focusing on right now.

You've used the word overlap. How do you compare the data you get from synthetic user research versus the data you get from what would you consider qualitative research?

I do consider it qualitative research. I think that's one challenge when people think about products like ours is that they tend to think of an algorithm that is just math and say, "Oh no, it's someone that wrote an algorithm and that algorithm spits out words and those words mean nothing because it's just mathematics."

What people tend to forget is that the training process of these large language models involves vast quantities of human data. Essentially, these models were trained with all the text that could be found on the internet. There's people who say not only text, but even transcripts from YouTube. There's now a question if OpenAI trained GPT-4 with transcripts from YouTube videos, which might be the case, might not be. I don't want to get in trouble by OpenAI for saying that.

But the training data that was used for these models is human data. It's people, it's parents on parenting forums asking about their questions that they really struggle with. It's people talking on Reddit, on the subreddit, talking about their cars. It's people talking about their beliefs, their challenges, their fears, their aspirations. And that is what makes these models so good. It's not that the algorithm is some kind of magic math. It is, of course, beyond my comprehension, but the fact that it's trained on human data is what makes these models so good.

Because then when you're running an interview with synthetic users, it's not the algorithm, what you're doing is you're going into the model and making it talk like the persons that you are interviewing talk online in different forums and in different aspects. So I do think it's qualitative research.

Some people prefer calling it kind of desk research. They think it's the same thing as going to Reddit and searching for what people are discussing in forums. This practice already exists, a lot of smart user researchers already use Reddit as a source for research, because if you're studying a particular group, it's an amazing source and they go to forums and they read what people are discussing in forums. So this practice already exists, but even if that's the case, we make it so much easier.

It's interesting that it's secondary research as opposed to primary, though it's dressed up a little bit like primary. It has a primary interface on a secondary methodology.

Exactly. I think there's also some nuances. The reason why we lead with the user interviews, some people ask us, "Wouldn't it just be easier for you to generate the report that you give us at the end directly instead of doing the interviews first?" And we tried it. If we don't do the user interviews and if we don't include them as the base material for the report generation, the report is so shallow. It's extremely shallow. It doesn't involve any depth. It doesn't have any nuance. It's going to just spit out what a random person asked on the street about, "Hey, what are the challenges of truck drivers?" That's what you get with a report without the interviews.

Because what the interviews allow these models to do is to explore what's called in the machine learning kind of groups "latent space". So we need these models, there's a space of connections and of correlations. And when you generate interviews that have different profiles and certainties on each one of the profiles, it goes into where it is that if you just generated the report, it wouldn't go there.

It wouldn't involve the interviews as much as I would also like to take them away because they take time for us to generate. And if I could get the results with just doing the report, I would, but I can't because the interviews really play a role.

And then there's another aspect of it, in the same way that with human interviews, sometimes it's just one sentence there that makes you think "There's something here that I might need to probe." And you can do that. It's the same thing with our interviews. Sometimes the interview, if you just summarize it, it's cool stuff, but then you look at one sentence and you're like, "Oh, because of the rush that I need to do every day to feed my kids." And you're like, "Wait, there's an adjacent problem here that is worth exploring." And you go and do a follow up question, "Can you tell me about that?" And suddenly, that's where the magic of the insight really is. It's when you can use synthetic users as you would do with humans, doing follow up questions and going deep on the conversation that was already taken and taking notes and investigating deeply. That's where you get the most value.

As you were answering that question you used your hands. You really were describing the experience of the interview, the value of the interview, but you were describing it with your hands as being this really almost exploratory, that it goes places you don't know that it's going to go. So what's happening in that interview?

It depends on the interviews we're talking about. If I focus on the ones I described earlier, the dynamic interviews in which it's an AI talking to an AI, essentially what's happening is you can think of it a bit as a problem space could be a map and you could decide on starting in this position and just going all the way here and you need to have some view of that problem space.

But what happens when you generate several interviews around the same problem space is that one of the researchers decides to go this path and it's more severe, the other one goes here and the other one goes way in here and the other one goes on this side. And so in conjunction, they end up having a lot more visibility on that entire map or that entire problem space than any individual user. And if you don't use profiles and if you don't make the synthetic users diverse, what ends up happening is that all researchers would go in the same area. So you wouldn't have this diversity that you need to really capture everything that is happening on that space.

What's your secret sauce that keeps you differentiated?

On one side we use open data. We use data sets that are out there, public, either they were collected in academic settings. Some of them you might be aware of, like the World Values Survey and the European Social Survey, so we use data that is out there. And that allows us to have a better understanding of people, how people think about particular topics and all. But that's open data. Anyone, I don't know if anyone else is using that kind of data. Maybe I'm talking too much and I shouldn't have said this. But this is some of the data that we use.

What we've been doing since maybe July last year, was we started also collecting our own data. So we collect data through surveys from several platforms. One of them is Prolific. Some of them is Amazon Mechanical Turk. There are lots of other micro task platforms that we collect data from. We have been collecting surveys. We have been collecting some qualitative interviews. Typically, the interviews we do are not even about problems and about concept testing, nothing like that. We have designed some interviews that are in our view essential to get a better understanding of how different aspects of people's lives are related and how they go along together. And this is really part of the core of the generation of the synthetic users and of the generation of the interviews themselves.

We use a technique that is called retrieval augmented generation, RAG. So before any interview generation starts, we go and try to find relevant data in our data set to enrich the profile of that user. This is not visible to our end customer, but we enrich the profile of that user in terms of psychographics, aptitudes and then we generate data. So this is a little bit of the work we've been doing.

And we're also now setting up a collaboration with some academic researchers in which we're going to be doing several studies comparing the results you get with synthetic users with the results you get with human interviews and human surveys.

That's exciting.

Yes, it is. It is. I'm really happy because it's something that we have on our wall in the office since day one, that is evaluation, which is the need to see how well the systems perform. It needs to be our number one priority, and we've been doing it, but informally and not sharing it with people out there, because we felt that just sharing numbers without sharing the full process wouldn't give the confidence to people that need it. Because they would just look at them and say, "Yes, these guys have something to sell, but they're hyping this just because they want to sell it."

So I'm really happy doing this academic collaboration, because it's going to be a neutral group that is going to be running this. We're going to be paying for the participants and everything else, it's just about academia being neutral. And what I told the team that is doing this with us is, I want to see where we fall short. I want to see really concrete results of where do we perform best? Where do we fall short? What kind of topics we don't perform that well on?

I can tell you one. I'm going to be really honest. We don't perform well on topics that touch sustainability. I'll explain why. Because the synthetic users tend to be a lot more worried about sustainability than conventional people. It has to do with the way they were trained. It has to do with some of the methodology that is used to fine tune these models, which is reinforcement learning and feedback. And they are a lot more concerned about the climate than most people are.

And that is a topic that we have some measures in place to address it. But sometimes it happens that you go there and ask for feedback on some headphones and initially, not now, but they would always end up asking, "No, but I'm really worried about the process of building those headphones. Is it safe for the planet?" So they tend to be a little bit more worried about it than real people are, but we also know that real people tend to say that they are more worried about it than their actions then reflect.

There seems to be lots of questions about bias in the data. How do you manage the bias in the data? You've given one example, what other ways do you manage it?

Yes. So in terms of our data collection process, that's something that we started the data collection process also because of that, because we wanted to make sure that at least on our side we had really diverse data. We collect data about fishermen in Angola, which is not a group that I expect anyone to ever try to run interviews on synthetic users for. But we feel that having this data, that I'm using our data sets for having this kind of diversity, wouldn't reach any of the audiences that are generated through our platform. So we have this kind of problem.

The thing is, there's a lot of the data sets that are used as training corpus for these models that is not real. Our corpus doesn't represent all human populations in the same way. English speaking populations are, of course, over represented. If you're thinking about the farmer in Botswana, it's quite improbable that he has an online footprint that is captured there. So there are populations in which we know are not represented in the data.

In our particular case, since we are B2B, what we're doing is that the overlap between representation in online data and the overlap of the populations that companies want to study is quite good. I imagine that most companies are not, they don't want to interview fishermen in Angola or farmers in Botswana. I'm less worried about their lack of representation of those groups in the data that was used in training those models, because I also don't think people are coming to us to ask about those groups.

Having said this, in our own data collection efforts, we're doing the best we can to really mitigate gender, race and all the kind of biases that we know are kind of part of the original corpus of training.

As you look ahead, what are you most looking forward to, or how do you see this space, synthetic user research changing or evolving in five years?

I think we're going to go beyond research. I think one of the things that I'm really looking forward to, we're starting to do that now with a feature that is coming in two weeks, is the idea of agents. The idea of going from just the discovery, understanding the market size and understanding the user needs, into also helping companies ideate, brainstorm, assess feasibility, assess market pull, test and validate with real people and even build the products itself.

I think we will have agents in which a complex swarm of AI agents can build a product end to end in the sense that I can go to Synthetic Users and I can say, "I want to help single parents that live in Lisbon, Portugal and have an income less than X." And our system will put agents in those roles, they will simulate those personas, they will figure out what are the biggest pains they feel, they will figure out what are the market alternatives that are already in place, they will figure out what is a sufficiently differentiated new proposition that might solve those issues. They will figure out if there's business viability, if you can even make money building that product, it will assess the technical viability of building it. And then with the advances that we've been seeing, particularly on the software side, it might even build the product itself, might even write the copy, the marketing material for it.

So in five years, I see this more as an end-to-end innovation engine than necessarily just a research piece, but we felt that beginning with the research piece was the right place to start. The Board of Innovation, a big agency, they like us a lot and we like them a lot. I think they came up for their summit last year, they came up with a really good name. They used, they called their summit "Autonomous Innovation", and I think it captures how I think about this space too.

I'm curious, I want to return to the idea of being what I've heard described as "category generic". The name has captured the category. As somebody that works in branding and marketing, what's the upside, the benefit of that? And what's the downside?

So the benefit is that we became the default name for people talking about the category, even when they're not aware that we exist as a company. So some people, because they saw some competitor start talking about synthetic users, then they go and Google and they discover that we exist. So there's that upside.

There's a downside, which is, it's generic, so it's hard to then build as a brand with the artifacts that sometimes you need to have so it ends up capturing the category and not being necessarily specific enough to us, which at this stage where the category is still growing is a good thing.

But I'm really happy that we made the decision to go with "Synthetic Users" because we almost don't need to explain it. We just say the name and most people have a sense of what you're discussing immediately without any explanation. And that's a really good thing when you're launching something as new as what we are.

Listen, I want to really thank you so much for your time. You were generous a year ago by saying hello to me and especially today. So I really appreciate it.

You're welcome. But it's always a pleasure to discuss these topics with you. You are a great conversationalist. So it's an amazing experience to have these kinds of conversations with someone who understands the industry, understands what we're trying to do, and gives us the best possibility of discussing.

I wish you the best of luck and thank you again. Have a great weekend, Hugo.

Thank you so much. Have a great weekend, Peter.