How UX Shapes Generative AI Adoption


.png)
Introduction
In this conversation, Yash, Jay, and Koushik explore the intersection of user experience (UX) and generative AI, discussing how effective UX design can facilitate the adoption of generative AI technologies in SaaS products. They delve into the importance of choosing the right interface, addressing latency challenges, and optimizing user experience to enhance output quality. The discussion also covers the significance of feedback in improving AI products and the differences in user behavior between mobile and desktop interfaces. The conversation concludes with a call to action for SaaS founders to consider UX improvements in their generative AI applications.
Key Takeaways
- UX plays a crucial role in generative AI adoption.
- Generative AI allows for multiple outputs based on input.
- Choosing the right interface is essential for user engagement.
- Latency can be mitigated through effective UX design.
- Feedback mechanisms are vital for improving AI outputs.
- Understanding user reading capabilities can enhance UX.
- Mobile interfaces often see higher drop-off rates than desktop.
- Streaming responses can improve user experience during latency.
- Continuous feedback helps in refining AI products.
- AI and UX must evolve together for better user satisfaction.
Transcript
Okay, it says that we are live. We'll wait for a second or so to see whether we get viewers. Yes, we have our first viewer. So we are live. Awesome, let's begin. Hello and welcome to Momentum Officers. My name is Yash and I'm joined by my co-founders Jay and Koushik to discuss topic of the week, how UX shapes generative AI adoption.
Our goal is to provide you with actionable insights and practical strategies that you can apply to your own business. Throughout the session, we encourage you to engage with us by asking questions and sharing your thoughts. This is a fantastic opportunity to learn from each other and gain new insights that can help drive your initiatives forward. So let's get started. Jay, Koushik, how are you doing today? Good, nice. Finished as usual client meeting and working.
Jay's background is better today. Jay, is your background better today than the usual that we have at our office? Yeah, because I'm traveling right now and I'm at one of our client's space and exploring new business initiatives as well. it has been quite hectic but very interesting learning new things, exploring new opportunities and yeah, getting things ready. visiting our client's space and doing our work.
Which is another level of productivity. So, high life, science side I am working on our own initiative. So, tell me Koushik, what's the topic about today? Why are we discussing this? Because when I read the title of the topic for the first time, it felt like there is only Sam Altman is the only person for whom this topic will make sense.
So why are we talking about UX and generative AI? Tell us a little bit about this. this up. So the most common use case of AI adoption that we are seeing and which will be also is Gen AI adoption. So if you ask me, then what is the other adoption that exists? that is the next question. Is that so you have supervised learning. have supervised learning.
Yash Shah (02:26.274)
I have to give you a very quick example. Your spam filters is basically supervised learning. Basically, in input, you have a certain set of emails, the AI is trained upon a set of emails, then you want the output to find out if this is spam or not. That is supervised learning. Your artificial intelligence used inside your self-driving cars is actually supervised learning. So you have, what it does is it basically clicks images of everything around you.
And then it also has radar which gives relative positioning and depth between each object within the image. And that is the input that you're training the model on. And then the output that you want the model to give up is basically what is the distance of the next car in front of me or the next object in front of me. So this is supervised learning, right? So which gives a predefined output that I want to know yes or no kind of a situation is what is happening.
Generative AI is the new big shift in AI, right? Where you are giving an input, output is getting generated based on the input you gave. the input is something that you planned or gathered together and then you're updating it, nor the output is something where not just one output, I can keep creating as many outputs as I can. Now, and especially when GenAI has got, came to an extent and level where it is being used.
across businesses in their platforms, across products in their platforms, the UX for how to design it and the UX for how should we help for the GNI platform. The UX here also helps the platform to get better. So, as right timing. So yeah, so that's what we are trying to understand. So what measures you need to take that will help for the product development internally also.
where the UX plays a big role in this actually. that's what we are discussing. Just to sort of get at the same bit, like as an example, let's say if I am a SaaS founder who's built a social media scheduling platform, something like Buffer or something like Hootsuite. And if I am building capability within my social media scheduling platform,
Yash Shah (04:50.979)
for my users to be able to generate posts, generate images for their social media, write captions for their Instagram, whatever the case may be. It would most likely be a wrapper of chat GPT or it would be a wrapper of some sort of other LLM, Lama or whatever. However, the UX, I will have to build it within my systems. That UX...
is what we are talking about. that be a fair way to say it? Yeah, the UX of the final application, it could be a web-based interface or it could be an app-based interface, depends on the user base that is there. But there are more nuances in it. Like for example, I'll give you one example, right? The output has come very often in your chat GPU also you will see the thumbs up and thumbs down symbol. Every single output is there.
So how would you know why should even charge it introduced that is that feedback is a huge part of model development with respect to the entire application itself. That's what makes the actual application better at serving you with giving you better outputs. Now, how do you design the UX of the UX of errors for GenAI itself is a topic. So the way how we handled and designed errors previously versus how we have to design errors now itself is changing.
So those key differences is what we are talking about. Got it. Interesting.
Yash Shah (06:24.687)
So Koushik, had this question largely whatever GenAI tools that we see, most of them are having this, you know, chat interface. And now newly we also have the voice interface as well. is it like mandatory or like can you put on some some light on when should someone prefer the chat interface over any other options? It's a use case. Again, so you have we have three categories categories of us. GenAI,
Input output may we have input could be anything the output could be text could be image It could be a voice it could be a video any of this right most of the application is text to text right so You're inputting some text you are the output that is getting generated some text of some use to you now In this when it comes to chat chat is one aspect of it. There is writing there is reading then there is chat So your use case could fall in any of this category
For example, if I'm building an internal team HR assistant, where any employee who wants to ask anything about the company can immediately need a place to go and ask something and get the right information, then that's the place where I need to build a chatbot over there. Now, let's say if I'm a restaurant who's trying to build a, you know, based on the reviews that is coming on my restaurant at different, different platforms, could be Zomato.
this video or anywhere, Google or anywhere. I want a sentiment analysis that needs to be built, which basically gives me information that how many positive comments or reviews did I get in a day and as well as how many negative reviews did I get in a day and which part of the day did I get positive, which part of the day did I get negative. If I want that data, that you don't need a chat interface there, rather there it is a representation, it's a graph or something like that that is being
So your use case decides what the final interface that needs to be. So chat is not a mandatory for everything. We are seeing more chat because of more chat being the general purpose web application. And also, there's also one more reason. Most of the JNA adoption is currently happening in customer service, sales, product development in those areas where chat is the majority use case. Now let's say most of the...
Yash Shah (08:52.271)
which is happening, but if once it starts having, it starts to happen in supply chain, it starts to happen in different areas, you start seeing. Yeah, yeah. And so like as an example, we have a client for whom generative AI was a canvas based use case, right? So which is where we were building mind maps and things like that. So over there,
generative AI was like the response that you would get or even some parts of the prompting were also happening on like a, I don't know, it's called a canvas, right? Or a whiteboard of sorts. Like editor, yeah. Yeah, like an editor, right? So there are different types of interfaces essentially, know, there are, so Jay, what you're talking about are prompt-based use cases, which is where the chat interface sort of could work in a little bit.
But then there are canvas based use cases, are adaptive UI components, are voice first interfaces also and so on and so forth, think. Got it. So another, in terms of one of the challenges that I see a lot of SaaS founders trying to fix with building a wrapper on top of LLMs and then
introducing those capabilities within their products is the challenge of latency. And while latency is a tech problem, but engineering and technology can only fix up to a certain extent. So it will become better when the technology becomes better, which is what all of us can wait for, but the customers won't wait for it. Are there any UX ways of fixing for latency?
Yeah, so for this, this is where the need for the topic also is there. It's nice that you asked this question. So UX designers should also understand these concepts like latency and also prompt, what prompt engineering is and what the output outcome is. Now latency is related to prompt engineering. I'll tell you why from a UX perspective. So I'm giving the quality of the output that I'm getting depends on the input that I have.
Yash Shah (11:19.341)
Now, you could optimize the UX in such a way that when you give a bad prompt, the chat interface or whatever it is there, it could ask questions that could build the prompt. And then eventually that gets submitted and then the better output comes in. Now this is one UX use case that we are trying to come up, right? Like, so you gave a 20 % of a good prompt, what sort of questions need to come up such that it becomes, you know, a better prompt.
So your interface, the UX is helping you to engineer the prompt itself, then it gets submitted, then it processes it. But the UX designer should come up with this idea. So you should say that, okay fine, you should have these kind of pre-level where we should be able to help them build the prompt and then submit it. And also let's say you have submitted the prompt even at the output level.
you could try to bring in multiple drafts of the output also same time. So such that every single time instead of doing that extra click to for me to get a new output, you could create four outputs, which are four drafts and your user can compare the drafts and choose the draft that they want to go ahead. That could be another way. Now these decisions are simple UX flows to make the latency. Now,
Coming back, circling back to your question, latency automatically becomes better if the input is better. And my experience, user experience automatically becomes better once I'm given more options to choose from for the better one to pick from. So this both comes as a combination. So this is one way in which we could solve it. So these kind of initiatives and also this is where errors also play a great role. Me giving good feedback.
How do you get the feedback even? Like how often, I mean we can ask this question to ourselves, right? Like how often do we press the thumbs up and thumbs down button for a, you know, output that comes in? So that means that UX designers should consider a refresh option also equivalent to what a thumbs up thumbs down is, which means that user didn't get the right output. Now we have seen a few interfaces, like for example Zapier interviews, Zapier agents where
Yash Shah (13:46.302)
The way Zapier Agents is built is that when the output comes in, it will give you a draft of it and then for each draft that is there, it is blurred out initially and then you open one by one. So let's say in the second one, you are happy with it. People do open it, then you don't have to open the other one. But there are some shortcomings to this also, where it's like, why did you show me four of them? Those sort of things that keeps coming in.
you keep iterating to figure out what sort of better error implementations can you do such that the ability of latency is much faster and stuff.
So there was a question related to it. We often seen a lot of tools showcasing how the output is being processed once the input is entered. Is this also related to this particular angle where just waiting for output to get generated rather than that you showcase that, I mean, it is being processed, it is being analyzed, and this is how the thought process is going on from the model side. And then it showcases the result. I wanted to know
from like from my personal individual perspective at times I don't like I just want to get the answer and at times I mean I don't often read the process what what is being processed in certain cases I also like to know but I wanted to know in general does this annoy users and or do people also prefer that okay how the output is being processed I want to know so that I am well aware that okay the machine has understood it very properly so
I think what you're talking about is the reasoning aspect that is coming in that we are seeing currently where you input then it says I'm thinking this I'm doing this. So there are two aspects to it. One is that let's say they are showing that to you such that to make you understand that the latency it is taking to do that is being covered that is fine. But let's say you are someone who is still
Yash Shah (15:53.172)
reading the reasoning. Then also, let's say it is doing a particular reasoning, it is showing a particular step where I'm thinking this and you don't want the AI to think that. There is no way for you to stop it. Stop the entire operation and again give a better prompt or something which makes you feel more frustrated because you're doing it like double. So that is still a problem, that is a problem that came along with the innovation.
So like as as the deep sea came up with reasoning aspect, everyone started adopting it. that's when the so the difference what happened here is that previously internet access was not there. Now the internet access is there, which is large amount of data to process. Hence the increased latency that is happening. But it's a complete tech problem, which will take significant time to solve also. But what would be better is you don't need
such massive reasoning documents that you are showing. What you created as a way to divert people from understanding the delayed latency is almost becoming frustrating now to see long documents on there. That could all be in the back end, but rather we could have a much more simplified version of it of just showing four or five tasks at a time, and then the next four tasks, next four tasks. What currently is happening is it is showing
If it is doing 50 steps, is showing all 50 steps at a single time. So you don't like, which is of no use to the user, right? It's like you show first steps, next four steps, next four steps, a smaller window design would help. If you don't show it like, charge even iterated with it, like with the UX also, like a few users started seeing this thing where you will, you won't see the entire documentation. You would see a simple loading of the data getting loaded.
Or much more better way of dealing with it is that the AI is doing the research or AI is doing the thing. You can, it could give a message saying that you can go and do anything you want. Once the message is ready, we will intimate you or a pop-up or a push notification appears. That is a much better UX such that, you know, you're saying that you go and do whatever you want. Whenever it's ready, I'll let you know. So that, that, that is a much better UX to go with. So that is what we, in one of the products that we tried, that is what we tried to do.
Yash Shah (18:15.314)
Research is happening, once it's done, a push notification comes in and you get back to the platform and you see, okay, fine, this is what I'll do. Yeah, and so just on this piece, right, I was listening to a conversation with the CEO of Perplexity. And one of the things that he mentioned was that they, instead of sending the responses, they started streaming the response. What that essentially means is that there's a difference between how AI
generates a response and how human generates a response. So if Jay you ask a question, I will first think of the complete answer and then I will start speaking the answer. But that's not how AI works. AI is generating on the go. So let's say if you ask a prompt and it generates a response that is 300 words, let's It will start to stream the response to you as soon as it's written like 50 words.
when it has written 50 words, it doesn't know what are the last 50 words. It doesn't, it just going word after word. And so that's why latency, you will see that OpenAI, Gemini, like ChatGPD, Gemini, Perplexity, you will not feel that much latency because the response is being streamed to you. It will take you longer to read the 50 words than it takes
chat GPT to write the next 50 words. So you start to get the response very, very quickly. However, this becomes a huge problem for SaaS products that are rappers on top of it because they cannot stream the response. They have to wait for chat GPT to stream the response to them and then send it out to the user. And that's why sort of solve, that's why the solutions that Koushik is talking about, right? Which is how do you deal with latency within your SaaS product, which is a rapper on top of some other LM.
How do you deal with that? Because the user uses your product as well as chat GPT and then the user feels that chat GPT is significantly faster and you, which is a rapper on chat GPT is extremely slow. And so that's why it needs to be sort of fixed. But this sort of brings me to another point, like another question also that I have from in terms of generative AI adoption by SaaS founders and their customers and UXO.
Yash Shah (20:38.76)
How else, what are the metrics that you recommend a SaaS founder measure or a product manager measure such that they will know that the UX change that they have made is actually working? if, so as an example, like if I make a UX change and if I start to see more people using it more frequently, that could also be because my engine is poor and so people need to use it more often.
to get the response or it could also be that my engine is extremely valuable and so that so like what are the metrics that that would tell a product manager that hey you know good job on the UX front. So the usual set of things that I I we would notice or we would look out for is that what is your what is the per hour
reading capability of your user. It is a different metric. What we are talking about is an average human being reading capability is 250 words per minute is the average human capability. So wow, so it's like four words a second. I mean, you're just glancing through and going. So now let's say if you overall
There is a research from OpenAI also which says that 30,000 words is what a human can, 30,000 output words is what a human, no 15,000 output words, 15,000 input words, 15,000 output words is what a human can comprehend per hour. Now this is what they use to do the costing for the tokens and everything. That's why the research went behind. Now let's say you have designed any GenAI product, right?
Now it is creating an output. Let's say you created an output so detailed and so much that it is not useful for the user. That is what is very often happening. In fact, in the reasoning model, when you click on it, the output that comes is almost a pages and pages level that is there. You are probably again using an AI to summarize that output that came out. So you trying to understand what is the right output amount?
Yash Shah (23:02.644)
that your user can comprehend is very important. So that is a metric for you to look out for. So that also optimizes that that is something that UX designer will report to the product manager saying that, fine, this is the set of thing that has come up. Then this is the output that came from LLM now, then your UX designer will think that I'm pretty sure that my user is going to say that I want this to be summarized. So what are the ways to summarize? So should I give an automatics?
You know, simplify this button over there such that when you click on it, it gets simplified automatically. Do I have to introduce some new innovations over there because of this particular factor? This is one. Then your amount of feedback that is coming in on a regular basis. That is another metric that you should very keenly look out for. It is a metric for two reasons. Any JNI product that gets built is not perfect.
you have to be like ready to be dissatisfied with the output it is giving. This is given because the process itself is based on finding. Key to happiness is low expectations. Yeah, extreme low expectations. So the way it works is that, a little deviation but very quickly I'll just brief this. You're scoping your product, whatever the feature or AI feature that you're trying to do, then you deploy it.
You think that that's where it is done. It is just 10 % of your AI product. Then comes internal evaluation where your AI engineers are basically what they're trying to do is they're giving an input and they're seeing if the right output is coming. Right output won't come. So what they do is they keep going back and fine tuning it on a regular basis until they are satisfied that, fine, it's ready for deployment. And once it goes to deployment, you'll start seeing biases. Okay. But I'll give you an example.
There was a chatbot that was deployed for a healthcare industry and in the deployment what they observed is that if you type the surgeon needs to do so and so what should be the output? Anything of that kind. The output says that he should come up with something like this. Now how did it come to a conclusion that a surgeon should be a he only? Right? So this is something which
Yash Shah (25:30.068)
you wouldn't notice, you cannot notice because some user is typing and that is the output that is coming in. So what all UX is, so if a user is reporting on it or any sort of biases, you need to have, you need to categorize your feedbacks also into these are biases. These are, you know, proper prompt latency based issues. You have to divide the feedback itself in this case and then report back. So number of feedbacks that you are getting and also the quality of feedbacks that you are getting.
needs to be monitored continuously. So these are few things that you need to pay attention to. I would even go to an extent to say that if you are giving multi-draft outputs that is there, the quality between each output, how many, which output is getting clicked the most? Sorry, what did you say? Multi what? So the draft option that I mentioned, right? Multi-draft. Okay. So mostly what happens is that draft one, draft two, draft three, draft four.
Draft 2 is probably working on a very similar fine-tuned model. Draft 3 is on another fine-tuned option. Draft 4 is on another fine-tuned option. So irrespective of the question, Draft 2 is still working on a same fine-tuned option. So if more people are opting Draft 3 or Draft 2, that means that there is something wrong with your Draft 2 fine-tuned. So you need to go back and check it. So the click ratio within the draft output is also something to look out for.
So these kind of things are some key factors at least in this stage to look out for. It's so interesting because was just yesterday, day before yesterday I was trying out a product which made a promise saying that we'll generate ebooks in seconds. Like I remember the website saying write ebooks in seconds not weeks. was thinking oh my god this is amazing, you this is going to change the game, this is going to change my world.
and I tried it out and it was so frustrating. It probably also speaks to my entitlement in thinking that I will put in a prompt and it will give me exactly what I need when I need it. But yeah, so that this is a genuine real problem. But sorry Jay, you had a question. Yeah, yeah. But no, I mean, so this is a little bit different but more still related to UX and JNI. So we have seen and I actually
Yash Shah (27:49.596)
read it in one of the articles of different context but it mentioned that there are more drop-offs in know JNI tools which are into which are having mobile interface compared to desktop so is there any specific reason that you have spotted or you know that you know why there are more drop-offs in mobile why people are only preferring it to use it over a desktop? this depends on two reasons one is
So if you would have noticed, OpenAI also released the mobile version very later. They introduced the web version first. And the reason for that was it's purely from a model perspective. Usually, LLMs function great initially when they came up. Now they are very efficient. Initially, when they came up, they were very good at web, but they were very bad to be deployed in mobile. Now currently, are seeing similar problems with SLMs also.
still struggle with it, where desktop may be able to function well, but when it comes to multi-level interfaces that it needs to be on, if it's a mobile or if it's a, like for example, if you are in the healthcare industry and in the healthcare industry, you would have, if you want some Gen AI output to be displayed within the device of the medical device, that is there.
then it hasn't reached to that level where I can send that data and reflect that data over there. So those sorts of small limitations are there. It's a tech based problem actually, not a UX based problem. The drop-offs shouldn't be there because technically if you think, we always say mobile first. For any of them in the previous, even for websites we say mobile first. It's more better if AI is actually mobile first because, but it's a...
From tech side, has been some latency. Still, I think till today even Gemini struggles with being on mobile. Compared to, and we are talking about one of the leading tech people that are out there. Very recently they came on mobile. They own Android. Own Android. So it's a tech deployment difficulty that we are seeing and basically it's been on mobile.
Yash Shah (30:10.105)
And also one more thing is that like for example, have more, let's say I'm writing a code to generate a game. I can display it in multiple windows within a screen. have more room for it. In mobile, have room to display it. I probably have to scroll like two, three times and show it. So that again becomes a problem. So would I see the code or would I see the preview? So that itself is a problem. So those UX.
From a UX perspective, those are things that should be adjusted. Like for example, I don't have a score. I can have it as multiple tabs on the top. just shift across from code to preview, preview to code. I can keep moving. But the code to preview itself just came like two weeks ago. I think going forward is what these changes you will see. yeah. Interesting. And I want to call out one thing that is
interesting information for all our viewers and listeners but both of you may have also missed on this is that just yesterday evening we held an event which is where we talked about how AI shapes UX and you know pixel prompts and power how you know generative AI is a UX designer's best friend and today we are talking about how UX shapes generative AI adoption right so this is
This was not even planned, It was not planned for it to happen this way but the stars aligned in some way, shape or form and here we are which essentially talks about the deep collaboration that AI and UX. So what UX is doing for AI and then also what AI is doing for UX as well but this brings us to the end of our conversation for today. Thank you to everyone who
who joined in and hope this conversation was valuable and meaningful for you. If you're a SaaS founder who has built out a wrapper on top of any LLM within your product and you're looking for improving the UX, you're facing challenges in terms of its adoption by your users or customers, or you're facing critical feedback on it on your review platforms or in your chat support, do consider reaching out to us. Koushik has done.
Yash Shah (32:29.883)
Very interesting amount of research. We'd be happy to share with you the things that you've learned, some examples that you could learn from and implement as well. The other thing that I would also like to mention is that our cat budget is dependent on your subscriptions. If you subscribe to the channel, if you comment, if you like, then our cat gets food.
If you don't think about us, think about the kittens and the food that you are not giving them by not subscribing. So please consider subscribing or liking or commenting. We just introduced the Scrooge's cat by just that statement. The cat lives or dies based on the subscription. Right, based on the subscription, right? So it's in a state of uncertainty. But as soon as you click subscribe, we will be certain.
But until you hit that, it's in a state of uncertainty. But thank you for being with us. And we'll see you again next week with an interesting topic on how SAS and AI can work together. Until next time. Bye-bye.