Did you have to become an LLM prompt engineer? Who took on those roles? “It was really a partnership. It was largely Pathlight. We had a great team we were working with and we’d get on the phone once a week and look at the data coming in. For a while we did double the work. What I would do is have Pathlight’s AI create the conversation and then I’d separately have my managers do it in isolation. Then we’d compare the results to see not just what they both thought of the interaction, but what did they determine the root issue was? Was this a product return? Was this a product issue? How did they think the rep did?
“That really helped us identify areas where the prompt needed to be updated. We’d talk that through with Pathlight and they’d provide suggestions on how to tweak it or how to ask a question in a certain way to get more detail out of the AI, and that process has continued to work.”
What was the result of using the genAI from Pathlight? “A couple things came out of it, some expected and some not. We knew from our understanding of the product it was going to be able to tell us some of the basics. What did the customer call about? What was their sentiment?
“Some unexpected findings for us is we found some opportunities to double down on hospitality training. We found that there was a higher number of conversations we were comfortable with where the AI was grading the agent’s conversations as negative. Our initial impression was that this had to be wrong, and we adjusted the prompt and made it a little more forgiving, but what we found was that for a lot of customers when they get very demanding our reps didn’t necessarily have all the tools they needed to de-escalate, avoid conflict, to pivot. Or, more importantly, when to get themselves out and pass that conversation along to someone else. I think that was a huge finding for us.”
Did the AI surface anything unexpected? “We identified some product-related themes even before we went live, and this was one of the moments that totally sold me. We’d given Pathlight 150 recordings of totally random conversations, and they were just going to build us something to show us what their tool could do at scale. It just so happened within those 150 conversations they identified four or five customers with similar complaints about a product that ended up being a manufacturing defect that was causing rusting on electrical components. We really lucked out, because we’re not usually able to see something like that this clearly and identify we have a manufacturing issue with this specific set of products...
“We were able to trace it all the way back to the factory. We identified the serial numbers. It ended up being a small subset of a specific [wine] cellar we made. And we addressed it. We also made it right with those customers. That was one of those moments when thought, ‘I don’t know how we ever would have found that insight out.' That’s not something we were looking for. It’s never happened to one of our products before. It’s not part of any checklist. It wasn’t even part of our evaluation process. We were looking at the rep’s performance, not the products. This was cool.”
Were you able to calculate an ROI? "For many others I’ve spoken with, that’s been the most elusive part of rolling out AI. This is where I continue to struggle. The direct answer is, no. I know it’s delivering value for our business and I can demonstrate that by what my team’s doing on a daily basis. We’ve been able to directly map the investments we’ve made in hospitality training to hire [customer service reps] because we know this conversation is happening the way they’re supposed to. We’ve been able to assess and confirm procedural compliance and we know our agents are using the tools we’ve given them to make the right business decisions. But, I don’t know that we’ve ever been able to put a specific number on it to be able to say this drive this percent of ROI.
“It's been a validation tool more than anything. I allows us to assess what’s going on. It has given us the ability to be more nimble. Perfect example: we ran a promotion with a sale a couple weeks ago. We ran a test to see how people would respond to a shipping-related discount. It was not something we’d ever done before. The marketing team asked if we can figure out what customers are saying.
“So, we put in a prompt for the system to look at conversations over the past two days; who’s talking about this shipping discount and what are they saying. We were very [quickly] able to pull our an analysis of it. Alternatively, I can’t think of another good way of doing that, short of saying to the team, ‘Listen to every call and have every rep make a tally every time a customer mentions that.' And a lot of that would have to be planned ahead of time. There’s almost no way to go back afterward and say, ‘Hey, how did that promotion go?’”
Is there a next step in evolving your AI strategy? “What we have done thus far has largely been retrospective. Let’s look back on how the team performed over the past six months. Let’s look at how reps are improving on certain metrics. Let’s look at how they’re reacting to products. I want to get to a more proactive state, like with the corrosion issue. I want to find more of those. Show me the thing that maybe a human wouldn’t even put together that we can go out and action off of and maybe fix something before it’s really a problem.
"There are all these examples of times when we present an issue and ask how many examples of it do we have? Well, maybe one or two. Well, then is it a coincidence? Do we invest resources in fixing it? I think if we can move to a place where we’re prompting the model not just to look back on what it’s done but to say, ‘You know our business well enough now. Tell us if something seems odd to you.’”