Your teams face constant questions they can’t predict. Whether it’s employees searching for obscure policies or customers asking for their order status, questions arise from complex workflows. Traditional chatbots and web-based GPTs fall short, leaving businesses struggling to find answers quickly.
AI-powered solutions, like Krista, step in to answer these questions. By leveraging intelligent agents and real-time data, AI cuts through massive documents and delivers accurate answers when needed. This article explores how AI transforms document understanding, helping companies move beyond guesswork and slow processes to provide instant, actionable responses.
What Are Unknown Questions?
Unknown questions are unpredictable inquiries your employees or customers ask, and you can’t know them ahead of time. These questions surface in complex processes, from obscure policy details to hidden insights buried in lengthy documents. The challenge is clear: you don’t know the exact question, but you need to supply the answer fast.
Imagine a customer asking about an uncommon policy buried in a 200-page manual. Without a clear path to the right information, teams waste valuable time searching. AI solves this by using natural language processing (NLP) to interpret and respond to these unknown questions instantly, cutting through the noise to provide precise answers where and when they’re needed.
The Role of AI in Answering Questions
AI excels at tackling unknown questions by combining NLP with intelligent data retrieval. It doesn’t mimic a basic chatbot—it integrates with your systems to pull relevant answers from knowledge management, CRM systems, and other data sources. Whether it’s a customer inquiry or an internal query, AI understands the question’s context and delivers precise information fast.
Unlike traditional search tools, AI agents like Krista don’t require users to sift through documents manually. They quickly scan documents, databases, and real-time systems, providing answers even when you don’t know the question in advance. This capability helps teams move faster, solve problems instantly, and avoid bottlenecks caused by outdated processes or inaccessible knowledge.
Challenges in Building an AI Agent to Answer Questions
Building an AI agent that effectively answers unforeseen questions presents several key challenges. To succeed, businesses must address these issues directly:
- Hallucinations: AI can generate inaccurate answers, known as hallucinations. Robust quality control is essential to prevent this.
- Security and Privacy: When handling sensitive company data, strict security and privacy protocols are non-negotiable.
- Role-Based Access: AI must deliver answers based on the user’s role, ensuring only authorized individuals access confidential information.
- Real-Time Data Integration: Static documents won’t suffice. AI needs access to real-time data from dynamic systems to provide accurate, up-to-date answers.
- Data Strategy: You need a strategy that combines access to both static and live data while aligning with clear business goals.
Why Real-Time Data Matters
Real-time data makes AI truly effective. When AI answers questions, outdated information leads to frustration and mistrust. With real-time data, AI provides the most current and accurate information, ensuring responses are relevant and actionable.
By integrating AI with live systems—such as CRM, ERP, or customer support platforms—you empower your team with up-to-the-minute data. This capability sets advanced AI apart from basic chatbots that rely solely on static documents. With access to real-time information, AI can handle complex, evolving questions that require immediate, reliable answers.
The Role of Retrieval-Augmented Generation (RAG)
Retrieval-augmented generation (RAG) elevates real-time data integration by blending document retrieval with live data generation. Instead of relying solely on pre-trained data, RAG allows AI to search structured and unstructured sources, pulling relevant information in real time.
Without RAG, AI solutions often hit roadblocks. Large Language Models (LLMs) don’t know everything, and you can’t train them to manage constantly changing information. RAG gives your AI the power to pull live data from multiple sources and integrate it with your documents, ensuring contextually accurate answers, no matter how complex the query or how fragmented the data.
Use Cases for AI Virtual Assistants and Enterprise Search
AI excels at filling knowledge gaps in processes where finding answers is challenging. Key use cases include:
- Customer Support: AI reduces escalations by instantly answering uncommon questions that usually require senior staff intervention. This decreases training time and keeps support teams efficient, even with high turnover.
- Sales Teams: AI provides quick access to trends, forecasts, and key customer insights. Sales reps no longer need to manually sift through reports—they get immediate answers to critical questions, helping them close deals faster.
- Standard Operating Procedures (SOPs): AI helps employees retrieve rarely used procedures instantly, ensuring compliance and reducing the risk of costly mistakes.
In all of these cases, AI closes the knowledge gap between scattered data and actionable insights, driving efficiency and minimizing delays.
Actionable Steps for Implementing AI to Answer Questions
Implementing an AI solution that effectively answers unknown questions requires a strategic approach. Follow these steps to get started:
- Identify Key Audiences and Content Gaps: Begin by determining who in your organization needs the answers and where information bottlenecks exist. Focus on areas where AI can reduce costs, increase revenue, or improve employee efficiency. Address not only tangible outcomes but also factors like morale and turnover that impact long-term success.
- Develop a Clear Business Case: Invest in solving problems that affect dollars and cents, not just annoyances. Overburdened regulatory processes or complex internal knowledge systems are prime areas for AI-driven solutions that relieve teams and accelerate workflows.
- Automate Tribal Knowledge Collection: Use AI to extract valuable knowledge from employees and fill in documentation gaps. This automation reduces reliance on key individuals, prevents knowledge loss, and creates a sustainable, growing knowledge base.
- Integrate Real-Time Data and Static Content: Ensure your AI solution pulls data from both live systems (like CRM and ERP platforms) and static documents. This allows AI to deliver up-to-date, accurate responses, giving teams the context they need to make fast, informed decisions.
- Start Small and Scale: Launch with a manageable project that can demonstrate value within a short period (e.g., one month). Show a demo to key stakeholders, get early wins, and build momentum from there.
By following these steps, you enable AI to provide timely, accurate responses that enhance workflows, reduce costs, and drive better decision-making across the organization.
Get Started with Krista
AI-driven solutions like Krista transform the way businesses answer questions by leveraging both real-time data and existing content. By integrating intelligent agents, companies can overcome the challenges of siloed information, security concerns, and outdated processes. AI enables teams to find answers faster, streamline workflows, and make more informed decisions.
As you look to improve efficiency and reduce bottlenecks in your organization, consider how AI can fill the knowledge gaps for your employees and customers. Start by identifying key content gaps, automating knowledge gathering, and integrating real-time data to build an AI solution that delivers real business value. In the next phase, explore how AI handles known questions, driving even greater impact on operations.
Links and Resources
Speakers
Transcription
Scott King
Well, hey everyone, I’m Scott King, and thanks for joining this episode of the Union Podcast. I’m joined by my co-host, Chris Kraus. Hi Chris.
Chris Kraus
Hey Scott.
John Michelsen
How you doing, guys?
Scott King
We’re doing great. We’re going to follow up on our last episode where we talked about three different use cases for document understanding, answering questions from users, whether employees or customers. We discussed three different use cases where there’s an unknown question. This is the typical chatbot use case people think about. People need to figure out where all their documents are so when someone has a question, they can find an answer. You don’t know what the question will be, so you’d have to foresee all the questions if you’re going to code this into a chatbot.
Then we talked about the known questions, the ones you send out to people to which you get various answers. How do you handle that? Finally, we discussed how you’d enable natural language processing across your organization to find information and run processes. We want to dive deeper into the unknown questions. Some call this enterprise search. People want all their information available in some format. They experiment with internet content, like withChatGPT, but want their own content, such as manuals, sales systems, CRM, ERP, and so on.
Chris, if people want to do this, everyone who’s been on the internet in the past year and a half has played with Gemini or ChatGPT. There’s a huge buzz around AI agents. What’s the overview, and how can John help us dive into the details?
Chris Kraus
The thing people like is that it has answers. They can ask anything, and it gives an answer. Now, there may be hallucinations, or it might not be the right answer, or it’s a very general response. At the enterprise level, I see two problems with my customers. First, they’ve done their homework. They’ve put all their documents into something like SharePoint, OneDrive, or Box. They’ve curated them and implemented strict change management processes, and people still can’t find things. The answers are there, but they can’t be found.
That’s where Krista comes in to read those documents and answer questions based on them. The second half of my customers have so much tribal knowledge locked in people’s heads, and they’re worried about these people retiring. They can’t ask them to write everything down because they wouldn’t even know where to start. They know so much it would take years to document it all.
So when you talk about Q&A (enterprise search) and answering questions, it’s a great use case because you want automation. If the answer isn’t documented, you can ask a subject matter expert and build the documentation in the background. The software can help, but there’s no gate where you need all the answers upfront. In this use case, we don’t know all the questions, so we don’t know all the answers, but there’s a path forward for everyone to get value quickly.
One thing I’ve noticed is that people think narrowly. If I’m on my desktop, I can take an HR policy, paste it into ChatGPT, and ask a question. Or I can take notes from three meetings, paste them into ChatGPT, and get an answer. People talk about doing the work, finding things, and then using ChatGPT to summarize or fix grammatical errors.
That’s probably not the best way to do this because you’re forcing the person to be the integration glue for everything. There’s no way to control security, and you can’t handle hallucinations because you’re mixing data from what Gemini or ChatGPT knows with what you know. Sometimes, you also need access to different data sources. How have we actually solved that with Krista? We’ve done some cool things, and people don’t realize there are challenges you have to overcome.
John Michelsen
Yeah, well, you named a bunch of them. The fundamental issue is that no one has perfectly clean data that’s properly attributed to roles or characteristics like region or time frame. There are gaps that need to be filled, and there’s inconsistency in the data. Many people would love to have all their content in one repository, clean and ready for people to ask questions, instead of managing it themselves. And you know, actually, you can, despite the gaps, because really good orchestration of people into the process of curating that content gets you there.
When an answer isn’t sufficiently supported by the content provided, it invokes the right person in the appropriate role. They might dig up a document, or they could type out a really good answer on the spot. That helps fill in the gaps. You mentioned enterprise search earlier. Enterprise search is typically architected to replicate data from existing systems and file systems into another repository, making it easier for that system to index. You query that new data representation, but we think you’ll never get there by replicating everything.
There will always be gaps, and workflow curation isn’t built into that process. It often becomes a data dump. Many people talk about their data lakes and how difficult they are to manage, which is expected given what’s being done. Instead of dumping static files into a system, we want to ingest that content directly. For structured data in systems, it’s usually carefully architected to ensure correct structure, so we leave it there and query it in real time.
You don’t want to replicate it into another system, lose the schema, and then try to apply a new schema when asking a question. Leave those systems in place, and query them as needed, because they change so often. For example, if I ask, “What’s the sales forecast for my West Coast manager this quarter?” it might have changed since this morning.
You can’t just replicate the content into another system, index it, and then assume it’s all in one place. Then you’d need to reimplement a whole set of role-based access controls, which is a losing proposition for most. The key is intelligent orchestration: determining whether a question is better answered from static content or by querying live systems, and understanding who can access which systems based on their role. That’s where we’ve really made progress.
One more quick point: Chris mentioned a few customer examples where we’ve identified and overcome challenges, but there are many more. One of my personal favorites is dealing with authoritative versus non-authoritative content. There’s a big difference between content where we don’t know its source, who wrote it, or when, and using it to answer a customer question versus authoritative content.
For instance, an actual product specification with a date and model number, or a third-party certification stating certain facts about a product, are authoritative. These are three different levels of authority. One thing we’ve done that I’m really proud of—though it might seem minor—is that when Krista ingests content, Krista establishes its authority as low, moderate, or high. This transparency helps when Krista generates answers.
The ability to decide whether the information you have is sufficient to make a decision or provide to an end customer is crucial. You can look at the confidence, authority score, and trust score, and decide, “I think that’s not sufficient” or “I need to get back to you, Mr. Customer, because the content I have is unreliable.” These are the things that set apart what we were trying to achieve with enterprise search. However, what we ended up doing was creating massive IT infrastructure and forcing business people to learn SQL or non-SQL just to perform queries. We’re not really moving the ball forward by turning them into prompt engineers either, but at least we’re getting closer to natural language.
Chris Kraus
John, can you talk a little about the differences between the efficacy of a document—like, “This is the return policy for 2022” versus real-time data? Some things you need for historical purposes, but for others, like sales numbers, when they change, the history is gone.
What’s the difference between efficacy and real-time?
Scott King
I like that question, Chris. When John mentioned data lakes, I thought about all the executive dashboards that take so much work. The data is replicated, and by the time you present it, it’s already weeks old. I’m really interested in the efficacy dates, John.
John Michelsen
It’s funny because, in every large organization we’ve worked with, when they talk about their data lake, it’s not just about how long it takes to get data out—it’s how many years their project has been on the roadmap to even get data into the data lake. You wouldn’t create a spreadsheet of every possible time someone might ask questions about the current forecast or inventory level for anything dynamic. It’s moronic.
Every piece of content, even static files on a drive, has a useful life. There’s a date when it becomes relevant and a date when it’s no longer relevant, except for historical purposes. For example, last year’s tax code isn’t the one you use to decide how to pay taxes. You must use this year’s tax information. But while working on taxes, you also need last year’s information for filing. This is a good example of content’s effectivity—content that applies to different periods.
This applies to static content as well, like contracts, laws, and regulations that have dates stating when they become effective or supersede another document. You need to account for these dates if you want a system that replicates the work your people are doing manually, beyond a simple Google search. These systems need the capability to unlock real value, not just something like SharePoint search or Googling publicly available content.
In the static content world, this applies, but you’d never generate a static spreadsheet for inventory levels that are constantly changing. You always want real-time access to that data. When your content management solution—like Krista—manages this, it knows what static content is relevant and how to navigate roles, effectivity dates, and attributes. It also knows which systems have the real-time data, and it can combine both to give a complete answer to a tough question.
Scott King
Yeah, asking good questions is important. When you mentioned Google search and effectivity dates, even Google lets you filter by date range—last week, last month, last year. But that process is manual. You should be able to automate that with your prompt or just by how you’re asking the question. With the unknown questions, people understand how answers are generated, but they don’t understand how LLMs are prompted, which is why they say, “I need to train my own LLM.”
John Michelsen
That’s right.
Scott King
The market popularized retrieval-augmented generation (RAG), which is based on your question—where do you get the information, and how do you generate the answer? It’s more complicated than people think. You talk to customers about the problems they encounter with training, hallucinations, and security. Talk to us a little bit about the RAG aspect and how it needs to be done correctly from the start, instead of assuming too many unknowns and ending up with an out-of-control project.
John Michelsen
Right. Unfortunately, we’ve seen too many customers who start using Krista after trying to build a solution themselves. They think, “Langchain is free, I can hack away at some Python, and we’ll see what we can do.” They end up with something that’s basically a surrogate for Google search, and not much better. Maybe they’ve got proprietary content, but otherwise, it’s not an improvement.
As soon as they encounter the kinds of issues we’ve been discussing, they face a heavy lift. Even after completing the solution, it’s still a full-on application development lifecycle. You’re starting a backlog in Jira or VersionOne, hiring teams, planning sprints, and scheduling meetings.
Some customers think, “I can give a document to an LLM, ask it a question, and get an answer. Case closed.” But no, you’ve got gigabytes, even terabytes, of content, and not all of it should be considered for every question. If you don’t account for that, the answer could be wrong or misleading.
Scott King
I wanted to double-click on that because I don’t fully understand.
John Michelsen
Right. I was prompted with too much information.
Scott King
Yeah, that’s a great example—a bad prompt.
John Michelsen
Exactly. The idea of prompts themselves is part of the problem. We’ve been making simplifying assumptions from an IT or technology perspective and pushing the burden onto the users. We’re saying, “You guys can learn SQL, right? You can learn NoSQL, right? You can learn prompt engineering, right?” It’s just pushing the problem further away from the solution.
The point is, Krista engineers the prompts for you, but even then, the notion of asking good questions is still important. A vague question gets a vague answer, while a specific question gets a specific answer, or “I don’t know that,” which is incredibly valuable. Being told “I don’t know” is much better than getting a long-winded, incorrect response.
The funny thing about most public-facing LLMs is that the longer the answer, the less likely you’re getting what you actually need. You probably didn’t ask for an essay; you asked for a precise answer. If you did want a long discussion, well, you’ve got a lot of time on your hands. But that’s not what these systems are built for.
We need to ensure people ask good questions without having to engineer prompts. This morning, there was buzz in the AI world about Claude supposedly eliminating the need for RAG because its memory is so large. You can just shove everything into it and access it repeatedly. The argument is that you no longer need to deal with specifics.
But that’s completely insane. There’s no subject-matter, role-based, time-sensitive, authority-oriented approach. You miss everything that makes this useful. Yes, for the simplest use case, you could do what this guy suggests—dump a bunch of content into Claude, cache it, and access it for follow-up questions. But that won’t reduce a single customer service call or increase sales proficiency. Whatever goal you had, it won’t work because it takes much more sophistication to get real value.
After a few months, you’ll realize you need something more advanced. The real challenge is that you don’t know the questions people will ask, so you just go looking for content. But different content has different shapes—it’s only appropriate in certain subject matters, for certain people, or for specific time periods.
Who’s asking the question matters. For executives, I’ll only use highly authoritative content. For rank-and-file employees, I might say, “It could be this,” and follow up to validate the content. So you can get where you want to go if you start with the end in mind. Ask yourself, where am I really trying to get to? It’s a sophisticated place, and a very cool one. We just had a company meeting where we announced several new customers who launched in the past seven days. One of them deployed in four weeks and transformed their organization by unlocking all that information.
They solved an executive problem of too much data being locked in tribal knowledge. Executives were constantly being asked for answers, and now they don’t have to be. It’s really powerful when you start with that end goal and then backtrack to what you should do in the next month.
Chris Kraus
Yeah. One thing people don’t understand is that you need automation to capture the user’s context to get the right answer. Claude might have the company’s entire roster, but it doesn’t understand the reasoning needed to differentiate between U.S. and U.K. employees. Some of that context comes from system APIs, and then it’s applied to the search. Can you explain how that works? It can be sophisticated, but it’s not difficult if the platform handles it for you.
John Michelsen
Right, and of course, Chris, given your role in building this and ensuring customers use it effectively, this is one of the key components. We talk about this often, but to be quick: maybe 30% of what people do with a system like this is a surrogate for a Google search. They don’t need context—they just need an answer. For example, “What’s Chris Kraus’s birthday?” If that’s in the content, hopefully it’s accurate enough. Many times, though, without a lexical analysis, it may struggle depending on how you phrase the question. But assuming it works, you get your answer, and it didn’t require any special context.
However, the other 70% falls into different categories. You touched on one of the most important ones. Let’s say I’m sitting in Dallas, Texas, and I ask a question. Someone in the same company, even in the same role but in a different region or country, could ask the exact same question but need an entirely different answer. It’s not even about security authorization—it’s about context. When I ask, “Who are my most important customers?” the “my” implies a lot. My scope might be the central region, and that requires orchestrating systems to gather context in real time.
It might require orchestration to get that answer in real time, but if there’s a customer list somewhere, it can automatically pull the relevant ones from the central region, giving me the right answer. My colleagues elsewhere would get theirs. Without context—typically, who am I talking to—you won’t get there. Another part of this is…
Scott King
The most important customer list is probably on the Chief Revenue Officer’s laptop, right? So, John, we’ve talked about use cases for unknown questions. Tribal knowledge is important to me because I try to write everything down and share it so that nothing I know is inaccessible to others.
John Michelsen
Exactly.
Scott King
Customer support and sales are great internal use cases because situations change often—like who the customer is, or with employee churn. We have to move beyond tribal knowledge; we can’t keep asking “Brent” from The Phoenix Project all the IT questions just because he knows everything.
John Michelsen
Right.
Scott King
And then there are standard operating procedures—things that don’t come up often, like “How do I do this?” So, if I’m an executive leading a function and we need to solve these problems, what are some concrete action items we can take?
John Michelsen
Super important, yes.
Scott King
What are three or four concrete steps to get started, whether with Krista or not?
John Michelsen
Well, they’ve got to start by identifying the audience, the content, and the business case for why this problem needs solving. Annoying problems are one thing, but issues that impact dollars and cents or generate revenue are worth investing in. For example, reducing costs, increasing revenue, or improving morale and reducing turnover are important. If you make the work environment better, you’ll attract better employees and improve quality of life.
Another example is regulatory content that’s overwhelming everyone. How does anyone know how to comply with all the regulations specific to their use case, industry, or location? In the U.S., rules can vary down to the county or even the city level. Before doing anything, you’d almost have to ask, “What’s the rule on this?”
Alleviating that burden helps people work faster, brings in revenue faster, reduces costs, improves morale, and decreases churn. It’s magic.
So, to start, identify the audience and the knowledge gap. For example, if you have content but don’t know what questions will be asked, you need to know who your audience is and what the content is. Then, identify the critical people who can solve content gaps and inconsistencies. After that, you can either hire a team of developers or talk to the Krista team.
Scott King
Perfect. Chris, is there anything else about the action items John didn’t cover?
Chris Kraus
I want to go back to something I mentioned earlier. Some people might think, “I don’t have all my documents, so I can’t start.” But we’ve seen both use cases—those with organized content and those with tribal knowledge—be equally successful. We’ve had a 50-50 split between people who say, “We created a content warehouse, but people don’t know how to find it,” and small companies with just a few experts. These experts have all the tribal knowledge, but we can’t extract their brains.
We can, however, prompt them to answer questions we don’t have content for, and then automatically fill the content backlog. This way, you don’t need all 400 things they know—you just need the hundreds of things people are actually asking about. Both paths lead to success in four weeks, whether you have the content already or not.
John Michelsen
Yeah, that’s a good point.
Scott King
I think people don’t realize that you do need to gather your content and get it into a system, but we automate a lot of that. It’s not a manual exercise. If your data is extremely messy, yes, you might need to clean it up a bit.
Chris Kraus
Right.
Scott King
Using machine learning to build machine learning so everything works is a huge step. If you want to solve a problem but don’t know how to get your data into a system, contact us and we’ll help. We covered a lot today, from security to context and handling unknown questions.
Next time in our document understanding series, we’ll cover known questions—questions you would ask another person. We’ll dive into how to structure documents, run processes, and create hours of capacity using automation. Thanks for sticking with me through this series, and until next time.