# Five Enterprise AI Wins: Llama Index with Laurie Voss

## Метаданные

- **Канал:** Jason Liu
- **YouTube:** https://www.youtube.com/watch?v=JI7p1ifAILo
- **Дата:** 06.01.2026
- **Длительность:** 48:11
- **Просмотры:** 188
- **Источник:** https://ekstraktznaniy.ru/video/52971

## Описание

Complex document processing still consumes enormous effort in enterprise environments, from government RFPs to financial contracts to insurance claims. What if the key to effective AI agents isn't just the technology, but understanding your actual business processes first?

In this talk, Laurie Voss (VP, Developer Relations at LlamaIndex) shares real-world case studies from LlamaIndex customers who've successfully deployed document agents in production, revealing the patterns that separate successful implementations from failed experiments.

We discuss:
• Why document agents (RAG + agentic capabilities) are essential for real-world enterprise applications
• The critical importance of LlamaParse for turning complex documents into LLM-readable formats
• Real-world case studies: government construction RFPs, financial services document processing, insurance claims automation, and healthcare documentation workflows
• Why understanding your existing manual process is more important than the

## Транскрипт

### Introduction and Overview []

So today I'm really excited to bring in a longtime friend of the class Lauri Voss leading the still devol team now right he's going to share with us like I think five different case studies on how different companies have been using llama index from parsing to retrieval to maybe some of the more like document intelligence work that llama index has been developing. I'm really excited to learn a little bit more from the team. As always, we have a slid link in the Zoom chat. You can basically go there to ask any questions and upvote questions and we can we'll do that after the talk is finished and we can just have a conversation on some of the ideas that we're thinking about. And with that, Laura, take it away. — Thanks very much, Jason. Hi, everybody. As Jason said, I'm head of developer relations at Lana Index. Uh, in a former life, I co-founded npm Inc. So, some of you may remember me from when I used to talk incessantly about JavaScript. Uh these days I talk incessantly about AI. The through line being that I talk incessantly. Uh and today I'm going to be talking about llama index starting with what it is uh and what you can build with it. Namely document agents uh and then we're going to dive into the case studies some real problems that our customers solved with llama index uh what they were doing before lum actually built with us uh and what they got out of it.

### What is LlamaIndex? [1:23]

Uh but first I should say what is lamb index? Well it is a bunch of things. Uh start with the most obvious. We are a framework in both python and typescript that helps you build generative AI applications. Uh that framework those frameworks are both open source and free to use. Uh there's the core framework which is focused mostly on data ingestion and manipulation. Uh and then there is workflows which are an event-based orchestration framework uh built to put guard rails around the otherwise total autonomy that an LLM empowered agent gets uh and workflows guide your agent into useful predictable patterns. Um there's also Lamoud which is our enterprise service. Uh it allows you to connect your enterprise data sources whatever they are, documents, Google drives, databases uh and have their contents automatically parsed and indexed into semantically searchable form uh so that you can build AI applications on top of your data as I'll talk about later in this talk. You can think of llama cloud as llama index building most of a llama index application for you. um key to Lana cloud and something you're going to hear about a couple of times today is Llama parse which is our flagship service that will parse complicated documents in any format uh into a form that can be understood by an LLM. So this is not parsing for human purposes. This is parsing for LLM purposes. So it can take PDFs, word files, PowerPoint documents, whatever, uh, and turn them into a format that LLMs find easy to understand, which turns out to be, uh, critical for lots of Genai applications because, uh, if your LLM can't understand what it's reading, you're going to get nonsense results. Uh, so La Pars is free to use for 10,000 pages a month, so it's easy to try, uh, and get started with. Um but a good question to ask I think is why should you use llama index? In fact any framework at all or any cloud service to help you get started. Why shouldn't you just roll your own? Uh and the answer is because we will help you go faster. Uh as a technologist you have limited time. You have actual business and technology problems to solve. You don't want to get stuck figuring out the basics. uh our frameworks have solved a bunch of the foundational problems for you so you can focus on actual business problems which is what we're going to talk about in the case studies. Uh but what do people actually build with lama index? We call them document agents. Um I was told this crowd was pretty well informed. So you already know what agents are. rag is. So I'm not going to go into explanations about either of those. I can jump straight into why document agents uh which are the combination of rag and agents are important. Um rag is essential to any real world use case because LMS are trained on mountains of data but they're not trained on your data. They don't have your data. So to do anything useful you have to feed them your data and you have a lot of data. You probably have hundreds of millions of tokens worth of data. Uh LMS have limited context windows so you can't give them all of that data at the same time. And even if they had infinite context windows, giving them all of your data every time would be prohibitively slow and expensive. So you have to be selective, which is what rag is about. That is what the retrieval part of rag is about. It is finding only the most relevant data to give your LLM at the time that you're asking it to solve a problem. Uh but rag by itself is limited. Uh rag is semantic search. Uh and search can only do so much. she can only answer very specific questions about usually about a single focused topic. Uh it often fails at complex questions or multi-part questions or things that require planning and comparison. Uh for that you need the greater abilities of agents. agents can reason and plan and reflect and come up with many pertinent questions that they can uh feed to your rag system or your other systems uh and get multiple answers that they synthesize into single coherent answers and act on those actions by taking real world steps. So, rag needs agents to be good uh and agents need rag to be useful. Uh and that is what document agents are. They are agents that have been enabled with a powerful rag system. Uh but let's not talk about them in the abstract. Let's

### Case Study 1: SOFTIQ [5:36]

get into some actual case studies from actual llama index customers. Uh first up is a company called Soft IQ. They are a software consultancy based in Poland. Uh their document heavy domain was RFPs for government construction contract contracts. uh RFPs or requests for proposal as I'm sure you have heard they are lengthy uh often 100 plus page uh highly technical and follow no standardized format they are documents uh written by humans and are frankly barely readable by humans they are extremely dense uh but satisfying these contracts coming up with proposals for these RFPs uh in Poland is a 7 billion dollar market uh previously Their solution was humans. They would it would take one human a couple of hours uh or for a lengthy report sometimes days to read a single RFP uh and the prospective contracts. So out of all of the RFPs that were out there, uh they were finding them by doing keyword search. So they were just, you know, saying I'm a construction company and I do, you know, concrete or whatever. They were searching for the word concrete, uh and hoping that the RFP contained that word. Uh, so that led to both false positives and missed opportunities. RFPs that they could have satisfied, but they didn't know about them and RFPs that they had to read all the way through to discover that they couldn't satisfy them. Um what they built instead was a series of agents uh each of which mimicked a specific part of their manual process uh and produced standardized 20 to 30 page reports which included executive summaries, risk assessments and recommendations uh which can be read and immediately acted upon. Um what that resulted in uh was significantly better results across the board. first uh they found more and more relevant RFPs by using semantic search instead of keyword matching. So instead of searching for concrete and getting you know contracts that had that word in it, they were getting construction contracts that were about the specific things that they worked on uh based on an LLM's understanding of what it is that specific company that was doing the bidding did. Um and the processing of each RFP was cut dramatically uh down from multiple hours to something like 10 minutes. Uh so they went from — [clears throat] — uh an employee handling something like three tenders per day to 20 to 30. So a huge jump in the efficiency of the humans involved. Uh and that represents both a cost savings because they are working more efficiently but also an actual increase in sales because they are bidding on things that they would have missed entirely before. Um, for each of these case studies, I'm going to pull out the key things about these case studies that made them important. So, um, the first takeaway is that this why was this such a good use case? And it was because this was a bullseye use case for LLM. LMS are really good at taking a giant pile of unstructured data like a contract and boiling it down into something shorter and more readable and extracting the salient information. So this is exactly what you want an LLM to do taking more text and turning it into less text. Um the second key point is that it really worked well going from two to three to 30 uh RFPs in a day that is a huge outcome. Um, and the reason it worked so well is because they already had well-defined processes. So, they weren't just coming in blind and prompting the LLM with come up with a recommendation for what we should do from this RFP. They already had as humans a multi-stage process with defined inputs and outputs that generated these executive summaries, which is what they wanted, right? So they had, you know, five or six different things that they did that were well- definfined that went into producing different parts of this report. And so they were able to codify that into what those this set of six agents did. Uh so that translated directly into lama index workflows. They were taking a human process and turning it into uh a lame index workflow. So they weren't inventing a process, they were automating an already defined process. And that makes it a particularly great bet. You're going to get better results if you have like a battle tested uh human process that you are trying to automate rather than going in from scratch and trying to get the LM to figure out what to do. Uh the next use case, the next case

### Case Study 2: Pursuit [10:23]

study is pursu a company called Pursuit. Uh they are US-based. They are a businessto government sales intelligence platform. What that means is they are uh looking at uh the business of government. They're looking at you know everything from federal organizations down to local county organizations uh who are uh conducting business and require outside contracting services. um these opportunities come up in public documents. So you know every these public government entities they often have public meetings. These public meetings have public minutes. Uh they have strategic plans. They have budgets that they publish. They have council meeting transcripts that they publish. And inside of those is a gold mine of opportunities for companies who want to sell services to those entities. Uh there are over 90,000 of these entities scattered across the United States. Uh and these 90,000 entities have wildly different ways of codifying this information. Uh so it's extremely messy. It's extremely unstructured. It's unstandardized. Uh it's scattered across websites. It's got PDFs. It's got, you know, all sorts of formats. Uh with no common structure. Um but it is again uh a multi-billion dollar opportunity for people who are trying to sell services to government entities. Uh previously identifying these opportunities was largely manual. They would tune in to individual entities that they knew about. Uh they would look for well-known documents uh and they would read them manually with humans to see if there were opportunities represented. Um this is obviously uh incredibly slow and laborious and in fact impossible uh to keep on top of 90,000 different entities. So a lot of stuff was just going completely unseen. Uh so budget allocations, upcoming initiatives, funding streams, all of that stuff required manual review and there just weren't enough humans doing it. So a lot of this stuff was just going unseen. What they built at pursuit uh for these companies was a massive document ingestion and search system. Uh using llama parse which I mentioned earlier uh they parsed things like 4 million pages of government documents in a single weekend. Uh and they have maintained that volume consistently. Pursuit is one of our larger customers. Um what the system does is it reads these documents and it extracts key data points uh both from plain text but also from tables and figures and scanned images uh and they make everything searchable and filterable. So what pursuit provides is um a single unified search across all 90,000 government entities to find in to find initiatives, budget items, opportunities uh that match what your company is uh trying to sell to the government. Um and the results were transformative. They achieved um a 25 to 30% increase in accuracy over their previous manual extraction methods. So you know it's doing better than humans. Um but also it was doing more than humans. They were uh finding uh opportunities that were previously that would previously been entirely missed be by dent of scale. They were able to do so much more and cover so many more entities uh in so much more detail. Uh so they've made the entire public sector of the US searchable and actionable for the first time. What made this work? First was the scale problem. As I mentioned, manually processing even a fraction of 90,000 entities is simply impossible. But Llama pars gave them the ability to process millions of pages in a scalable framework. So they were able to tackle a problem that was completely outside of human capabilities before. Uh second was the accuracy problem. Uh these documents are incredibly messy. Uh they have inconsistent formatting. They have complex tables. uh they needed parsing that could handle that real world chaos, not just uh clean, well formatted documents. Um and third, they knew exactly what data points mattered. This is similar to the previous use case. They were uh they weren't just going in and telling the LLM figure out what is important in this set of document minutes. They knew from their customer base exactly what their customers were looking for. So they were able to have specific fields to extract budget line items, initiative names, contact information. They had very clear requirements of what information it was that would be relevant in these documents. So they could then uh run these parsed documents through an LLM and very efficiently uh extract production ready results.

### Case Study 3: Scaleport AI [15:27]

Our next study is uh called scaleport. Um they uh work with they are another consultancy. They work with a leading travel insurance provider. Um and their document heavy domain was medical claims. So they had uh hospital reports, diagnostic images, handwritten notes uh submitted from around the world in dozens of formats and crucially lots of different human languages. Um these are very challenging documents to process. They have inconsistent layouts. They had poor scan quality. Uh there's lots of critical details buried in tables or sometimes like handwritten notes scribbled on the margins of things. Um but uh it's also you know medical information. So you have to accurately process this data uh to be fair uh and to be accurate. Um and you also need to be fast because people are not hanging around waiting to get reimbursed. They are you know they want their money right now. Um, so previously every claim was handled manually. A claims adjuster would spend 20 to 40 minutes carefully reading through every medical report, extracting diagnoses and treatments and costs, checking against policy exclusions and estimating the claim amount. Um, with hundreds of claims per month, uh, the team was constantly underwater. uh and there was no way to scale this without hiring without you know linearly growing the team without hiring proportionally more adjusters uh to deal with their extra scale uh which was not economical. So what they built was an AI powered insurance case analysis agent. Um the foundation was again llama parse uh which is uh about OCR and document parsing. They could extract data from these messy documents. So handwritten notes again lowquality scans and again complex tables. Um where traditional OCR failed on real world medical documents Lamopar was able to get there. Um and the system is al the agent goes through multiple checks. So uh it has the uh coverage documents that these medical claims are being made against. So it can check against exclusion criteria. It can assess whether or not uh the treatment was relevant uh to the claim u and it can search for similar historical cases in using rag uh to find out what how those claims went and what uh humanpowered analysis did in previous cases uh allowing them to make estimate claim amounts based on location and on precedent. Um so the output was a structured analysis ready for a human adjuster to review. Um the result was that processing time dropped from 20 to 40 minutes to just 10 minutes per claim. So something like a 75% improvement. Um so the team could handle something like twice as many claims without adding staff. Um but a crucial thing here is that the system didn't replace adjusters, right? It augmented them. This was taking uh tedious extraction and initial analysis work uh and turning it into just the final part of what uh the adjusters were previously doing uh the human judgment calls uh and customer service that is required. So why did this work? Um first it was the right parsing technology. They had previously been trying OCR systems uh that were more traditional and they just did not they just failed on these documents entirely. They weren't you know it wasn't that good quality it was like they just didn't get any results at all out of uh traditional OCR. Um whereas llamaparse uh is purpose-built for this kind of thing. Llama parse has uh llamaparse is itself an agent. It is uh built on top of a number of language models and vision models. Uh and it is able to check its own work and see if it is doing a good job and reflect and uh try different strategies. As a result, it gets uh significantly better results than any kind of uh oneshot OCR. Um and like I said earlier, uh that turns out to be critical for LLM powered applications. It went from we can't do this with OCR to we can do this with OCR. This was not a you know a percentage increase. This was a 0ero to one. Um the second takeaway here was that uh they built a complete workflow. So not just extraction. The agent doesn't just pull out the data. It's also doing part of the work that the humans were previously doing. It is checking exclusions, searching for precedence, estimating amounts. uh it is doing a big chunk of the job of uh what a claim analyst was previously doing. Uh and third and very importantly, they were designing for human AI collaboration. Uh they were not attempting to make an agent so smart that it could uh replace a claims adjuster entirely. Uh partly because that would be very difficult and partly because uh for regulatory reasons you can't do that. Um but what instead the system does is it produces structured outputs that adjusters can very quickly verify and refine. So it is augmenting these adjusters uh not replacing them and that is often key to an effective AI use case. Uh LLMs are very smart but they are not usually smart enough to in their current forms to completely replace a human. Uh so systems that attempt to completely replace humans often fail whereas systems that are accelerating or augmenting uh existing human processes tend to be uh a better fit.

### Case Study 4: Arcee AI [21:28]

Uh our next case study is uh called Ry AI. Um they build specialized small language models for enterprise applications. um their document heavy domain was uh NLP research papers. They had uh a corpus of every NLP uh scientific paper published since 2017. They were stored in one giant S3 bucket uh totaling something like 4 million pages. Um these things are dense academic PDFs. They have multiple columns, uh complex equations, detailed tables. They have complex charts, specialized notation, uh and they needed to extract all of this uh to create a high quality training data set uh to uh create a researched focused LLM. Um so they are ingesting data to train LLMs. Previously they had tried traditional OCR solutions uh and they'd also tried open- source OCR alternatives. Uh but uh what they found was that these tools consistently failed on the most important parts of these documents. So the complex tables were mangled, the equations were lost or hallucinated uh and charts were sometimes ignored entirely. Um manual extraction at this scale across four million pages was again totally impractical. They are a startup. they could not do this. Um so they needed parsing that could handle the full complexity of academic papers. Um what they built was a large scale document processing p pipeline powered by llama parse. Uh llama parse's parsing instructions feature uh was the uh critical feature here. Um what like I said llama parse is an agent. So llama parse can take natural language instructions and one of the things that you can do with that is you can tell it what kind of document it's reading and you can give it hints about how the information in those documents is going to be arranged and how [clears throat] it should handle uh edge cases or complex formats or things like that. So that is the feature that they made heavy use of here. Uh it allowed them to uh run it through run through a representative sample of these scientific papers over and over to uh refine their prompts and refine their parsing instructions to come up with um instructions that iteratively improved their accuracy for things like tables and charts and equations. Um once they had got that system set up uh and uh you know got the fidelity to a place that they liked uh they were able to process all four million pages um in a way that preserved the semantic meaning of even very technical content uh and that became the foundational data set uh for fine-tuning a specialized NLP research model. Uh the result was uh exactly what they wanted. They achieved reliable conversion of PDFs to tax with minimal data loss. So the tables were intact, the equations were preserved and the context was maintained. The iterative prompt tuning allowed them to continuously improve the output quality as they encountered edge cases. Um but most importantly they created a complete high-quality data set uh in a fraction of the time that it would have taken with manual extraction or lower quality parsing tools. uh as anybody who has worked with fine-tuning or model training knows the data set quality directly translates to the performance of the model. So what are the takeaways here? Uh what made this work? Why was this a good use case? First is the complexity of the content. That is one of the things that we have spent a lot of time working on at Llama Index is making sure that we are not just you know good at simple documents but at particularly complicated documents. So academic papers are some of the hardest documents to parse. They are designed for human reading but not by you know your average human uh but by uh specialized scientists. Uh so they have complex visual layouts that convey meaning. Uh and llama parse was built to handle that. Um second was like I mentioned the parsing instructions. They didn't just run parsing once and hope for the best. uh they could guide the extraction process with the prompts and refine iteratively as they found issues. So parsing was not a black box. It was a controllable process. And third again was the scale. Uh AR's case was unusual in that this was technically a one-off uh process. They didn't have to do this over and over. They needed to do it once for four million documents. Uh but and when you have a one-off uh a one-off challenge like this, it's very tempting uh to build something quick and messy. Um but they found that quick wasn't quick and messy didn't work at all. Uh 4 million pages is too big. It is not a prototype problem. Um so they handled it as a production workload and that was what they needed.

### Case Study 5: 11x.ai [26:40]

Uh the final case study is called is from 11x you've probably heard of. Um they are a company that builds AI sales development representatives. Uh so they have AI agents like Alice who is their flagship SDR. Uh what Alice does is handle outbound sales. So researching leads, personalizing campaigns, booking meetings. Um their document heavy challenge was not uh in the flow of their SDR uh but in the onboarding to that SDR. uh enterprise clients would sign up uh to 11X um with hundreds or thousands of products uh that they sold um and they needed to be able to train their this AISDR to understand all of these documents so that the AISDR could talk about what was available uh quickly. Um these documents were everywhere. They were PDFs, they were powerpoints, they were websites, sometimes they were audio recordings of calls um in multiple formats completely unstructured. Um and how they were doing this before uh was manual customer onboarding. So individual humans were, you know, white glove taking each new customer on board, reading all of their documents and turning it into training materials for the AISDR. Um, someone had to write individual sequences, they had to craft personalization, they had to import context from all of these sources. Uh, and this was a critical scaling problem for 11X's business. Uh, because again, they would need to hire as many humans as they were doing onboarding. Um, this was especially a problem as they went down into mid-market companies who were, you know, smaller contract values and the white glove service didn't uh didn't make economic sense. Uh so sometimes they were uh facing bottlenecks of weeks or months waiting for humans to be available to do this onboarding process. Um they evaluated building an in-house OCR pipeline but it required massive engineering effort uh and produced quality issues and needed dedicated maintenance that they didn't want to get into. This was not uh their core business. Their core business is not document ingestion. AISDRs. Um so what they built was an automated knowledge ingestion system. Um again powered by llama parts. It handles multimodal content. So PDFs and word docs and web pages but also audio. Uh clients just drop their resources into a shared Google drive. Um and their s and the system automatically ingests everything. So it extracts context about products. It extracts marketing information about how you message the product and position the product. Uh and the AI agent then uses this knowledge base to generate campaign messaging at scale just like a human SDR would. Um the impact was great. Uh SDR onboarding time dropped from weeks to days. Um and their solution uh moved from prototype to production in just three days. This was one of our fastest uh on onboardings ever of our own customers. Uh they went from building this thing to being like this is how we do stuff now. in three days. Um because there was minimal engineering effort beyond the initial setup. Um uh they also found that after launch adoption was immediate. Uh users who had hesitated to migrate quickly embraced uh doing things the new way. Uh and the system now autoingests all sorts of resources and crafts messaging at scale. Uh enabling teams to roll out campaigns faster uh and at higher quality than they previously did. What made this succeed? Um, first was because they chose to buy instead of build. Um, building a custom OCR pipeline would have consumed engineering resources and created an ongoing maintenance burden. Um, lapar was an API that they could just pick up and use without having to figure out all the scaling and maintenance. Uh, second, the developer experience mattered. Um, most data science people are used to working in Python. Um but Lana index as I mentioned right at the beginning uh has a robust TypeScript SDK. Uh 11X's team was built around TypeScript. Uh so this fit into their existing processes. Uh and third they got fine grain control so they could toggle deep parsing for complex documents and lightweight extraction for simple documents. So they were able to do this cost effectively. Um and that is how you make practical production deployment work. you it doesn't just have to work once it has to work at scale and it has to be uh cost effective.

### Key Takeaways from Case Studies [31:21]

So in all of these case studies uh I have been giving you takeaways but what should your takeaway from all the takeaways be? Um the top level stuff the really uh big points are first that you've got to have a problem in the right domain. Uh if you've got a ton of unstructured data in some form and you probably do uh then llama index is engineered to be good at that thing. If you just want generic agents, uh you can build those in Lama Index, but you're not going to get anything extra out of us. Llama index is designed to be good at document agents specifically. Uh second, workflows are not just a buzzword that we use. Llama index workflows are a direct reflection of business processes and guidelines. Um so, as I mentioned a couple of times, you need to know what your business is before you can successfully automate it. You've got to have your processes defined in a way that a human can understand before you can give them to a robot. Uh you can't just hope the LLM comes up with a good way of handling your data. Uh you've got to come up with guidelines and guard rails and hardcode [clears throat] them into the system to get better reliability. Uh and thirdly, as I believe I've hammered home in this, uh the parsing quality is absolutely paramount. Uh when you're working with unstructured data, um there is a reason why we talk about llama parse all the time and it is because the difference between badly parsed documents and well parsed documents is the difference between a system that doesn't work and a system that does. It is not just a percentage improvement. It is a uh qualitative leap from uh non-functional to functional. Um and that is all I've got for you today. I hope this has been a useful look at a business level of what Llama Index actually does uh and how you use it. Um and I'm happy to answer more detailed questions about the technical side of how this framework actually does this stuff or details about the case studies. — Neat. Thank you so much for that. Let's uh jump into some questions. All right. Um just for just in case people have not seen this yet, I'm going to share a link in the zoom chat. Feel free to upload any questions that you have and then we can uh go from top to bottom. Um the most upvoted question

### Addressing Data Privacy and Access Control [33:40]

was actually around data privacy. uh does llama index have more like a on-prem solution or just how should we think about things like using llama parser llama parse or llama index in these like privacy sensitive situations that is an excellent question I'm surprised I didn't mention it uh we have both an on-prem and uh a SAS solution um for uh and the SAS solution exists both uh in the US market and in a dedicated EU market um so uh for a lot of our customers um just data locality uh is enough. Uh we can sign data processing agreements and things like that to guarantee the privacy of the data. It's never used to train on and stuff like that. Uh so a lot of people come in the door at that. Um but there are some customers for whom then even that is not enough and they're like no we need to keep everything inside of our network. Uh and for them we have a completely on premise solution. — Can you also talk a little bit more about like access level controls? I think some people are also almost on the same on uh same track as well of like access control. — Um that is a good question. So uh we build that stuff in at the API level. Um so especially when we're talking about uh systems that have existing access control levels like uh SharePoint for instance is one of the data sources that we integrate with. uh at the API level, we respect the same uh data level controls that SharePoint does or whatever source you're connecting to does. Uh so you can be sure that the data the results you're getting are based on data that the human who is make asking the question uh actually has answers to. Makes sense. Um earlier you talked about

### LlamaIndex's Built-in Agents and Customization [35:23]

you know specific agents in the document uh sort of intelligence space but could you also talk a little bit more about sort of the built-in agents that people are commonly using? I think someone asked about like summarization agents. I know we used to see like the map reduce type of summarization. Um yeah could you talk a little bit more about what are the generic document agents that llama index provides out of the box. Um yeah so our experience of uh of the market is that generic agents do not perform particularly well. Uh so when we provide you with an agent it is in the form of a template that we would expect you to take and customize to your own needs. Um so we have a lot of existing use cases in the finance space like invoice reconciliation uh in the medical space for like uh medical record parsing stuff like that. Um where we would give you uh an agent that you know is the 80% solution to your business. Um but we find that uh every customer will need to customize the agents to their specific domain. — I just saw a question in the Zoom chat actually around this Llama agent product. Can you talk a little bit more about what that how that differentiates from other of the offerings that you guys have? — Yeah, that's an excellent question. So llama agents are uh a systematized version of those templates that I was just talking about previously. Uh what we did was we handed you a pile of code said here is this template agent that you can take and customize and deploy inside of your system. Uh llama agents uh turn that into uh a one-click process. So uh we you know you can pick your template uh and have it automatically deployed uh on our cloud or on your on-prem solution uh and then you know in a in a no code way uh to build your prototype uh and then you can fall back to the code and customize it uh to get the remaining uh customization that you need. — Makes sense. Um I guess another question here is mostly around some of the stuff that we talk about in the course which is that it's actually very valuable to think about sort of the data different data types that might be coming in right I think people when they start building out rag agents they just say I have all these documents let me just chunk everything and throw it into a vector database but in reality we have to think a little bit more about okay what are the data that we can extract and filter on maybe there's like a contract data type versus an RFI data type um how does llama index think about these kinds of ontologies and breaking the document space down into these specific uh specialized — indices. — Um so uh that is definitely a uh pattern that we see all the time in production is multiple indices uh per data type um just to avoid confusion. Um and uh again it's been such a common use case uh that we built uh a service into Llama Cloud called Llama Classify. All of our things are called llama something. Uh and what that does is it acts as the sort of uh gateway uh to your indexing process. So it will uh you know you give it a bunch of onlogical rules um and uh it will classify your documents before ingesting them into the correct index. — Makes sense. Um I think one

### Ingestion Infrastructure and Challenges [38:57]

question I'm actually also very curious about is how do you think about the actual general um ingestion and online architecture? Like how much does Llama Index sort of support these kinds of uh you know I think most other companies I've seen just do some kind of like Kafka Q where data is being read in and then saved somewhere else. Uh how much does Llama Index handle a lot of this like ingestion infrastructure? — Um entirely. Uh it is part of that is the secret sauce uh is how we managed to scale to so many documents so fast. Um but you know anybody who's built a distributed system would recognize a lot of our architecture. — Can you talk a little bit more about that or is that still is that really the secret sauce? — Um I mean it is it's not the part of the system that I work on personally. So it's the part where I'm handwaving most. But you know there's there is uh a ton of uh Docker agents and a lot of Helm under the hood scaling up uh to bursty workloads to make sure that we can scale out and handle uh you know millions of documents in parallel. Mhm. Um I guess on the same side of like these kind of ingestion challenges, one of the questions someone had was around sort of what are the main challenges when implementing these systems with llama index like how do we know we're ready to come to you guys and figure out okay like now that we have everything that we need you know how do we prepare ourselves for success working with the team. — Um that is an excellent question. Uh I think uh I think it's mostly what I covered in the talk. I think um understanding your domain first off is the key. Uh when we see failure, it's when people are expecting LLMs to be magic and are like, I have this pile of complicated documents. I have a goal. Can we build an agent that achieves the goal? Uh LLMs are not smart enough to do that yet. you have to know how as a human you would have solved that problem so that you can codify it build a workflow that understands how this is going to work. Um and it can be very complicated right like it can be you know read a contract and tell me what the rules are from the contract but like that is a step that the LLM wouldn't have worked out on its own. It's like oh I have to read a contract and figure out what the rules are. Uh so the more uh the more of your process that you can walk in the door with uh the better your adoption is going to go. — Makes sense. Um I mean given that OpenAI also I think this another question was around OpenAI's agent kit and stuff like that but given that OpenAI has come out with these kinds of tools. Um I think some folks would just love your thoughts on generally how do you think about using something like agent kit or llama index or building from scratch? I don't know if he's uh muted real quick, but maybe this is the uh I don't know if I can hear you. I don't know if it's just me. — No, I can hear you. — Okay, great. — Okay, now I can. — Perfect. — Okay, excellent. Uh my dodgy AirPods died ran out of battery. What was that last question? — At the perfect time. time. — I know. — What was the last question you asked me?

### Choosing the Right Technology for Your Needs [42:20]

— So, you know, I think people just saw like open like agent kit for example, which seems like a very like uh no code simple, you know, agent builder, right? On the other end of that is probably just like building everything from scratch. um how would you walk through people's decision-m when choosing some of these technologies? — Um we have uh our own low code solution uh for the same reason which is that if you're building a prototype um the initial steps in putting a putting together a document ingestion pipeline are often pretty similar enough that you can you know use a drag and drop to be like you know this document goes here and then an LM act on it and then it goes over there. Um so for getting you know for getting a prototype going um it's an excellent solution uh for any kind of uh real world use case um where you need um you know fine grain control and you have edge cases and you have you know specific external systems that you want to integrate with that were not conceived of by the drag and drop system you're going to want to drop down to code. Um so uh it is we think of those sorts of things as being useful but not sufficient. Um and in particular with um OpenAI I think uh and this is more of a personal opinion. Um I think a model provider is the wrong place to put that. Um one of the ways uh that we get superior results is that we use all of the models at the same time. Like I said llama pars is an agent. It's talking to every model uh and using those models at the time at the thing that they are best at. We have hundreds and hundreds of benchmarks uh that you know split up workloads according to what kind of workload it is. Uh and there's no one winning model. Um so I would be very hesitant if I were building an agent to tie that agent to a specific frontier model. — That makes sense. Which I think perfectly leads to the top question right now which is that you know when it comes to doing things like document parsing, generating reports um which LLMs or VLMs has the team been utilizing? How should we think about distributing this work? — Um like I said we are using all of them. We are um enormous customers of OpenAI. I think uh if you watched Dev Day yesterday, there was that wall of uh people who have sent you know billions of tokens through and Jerry was on the wall uh we are also big anthropic customers um and uh Gemini customers as well. Um we are using all of those models for different things. uh we like I said we have a lot of benchmarks and a lot of different document types that we run through this stuff and we have not found a clear winner uh even within a specific model provider like there's still some stuff where we will fall back to Gemini 1. 5 because Gemini 2. 5 is not as good at it uh or we'll fall back to cloud 4 because 4. 5 is not as good. Um, so, uh, that's one of the things I think that that's sort of one of our value ads is that we are doing all of that tedious work of figuring out what is the best model to use in what case. Um, and uh, and doing it for you. — Oh, that makes sense. Speaks to the importance of uh, having eval. Um, yeah, let me spend maybe take another minute to find other uh, questions here. One question I'd like to ask folks uh you know to buy us some time really is thinking about you know what is the thing that you feel like folks are not asking themselves as they build the agents right you talked about having well- definfined workflows understanding your data sets um but you know what is the common mistake you feel like folks are still really making these days

### Common Pitfalls and Best Practices [46:08]

— um I think a uh a really common um early pitfall that we see with our enterprise customers is um they come in with a generic set of complicated documents. They're like, "Here's like uh our sort of our white whale is like people who walk in with like handwriting. " They're like, "Here's a scanned page of handwriting. Can you OCR it? " And we're like, "Yes. " But you're almost never going to have a scanned page of handwriting in a practical case. What you're actually going to have is uh your sales team's PowerPoint deck, which has a very specific format uh that your sales team has uh refined over the course of years and all of their PowerPoint decks look the same. We should be training on your PowerPoint decks. We should not be uh you know proving ourselves on a scan page of handwriting because that's not what your actual workflow looks like. Uh so the more real world the data they can give us from the get-go uh the better the integration goes and the faster the integration goes. That totally makes sense. I've definitely seen demos like that in the past where it's like you know either like perfect data in the evals and then everything is super high or it's like data that doesn't look anything like the production data and then again you get really poor results. Um I think with

### Conclusion and Final Thoughts [47:29]

that said we can wrap things up. we can have a little bit of time before the next talk and let people stretch their legs. But, um, if there's anything else you wanted to, you know, say to the cohort, uh, now's the time. — Uh, no, I think this has been a very unusual talk in that I usually talk at very much at the code level of like this is how you use this, integrate. So, it's been a refreshing change to uh, be talking at the business level once. — Yeah. I'm definitely uh, looking forward to hearing a little bit more about the security things as well. I think like you know thinking about things like on-prem and access level controls has been a big conversation point for a lot of the students as well. Thank you Lauri. We'll see you guys. — Thanks everybody. — 8 minutes. — See you. And they get