Build a Local AI-Powered Document Summarization Tool (2025)

  1. DZone
  2. Coding
  3. Tools
  4. Build a Local AI-Powered Document Summarization Tool

Learn how to build a simple document summarizer using Streamlit for the interface and Ollama for running AI models locally.

By

Build a Local AI-Powered Document Summarization Tool (1)

Vamsi Kavuri

Build a Local AI-Powered Document Summarization Tool (2) CORE ·

Feb. 24, 25 · Tutorial

Likes (3)

Comment

Save

Share

2.9K Views

Join the DZone community and get the full member experience.

Join For Free

When I began my journey into the field of AI and large language models (LLMs), my initial aim was to experiment with various models and learn about their effectiveness. Like most developers, I also began using cloud-hosted services, enticed by the ease of quick setup and availability of ready-to-use LLMs at my fingertips.

But pretty quickly, I ran into a snag: cost. It is convenient to use LLMs in the cloud, but the pay-per-token model can suddenly get really expensive, especially when working with lots of text or asking many questions. It made me realize I needed a better way to learn and experiment with AI without blowing my budget. This is where Ollama came in, and it offered a rather interesting solution.

By using Ollama, you can:

  • Load and experiment with multiple LLMs locally
  • Avoid API rate limits and usage restrictions
  • Customize and fine-tune LLMs

In this article, we will explore how to build a simple document summarization tool using Ollama, Streamlit, and LangChain. Ollama allows us to run LLMs locally, Streamlit provides a web interface so that users may interact with those models smoothly, and LangChain offers pre-built chains for simplified development.

Environment Setup

  • Ensure Python 3.12 or higher is installed.
  • Download and install Ollama
  • Fetch llama3.2 model via ollama run llama3.2
  • I prefer to use Conda for managing dependencies and creating isolated environments. Create a new Conda environment and then install the necessary packages mentioned below.

Shell

pip install streamlit langchain langchain-ollama langchain-community langchain-core pymupdf

Now, let's dive into building our document summarizer. We will start by creating a Streamlit app to handle uploading documents and displaying summaries in a user-friendly interface.

Next, we will focus on pulling the text out of the uploaded documents (supports only PDF and text documents) and preparing everything for the summarization chain.

Finally, we will bring Ollama to actually perform the summarization utilizing its local language model capabilities to generate concise and informative summaries.

The code below contains the complete implementation, with detailed comments to guide you through each step.

Python

import osimport tempfileimport streamlit as stlitfrom langchain_text_splitters import CharacterTextSplitterfrom langchain.chains.summarize import load_summarize_chainfrom langchain_ollama import OllamaLLMfrom langchain_community.document_loaders import PyMuPDFLoaderfrom langchain_core.documents import Document# Create Streamlit app by page configuration, title and a file uploaderstlit.set_page_config(page_title="Local Document Summarizer", layout="wide")stlit.title("Local Document Summarizer")# File uploader that accepts pdf and txt files onlyuploaded_file = stlit.file_uploader("Choose a PDF or Text file", type=["pdf", "txt"])# Process the uploaded file and extracts text from itdef process_file(uploaded_file): if uploaded_file.name.endswith(".pdf"): with tempfile.NamedTemporaryFile(delete=False) as temp_file: temp_file.write(uploaded_file.getvalue()) loader = PyMuPDFLoader(temp_file.name) docs = loader.load() extracted_text = " ".join([doc.page_content for doc in docs]) os.unlink(temp_file.name) else: # Read the content directly for text files, no need for tempfile extracted_text = uploaded_file.getvalue().decode("utf-8") return extracted_text# Process the extracted text and return summarydef summarize(text): # Split the text into chunks for processing and create Document object chunks = CharacterTextSplitter(chunk_size=500, chunk_overlap=100).split_text(text) docs = [Document(page_content=chunk) for chunk in chunks] # Initialize the LLM with llama3.2 model and load the summarization chain chain = load_summarize_chain(OllamaLLM(model="llama3.2"), chain_type="map_reduce") return chain.invoke(docs)if uploaded_file: # Process and preview the uploaded file content extracted_text = process_file(uploaded_file) stlit.text_area("Document Preview", extracted_text[:1200], height=200) # Generate a summary of the extracted text if stlit.button("Generate Summary"): with stlit.spinner("Summarizing...may take a few seconds"): summary_text = summarize(extracted_text) stlit.text_area("Summary", summary_text['output_text'], height=400)

Running the App

Save the above code snippet into summarizer.py, then open your terminal, navigate to where you saved the file, and run:

Shell

streamlit run summarizer.py

That should start your Streamlit app and automatically open in your web browser, pointing to a local URL like http://localhost:8501.

Conclusion

You've just completed the document summarization tool by combining Streamlit’s simplicity and Ollama’s local model hosting capabilities. This example utilizes the llama3.2 model, but you can experiment with other models to determine what is best for your needs, and you can also consider adding support for additional document formats, error handling, and customized summarization parameters.

Happy AI experimenting!

AI Document Tool

Opinions expressed by DZone contributors are their own.

Related

  • Build an AI Browser Agent With LLMs, Playwright, Browser Use

  • How to Use AI With WordPress

  • Building Custom Tools With Model Context Protocol

  • Automate Developer Routine With Swift in iOS Development

Build a Local AI-Powered Document Summarization Tool (2025)

References

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Pres. Lawanda Wiegand

Last Updated:

Views: 5472

Rating: 4 / 5 (51 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Pres. Lawanda Wiegand

Birthday: 1993-01-10

Address: Suite 391 6963 Ullrich Shore, Bellefort, WI 01350-7893

Phone: +6806610432415

Job: Dynamic Manufacturing Assistant

Hobby: amateur radio, Taekwondo, Wood carving, Parkour, Skateboarding, Running, Rafting

Introduction: My name is Pres. Lawanda Wiegand, I am a inquisitive, helpful, glamorous, cheerful, open, clever, innocent person who loves writing and wants to share my knowledge and understanding with you.