Features
- Parse complex PDF documents (100s of pages)
- Chunk them and calculate embeddings for chunks
- Efficient compute of embeddings using batch mode
- Store the calculated embeddings in a vector database for fast retrieval
- When user asks a question, the question and relevant document chunks are sent to LLM to get the answer
Tech Stack
- Parsing PDF documents using llama-index
- Embedding model: Qwen/Qwen3-Embedding-8B - running on Nebius AI Studio
- Vector Database: Milvus
- LLMs: open source LLMs (GPT-OSS / Qwen3 / DeepSeek) running on Nebius AI Studio
Pre requisites
- Nebius API key. Sign up for free at AI Studio
Getting Started
-
Get the code:
-
Install dependencies:
If using
uv
(prefered):If usingpip
: -
Create .env file in the project root and add your Nebius API key:
Running the Agent
If usinguv
:
pip
Process PDFs
We will- parse PDF files
- compute embeddings
- and store them into vector database