This open-source application allows users to ask questions to various models (like ChatGPT, Ollama, Gemini) and receive answers. It also provides the functionality to upload documents and get answers based on the content of these documents. Users have the freedom to customize the processing pipeline by selecting different options for models,text extraction, splitting, embedding generation, and storage thus providing one single platform leveraging all the capabilities.
Model Selection: Users can choose from various models (ChatGPT, Ollama, Gemini etc.) to answer their queries.
Document Upload: Supports uploading documents in various formats (PDF, text, Excel etc.).
Text Extraction: Users can select different text extractors for extracting text from documents.
Text Splitting: Provides multiple options for splitting the text into chunks (Langchain RecursiveCharacterTextSplitter, etc.).
Embedding Generation: Users can choose from several transformers (SentenceTransformer, etc.) for generating embeddings.
Storage Options: Supports multiple storage options for embeddings (Faiss, etc.).
Customizable Pipeline: The platform allows users to customize each step of the processing pipeline to suit their needs.
Local and Cloud Deployment: The application can run both locally and on cloud platforms, ensuring data privacy when run locally.
Python 3.8+
PDM (Python Dependency Manager)
Docker
Clone the repository:
git clone https://github.com/gokulnath30/QueryGenie.git
cd QueryGenieBuild the Docker container:
docker build -t QueryGenie .Install the dependencies using PDM:
pdm installRun the application locally:
streamlit run app.pyLaunching the Application: Open your browser and go to http://localhost:8501 to access the Streamlit UI.
Uploading a Document:
Click on the upload button and select your document (PDF, text, Excel).
Choose the text extractor from the provided options.
Customizing the Processing Pipeline:
Select the text splitter to chunk the document.
Choose the transformer to generate embeddings.
Pick the storage option to store embeddings.
Asking Questions:
Select the model you want to use (ChatGPT, Ollama, Gemini).
Enter your question in the input box and submit.
View the answer generated based on the selected options and document content.
The application allows for easy customization through the Streamlit UI. Here are the configurable components:
Text Extractors: Choose from a variety of text extraction methods.
Text Splitters: Options for splitting the text into manageable chunks.
Transformers: Select from different transformers for embedding generation.
Storage Options: Pick the preferred storage for embeddings.