Try the Full Application
The demo runs on a GPU-powered server with Mistral 7B, Qwen 2.5 7B, and Phi-3 14B. Create an account to explore the full memory system with your own data.
Launch Demo ApplicationWhat You Can Explore
Memory Tiers
See how short-term, long-term, and persistent memory work together to provide context.
Data Ingestion
Connect data sources and watch your personal knowledge base grow in real-time.
Multiple Models
Switch between Mistral, Qwen, and Phi-3 to compare how different models handle your context.
Context Search
Search your memories semantically - find information by meaning, not just keywords.
Demo Limitations
The public demo has some limitations compared to running Memory locally:
- Shared resources - GPU inference is shared across users, so responses may be slower during peak times
- Limited data sources - Some data source adapters require local file access and aren't available in the demo
- Session persistence - Demo accounts may be cleaned up periodically
Run It Locally
For the full experience, clone the repository and run Memory on your own machine. With Ollama installed, you can run entirely offline with complete privacy.
Getting Started Locally
# Clone the repository
git clone https://github.com/rod-higgins/memory.git
cd memory
# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -e ".[web]"
# Install Ollama and pull a model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull mistral:7b
# Run the application
python -m memory.web.app
Visit http://localhost:8765 and create your first account. The setup
wizard will guide you through connecting data sources and configuring your personal
language model.