Live Demo - Memory | Personal Language Models

Try the Full Application

The demo runs on a GPU-powered server with Mistral 7B, Qwen 2.5 7B, and Phi-3 14B. Create an account to explore the full memory system with your own data.

Launch Demo Application

What You Can Explore

Memory Tiers

See how short-term, long-term, and persistent memory work together to provide context.

Data Ingestion

Connect data sources and watch your personal knowledge base grow in real-time.

Multiple Models

Switch between Mistral, Qwen, and Phi-3 to compare how different models handle your context.

Context Search

Search your memories semantically - find information by meaning, not just keywords.

Demo Limitations

The public demo has some limitations compared to running Memory locally:

Shared resources - GPU inference is shared across users, so responses may be slower during peak times
Limited data sources - Some data source adapters require local file access and aren't available in the demo
Session persistence - Demo accounts may be cleaned up periodically

Run It Locally

For the full experience, clone the repository and run Memory on your own machine. With Ollama installed, you can run entirely offline with complete privacy.

Getting Started Locally

# Clone the repository
git clone https://github.com/rod-higgins/memory.git
cd memory

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -e ".[web]"

# Install Ollama and pull a model
curl -fsSL https://ollama.com/install.sh | sh
ollama pull mistral:7b

# Run the application
python -m memory.web.app

Visit http://localhost:8765 and create your first account. The setup wizard will guide you through connecting data sources and configuring your personal language model.

Try the Demo View Source