Posted: 24 July 2025

Why We Backed Memories.ai: Building the Memory Layer for AI’s Next Chapter

Some of the most profound advances in AI don’t come from bigger models or faster inference. They come from solving bottlenecks hidden in plain sight. Memory is one of them.

That’s what struck us immediately when we met Shawn Shen and Enmin Zhou. Their clarity of thought, depth of technical insight, and vision for a memory-augmented future. What they were building at Memories.ai wasn’t an iteration on vision models or a smarter search engine, it was something foundational: persistent, structured memory for visual data.

Today we’re proud to share that Crane has invested in the $8M seed round for Memories.ai, alongside Susa Ventures (lead), Samsung Next, Fusion Fund, Seedcamp, and Creator Ventures.

The Problem: Stateless AI

LLMs like ChatGPT can recall short-term context but lose continuity across sessions. Vision models typically process frames or short clips in isolation, without building persistent understanding. Even advanced multimodal systems struggle to reason across time. This lack of long-term memory, context accumulation, and temporal coherence is especially limiting for video —one of the richest data types in AI.

This limitation holds back a wide range of real-world applications including surveillance, sports analytics, AR/VR and robotics. An AI assistant might summarize a 10-minute clip but can’t relate it to a similar event from last week. Security systems can flag an intruder in the moment, but struggle to recognize patterns of suspicious behavior across days or locations. Without memory, AI remains reactive, unable to connect the dots over time.

What Memories.ai Is Building

Memories.ai is creating a new layer in the AI stack: a memory infrastructure purpose-built for long-term visual understanding. Their Large Visual Memory Model (LVMM) doesn’t just process frames. It builds dynamic, persistent memory graphs that span sessions, devices, and timelines. It transforms video into rich memory units —graph-based, contextual, and evolving, that can be searched, reasoned over, and connected indefinitely.

Technically, this means:

Compressing visual inputs into rich semantic memory representations
Structuring that memory into searchable, indexable graphs
Enabling fast, memory-aware retrieval that doesn’t require reprocessing entire video streams

The result? AI systems that can search, reason, and learn from video the way humans do, by accumulating context, linking behaviors, and retaining meaningful patterns.

Why It Matters

Video is one of the richest sources of information in the world, but also the most underutilized. Over 1 billion cameras generate petabytes of video data daily, yet most of it remains unused. There’s no way to store it economically, analyze it continuously, or make it truly useful. Memory is building the bridge between intelligence and understanding. Without it, AI is trapped in the present, analyzing single inputs and forgetting everything else. With memory, AI can become contextual, relational, and personal. It can track, learn, recall, and adapt.

Memories.ai is doing for video what vector databases and retrieval-augmented generation did for text: unlocking persistent, structured recall. But their system goes further, it doesn’t just index past frames, it builds an evolving representation of reality. This opens the door to entirely new forms of machine perception: AI that remembers what it sees, builds understanding over time, and acts with long-term awareness. Imagine assistants that recall your environment across days, robots that learn from lived experience, or wearable devices that augment your memory in real-time.

Why Now

AI is crossing a threshold. As systems move from single-turn tools to agentic systems that operate continuously, across space and time. As we move into a world of autonomous agents, including robotics, driverless cars, and AR devices, we need infrastructure that can support long-term memory and contextual reasoning. We believe video will become the dominant form of data, and edge computing has matured enough to support lightweight, real-time intelligence.

No one is solving memory infrastructure at the video level. Memories.ai enables that kind of temporal understanding, unlocking a new layer of intelligence that today’s stateless systems simply can’t reach.

The Team

We’ve been consistently impressed by Shawn Shen and Enmin Zhou, former Meta Reality Labs engineers with deep expertise in visual AI and on-device learning. They’ve published top-tier research, shipped production systems, and are building with the clarity, urgency, and humility we look for in the very best technical founders. We’re proud to be backing them and excited to support what we believe is a generational team building foundational infrastructure for the future of AI.

What’s Next

Memories.ai unlocks a new dimension for AI, one where systems can build knowledge over time, not just react in the moment. It’s the missing layer that turns perception into understanding, and makes long-term, human-like intelligence possible.

We’re excited to support Shawn and Enmin as they grow their team, deepen their platform, and help AI evolve from reactive to truly reflective. If you’re building with video, working on AI systems, or simply curious about what happens when machines can remember. Go check them out at memories.ai.

Because the future of AI won’t just be about thinking faster.
It will be about remembering better.
And that starts here.

–Max Chapman, Crane Venture Partners