Proprius Labs - AI-First Personal Identity Infrastructure

Every day, millions of people open dozens of browser tabs—research articles, product pages, social media, work documents—and never close them. Why? Because closing a tab feels like losing it forever.

Calvin solves this with AI. Here's exactly how it works under the hood.

The Architecture Overview

Calvin's intelligence comes from three core systems working together:

**Content Extraction**: Understanding what each tab actually contains

**Semantic Embeddings**: Converting that understanding into mathematical representations

**Intelligent Clustering**: Grouping related content automatically

Let's dive into each.

Content Extraction: Beyond the URL

A URL tells you almost nothing. medium.com/p/abc123 could be about quantum physics or sourdough bread. Calvin needs to understand the content.

When you save a tab, Calvin extracts:

Title and meta description: (the easy stuff)

Main content body: (using readability algorithms to strip navigation and ads)

Key entities: (people, companies, topics mentioned)

Content type: (article, product page, documentation, social post)

This extraction happens locally in your browser for privacy. Only the processed, anonymized features get sent to our servers.

Semantic Embeddings: The Math of Meaning

Here's where it gets interesting. Calvin converts each tab's content into a 768-dimensional vector—a point in high-dimensional space where similar content clusters together.

We use a fine-tuned sentence transformer model optimized for:

Cross-domain understanding: A machine learning tutorial and a TensorFlow documentation page should be "near" each other

Intent recognition: Shopping research tabs cluster differently from academic research tabs

Personal context: Your "travel planning" tabs form coherent groups even if they span flights, hotels, and restaurants

The embedding model runs on our infrastructure, but we never store your raw content—only the vectors and metadata needed for organization.

Intelligent Clustering: Finding Natural Groups

With thousands of tabs represented as vectors, Calvin uses a modified HDBSCAN algorithm to find natural groupings. Why HDBSCAN?

No predefined cluster count: We don't force your tabs into exactly 10 folders

Handles noise: Outlier tabs that don't fit anywhere stay uncategorized

Hierarchical structure: "Work" can contain "Project Alpha" which contains "Design Specs"

The clustering considers:

Semantic similarity: (from embeddings)

Temporal proximity: (tabs saved together often relate)

Domain patterns: (your GitHub tabs probably relate to each other)

The Learning Loop

Calvin gets smarter the more you use it. When you:

Move a tab: between folders → reinforcement signal

Merge folders: → these concepts belong together for you

Split a folder: → these concepts are distinct in your mental model

This feedback fine-tunes your personal organization model without ever sharing your data with other users.

Privacy by Design

Everything described above was built with privacy as a core constraint:

Raw content never leaves your browser for extraction

Embeddings are computed on anonymized features

Your personal model lives in your account, not a shared system

We can't reconstruct your browsing from the vectors we store

What's Next

We're working on:

Real-time clustering: as you browse (not just when you save)

Cross-device context: understanding

Natural language queries: ("find that article about React hooks I saved last month")

Tab management is just the beginning. The same architecture will power our future products for organizing your entire digital life.

Want to try it yourself? Get Calvin and experience intelligent tab management.