DocumentAccessPOC
A proof-of-concept secure document management system that solves the critical challenge of granular access control in organizational environments where traditional permission systems fall short.
π© Problem Statement
In modern organizations, controlling document access is complex and fraught with security risks. Traditional approaches like Role-Based Access Control (RBAC), file system permissions, and Access Control Lists (ACLs) struggle with:
- Overly Broad Access: Developers and DevOps teams often have administrative access to everything, undermining confidentiality.
- Lack of Granularity: Difficulty controlling access based on multiple dimensions (e.g., Teams, Projects, and Organization Levels).
- Document Duplication: Sharing documents often requires creating multiple copies with different permissions, leading to versioning chaos.
- Scalability Issues: Managing permissions becomes unmanageable as users and documents grow.
- Security Vulnerabilities: System administrators can access sensitive content regardless of intended restrictions.
π‘ Solution Overview
This project introduces a secure system where each document is encrypted with a unique key. Access is then managed through a central SharedKeyRegistry, which grants permissions on a per-user, per-document basis without ever exposing the document's content to the server.
β
Zero-Trust for Data at Rest: Even system administrators cannot access document content without authorization.
β
Granular Control Architecture: The database schema is built for fine-grained permissions (see Roadmap).
β
No Duplication: A single encrypted document is stored, with access managed dynamically.
β
Scalable by Design: Efficiently manage permissions for thousands of users and documents.
β
Secure Collaboration: Multiple users can work on the same secure document version.
π± Project Genesis: Solving a Real-World Challenge
This project wasn't born from a theoretical exercise; it was created to solve a critical, real-world security gap encountered in modern software development.
The journey began while a colleague was building an internal project management tool. The tool required a document storage system where teams (like Finance or HR) could upload highly sensitive files. The core problem was this: how can you guarantee that sensitive documents are inaccessible even to the DevOps and Cloud Admins who manage the infrastructure? With direct access to storage backends like AWS S3, traditional permissions are easily bypassed.
While concepts like using a Document Encryption Key (DEK) were known, existing systems didn't offer a clear solution for the most complex part: managing access for multiple, specific users in a zero-trust way.
The breakthrough came from tackling that multi-user challenge from first principles: 1. A single, encrypted document should exist, avoiding duplication. 2. To grant access, its unique DEK must be shared securely. 3. Instead of sharing the DEK directly, it could be encrypted separately for each authorized user using their individual public key. 4. A central registry could then map which users have access to which documents by storing these individually-encrypted DEKs.
This design elegantly solved the problem. Sharing a document with a new user becomes a lightweight operation of encrypting the DEK one more time, and revocation is as simple as deleting an entryβall without the server ever needing to see the plaintext keys.
Only after its conception was it clear that this independently-derived architecture aligns perfectly with robust industry best practices. It's a powerful implementation of what is known as envelope encryption combined with a dynamic key registry for cryptographic access control. This PoC serves as a blueprint for developers facing the same dilemma, demonstrating a practical path to building truly secure systems.
ποΈ Architecture
βββββββββββββββββ ββββββββββββββββββββββ βββββββββββββββββββββ
β User Client β β FastAPI Server β β Storage Backend β
β βββββββΊβ βββββββΊβ (e.g., Local, S3) β
β β’ JWT Token β β β’ API Endpoints β β β
β β β β’ JWT Validation β β β’ Encrypted β
βββββββββββββββββ β β’ Access Logic β β Documents β
β β’ Crypto Ops β β β
ββββββββββββββββββββββ βββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Database β
β β
β β’ Users β
β β’ Documents β
β β’ Public Keys β
β β’ SharedKeyRegistry β
βββββββββββββββββββββββββ
π API Reference
Authentication
POST /users: Create a new user account.POST /token: Authenticate and get a JWT token.
Users
GET /users/me: Get the profile of the currently authenticated user.GET /users: Get public profiles of users by their IDs.
Documents
POST /documents: Upload, encrypt, and share a new document.GET /documents: List all documents accessible to the current user.GET /documents/{doc_id}: Download and decrypt a document.PUT /documents/{doc_id}/share: Share a document with more users.PUT /documents/{doc_id}/revoke: Revoke user access from a document.DELETE /documents/{doc_id}: Securely delete a document.
π Positioning in the Security Landscape
When solving a fundamental problem like secure data access, it's common for independent efforts to converge on similar architectural patterns. After developing this PoC, a review of the landscape shows that the core principles used hereβlike envelope encryption and cryptographic access controlβare industry best practices, validating the soundness of the approach.
However, the focus and purpose of this project are fundamentally different from existing commercial applications. This PoC is not intended to be a feature-complete alternative to a platform like Tresorit or Nextcloud. Instead, its unique value lies in being a:
Developer-centric, unopinionated, and self-hostable backend blueprint.
The following comparison clarifies this distinction, showing how this project fills a specific niche for developers who need to build secure document functionality into their own applications, rather than adopting a monolithic, all-in-one platform.
| Feature / Aspect | DocumentAccessPOC (This PoC) | Commercial SaaS (e.g., Tresorit) | Open-Source Platform (e.g., Nextcloud) |
|---|---|---|---|
| Security Model | β Zero-Trust | β Zero-Knowledge | β (With E2EE Module) |
| Deployment Model | Self-Hosted API | Managed SaaS | Self-Hosted App |
| Granular Control | β (Architectural) | β (Full UI) | β (Full UI) |
| Cryptographic Revocation | β οΈ Partial (No Key Rotation) | β Complete | β οΈ Partial / Complex |
| Client-Side Keys | β οΈ Server-Side (In-Memory) | β Yes | β Yes |
| Focus | Secure Backend/API | End-User Application | Full Collaboration Platform |
Key Takeaway: This PoC serves as a foundational secure backend. Its purpose is to provide a clear, working model of the core cryptographic and access control logic that developers can learn from, adapt, and integrate into their own products.
β οΈ PoC Scope & Production Considerations
This project is a Proof-of-Concept designed to demonstrate a secure architectural pattern. The following table outlines key production-level features that were intentionally scoped out to maintain a tight focus on the core backend logic. It details the current PoC's behavior, the ideal production approach, and the rationale behind each scoping decision.
| Area / Feature | Current PoC Implementation | Production-Ready Approach | Scoping Rationale for PoC |
|---|---|---|---|
| Cryptographic Revocation | Removes user's key from the registry, preventing future API access. | Key Rotation: Re-encrypt the document with a new key and distribute it to remaining users. | This is a complex and computationally expensive workflow, outside the core goal of demonstrating the primary access and sharing mechanism. |
| Private Key Handling | Private key is briefly decrypted in server memory during authentication. | Client-Side Cryptography: All private key operations occur on the client (browser/app), achieving a full zero-knowledge architecture. | This requires a dedicated frontend application with crypto libraries. The PoC's focus was on the backend API that such a client would consume. |
| Account Recovery | None. A lost passphrase results in permanent data inaccessibility. | User-Managed Recovery Key: A one-time key is generated during onboarding that the user must save to restore access if their passphrase is lost. | This is an application-layer user flow, separate from the core cryptographic backend this PoC aims to demonstrate. |
| Collaboration & Compliance | No document versioning or audit trails are implemented. | Dedicated Subsystems: A robust versioning engine to prevent data loss and a tamper-resistant audit log for security and compliance. | These are both major subsystems. Implementing them would have detracted from the PoC's primary goal of proving the core zero-trust access model. |
ποΈ Project Structure
DocumentAccessPOC/
βββ main.py # FastAPI application entry point
βββ requirements.txt # Python dependencies
βββ config.py # Configuration settings (DB, JWT, Storage)
βββ models/ # Data models and schemas (SQLModel)
βββ helpers/ # Cryptographic utilities (AES, RSA, JWT)
βββ backends/ # Storage and database backend abstractions
βββ logic/ # Business logic and dependencies
βββ docs/ # Additional documentation
π Acknowledgments
Thanks to the FastAPI team and the contributors to the PyCryptodome and Cryptography libraries for their excellent tools.
A Note on Passphrase Generation
The unique, one-time passphrases in this project are generated using a custom library, BetterPassphrase, also created by me. It's designed to create secure, memorable passphrases from grammatically correct phrases.
Key Features of BetterPassphrase:
- Secure & Memorable: Generates grammatically correct phrases that are easier to remember than random strings.
- Security Focused: Includes entropy calculations to balance memorability with cryptographic strength.
- Highly Customizable: Control word count, separators, capitalization, and more.
- Powerful CLI: A full command-line interface for batch generation and file output.
You can check it out on PyPI or explore the source code on GitHub.
π License
This is a Proof-of-Concept project and is not currently available under a formal open-source license. It is intended for demonstration and educational purposes.