Skip to content

DocumentAccessPOC

A proof-of-concept secure document management system that solves the critical challenge of granular access control in organizational environments where traditional permission systems fall short.

Python FastAPI Cryptography

🚩 Problem Statement

In modern organizations, controlling document access is complex and fraught with security risks. Traditional approaches like Role-Based Access Control (RBAC), file system permissions, and Access Control Lists (ACLs) struggle with:

  • Overly Broad Access: Developers and DevOps teams often have administrative access to everything, undermining confidentiality.
  • Lack of Granularity: Difficulty controlling access based on multiple dimensions (e.g., Teams, Projects, and Organization Levels).
  • Document Duplication: Sharing documents often requires creating multiple copies with different permissions, leading to versioning chaos.
  • Scalability Issues: Managing permissions becomes unmanageable as users and documents grow.
  • Security Vulnerabilities: System administrators can access sensitive content regardless of intended restrictions.

πŸ’‘ Solution Overview

This project introduces a secure system where each document is encrypted with a unique key. Access is then managed through a central SharedKeyRegistry, which grants permissions on a per-user, per-document basis without ever exposing the document's content to the server.

βœ… Zero-Trust for Data at Rest: Even system administrators cannot access document content without authorization.
βœ… Granular Control Architecture: The database schema is built for fine-grained permissions (see Roadmap).
βœ… No Duplication: A single encrypted document is stored, with access managed dynamically.
βœ… Scalable by Design: Efficiently manage permissions for thousands of users and documents.
βœ… Secure Collaboration: Multiple users can work on the same secure document version.

🌱 Project Genesis: Solving a Real-World Challenge

This project wasn't born from a theoretical exercise; it was created to solve a critical, real-world security gap encountered in modern software development.

The journey began while a colleague was building an internal project management tool. The tool required a document storage system where teams (like Finance or HR) could upload highly sensitive files. The core problem was this: how can you guarantee that sensitive documents are inaccessible even to the DevOps and Cloud Admins who manage the infrastructure? With direct access to storage backends like AWS S3, traditional permissions are easily bypassed.

While concepts like using a Document Encryption Key (DEK) were known, existing systems didn't offer a clear solution for the most complex part: managing access for multiple, specific users in a zero-trust way.

The breakthrough came from tackling that multi-user challenge from first principles: 1. A single, encrypted document should exist, avoiding duplication. 2. To grant access, its unique DEK must be shared securely. 3. Instead of sharing the DEK directly, it could be encrypted separately for each authorized user using their individual public key. 4. A central registry could then map which users have access to which documents by storing these individually-encrypted DEKs.

This design elegantly solved the problem. Sharing a document with a new user becomes a lightweight operation of encrypting the DEK one more time, and revocation is as simple as deleting an entryβ€”all without the server ever needing to see the plaintext keys.

Only after its conception was it clear that this independently-derived architecture aligns perfectly with robust industry best practices. It's a powerful implementation of what is known as envelope encryption combined with a dynamic key registry for cryptographic access control. This PoC serves as a blueprint for developers facing the same dilemma, demonstrating a practical path to building truly secure systems.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  User Client  β”‚      β”‚   FastAPI Server   β”‚      β”‚  Storage Backend  β”‚
β”‚               │◄────►│                    │◄────►│ (e.g., Local, S3) β”‚
β”‚  β€’ JWT Token  β”‚      β”‚  β€’ API Endpoints   β”‚      β”‚                   β”‚
β”‚               β”‚      β”‚  β€’ JWT Validation  β”‚      β”‚  β€’ Encrypted      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚  β€’ Access Logic    β”‚      β”‚    Documents      β”‚
                       β”‚  β€’ Crypto Ops      β”‚      β”‚                   β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                      β”‚       Database        β”‚
                      β”‚                       β”‚
                      β”‚  β€’ Users              β”‚
                      β”‚  β€’ Documents          β”‚
                      β”‚  β€’ Public Keys        β”‚
                      β”‚  β€’ SharedKeyRegistry  β”‚
                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“š API Reference

Authentication

  • POST /users: Create a new user account.
  • POST /token: Authenticate and get a JWT token.

Users

  • GET /users/me: Get the profile of the currently authenticated user.
  • GET /users: Get public profiles of users by their IDs.

Documents

  • POST /documents: Upload, encrypt, and share a new document.
  • GET /documents: List all documents accessible to the current user.
  • GET /documents/{doc_id}: Download and decrypt a document.
  • PUT /documents/{doc_id}/share: Share a document with more users.
  • PUT /documents/{doc_id}/revoke: Revoke user access from a document.
  • DELETE /documents/{doc_id}: Securely delete a document.

πŸ†š Positioning in the Security Landscape

When solving a fundamental problem like secure data access, it's common for independent efforts to converge on similar architectural patterns. After developing this PoC, a review of the landscape shows that the core principles used hereβ€”like envelope encryption and cryptographic access controlβ€”are industry best practices, validating the soundness of the approach.

However, the focus and purpose of this project are fundamentally different from existing commercial applications. This PoC is not intended to be a feature-complete alternative to a platform like Tresorit or Nextcloud. Instead, its unique value lies in being a:

Developer-centric, unopinionated, and self-hostable backend blueprint.

The following comparison clarifies this distinction, showing how this project fills a specific niche for developers who need to build secure document functionality into their own applications, rather than adopting a monolithic, all-in-one platform.

Feature / Aspect DocumentAccessPOC (This PoC) Commercial SaaS (e.g., Tresorit) Open-Source Platform (e.g., Nextcloud)
Security Model βœ… Zero-Trust βœ… Zero-Knowledge βœ… (With E2EE Module)
Deployment Model Self-Hosted API Managed SaaS Self-Hosted App
Granular Control βœ… (Architectural) βœ… (Full UI) βœ… (Full UI)
Cryptographic Revocation ⚠️ Partial (No Key Rotation) βœ… Complete ⚠️ Partial / Complex
Client-Side Keys ⚠️ Server-Side (In-Memory) βœ… Yes βœ… Yes
Focus Secure Backend/API End-User Application Full Collaboration Platform

Key Takeaway: This PoC serves as a foundational secure backend. Its purpose is to provide a clear, working model of the core cryptographic and access control logic that developers can learn from, adapt, and integrate into their own products.

⚠️ PoC Scope & Production Considerations

This project is a Proof-of-Concept designed to demonstrate a secure architectural pattern. The following table outlines key production-level features that were intentionally scoped out to maintain a tight focus on the core backend logic. It details the current PoC's behavior, the ideal production approach, and the rationale behind each scoping decision.

Area / Feature Current PoC Implementation Production-Ready Approach Scoping Rationale for PoC
Cryptographic Revocation Removes user's key from the registry, preventing future API access. Key Rotation: Re-encrypt the document with a new key and distribute it to remaining users. This is a complex and computationally expensive workflow, outside the core goal of demonstrating the primary access and sharing mechanism.
Private Key Handling Private key is briefly decrypted in server memory during authentication. Client-Side Cryptography: All private key operations occur on the client (browser/app), achieving a full zero-knowledge architecture. This requires a dedicated frontend application with crypto libraries. The PoC's focus was on the backend API that such a client would consume.
Account Recovery None. A lost passphrase results in permanent data inaccessibility. User-Managed Recovery Key: A one-time key is generated during onboarding that the user must save to restore access if their passphrase is lost. This is an application-layer user flow, separate from the core cryptographic backend this PoC aims to demonstrate.
Collaboration & Compliance No document versioning or audit trails are implemented. Dedicated Subsystems: A robust versioning engine to prevent data loss and a tamper-resistant audit log for security and compliance. These are both major subsystems. Implementing them would have detracted from the PoC's primary goal of proving the core zero-trust access model.

πŸ—‚οΈ Project Structure

DocumentAccessPOC/
β”œβ”€β”€ main.py                 # FastAPI application entry point
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ config.py               # Configuration settings (DB, JWT, Storage)
β”œβ”€β”€ models/                 # Data models and schemas (SQLModel)
β”œβ”€β”€ helpers/                # Cryptographic utilities (AES, RSA, JWT)
β”œβ”€β”€ backends/               # Storage and database backend abstractions
β”œβ”€β”€ logic/                  # Business logic and dependencies
└── docs/                   # Additional documentation

πŸ™ Acknowledgments

Thanks to the FastAPI team and the contributors to the PyCryptodome and Cryptography libraries for their excellent tools.

A Note on Passphrase Generation

The unique, one-time passphrases in this project are generated using a custom library, BetterPassphrase, also created by me. It's designed to create secure, memorable passphrases from grammatically correct phrases.

Key Features of BetterPassphrase:

  • Secure & Memorable: Generates grammatically correct phrases that are easier to remember than random strings.
  • Security Focused: Includes entropy calculations to balance memorability with cryptographic strength.
  • Highly Customizable: Control word count, separators, capitalization, and more.
  • Powerful CLI: A full command-line interface for batch generation and file output.

You can check it out on PyPI or explore the source code on GitHub.

πŸ“œ License

This is a Proof-of-Concept project and is not currently available under a formal open-source license. It is intended for demonstration and educational purposes.