← Back to Blog

VaultBook Privacy Spotlight: Local JSON Setup for Sensitive Data

There is a distinction in data security thinking that does not receive the attention it deserves, because the dominant conversation about data privacy has been shaped primarily by the interests of the cloud service industry - an industry that has strong reasons to make the conversation about security practices rather than about data custody.

The distinction is this: the question of who can access data that has been secured with strong encryption on a remote server is a different question from the question of who holds custody of that data. The first question is about the effectiveness of the cryptographic barriers between an attacker and the data’s plaintext content. The second question is about the structural facts of where the data exists, who controls the infrastructure it exists on, and what legal, regulatory, and technical mechanisms could give a third party access to the data independent of the cryptographic barriers.

Cloud service providers answer the first question well and the second question poorly - and they answer the first question prominently in their marketing while treating the second question as a technical detail that most users do not need to concern themselves with. The encryption is strong. The security practices are certified. The privacy policy commits to not using the data in ways the user has not authorized. What is less prominently discussed is that the data, however strongly encrypted, resides in a data center that the cloud provider operates, that legal process can be served on the cloud provider requiring production of the data, that the provider’s employees can access the data in ways that the user cannot monitor or prevent, that the provider’s infrastructure is a target for sophisticated attackers who may find attack vectors that the encryption does not close, and that the provider’s ongoing commitment to the privacy practices described in today’s privacy policy is not technically enforceable - it is a promise that the organizational entity making it can revise, transfer through acquisition, or breach in ways the user may not immediately discover.

VaultBook’s local JSON storage architecture answers both questions. It answers the cryptographic question through per-entry AES-256-GCM encryption with PBKDF2 key derivation. And it answers the custody question definitively: the JSON files that constitute the vault reside on the user’s own device, in a folder structure the user can see, in a format the user can read, accessible only through mechanisms the user controls. The custody question does not produce a conditional answer that depends on the cloud provider’s behavior. It produces an absolute answer: the data is here, on this device, in this folder, and no third party has it.

This is what local JSON storage means in practice, and this article examines in detail why the format matters, what the data structure provides, and how VaultBook’s full feature set integrates with the transparency and portability that local JSON enables.

What JSON Storage Actually Means for a Knowledge Vault

JSON - JavaScript Object Notation - is a structured data format that has become the universal language of data exchange in software development. It represents data as hierarchically organized collections of key-value pairs, arrays, and nested objects, in plain text that is simultaneously machine-parseable and human-readable. A JSON file can be opened in any text editor on any operating system and read by any person with a basic understanding of the format. It requires no proprietary application to interpret and no vendor-provided tool to access.

This human readability is the property that makes JSON the right format for a privacy-focused local knowledge vault - not primarily because users will routinely read their vault data in raw JSON form, but because the readability makes the data format auditable, verifiable, and independent of any specific application’s continued operation.

An auditable format is one where the user can verify, independently of the application’s interface, what data the application has stored and how it has been structured. A VaultBook user who wants to verify that a deleted entry has been genuinely removed from the vault can examine the vault’s repository JSON file and confirm the entry’s absence. A user who wants to verify that a specific label has been correctly applied to a specific set of entries can examine the repository and confirm the label assignments. A compliance auditor who needs to verify the structure of stored data for regulatory purposes can read the JSON files directly rather than relying on the application’s interface to present the data accurately.

A verifiable format is one where the integrity of the stored data can be confirmed without trusting the application that produced it. Because JSON is plain text, standard file integrity verification tools - hashing utilities that compute a cryptographic hash of the file’s content - can be applied to the vault’s JSON files to verify that the files have not been modified since a known state. For compliance and forensic contexts where the integrity of documentation needs to be demonstrable, this verifiability is meaningful.

An application-independent format is one where the data can be accessed without the specific application that created it. VaultBook’s vault data is stored in JSON and markdown files that are readable without VaultBook. A user who loses access to VaultBook for any reason - the application is discontinued, the license expires, the device changes in a way that makes VaultBook temporarily inaccessible - can read their vault’s data using any text editor, any JSON viewer, or any markdown renderer. The knowledge that the vault contains is not held hostage by VaultBook’s continued operation. The data belongs to the user in the most literal sense: they can read it, copy it, transform it, and use it without VaultBook’s involvement.

The Vault’s Folder Structure: A Transparent Architecture

VaultBook organizes its local data in a folder structure whose components each serve a specific function in the vault’s architecture, and understanding this structure clarifies exactly what local storage means at the implementation level.

The vault’s root folder contains the primary repository file - a JSON document that holds the vault’s organizational state. This repository file contains the complete record of the vault’s pages and their hierarchy, the metadata for each entry including its title, creation date, modification date, due date, expiry date, labels, page assignment, section structure, and attachment references. It contains the user’s label definitions with their colors and descriptions, the vault’s settings and preferences, the AI features’ vote data for QA Actions reranking and Related Entries relevance training, the Favorites list, the search history that feeds Query Suggestions from History, and the Timetable’s scheduled event data.

The repository file is the vault’s single source of organizational truth. Every structural decision the user makes - creating a page, assigning a label, setting a due date, marking an entry as favorite, voting on a search result - is recorded in the repository file immediately. The repository’s content at any moment reflects the complete organizational state of the vault at that moment. An independent reader of the repository file - a developer building a tool to extract VaultBook data, an auditor reviewing the vault’s organizational structure, a backup system verifying that the repository reflects the expected structure - has access to the complete organizational picture without needing to access any other file.

Entry body content is stored in individual sidecar files in the vault’s details subdirectory. Each entry’s body content - the note text, the section structure with each section’s title and rich text body, the inline image data - is stored in a separate markdown file named by the entry’s unique identifier. This separation of body content from organizational metadata means that the repository file remains compact and fast to parse even as individual entries grow large, and that each entry’s content can be accessed, backed up, or verified independently of the repository.

Attached files are stored in the vault’s attachments directory. Each attached file is stored with its original filename and extension, making the attachment directory human-navigable - the user can open the attachments folder in their operating system’s file manager and see exactly which files are attached to vault entries. The attachment manifest stored in the repository connects each entry’s attachment references to the files in the attachments directory, maintaining the association between entries and their files at the metadata level while storing the actual files in a directly accessible form.

For VaultBook Pro users, a versions directory stores per-entry version snapshots with a sixty-day retention period. Each snapshot is a dated copy of the entry’s body content at a specific point in time, stored in the same markdown format as the live entry body. The versions directory is browsable independently of VaultBook’s interface - a user examining the versions directory directly can see the complete version history of any entry as a series of time-stamped files.

The license file in the vault’s root records the vault’s subscription status in a form that the application can verify without any network request - a locally stored record of the entitlements associated with the vault, signed in a way that allows the application to validate the license from local data.

Portability as a Professional Capability

The portability of VaultBook’s local JSON vault - the ability to copy the entire vault to a different location, a different device, or a different storage medium and have the vault be immediately accessible from that new location - is not merely a convenience feature. For specific professional contexts, it is a capability requirement that cloud-dependent tools cannot satisfy.

The most straightforward portability scenario is device migration. A professional who is moving from one device to another - upgrading hardware, transitioning from a work-provided device to a personal device, or moving between operating systems - can copy the VaultBook vault folder to the new device and have immediate, complete access to the entire vault from the new device. No export, no import, no data transformation, and no interaction with any cloud service is required. The vault folder contains everything the application needs, and connecting VaultBook to the vault folder on the new device provides the same vault experience as on the original device.

The USB portability scenario extends this to temporary access from devices the user does not own - a library computer, a colleague’s workstation, a device in a secure facility that allows USB drives but not internet access. Carrying the vault folder on a USB drive allows the user to access the vault from any device with a modern browser, without installing any application beyond VaultBook’s single HTML file and without leaving any trace of vault content on the host device when the session ends. For professionals who work in environments where the devices available are not their own - clinical settings where shared workstations are the norm, legal environments where specific secure terminals must be used for confidential work, research facilities where secure computing resources are shared - USB portability makes VaultBook accessible in contexts where cloud-dependent tools would either require network access that is unavailable or leave data on devices that the user does not control.

The air-gap scenario represents the most demanding portability requirement - operation in an environment that is physically isolated from any network connection, where no internet access is available and where the operational security requirements of the environment prohibit internet-capable connections. Government classified computing environments, certain research facilities, corporate environments with strict air-gap policies for sensitive systems - these are contexts where cloud-dependent applications are not merely inconvenient but operationally prohibited. VaultBook’s local JSON architecture is inherently compatible with air-gap operation because it requires no network connectivity for any of its functions. The vault, the application, the AI features, the search indexing, the attachment processing, and the analytics all operate from local data without any network dependency.

The backup portability scenario addresses the data protection requirement that any serious professional knowledge base imposes. Because the vault is a folder of standard files, backing it up requires nothing more than copying the folder to a backup location - an external drive, a second device, an encrypted cloud backup, or any other backup destination the user chooses. The backup is a complete and immediately usable copy of the vault; restoring from the backup requires only connecting VaultBook to the backup copy of the vault folder. There is no proprietary backup format, no application-specific restore process, and no dependency on VaultBook’s involvement in either the backup or the restore operation.

What the JSON Format Enables for Compliance Auditing

The structured, human-readable, application-independent nature of VaultBook’s JSON data format has specific implications for compliance auditing that distinguish VaultBook from cloud-hosted alternatives in contexts where regulatory compliance requires documentation of data handling practices.

A compliance audit of a knowledge management system typically requires answers to several categories of questions about the data the system holds. What data exists in the system, and how is it categorized? Where is the data stored, and who controls access to it? When was each piece of data created, modified, and accessed? What retention and disposal practices apply, and how are they enforced? Has any data been transmitted to third parties, and under what conditions?

For a cloud-hosted knowledge management system, answering these questions requires the cooperation of the cloud service provider - their reports on data location, their access logs, their retention policies and how they are implemented in their infrastructure, and their documentation of any third-party data sharing. The auditor is dependent on the provider’s representations because the raw data and the infrastructure that holds it are not directly accessible.

For VaultBook, answering these questions requires examining the local vault folder. The repository JSON file documents what data exists and how it is categorized - every entry’s metadata, including its labels, its page assignment, its creation and modification timestamps, its due date, and its expiry date, is present in the repository. The vault folder itself documents where the data is stored - it is in this folder, on this device. The repository’s timestamps document when data was created and modified. The expiry date fields and the sixty-day purge policy document what retention practices apply. The application’s architecture documents that no data is transmitted to any third party, because the application makes no network requests. Each of these questions is answerable from direct examination of the vault’s local data without any intermediary, without any report from a cloud provider, and without any dependence on third-party representations.

This auditability is particularly valuable in healthcare settings where HIPAA audits may require documentation of where PHI resides and how it is protected, in legal settings where privilege logs may require documentation of what privileged material the attorney maintains and where it is stored, in financial settings where regulatory examinations may require documentation of how client information is held and protected, and in corporate settings where data governance audits may require documentation of where sensitive proprietary information lives and who controls access to it.

The local JSON format provides this documentation as a natural consequence of its design - the transparency that makes the format human-readable also makes it auditable, and the local storage that makes the data private also makes it directly accessible to the person or organization conducting the audit without requiring cooperation from any third party.

Combining Local Storage With System-Level Security Layers

VaultBook’s local JSON storage is the foundation of its security architecture, but it is designed to be combined with additional security layers that the user configures independently, creating a layered approach to data protection that addresses multiple threat vectors simultaneously.

The first additional layer is system-level full-disk encryption. FileVault on macOS, BitLocker on Windows, and VeraCrypt for cross-platform encrypted volumes provide encryption of all data on the device’s storage at the disk level. When the device is powered off, all data on the encrypted volume - including the vault folder with all its JSON files, attachment files, and version snapshots - is encrypted with the system-level key. An attacker who obtains the device’s storage media - by physical theft of the device, by forensic examination of a confiscated device, or by any other means of accessing the storage hardware - obtains only encrypted data that cannot be read without the system-level decryption credential.

The combination of system-level disk encryption and VaultBook’s local storage means that the vault’s data is protected at two independent levels: the disk-level encryption protects all vault data from access through the storage hardware, and VaultBook’s application-level password and per-entry AES-256-GCM encryption protect specific vault content from access through the application interface. An attacker who somehow bypasses the disk-level encryption still faces VaultBook’s application-level protection; an attacker who somehow bypasses the application-level protection faces the disk-level encryption on the underlying storage.

VaultBook’s per-entry AES-256-GCM encryption adds a third layer specifically for the most sensitive entries. The PBKDF2 key derivation at 100,000 iterations of SHA-256, with a random 16-byte salt and a random 12-byte initialization vector for each encryption operation, provides cryptographic protection for individual entries that persists in the JSON storage files themselves - the encrypted ciphertext is what is stored in the entry’s body sidecar file when per-entry encryption is enabled. An attacker who gains access to the vault’s JSON files - by obtaining the device after the disk-level encryption has been unlocked, by accessing the vault folder through any means during an active session, or by obtaining a copy of the vault folder through a backup or sync mechanism - encounters encrypted ciphertext for all per-entry encrypted entries, with no stored credential that would allow decryption without the entry-specific password.

The session caching of per-entry passwords - which holds the entered password in the browser’s working memory for the duration of the session, avoiding repeated re-entry - maintains this cryptographic protection in the stored files throughout the session. The decrypted content exists only in session memory, never in the JSON files on disk. When the session ends and the browser tab is closed, the cached passwords are discarded along with all other session state, and the vault’s JSON files contain only encrypted ciphertext for per-entry encrypted entries - the same protected form they had before the session began.

The Version History and Its JSON Architecture

VaultBook Pro’s version history system stores per-entry version snapshots in the vault’s versions directory, and the format and structure of these snapshots reflects the same design principles as the rest of the vault’s JSON-based local storage - human-readable, application-independent, and directly auditable.

Each version snapshot is a dated copy of an entry’s body content stored in markdown format, named with the entry’s identifier and the timestamp of the snapshot. The versions directory for a specific entry contains all of its snapshots within the sixty-day retention window, organized chronologically by their filename timestamps. A user who navigates to the versions directory in their operating system’s file manager can see the complete version history of any entry as a series of timestamped markdown files, each independently readable without VaultBook.

The version history modal in VaultBook’s interface presents these snapshots from newest to oldest, displaying the entry’s content at each snapshot point and allowing the user to restore any prior version. But the modal is a convenience interface for accessing data that is independently accessible in the versions directory - the version history exists in a form that the user can examine, copy, or use independently of VaultBook’s interface if they choose.

For compliance purposes, the version history in the versions directory provides a local audit trail of document development. A legal professional who needs to demonstrate the development history of a specific document - when it was created, what changes were made and when, how the current version relates to earlier drafts - can point to the versions directory’s time-stamped snapshots as a contemporaneous record of the document’s development. This documentation is created automatically as a byproduct of normal vault use, requires no deliberate archiving action, and exists in a format that is auditable without any application-specific tooling.

The sixty-day retention period for version snapshots is a design choice that balances the value of historical access against the data minimization principle that sensitive professional information should not be retained longer than its legitimate purpose requires. Snapshots older than sixty days are automatically purged, ensuring that the version history does not accumulate indefinitely into an archive of sensitive historical content that creates its own retention compliance challenges. For entries that have been deleted, the same sixty-day window applies to the entry itself - deleted entries remain recoverable for sixty days and are then permanently purged from the vault’s storage, with their version history purged at the same time.

The Smart Features That Operate From Local JSON Data

VaultBook’s AI and intelligent features derive their intelligence entirely from the vault’s local JSON data - the repository file, the entry body sidecar files, the attachment index, and the version history - without any dependency on external AI services, cloud-hosted models, or behavioral data transmission.

The AI Suggestions carousel’s Suggestions page builds its weekday pattern model from the access timestamps recorded in the repository’s entry metadata. When an entry is accessed, the repository records the access timestamp. The Suggestions engine reviews these timestamps across the preceding four weeks and identifies which entries have been accessed on each day of the week, producing the top three entries for the current day based on this access pattern. The entire computation happens in the browser’s JavaScript execution environment, operating on the repository data that is already loaded in memory, with no external service involvement.

The Smart Label Suggestions feature analyzes the content of the entry being edited - the text currently in the note body and section bodies, extracted from the in-memory representation of the entry being edited - and compares it against the label descriptions and the content of entries that currently carry each label, recommending labels whose existing usage is conceptually consistent with the entry being edited. This analysis operates on the in-memory representation of the current entry and the label metadata in the repository, without any external natural language processing service.

The QA natural language search with weighted relevance scoring operates on the locally maintained search index that is derived from the repository’s entry metadata, the content of entry body sidecar files, and the indexed text extracted from attached files through the local attachment indexing pipeline. The search index is a local data structure maintained in the vault’s working state, derived from the vault’s JSON and markdown files, and populated through local processing without any external search service.

The vote-based reranking in VaultBook Pro’s QA Actions and the Related Entries feature store their vote pair data in the repository JSON file. Upvotes and downvotes applied to search results or Related Entries suggestions are immediately persisted to the repository as the user applies them, updating the local data structure that the reranking algorithm uses to adjust future result ordering. The accumulated votes represent a personalized relevance model that lives entirely in the repository JSON - the behavioral data that makes the AI features more useful over time is stored in the same local file as the rest of the vault’s organizational data, accessible to the user in the same transparent, auditable form, and never transmitted to any external service.

The analytics panel’s charts and metrics are computed from the repository’s entry metadata and the timestamps recorded in the vault’s activity tracking. The Last 14 Days Activity line chart in VaultBook Pro, the Month Activity bar chart, the Label utilization pie chart, and the Pages utilization pie chart are all canvas-rendered visualizations of data that exists entirely in the repository JSON file - the counts, the timestamps, and the organizational metadata from which the charts are derived are all present in the repository in their raw form, and the charts render them in a visual form that reveals patterns that the raw data alone does not immediately surface.

Real-World Applications Across Professional Domains

The practical implications of VaultBook’s local JSON architecture for specific professional domains are concrete and specific in ways that general privacy claims about “local storage” do not fully convey.

For healthcare professionals, the local JSON architecture means that PHI stored in VaultBook entries and their attached clinical documents exists in a precisely locatable, auditable, and controllable data store. A covered entity conducting a HIPAA Security Rule compliance review can examine the vault folder directly to verify the location of PHI, confirm the access controls in place, document the retention and disposal practices implemented through the expiry date and purge policy systems, and verify that no PHI has been transmitted to any external server. The local JSON format makes this compliance documentation a direct examination exercise rather than a vendor-dependent reporting exercise.

For legal professionals, the local JSON architecture means that privileged communications and attorney work product stored in VaultBook entries exist in a data store where the attorney maintains complete custody. The attorney-client privilege’s work product protection requires that privileged material be kept under the attorney’s control - a requirement that cloud storage of privileged documents complicates because it places a copy of the privileged material in the hands of a cloud service provider, potentially raising questions about whether the transmission to the cloud service provider constitutes a disclosure that could affect the privilege. Local JSON storage eliminates this question because the data never leaves the attorney’s own infrastructure.

For financial professionals, the local JSON architecture means that client financial information stored in VaultBook entries is maintained in a data store where the advisor maintains complete control over who can access it and how long it is retained. Client financial information stored in cloud-hosted note applications may be subject to the cloud provider’s data handling practices, security incident exposure, and legal process served on the provider - risks that local JSON storage eliminates by keeping the data on the advisor’s own infrastructure.

For research professionals, the local JSON architecture means that unpublished research data, preliminary findings, confidential peer review materials, and proprietary intellectual property stored in VaultBook entries are maintained in a data store where the researcher holds exclusive custody. Research institutions and funding agencies increasingly have data governance requirements that specify where sensitive research data may be stored and who may have access to it - requirements that cloud-hosted storage may not satisfy but that VaultBook’s local JSON architecture naturally meets.

For individuals managing their most sensitive personal documents - identity documents, financial records, medical history, private journals, confidential personal correspondence - the local JSON architecture means complete personal custody over data that is genuinely private. The personal data stored in VaultBook exists only in the places the individual has chosen to store it, accessible only through the device and credentials the individual controls, and permanent only as long as the individual chooses to retain it.

The Transparency Promise That Cloud Cannot Make

Cloud applications cannot make the transparency promise that VaultBook’s local JSON storage enables, because cloud transparency is necessarily mediated by the provider’s reporting and auditing mechanisms rather than by direct examination of the underlying data.

A cloud provider can publish transparency reports documenting the legal process requests they have received and how they have responded. They can publish security certifications demonstrating that their infrastructure meets specific standards. They can provide data export tools that allow users to download their data in formats the application defines. They can offer APIs that provide programmatic access to stored data through the application’s own interface. Each of these mechanisms provides a form of transparency, but it is transparency about the provider’s behavior - reports of what the provider has done and certifications of the standards the provider meets - rather than direct examination of the underlying data.

VaultBook’s local JSON storage provides direct transparency without any intermediary. The data is in the vault folder. The folder is on the user’s device. The user can examine its contents directly, without requesting a report, without obtaining a certification, and without going through any application interface. The repository JSON is a complete record of the vault’s organizational state, readable in any text editor. The entry body sidecar files contain the vault’s content in standard markdown format. The attachments directory contains the vault’s attached files with their original names and formats. The versions directory contains the version history in time-stamped files.

This is the transparency that “local JSON storage” means in practice - not merely that data is stored locally rather than in the cloud, but that the storage is organized in a format that makes the data directly inspectable, independently auditable, and completely portable without any dependency on the application that created it. It is the transparency appropriate to professional knowledge that the professional genuinely owns rather than holds through the continuing good graces of a cloud service provider.

VaultBook is built for professionals for whom this transparency is not a preference but a requirement - for whom knowing exactly where their data is, in exactly what form, accessible in exactly what ways, is as important as the data itself. The local JSON setup is the architectural commitment that makes that knowing possible.

Your data. Your device. Your format. Your vault. Completely and permanently yours.

Want to build your second brain offline?
Try VaultBook and keep your library searchable and under your control.
Get VaultBook free