Self-Host Paperless-ngx with Greffon
Every invoice, contract, and tax form you own, scanned and searchable in one place. That archive is worth keeping on hardware you control. Here is the honest setup for Paperless-ngx on a greffer.
A document archive is the kind of thing you build once and lean on for years: invoices, contracts, warranties, tax forms, the paper trail of a whole life. That is exactly the data you do not want scattered across a provider you rent. Paperless-ngx turns a pile of scans into an indexed, searchable archive, and Greffon takes the assembly off your plate so it runs on a machine you own.
Why own the archive
Paperless-ngx ingests documents, runs OCR so the text inside scans becomes searchable, and tags and sorts them so you can find a receipt from three years ago in seconds. It is the open project many people reach for when they want to go paperless without handing their financial and legal records to a cloud account.
Self-hosting changes where those records sit. The OCR text, the original PDFs, the tags you build up over time all live on your greffer, not on someone else's storage. For documents this sensitive, that is the whole point.
Graft it from the catalog
On a greffer you do not hand-write a compose file or wire a reverse proxy. Pick Paperless-ngx from the catalog and graft it onto your greffer. Greffon issues the certificate and routes the app, so it comes up reachable over HTTPS from the first start. The Django secret key is generated for you at instance creation, so the one thing you set by hand is the admin password.
admin superuser you will log in with. The catalog enforces a 12-character minimum. Pick a strong one: it is the front door to every document you archive. You can add more users from inside Paperless-ngx later.Reach it from anywhere
An archive is most useful when you can pull up a document from your phone at a counter or from a laptop on the road. On the same network as your greffer that works the moment it starts. To reach it from elsewhere you have two honest options.
The simplest is tunnel mode: a greffer connects outbound to the manager's tunnel and serves its apps without opening a single inbound port, which is the answer for a box behind NAT or CGNAT with no public IP. Paperless-ngx is a plain HTTP web app, so it rides the tunnel cleanly. If you would rather expose the greffer directly, port forwarding plus dynamic DNS still works. Either way you reach the archive over HTTPS.
Storage and memory
This is the tradeoff most walkthroughs skip. Paperless-ngx keeps both your original files and the OCR output, so storage grows with every document you feed it. A few thousand scanned pages is modest, but years of high-resolution scans add up. Put it on a greffer with room to grow, and watch the disk.
OCR is also the heaviest thing it does. Crunching the text out of a scan is CPU and memory work, and it leans on a small stack of helpers (a database, a task queue, a cache) that run alongside the app. It is comfortable on a modest box, but it is not as featherweight as a single-process app. A 1 GB greffer will feel tight under a bulk import.
Back it up first
For an archive, backups are the section that matters most. The whole value of going paperless is that the digital copy becomes the copy of record, often after you have shredded the paper. That makes the greffer the single place those documents live. Greffon handles TLS and routing today, and native one-click backups are coming in M2. Until then, bring your own backup tool (restic or borgbackup are the usual choices), back up the Paperless-ngx data and database on a schedule, and keep a copy off the greffer.
Keep it always-on
You will reach for a document at odd hours from whatever device is in hand, so the archive should be up when you are. Run it on an always-on greffer, a small VPS, a mini-PC, or a free Oracle Cloud box, rather than a laptop that sleeps at night. The Oracle walkthrough is a good place to get a greffer running before you graft the archive onto it.