This commit is contained in:
eddiesoehnel 2026-06-16 13:35:57 -06:00
parent aca27fe4d3
commit a62d1edc76

View File

@ -0,0 +1,162 @@
Task Summary: Insights Hub Export + Gitea Push Fix
We created a Python script to support your Insights Hub / Portable Identity Document workflow.
The script:
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\scripts\insights-hub-posts-last-6-months.py
pulls matching JSON records from:
C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\hrecords
filters for records where:
"HubTags": ["External Platform Posts"]
copies matching JSON files into:
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\hrecords
copies referenced media files from:
C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\files
into:
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\files
and generates a Markdown digest:
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\insights-hub-posts-last-6-months.md
with the title:
# Insights Hub Posts - Last 6 Months
The Markdown output is sorted most recent first, covers roughly the last 6 months, and includes each posts title, summary, and referenced image/file where available.
Problem 1: Git push failed with HTTP 413
When syncing to your Gitea repo:
https://projects.eddiesoehnel.com/adminprojects/eddie-soehnel-portable-identity-document-OPEN
Git failed with:
error: RPC failed; HTTP 413 curl 22 The requested URL returned error: 413
fatal: the remote end hung up unexpectedly
The key issue was:
HTTP 413 = Payload Too Large
Your push included a large Git pack, around:
118.25 MiB
because the new Insights Hub export included many image and PDF files under:
data/insights-hub/files/
Problem 2: Misleading “Everything up-to-date” message
Git also printed:
Everything up-to-date
after the failed push.
We treated that as unreliable because the same output also included:
fatal: the remote end hung up unexpectedly
The local branch remained ahead of origin, confirming the push had not actually completed.
Workaround Attempt 1: Exclude media from Git
We temporarily added this to .gitignore:
data/insights-hub/files/
Then we reset and recommitted the lightweight files only:
.gitignore
scripts/insights-hub-posts-last-6-months.py
data/insights-hub/hrecords/
data/insights-hub/insights-hub-posts-last-6-months.md
That allowed Git to avoid the large media payload.
However, this created a new issue: the generated six-month Markdown file referenced local media files, but those files were not present on the Gitea/server side because Git was ignoring them.
Problem 3: Markdown referenced missing media
Since the Markdown file displays images/files from:
data/insights-hub/files/
the media files do need to exist wherever the repo is served or rendered.
So excluding the media folder solved the Git push problem but broke the completeness of the published Markdown output.
Final Fix: Increase Caddy request body limit
We inspected your Caddyfile and found the Gitea reverse proxy block:
see caddyfile for the new addition
We added a larger request body limit:
Then Caddy was validated and reloaded:
sudo caddy validate --config /etc/caddy/Caddyfile
sudo systemctl reload caddy
After that, the push worked.
Cloudflare Change
You also turned off the Cloudflare proxy for:
projects.eddiesoehnel.com
and set it to DNS only.
That was the correct move because Git pushes over HTTPS can hit Cloudflare request-body limits. For a self-hosted Gitea server, the cleaner current setup is:
Cloudflare DNS only → Caddy HTTPS reverse proxy → Gitea VM
rather than:
Cloudflare proxy → Caddy → Gitea
Current Recommended State
Keep:
projects.eddiesoehnel.com = DNS only
Keep this in the Caddyfile:
request_body {
max_size 1GB
}
Allow the media files in Git if the Markdown digest depends on them being present in the repo.
Longer term, the cleaner architecture would be to separate Git-tracked content from large media deployment:
Git:
- JSON
- Markdown
- scripts
Separate media deployment:
- rsync
- SFTP
- object storage
- static media volume
- Git LFS
But for the current workflow, the practical fix is:
Track the media in Git + raise Caddy upload limit + keep Cloudflare DNS-only for Gitea.