162 lines
4.1 KiB
Markdown
162 lines
4.1 KiB
Markdown
Task Summary: Insights Hub Export + Gitea Push Fix
|
||
|
||
We created a Python script to support your Insights Hub / Portable Identity Document workflow.
|
||
|
||
The script:
|
||
|
||
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\scripts\insights-hub-posts-last-6-months.py
|
||
|
||
pulls matching JSON records from:
|
||
|
||
C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\hrecords
|
||
|
||
filters for records where:
|
||
|
||
"HubTags": ["External Platform Posts"]
|
||
|
||
copies matching JSON files into:
|
||
|
||
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\hrecords
|
||
|
||
copies referenced media files from:
|
||
|
||
C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\files
|
||
|
||
into:
|
||
|
||
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\files
|
||
|
||
and generates a Markdown digest:
|
||
|
||
C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\insights-hub-posts-last-6-months.md
|
||
|
||
with the title:
|
||
|
||
# Insights Hub Posts - Last 6 Months
|
||
|
||
The Markdown output is sorted most recent first, covers roughly the last 6 months, and includes each post’s title, summary, and referenced image/file where available.
|
||
|
||
Problem 1: Git push failed with HTTP 413
|
||
|
||
When syncing to your Gitea repo:
|
||
|
||
https://projects.eddiesoehnel.com/adminprojects/eddie-soehnel-portable-identity-document-OPEN
|
||
|
||
Git failed with:
|
||
|
||
error: RPC failed; HTTP 413 curl 22 The requested URL returned error: 413
|
||
fatal: the remote end hung up unexpectedly
|
||
|
||
The key issue was:
|
||
|
||
HTTP 413 = Payload Too Large
|
||
|
||
Your push included a large Git pack, around:
|
||
|
||
118.25 MiB
|
||
|
||
because the new Insights Hub export included many image and PDF files under:
|
||
|
||
data/insights-hub/files/
|
||
Problem 2: Misleading “Everything up-to-date” message
|
||
|
||
Git also printed:
|
||
|
||
Everything up-to-date
|
||
|
||
after the failed push.
|
||
|
||
We treated that as unreliable because the same output also included:
|
||
|
||
fatal: the remote end hung up unexpectedly
|
||
|
||
The local branch remained ahead of origin, confirming the push had not actually completed.
|
||
|
||
Workaround Attempt 1: Exclude media from Git
|
||
|
||
We temporarily added this to .gitignore:
|
||
|
||
data/insights-hub/files/
|
||
|
||
Then we reset and recommitted the lightweight files only:
|
||
|
||
.gitignore
|
||
scripts/insights-hub-posts-last-6-months.py
|
||
data/insights-hub/hrecords/
|
||
data/insights-hub/insights-hub-posts-last-6-months.md
|
||
|
||
That allowed Git to avoid the large media payload.
|
||
|
||
However, this created a new issue: the generated six-month Markdown file referenced local media files, but those files were not present on the Gitea/server side because Git was ignoring them.
|
||
|
||
Problem 3: Markdown referenced missing media
|
||
|
||
Since the Markdown file displays images/files from:
|
||
|
||
data/insights-hub/files/
|
||
|
||
the media files do need to exist wherever the repo is served or rendered.
|
||
|
||
So excluding the media folder solved the Git push problem but broke the completeness of the published Markdown output.
|
||
|
||
Final Fix: Increase Caddy request body limit
|
||
|
||
We inspected your Caddyfile and found the Gitea reverse proxy block:
|
||
|
||
see caddyfile for the new addition
|
||
|
||
We added a larger request body limit:
|
||
|
||
Then Caddy was validated and reloaded:
|
||
|
||
sudo caddy validate --config /etc/caddy/Caddyfile
|
||
sudo systemctl reload caddy
|
||
|
||
After that, the push worked.
|
||
|
||
Cloudflare Change
|
||
|
||
You also turned off the Cloudflare proxy for:
|
||
|
||
projects.eddiesoehnel.com
|
||
|
||
and set it to DNS only.
|
||
|
||
That was the correct move because Git pushes over HTTPS can hit Cloudflare request-body limits. For a self-hosted Gitea server, the cleaner current setup is:
|
||
|
||
Cloudflare DNS only → Caddy HTTPS reverse proxy → Gitea VM
|
||
|
||
rather than:
|
||
|
||
Cloudflare proxy → Caddy → Gitea
|
||
Current Recommended State
|
||
|
||
Keep:
|
||
|
||
projects.eddiesoehnel.com = DNS only
|
||
|
||
Keep this in the Caddyfile:
|
||
|
||
request_body {
|
||
max_size 1GB
|
||
}
|
||
|
||
Allow the media files in Git if the Markdown digest depends on them being present in the repo.
|
||
|
||
Longer term, the cleaner architecture would be to separate Git-tracked content from large media deployment:
|
||
|
||
Git:
|
||
- JSON
|
||
- Markdown
|
||
- scripts
|
||
|
||
Separate media deployment:
|
||
- rsync
|
||
- SFTP
|
||
- object storage
|
||
- static media volume
|
||
- Git LFS
|
||
|
||
But for the current workflow, the practical fix is:
|
||
|
||
Track the media in Git + raise Caddy upload limit + keep Cloudflare DNS-only for Gitea. |