eddie-soehnel-portable-iden.../tasks/2026-06-16-insights-hub-publish-script-create-git-push-fail-workaround.md
eddiesoehnel a62d1edc76 added
2026-06-16 13:35:57 -06:00

4.1 KiB
Raw Permalink Blame History

Task Summary: Insights Hub Export + Gitea Push Fix

We created a Python script to support your Insights Hub / Portable Identity Document workflow.

The script:

C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\scripts\insights-hub-posts-last-6-months.py

pulls matching JSON records from:

C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\hrecords

filters for records where:

"HubTags": ["External Platform Posts"]

copies matching JSON files into:

C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\hrecords

copies referenced media files from:

C:\Users\edsoe\My Drive\Personal Organization\JSON_Database\files

into:

C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\files

and generates a Markdown digest:

C:\projects\ES\eddie-soehnel-portable-identity-document-OPEN\data\insights-hub\insights-hub-posts-last-6-months.md

with the title:

Insights Hub Posts - Last 6 Months

The Markdown output is sorted most recent first, covers roughly the last 6 months, and includes each posts title, summary, and referenced image/file where available.

Problem 1: Git push failed with HTTP 413

When syncing to your Gitea repo:

https://projects.eddiesoehnel.com/adminprojects/eddie-soehnel-portable-identity-document-OPEN

Git failed with:

error: RPC failed; HTTP 413 curl 22 The requested URL returned error: 413 fatal: the remote end hung up unexpectedly

The key issue was:

HTTP 413 = Payload Too Large

Your push included a large Git pack, around:

118.25 MiB

because the new Insights Hub export included many image and PDF files under:

data/insights-hub/files/ Problem 2: Misleading “Everything up-to-date” message

Git also printed:

Everything up-to-date

after the failed push.

We treated that as unreliable because the same output also included:

fatal: the remote end hung up unexpectedly

The local branch remained ahead of origin, confirming the push had not actually completed.

Workaround Attempt 1: Exclude media from Git

We temporarily added this to .gitignore:

data/insights-hub/files/

Then we reset and recommitted the lightweight files only:

.gitignore scripts/insights-hub-posts-last-6-months.py data/insights-hub/hrecords/ data/insights-hub/insights-hub-posts-last-6-months.md

That allowed Git to avoid the large media payload.

However, this created a new issue: the generated six-month Markdown file referenced local media files, but those files were not present on the Gitea/server side because Git was ignoring them.

Problem 3: Markdown referenced missing media

Since the Markdown file displays images/files from:

data/insights-hub/files/

the media files do need to exist wherever the repo is served or rendered.

So excluding the media folder solved the Git push problem but broke the completeness of the published Markdown output.

Final Fix: Increase Caddy request body limit

We inspected your Caddyfile and found the Gitea reverse proxy block:

see caddyfile for the new addition

We added a larger request body limit:

Then Caddy was validated and reloaded:

sudo caddy validate --config /etc/caddy/Caddyfile sudo systemctl reload caddy

After that, the push worked.

Cloudflare Change

You also turned off the Cloudflare proxy for:

projects.eddiesoehnel.com

and set it to DNS only.

That was the correct move because Git pushes over HTTPS can hit Cloudflare request-body limits. For a self-hosted Gitea server, the cleaner current setup is:

Cloudflare DNS only → Caddy HTTPS reverse proxy → Gitea VM

rather than:

Cloudflare proxy → Caddy → Gitea Current Recommended State

Keep:

projects.eddiesoehnel.com = DNS only

Keep this in the Caddyfile:

request_body { max_size 1GB }

Allow the media files in Git if the Markdown digest depends on them being present in the repo.

Longer term, the cleaner architecture would be to separate Git-tracked content from large media deployment:

Git:

  • JSON
  • Markdown
  • scripts

Separate media deployment:

  • rsync
  • SFTP
  • object storage
  • static media volume
  • Git LFS

But for the current workflow, the practical fix is:

Track the media in Git + raise Caddy upload limit + keep Cloudflare DNS-only for Gitea.