Gitea's Container Registry Just Got More Reliable: Fixing Data Races in Concurrent Blob Uploads (PR #36524 + Backport #36526)
- Part 1: How fzf Just Became Way More Memory-Efficient: The Bitmap Cache Upgrade
- Part 2: Unlocking Faster Fuzzy Finding: How a Smart Work Queue Made fzf Even Quicker
- Part 3: Speeding Up fzf: How a Tiny Change Made the World's Fastest Fuzzy Finder Even Faster
- Part 4: Fixing a Sneaky XSS Vulnerability in Hugo: Inside the Commit That Makes Markdown Rendering Safer
- Part 5: Fixing Syncthing’s REST API: Why “application/json; charset=utf-8” Was Wrong and How a Simple Commit Made It Right
- Part 6: Fixing UI Lag | Lazygit just got a major speed boost for one of its most frustrating pain points
- Part 7: Traefik PR #12880: A Tiny Fix That Squashes Unnecessary Allocations and Kills Routing Lag
- Part 8: Fixing Header Sanitization in Traefik’s PassTLSClientCert Middleware: A Deep Dive into PR #12875
- Part 9: Fixing URL Prefix Stripping in Traefik: Inside Pull Request #12863
- Part 10: Gitea Just Got More Secure: Fixing OAuth2 Authorization Code Expiry and Reuse
- Part 11: Fixing a Sneaky Security Bug in Gitea: Users Could No Longer Change Someone Else’s Primary Email
- Part 12: Securing Gitea Repository Templates: A Deep Dive into PR #36734 & #36746
- Part 13: Gitea's Container Registry Just Got More Reliable: Fixing Data Races in Concurrent Blob Uploads (PR #36524 + Backport #36526)
TL;DR
When you build multiple Docker/OCI images that share the same base layers (super common with BuildKit), Gitea’s container registry used to hit intermittent failures — 400 Bad Request errors, flaky pushes, or worse.
Two small PRs fixed it:
Result: No more data races. Uploads are safe, fast, and predictable. Merged in February 2026 and already in the changelog.
The Problem (Why It Happened) 🔗
Gitea’s built-in Container Registry stores Docker/OCI images as “blobs” (compressed layers). Each blob has a unique SHA-256 hash.
Modern build tools (BuildKit, docker buildx, etc.) are highly concurrent:
- They upload the same base layer (e.g.,
ubuntu:24.04) to the registry multiple times in parallel when you build several images at once. - Gitea’s code tried to save the blob to the database and storage at the same time from different goroutines.
Without protection, this caused a classic data race:
- Two goroutines read “blob doesn’t exist yet” → both try to INSERT → one fails → or worse, internal state (like file handles in
BlobUploader) got corrupted. - Real-world symptom: “400 Bad Request” from the registry during parallel builds.
Before the fix (simplified flow):
- Upload blob →
NewBlobUploader() saveAsPackageBlob()→ no lockGetOrInsertBlob()→ direct INSERT (race window)- SDK (e.g. MinIO) or defer Close() could clash → race!
The Fix (What, How, and Why the Commits Were Made) 🔗
The PRs made three targeted changes — all tiny, all surgical, and all aimed at making concurrent blob uploads 100% safe.
1. Global per-blob locking (the main guardrail) 🔗
File changed: routers/api/packages/container/blob.go
Before:
func saveAsPackageBlob(...) { ... } // no protection
After (key addition):
// There will be concurrent uploading for the same blob,
// so it needs a global lock per blob hash
func saveAsPackageBlob(ctx context.Context, hsr packages_module.HashedSizeReader, pci *packages_service.PackageCreationInfo) (*packages_model.PackageBlob, error) {
pb := packages_service.NewPackageBlob(hsr)
err := globallock.LockAndDo(ctx, "container-blob:"+pb.HashSHA256, func(ctx context.Context) error {
var err error
pb, err = saveAsPackageBlobInternal(ctx, hsr, pci, pb)
return err
})
return pb, err
}
Why this works:
Only one goroutine can process a given blob hash at a time. Others wait politely. No more races. The lock key is the SHA-256 hash — perfect granularity.
2. Race-safe database insert (defensive retry) 🔗
File changed: models/packages/package_blob.go
Before:
Straight INSERT → fail if another request won the race.
After:
if _, err = e.Insert(pb); err != nil {
// Handle race condition: another request may have inserted
// the same blob between our SELECT and INSERT.
if has, _ = e.Where(hashCond).Get(existing); has {
return existing, true, nil // return the one that already exists
}
return nil, false, err
}
Why: Even with the lock, tiny race windows can still happen under heavy load. This makes GetOrInsertBlob idempotent and bulletproof.
3. New concurrency test (proof it works) 🔗
New file: models/packages/package_blob_test.go
func TestGetOrInsertBlobConcurrent(t *testing.T) {
// 3 goroutines try to insert the exact same blob at the same time
wg.Go(...) // x3
// Assert: all get the same blob ID, only ONE created it
}
This test uses golang.org/x/sync/errgroup and runs in CI — future changes can never re-introduce the bug.
Files touched in total (both PRs):
models/packages/package_blob.gomodels/packages/package_blob_test.go(new)routers/api/packages/container/blob.goservices/packages/container/blob_uploader.go(minor cleanup)
No breaking changes. Zero user-facing API impact.
Goals of the PRs — Clearly Achieved ✅ 🔗
| Goal | Status | How It Was Done |
|---|---|---|
| Eliminate data race on concurrent blob uploads | ✅ | Global lock + retry logic |
| Fix real-world 400 errors during parallel builds | ✅ | Tested by @noeljackson with 3 simultaneous package builds sharing layers |
| Keep performance high | ✅ | Lock is per-blob-hash (not global) and very short-lived |
| Prevent future regressions | ✅ | New unit test + race-detector friendly code |
| Backport to stable v1.25 | ✅ | #36526 (merged next day) |
Who Made It Happen 🔗
- Author: @noeljackson — spotted the issue in production and delivered the fix.
- Backport: GiteaBot (automated)
- Reviewers & Approvers: @wxiaoguang, @techknowlogick, @lunny, @TheFox0x7
- Merged into
main(Feb 3, 2026) andrelease/v1.25(Feb 4, 2026)
Bottom Line 🔗
If you use Gitea’s container registry (docker push your-gitea.example.com/org/image), especially with multi-image or multi-arch builds, this fix makes your life noticeably smoother. No more mysterious 400s when layers are shared.
Try it today — update to the latest Gitea (or v1.25.5+). Push a few images that share a base layer and watch it just… work.
Links
- Main PR: https://github.com/go-gitea/gitea/pull/36524
- Backport: https://github.com/go-gitea/gitea/pull/36526
- Changelog entry: Search “data race when uploading container blobs” in CHANGELOG.md
Happy pushing! 🐳
(And huge thanks to the Gitea contributors who keep the registry rock-solid.)
I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .
- Part 1: How fzf Just Became Way More Memory-Efficient: The Bitmap Cache Upgrade
- Part 2: Unlocking Faster Fuzzy Finding: How a Smart Work Queue Made fzf Even Quicker
- Part 3: Speeding Up fzf: How a Tiny Change Made the World's Fastest Fuzzy Finder Even Faster
- Part 4: Fixing a Sneaky XSS Vulnerability in Hugo: Inside the Commit That Makes Markdown Rendering Safer
- Part 5: Fixing Syncthing’s REST API: Why “application/json; charset=utf-8” Was Wrong and How a Simple Commit Made It Right
- Part 6: Fixing UI Lag | Lazygit just got a major speed boost for one of its most frustrating pain points
- Part 7: Traefik PR #12880: A Tiny Fix That Squashes Unnecessary Allocations and Kills Routing Lag
- Part 8: Fixing Header Sanitization in Traefik’s PassTLSClientCert Middleware: A Deep Dive into PR #12875
- Part 9: Fixing URL Prefix Stripping in Traefik: Inside Pull Request #12863
- Part 10: Gitea Just Got More Secure: Fixing OAuth2 Authorization Code Expiry and Reuse
- Part 11: Fixing a Sneaky Security Bug in Gitea: Users Could No Longer Change Someone Else’s Primary Email
- Part 12: Securing Gitea Repository Templates: A Deep Dive into PR #36734 & #36746
- Part 13: Gitea's Container Registry Just Got More Reliable: Fixing Data Races in Concurrent Blob Uploads (PR #36524 + Backport #36526)