Replace usage of Fatale with context cancellation#1639
Replace usage of Fatale with context cancellation#1639ggilder wants to merge 14 commits intogithub:masterfrom
Fatale with context cancellation#1639Conversation
This patch modifies gh-ost to use a cancellable context instead of log.Fatale() in listenOnPanicAbort. When using gh-ost as a library, this allows the calling application to recover from aborts (e.g. log the failure reason) instead of having the entire process terminate via os.Exit(). Now we store the error and cancel a context to signal all goroutines to stop gracefully.
c35276d to
81ec000
Compare
There was a problem hiding this comment.
Pull request overview
This PR replaces abort handling that previously exited the process (log.Fatale) with context cancellation + “first error wins” error propagation, improving gh-ost’s usability as a library while preserving CLI behavior.
Changes:
- Added
context.Context+ abort error storage toMigrationContext, and cancelled context on abort. - Updated long-running loops and abort send sites to respect context cancellation.
- Added unit tests for abort propagation and first-error-wins behavior.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| go/base/context.go | Adds cancellable context + abort error storage APIs to MigrationContext. |
| go/base/context_test.go | Tests for first-error-wins abort error storage and thread-safety. |
| go/logic/migrator.go | Replaces Fatale with error storage + context cancellation; adds checkAbort() and integrates it into migration flow. |
| go/logic/migrator_test.go | New tests validating abort propagation and checkAbort() behavior. |
| go/logic/applier.go | Heartbeat loop now exits on context cancellation; abort send is context-aware. |
| go/logic/throttler.go | Throttle checks now exit on context cancellation; abort sends are context-aware. |
| go/logic/streamer.go | Streaming loop now checks for context cancellation. |
| go/logic/server.go | “panic” command abort send is context-aware. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
|
This is great! |
|
@GGlider Looks like the migration tests are hanging for some reason. I'll take a look later if I get the chance |
|
@meiji163 hmm, interesting. I did see some tests hanging locally before, and I thought I had resolved that. However I just tried running locally and it's hanging at the same spot. I'll take another look on Monday. |
|
@meiji163 think I found the spot that was deadlocking, and I added a helper to encapsulate the non-blocking channel send pattern. I believe the replica tests should pass now |
|
Looks like it's still hanging somewhere 😢 I'll keep testing, maybe I can run the tests in a container to more closely approximate CI. |
|
@meiji163 could you approve the workflow run again? I think this latest commit will fix it |
Related issue: #1635
Description
This PR replaces
log.Fataleerror handling inlistenOnPanicAbort()with context-based cancellation and proper error propagation, enabling gh-ost to be used as a Go library while maintaining the same CLI behavior.The previous implementation called
Fatalein a separate goroutine when fatal errors occurred. This had several issues:I've implemented a context-based cancellation system with proper error propagation:
This also makes abort scenarios more testable.
script/cibuildreturns with no formatting errors, build errors or unit test errors.