Skip to content

ci: add automated upstream sync workflow#18

Closed
KHAEntertainment wants to merge 1 commit intomainfrom
upstream-sync-2026-Q1
Closed

ci: add automated upstream sync workflow#18
KHAEntertainment wants to merge 1 commit intomainfrom
upstream-sync-2026-Q1

Conversation

@KHAEntertainment
Copy link
Copy Markdown
Collaborator

@KHAEntertainment KHAEntertainment commented Apr 1, 2026

Bi-weekly (Mon/Thu 06:00 UTC) workflow that fetches sgl-project/sglang, attempts merge, classifies risk based on fork-differentiated paths, and opens a PR with appropriate labels. Conflict merges create an issue instead. Low-risk clean merges get auto-merge enabled.

Motivation

Modifications

Accuracy Tests

Speed Tests and Profiling

Checklist

Review and Merge Process

  1. Ping Merge Oncalls to start the process. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • Common commands include /tag-and-rerun-ci, /tag-run-ci-label, /rerun-failed-ci
  4. After green CI and required approvals, ask Merge Oncalls or people with Write permission to merge the PR.

Summary by CodeRabbit

  • Chores
    • Implemented automated synchronization between the fork and upstream repository. The system performs regular scheduled syncs, automatically creating pull requests with detailed change summaries and risk level assessments for successful merges. When merge conflicts occur, the system generates issues documenting the conflicting files and providing recommended resolution steps.

Bi-weekly (Mon/Thu 06:00 UTC) workflow that fetches sgl-project/sglang,
attempts merge, classifies risk based on fork-differentiated paths, and
opens a PR with appropriate labels. Conflict merges create an issue
instead. Low-risk clean merges get auto-merge enabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 1, 2026

Walkthrough

A new GitHub Actions workflow automates periodic syncing of a fork with an upstream repository's main branch. The workflow checks for commits behind, merges upstream changes, classifies merge risk based on conflict patterns, and creates PRs or issues with appropriate labels and auto-merge settings based on risk level.

Changes

Cohort / File(s) Summary
Upstream Sync Workflow
.github/workflows/upstream-sync.yml
New workflow implementing scheduled and manual fork synchronization with upstream. Includes merge conflict detection, risk classification (low/medium/high) based on sensitive path intersections, PR creation with conditional auto-merge for low-risk syncs, and issue creation for merge failures. Supports dry-run mode and generates workflow summary reporting commits behind, merge status, and risk level.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The review requires understanding of GitHub Actions syntax, Git operations, conditional workflow logic, risk classification heuristics, and PR/issue creation APIs. Multiple sequential decision branches and the need to verify merge conflict handling and risk categorization logic add complexity.

Poem

🔄 ✨ A sync-bot awakens with schedule and grace,
Merging upstream changes to keep pace,
Risk-tamed and conflicts contained with care,
PRs auto-merge when the path is clear!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description includes a clear summary of the workflow's purpose, but the required template sections (Motivation, Modifications, Accuracy Tests, Speed Tests) remain unfilled with substantive content. Fill in the Motivation section explaining why this automation is needed, and the Modifications section detailing the workflow implementation details and files added.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title clearly and concisely describes the main change: adding an automated upstream sync workflow for CI purposes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch upstream-sync-2026-Q1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/upstream-sync.yml:
- Around line 300-312: Add a single trailing newline at the end of the Workflow
summary step block so the file ends with a newline character; locate the "name:
Workflow summary" step (the run block that appends lines to
$GITHUB_STEP_SUMMARY) and ensure there is one final blank line/newline at EOF so
the end-of-file-fixer lint rule stops flagging the file.
- Around line 279-296: The issue creation branch currently passes labels
['upstream-sync','needs-review'] directly to github.rest.issues.create which can
422 if those labels don't exist; before calling the create block that builds the
issue (the code that references conflicts and calls github.rest.issues.create)
either call the same label bootstrap/check routine used in the PR creation path
to ensure 'upstream-sync' and 'needs-review' exist, or remove labels from the
create call and immediately call github.rest.issues.addLabels(owner, repo,
issue_number, ['upstream-sync','needs-review']) after the issue is created so
labels are added only once they exist.
- Around line 23-26: The workflow's permissions block currently grants only
contents and pull-requests write access but omits issues, causing Issues API
calls to 403; update the permissions mapping in the permissions block (the YAML
key named permissions) to include "issues: write" so the steps that use the
Issues API (label checks/creation, attaching labels, and creating a failure
issue) have the required permission; ensure you add the "issues: write" entry
alongside "contents: write" and "pull-requests: write" in the existing
permissions block.
- Around line 83-85: The branch name generation using BRANCH="${{
env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)" is not unique per workflow run;
change the BRANCH creation (the BRANCH variable assignment) to include the
GITHUB_RUN_ID (e.g., append -${{ github.run_id }} or ${GITHUB_RUN_ID}) so each
workflow execution produces a unique branch name and avoids non-fast-forward
push rejections when pushing the branch.
- Around line 79-96: The merge step (id: merge) fails due to missing git
committer identity before the git merge command; add commands in the same run
block, before the "git merge upstream/${{ env.UPSTREAM_BRANCH }}" invocation, to
configure git user.name and user.email (e.g., using the workflow actor/email) so
the commit/merge can succeed; ensure these git config settings are applied
locally in the created branch (used by BRANCH and pre_merge_sha) so the
subsequent conditional doesn't misclassify an identity error as a merge
conflict.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4c9d1942-8117-4937-900f-0045ea942589

📥 Commits

Reviewing files that changed from the base of the PR and between d51ed61 and 553f220.

📒 Files selected for processing (1)
  • .github/workflows/upstream-sync.yml

Comment on lines +23 to +26
permissions:
contents: write
pull-requests: write

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cd .github/workflows && cat -n upstream-sync.yml | head -40

Repository: KHAEntertainment/sglang-mamba

Length of output: 1762


🏁 Script executed:

cd .github/workflows && cat -n upstream-sync.yml | sed -n '210,250p'

Repository: KHAEntertainment/sglang-mamba

Length of output: 1822


🏁 Script executed:

cd .github/workflows && cat -n upstream-sync.yml | sed -n '275,300p'

Repository: KHAEntertainment/sglang-mamba

Length of output: 1716


🏁 Script executed:

cd .github/workflows && grep -n "github-script\|GITHUB_TOKEN\|token:" upstream-sync.yml | head -20

Repository: KHAEntertainment/sglang-mamba

Length of output: 258


Add issues: write to the permissions block.

The workflow uses the Issues API in three places—to check and create labels (lines 214–228), attach labels to the PR (lines 241–246), and create an issue when merges fail (lines 280–296)—but the permissions block doesn't grant issues access. In GitHub Actions, when you explicitly set permissions, any scope you don't mention gets locked down to none. This means all those Issues API calls will hit a 403 error, breaking both the happy path (PR + labels won't be created) and the error path (conflict reporting will fail silently).

🔧 Suggested fix
permissions:
  contents: write
+ issues: write
  pull-requests: write
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
permissions:
contents: write
pull-requests: write
permissions:
contents: write
issues: write
pull-requests: write
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 23 - 26, The workflow's
permissions block currently grants only contents and pull-requests write access
but omits issues, causing Issues API calls to 403; update the permissions
mapping in the permissions block (the YAML key named permissions) to include
"issues: write" so the steps that use the Issues API (label checks/creation,
attaching labels, and creating a failure issue) have the required permission;
ensure you add the "issues: write" entry alongside "contents: write" and
"pull-requests: write" in the existing permissions block.

Comment on lines +79 to +96
- name: Create sync branch and attempt merge
if: steps.drift.outputs.skip == 'false'
id: merge
run: |
BRANCH="${{ env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)"
git checkout -b "$BRANCH"
echo "sync_branch=$BRANCH" >> "$GITHUB_OUTPUT"

# Save pre-merge HEAD for later diff
echo "pre_merge_sha=$(git rev-parse HEAD)" >> "$GITHUB_OUTPUT"

# Attempt the merge
MERGE_MSG="chore: sync with upstream sgl-project/sglang ($(date +%Y-%m-%d))"
MERGE_MSG="${MERGE_MSG}

Merges ${{ steps.drift.outputs.commits_behind }} commits from upstream."

if git merge upstream/${{ env.UPSTREAM_BRANCH }} --no-edit -m "$MERGE_MSG"; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

actions/checkout git user.name user.email configuration

💡 Result:

The actions/checkout GitHub Action does not automatically configure git user.name or user.email. Users must manually set these in their workflow after checkout if they plan to perform git commits or other operations requiring author identity. To configure for GitHub Actions bot commits (recommended for proper attribution and avatar): - name: Configure Git run: | git config user.name "github-actions[bot]" git config user.email "41898282+github-actions[bot]@users.noreply.github.com" Alternative common configurations: - From last commit: git config user.name "$(git log -1 --pretty=format:'%an')" && git config user.email "$(git log -1 --pretty=format:'%ae')" - Simple: git config user.name "github-actions" && git config user.email "github-actions@github.com" Use --global or --local as needed (local is typical for repo-specific). This is documented in the official actions/checkout README example for pushing commits. Note: Older suggestions used simpler emails, but the bot format ensures correct GitHub recognition.

Citations:


🌐 Web query:

git merge committer identity required github actions

💡 Result:

The error "committer identity required" or "Author identity unknown * Please tell me who you are" occurs in GitHub Actions when performing git operations like merge, commit, or push without configuring git's user.name and user.email. Git requires these for any commit creation, including merge commits. To fix this in your GitHub Actions workflow: 1. After actions/checkout, add a step to configure git identity (local to the repo is sufficient): - name: Configure Git run: | git config user.name "github-actions[bot]" git config user.email "41898282+github-actions[bot]@users.noreply.github.com" This uses the official GitHub Actions bot identity, ensuring commits are properly attributed with the bot avatar. Alternative common configurations: - Copy from previous commit: git config user.name "$(git log -1 --pretty=format:'%an')" && git config user.email "$(git log -1 --pretty=format:'%ae')" - Generic: git config user.name "${GITHUB_ACTOR}" && git config user.email "${GITHUB_ACTOR}@users.noreply.github.com" For merges specifically: - Ensure fetch-depth: 0 in actions/checkout if merging branches to avoid shallow clone issues. - Use git merge --no-edit to avoid editor prompts. - Grant contents: write permissions if pushing. Example full workflow snippet for merging branches: jobs: merge: runs-on: ubuntu-latest permissions: contents: write steps: - uses: actions/checkout@v4 with: fetch-depth: 0 token: ${{ secrets.GITHUB_TOKEN }} - name: Configure Git run: | git config user.name "github-actions[bot]" git config user.email "41898282+github-actions[bot]@users.noreply.github.com" - name: Merge feature into main run: | git checkout main git pull git merge feature --no-ff --no-edit git push origin main Note: GITHUB_TOKEN handles authentication for push if permissions are set; no separate credential setup needed beyond config.

Citations:


🏁 Script executed:

# Let's look at the workflow file to understand the context better
cd .github/workflows 2>/dev/null || cd . 
fd upstream-sync.yml

Repository: KHAEntertainment/sglang-mamba

Length of output: 90


🏁 Script executed:

cat -n .github/workflows/upstream-sync.yml | head -100

Repository: KHAEntertainment/sglang-mamba

Length of output: 4558


🏁 Script executed:

# Check if there's any git config step earlier in the workflow
cat -n .github/workflows/upstream-sync.yml | grep -i "git config\|user\.name\|user\.email"

Repository: KHAEntertainment/sglang-mamba

Length of output: 55


🏁 Script executed:

# Let's see more of the workflow to understand the complete flow
cat -n .github/workflows/upstream-sync.yml | sed -n '75,130p'

Repository: KHAEntertainment/sglang-mamba

Length of output: 3174


Configure git committer identity before the merge.

actions/checkout doesn't set user.name or user.email, so line 96's merge will fail with a "committer identity required" error. Since the merge failure gets caught by the conditional, the workflow enters the conflict-handling branch even though there's no actual conflict—just a missing configuration. This causes the sync to be misclassified as a merge conflict rather than a simple identity problem.

Think of it like trying to sign a check without providing your name: the bank doesn't care if you have funds; they need your identity on the document first.

🔧 Suggested fix
      # ── 4. Attempt merge ──────────────────────────────────────────────
+     - name: Configure git author
+       if: steps.drift.outputs.skip == 'false'
+       run: |
+         git config user.name "github-actions[bot]"
+         git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+
       - name: Create sync branch and attempt merge
         if: steps.drift.outputs.skip == 'false'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 79 - 96, The merge step
(id: merge) fails due to missing git committer identity before the git merge
command; add commands in the same run block, before the "git merge upstream/${{
env.UPSTREAM_BRANCH }}" invocation, to configure git user.name and user.email
(e.g., using the workflow actor/email) so the commit/merge can succeed; ensure
these git config settings are applied locally in the created branch (used by
BRANCH and pre_merge_sha) so the subsequent conditional doesn't misclassify an
identity error as a merge conflict.

Comment on lines +83 to +85
BRANCH="${{ env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)"
git checkout -b "$BRANCH"
echo "sync_branch=$BRANCH" >> "$GITHUB_OUTPUT"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/upstream-sync.yml | head -200

Repository: KHAEntertainment/sglang-mamba

Length of output: 9638


Make the sync branch name unique per workflow run.

The branch name auto-sync/upstream-$(date +%Y%m%d) depends only on the date, which means reruns or same-day manual dispatches will try to push to the same remote branch. If an earlier run already pushed that branch with a different commit, the next push will hit a non-fast-forward rejection—like trying to rewind history on a branch that's already been shared. Adding ${GITHUB_RUN_ID} ensures each workflow execution uses its own unique branch, preventing these collisions.

🔧 Suggested fix
-          BRANCH="${{ env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)"
+          BRANCH="${{ env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)-${GITHUB_RUN_ID}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 83 - 85, The branch name
generation using BRANCH="${{ env.SYNC_BRANCH_PREFIX }}-$(date +%Y%m%d)" is not
unique per workflow run; change the BRANCH creation (the BRANCH variable
assignment) to include the GITHUB_RUN_ID (e.g., append -${{ github.run_id }} or
${GITHUB_RUN_ID}) so each workflow execution produces a unique branch name and
avoids non-fast-forward push rejections when pushing the branch.

Comment on lines +279 to +296
// Create an issue instead of a PR when merge fails
const issue = await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `⚠️ Upstream sync blocked — ${behind} commits, merge conflicts`,
body: `## Upstream Sync Failed\n\n` +
`**Commits behind:** ${behind}\n` +
`**Risk level:** ${risk}\n\n` +
`### Conflicting files\n\`\`\`\n${conflicts.replace(/\|/g, '\n')}\n\`\`\`\n\n` +
`### Next steps\n` +
`1. Create a local branch: \`git checkout -b upstream-sync-manual\`\n` +
`2. Merge upstream: \`git fetch upstream && git merge upstream/main\`\n` +
`3. Resolve conflicts in the files above\n` +
`4. Run fork test suite: \`pytest test/sglang/snapshot/ -v\`\n` +
`5. Push and open PR\n\n` +
`---\n*Auto-generated by upstream-sync workflow.*`,
labels: ['upstream-sync', 'needs-review'],
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n .github/workflows/upstream-sync.yml | sed -n '200,300p'

Repository: KHAEntertainment/sglang-mamba

Length of output: 5057


🌐 Web query:

GitHub API create issue with non-existent labels 422 error

💡 Result:

When creating an issue via the GitHub REST API (POST /repos/{owner}/{repo}/issues) with non-existent labels in the "labels" array, the API returns a 422 Unprocessable Entity error with a "Validation Failed" message. The error details typically include something like {"resource":"Label","field":"name","code":"missing_field"}, indicating the specified label does not exist in the repository. Labels must already exist in the repository to be assigned during issue creation. Users without push access have labels silently dropped, but with push access (required for label assignment), validation fails for non-existent labels. To resolve: - Create the label first using POST /repos/{owner}/{repo}/labels (requires push access). - Omit the "labels" field from the create issue request, then add labels afterward using POST /repos/{owner}/{repo}/issues/{issue_number}/labels. - List existing labels with GET /repos/{owner}/{repo}/labels to verify. Example error response: { "message": "Validation Failed", "errors": [{"resource": "Label", "field": "name", "code": "missing_field"}], "documentation_url": "https://docs.github.com/rest/issues/issues#create-an-issue" } This behavior is consistent across GitHub REST API documentation and community reports as of 2026.,1,6,54,57,high

Citations:


Add label bootstrapping to the conflict reporting path.

The PR creation path (lines 211-230) carefully ensures labels exist before using them—like checking your supplies before cooking. But the conflict reporting path skips this step. When it tries to create an issue with ['upstream-sync', 'needs-review'] labels that don't exist yet (line 295), the GitHub API returns a 422 error, and the workflow silently fails to report the very conflict it's meant to flag.

Reuse the label bootstrap logic from the PR path, or create labels after the issue is created using addLabels().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 279 - 296, The issue
creation branch currently passes labels ['upstream-sync','needs-review']
directly to github.rest.issues.create which can 422 if those labels don't exist;
before calling the create block that builds the issue (the code that references
conflicts and calls github.rest.issues.create) either call the same label
bootstrap/check routine used in the PR creation path to ensure 'upstream-sync'
and 'needs-review' exist, or remove labels from the create call and immediately
call github.rest.issues.addLabels(owner, repo, issue_number,
['upstream-sync','needs-review']) after the issue is created so labels are added
only once they exist.

Comment on lines +300 to +312
# ── 10. Summary ───────────────────────────────────────────────────
- name: Workflow summary
if: always()
run: |
echo "## Upstream Sync Summary" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Metric | Value |" >> $GITHUB_STEP_SUMMARY
echo "|--------|-------|" >> $GITHUB_STEP_SUMMARY
echo "| Commits behind | ${{ steps.drift.outputs.commits_behind }} |" >> $GITHUB_STEP_SUMMARY
echo "| Merge clean | ${{ steps.merge.outputs.merge_clean }} |" >> $GITHUB_STEP_SUMMARY
echo "| Risk level | ${{ steps.classify.outputs.risk_level }} |" >> $GITHUB_STEP_SUMMARY
echo "| PR created | ${{ steps.create-pr.outputs.pr_url || 'N/A' }} |" >> $GITHUB_STEP_SUMMARY
echo "| Dry run | ${{ inputs.dry_run || 'false' }} |" >> $GITHUB_STEP_SUMMARY No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the trailing newline.

The lint job is already red because end-of-file-fixer is rewriting this file. One final newline at EOF clears it.

🧰 Tools
🪛 GitHub Actions: Lint

[error] 309-309: CI failure due to formatting hook changes: missing newline at end of file was added by 'end-of-file-fixer'

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/upstream-sync.yml around lines 300 - 312, Add a single
trailing newline at the end of the Workflow summary step block so the file ends
with a newline character; locate the "name: Workflow summary" step (the run
block that appends lines to $GITHUB_STEP_SUMMARY) and ensure there is one final
blank line/newline at EOF so the end-of-file-fixer lint rule stops flagging the
file.

@Clarit-AI
Copy link
Copy Markdown
Owner

Closing as superseded. The upstream-sync.yml workflow already landed on main via commit d2ec4e076, and follow-up fixes are now tracked in #33.

@Clarit-AI Clarit-AI closed this Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants