Skip to content

Race condition in token refresh under high concurrency #56

@coolguy1771

Description

@coolguy1771

Issue

I noticed a potential thundering herd issue in updateTokenIfNeeded. While the mutex prevents data races, multiple goroutines can still trigger back-to-back API calls if they all observe the token nearing expiration at the same time. For enterprise environments managing thousands of runners, this results in unnecessary load and risks hitting API rate limits.

Proposed Fix

Introduce a lastTokenRefresh timestamp to enforce a minimum cooldown (e.g., 5 seconds) between refresh attempts. This ensures that once one goroutine successfully updates the token, subsequent callers within that window will skip the redundant network call.

type Client struct {
    // ... existing fields ...
    lastTokenRefresh time.Time
}

func (c *Client) updateTokenIfNeeded(ctx context.Context) error {
    // Check if refresh is actually needed
    aboutToExpire := time.Now().Add(60 * time.Second).After(c.actionsServiceAdminTokenExpiresAt)
    if !aboutToExpire && !c.actionsServiceAdminTokenExpiresAt.IsZero() {
        return nil
    }
    
    // Cooldown to prevent concurrent thundering herd
    if time.Since(c.lastTokenRefresh) < 5*time.Second {
        return nil
    }
    
    c.lastTokenRefresh = time.Now()
    
    // ... token refresh logic ...
}

Impact

  • Efficiency: Eliminates redundant API calls during expiration windows.
  • Scalability: Critical for large-scale deployments where hundreds of listeners might trigger a refresh cycle simultaneously.
  • Resilience: Reduces the likelihood of hitting GitHub API rate limits during high-concurrency events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions