How Laravel Octane Cuts Response Times for SaaS Apps
Improve SaaS performance by running Laravel in memory with Octane, reducing request boot time and delivering faster response speeds at scale.
How Laravel Octane Cuts Response Times for SaaS Apps
SaaS applications live and die by speed. A sluggish API response doesn't just frustrate users—it quietly drains conversion rates, inflates churn, and chips away at the revenue metrics your investors watch closely. For Laravel teams running high-concurrency workloads, the traditional PHP-FPM request lifecycle has a fundamental limitation: every request bootstraps the entire application from scratch. Configuration loads, service providers register, bindings resolve—then everything gets discarded the moment the response is sent.
Laravel Octane eliminates that overhead entirely. By keeping the application in memory between requests and serving traffic through high-performance servers like Swoole and RoadRunner, Octane can reduce p95 response times by 40–70% on real-world SaaS workloads—without changing a single line of your business logic.
This guide explains how Octane works, how to configure it correctly, and the architectural patterns that allow SaaS teams to sustain those performance gains under production load.
How Octane Changes the Laravel Request Lifecycle
Under a standard PHP-FPM setup, each incoming HTTP request triggers a complete application boot cycle. While Laravel's bootstrap process is efficient, that overhead accumulates quickly when handling thousands of simultaneous users.
Octane inverts this model. On server startup, it boots your application once and holds it in memory as a persistent worker process. Subsequent requests are handled by the already-initialized application, completely bypassing the bootstrap cycle. The practical result is that framework overhead that previously consumed 20–50ms per request drops to near zero.
Octane supports two server backends:
- Swoole — A PHP extension written in C, offering coroutine support, non-blocking I/O, and built-in connection pooling. Best suited for high-throughput, latency-sensitive APIs.
- RoadRunner — A Go-based application server that communicates with PHP workers over a binary protocol. Easier to install than Swoole (no PHP extension required) and well-suited for containerized deployments.
Both options deliver substantial improvements over FPM for the vast majority of SaaS workloads.
Installing and Configuring Octane
Getting Octane running is straightforward. Install the package via Composer, then select your preferred server:
composer require laravel/octane php artisan octane:install
During installation, you will be prompted to choose between Swoole and RoadRunner. For production SaaS environments with high concurrency requirements, Swoole is generally the stronger choice. For teams prioritizing deployment simplicity—particularly those using Docker without the ability to install PHP extensions—RoadRunner integrates cleanly into containerized pipelines.
Once installed, the config/octane.php file exposes the primary tuning parameters. The two most impactful settings for SaaS workloads are workers and max_requests.
'workers' => env('OCTANE_WORKERS', 4),
'max_requests' => env('OCTANE_MAX_REQUESTS', 500),Workers controls how many concurrent requests your application can handle simultaneously. A practical baseline is one worker per available CPU core. For memory-intensive Laravel applications, monitor RSS usage under load before increasing this number.
Max requests instructs Octane to restart a worker after it has handled the specified number of requests. This prevents slow memory accumulation from long-lived worker processes—an important safeguard in applications that process large payloads or hold references to heavy objects across requests.
Managing State in Persistent Workers
Octane's performance gains come directly from application persistence, and that same persistence introduces the most common source of bugs in Octane migrations: state contamination between requests.
With FPM, every request starts with a clean slate. With Octane, static properties, resolved singletons, and any data written to class-level variables persist across requests within the same worker. If a previous request writes user-specific data to a static property, the next request processed by that worker inherits it.
The three most important practices for managing this correctly are:
1. Avoid Storing Request-Scoped Data in Singletons
Services resolved from Laravel's container as singletons remain bound for the lifetime of the worker. If those services store request-specific data—authenticated user IDs, tenant identifiers, feature flag states—that data will bleed into subsequent requests.
The solution is to rebind request-scoped services on each request. Octane provides lifecycle hooks for exactly this purpose:
// In AppServiceProvider
Octane::tick('flush-tenant', function () {
app('tenant')->flush();
})->seconds(0);Alternatively, bind services as scoped() rather than singleton(). Scoped bindings are flushed automatically between Octane requests, making them the safer default for services that interact with authentication, authorization, or multi-tenancy logic.
2. Flush Static Properties Explicitly
Static properties on your own classes require manual flushing. Audit your codebase for any class that writes to static properties during a request and ensure those properties are reset in an octane:request-handled listener or through the flush method pattern.
3. Test with octane:status and Memory Profiling
After deploying, monitor memory growth per worker using php artisan octane:status. Pair this with Blackfire or Laravel Telescope to identify services that accumulate memory across request cycles. A worker that doubles its memory footprint over 100 requests typically has an unresolved static reference or a collection that grows without being cleared.
Integrating Octane with Laravel Horizon and Redis
Octane handles synchronous HTTP traffic. Background job processing remains the responsibility of Laravel Horizon. The two components are complementary, not competitive—and configuring them correctly together determines whether your SaaS platform handles traffic spikes gracefully or degrades under load.
A well-structured production setup separates concerns as follows:
- Octane workers serve all incoming HTTP and WebSocket connections
- Horizon supervisors process queued jobs across dedicated worker pools
- Redis serves as the shared state layer between both—handling job queues, cache, rate limiting, and session storage
When configuring Redis connections in an Octane environment, use persistent connections where possible. Swoole's connection pooling can manage Redis handles efficiently, but this requires explicit configuration:
'redis' => [
'default' => [
'url' => env('REDIS_URL'),
'persistent' => true,
'persistent_id' => 'octane',
],
]Persistent connections reduce the TCP handshake overhead on each Redis call—a meaningful saving when a single API response may execute 20–40 cache reads.
Deploying Octane in Docker and CI/CD Pipelines
Octane runs as a long-lived process rather than a request-response handler, which changes how zero-downtime deployments operate. A naive deployment that restarts the Octane process mid-traffic will drop in-flight requests. The correct approach uses process managers that support graceful termination.
In a Docker-based setup with GitHub Actions, the deployment sequence should be:
- Build the new image and push to your container registry
- Send a SIGTERM to the current Octane container (triggering graceful shutdown after active requests complete)
- Start the new container with the updated image
- Run php artisan octane:reload if using the same container with updated code
Laravel Forge and Envoyer both support Octane-aware deployment hooks. If your infrastructure runs on AWS with Laravel Vapor, Octane support is built in via the octane configuration flag in vapor.yml.
For environments using GitHub Actions, a minimal Octane reload step looks like this:
- name: Reload Octane
run: ssh deploy@${{ secrets.SERVER_HOST }} "cd /var/www/app && php artisan octane:reload"This approach restarts workers gracefully without dropping active connections.
Measuring the Impact
Performance improvements without measurement are just assumptions. Before deploying Octane, establish a baseline using your current p50, p95, and p99 response time distributions across your most critical endpoints—typically authentication, dashboard data, and any API routes consumed by your front end.
After deployment, track the same metrics under equivalent load conditions. For SaaS platforms running on Laravel 10 with standard Eloquent workloads, real-world outcomes typically include:
- 40–60% reduction in average response time for read-heavy API endpoints
- Significant improvement in p95 latency under concurrent load, where FPM's worker exhaustion previously caused request queuing
- Reduced server costs, as higher throughput per worker allows the same infrastructure to handle more traffic
When combined with Redis query caching, Horizon queue offloading, and Blackfire-guided query optimization, Octane forms the foundation of a performance stack that can sustain growth without proportional infrastructure scaling.
Build on a Performance-Ready Foundation
Octane is not a configuration toggle you flip once and forget. It requires architectural awareness—particularly around state management—and a deployment pipeline capable of handling long-lived processes correctly. Teams that invest in those fundamentals gain a compounding advantage: each feature released on top of a fast, stable core compounds rather than degrades the user experience.
For SaaS teams evaluating their Laravel performance roadmap, the practical starting point is an Octane readiness audit: review singleton usage, test worker stability under load, and confirm your deployment pipeline supports graceful restarts. The performance ceiling is high. Reaching it is a matter of systematic execution.
Need help assessing your Laravel application's Octane readiness or accelerating a production migration? Book a complimentary strategy call to review your current stack and define the fastest path to measurable performance improvements.
Related articles
Continue exploring Laravel insights and practical delivery strategies.
Build Scalable SaaS Dashboards with Filament & Laravel
FilamentPHP transforms how Laravel teams build SaaS dashboards and admin panels. Instead of maintaining complex custom tooling, developers can leverage Filamentβs resource system, reactive Livewire components, and multi-tenant architecture to deliver production-ready dashboards in a fraction of the time. This guide explores how Filament integrates with Laravel to power scalable SaaS platforms, streamline internal tooling, and support real-time operational data without building everything.
Florentin Pomirleanu
Principal Laravel Consultant
AI-Assisted Laravel Development with OpenAI Codex
Learn how to build Laravel applications faster using OpenAI Codex. This step-by-step guide shows how to generate models, controllers, and views with AI.
Florentin Pomirleanu
Principal Laravel Consultant
Laravel Technology Trends 2025: AI, MCP, and the Future of Development
Discover the top Laravel technology trends for 2025, including AI integration, the Model Context Protocol (MCP), serverless architectures, and more. Learn how these innovations are transforming development workflows and empowering SaaS companies to build scalable, high-performance applications.
Florentin Pomirleanu
Principal Laravel Consultant
Laravel consulting
Need senior Laravel help for this topic?
Let's adapt these practices to your product and deliver the next milestone.