OpenLiteSpeed vs Nginx: Troubleshooting a Web Server Nightmare
In the world of web hosting, managing high traffic can be a challenge. Websites that experience sudden spikes in visitors, especially during severe weather events, need a robust hosting solution that can handle the increased load. Lee Hutchinson, a colleague of Eric Berger, faced this very challenge with their Houston-area weather forecasting site, Space City Weather. While the site typically received a manageable amount of traffic, it could experience a surge of over a million page views in just 12 hours during severe weather events. To ensure the site could handle such high traffic, Hutchinson decided to make a change to the hosting stack.
The Old Stack: HAProxy, Varnish Cache, and Nginx
Hutchinson initially ran Space City Weather on a backend stack consisting of HAProxy for SSL termination, Varnish Cache for on-box caching, and Nginx as the web server application. This setup had been battle-tested and proven to handle high traffic effectively. However, it was complex and made troubleshooting issues more difficult. Hutchinson wanted to simplify the stack and reduce complexity.
Enter OpenLiteSpeed
In search of a simpler solution, Hutchinson discovered OpenLiteSpeed (OLS). While he didn’t know much about OLS at the time, he learned that it was highly regarded for its integrated caching, particularly for WordPress hosting. Intrigued by the potential speed improvements and eager for a change after years of administering the same stack, Hutchinson decided to give OLS a try.
Transitioning to OpenLiteSpeed
Transitioning from Nginx to OLS required some adjustments. OLS primarily used a GUI for configuration, which added another layer of complexity. Hutchinson had to acclimate himself to the new interface and ensure the admin console remained secure. Additionally, he had to work with the OLS LiteSpeed Cache plugin for WordPress to optimize caching settings. After some experimentation and configuration, Hutchinson successfully migrated Space City Weather to OLS.
Teething Pains and Initial Success
Despite the successful transition, Hutchinson encountered some compatibility issues between the Cloudflare WordPress plugin and the LiteSpeed Cache plugin. These plugins were crucial for cache invalidation and ensuring the site worked normally for visitors. However, after some troubleshooting and configuration adjustments, Hutchinson managed to resolve the cache issues and achieve consistent functionality.
The Nightmare Begins
For almost two years, Space City Weather ran smoothly on OLS. However, one day, during a cold front update, the site went non-responsive. Hutchinson received a call from Eric Berger, alerting him to the issue. Troubleshooting revealed that OLS was spawning an excessive number of PHP handler processes, causing high CPU usage and rendering the site unresponsive. Despite various attempts to fix the problem, including adjusting PHP and server configurations, the issue persisted.
Troubleshooting Frustration
Hutchinson spent two weeks trying to identify the root cause of the problem. He experimented with Cloudflare settings, audited PHP configurations, and adjusted LiteSpeed Cache plugin settings. However, he faced limitations due to inadequate access logging in OLS, which made it challenging to pinpoint the exact requests causing the high load. Frustrated by the lack of progress, Hutchinson decided to revert to Nginx, replacing OLS entirely.
Back to Nginx
Returning to Nginx provided a sense of familiarity and stability for Hutchinson. With a simplified stack and Nginx’s FastCGI cache for on-box caching, he managed to restore functionality to Space City Weather. While there is still uncertainty about whether all the underlying issues have been resolved, Hutchinson feels more confident with Nginx as the web server.
Lessons Learned
Hutchinson’s experience serves as a reminder that change should only be embraced when driven by logical, requirements-driven reasons. While OLS initially seemed like an exciting change, it ultimately caused more problems than it solved. Hutchinson advises against embracing change for the sake of change, especially when dealing with functional and stable systems.
Moving Forward
As Hutchinson continues to monitor the site’s performance, he acknowledges that there may have been multiple unrelated issues contributing to the problem. However, the switch back to Nginx has significantly altered the hosting landscape, providing a fresh start. Hutchinson remains vigilant, tailing log files and monitoring CPU usage, as he strives to ensure Space City Weather operates smoothly.
In conclusion, Hutchinson’s journey from OpenLiteSpeed to Nginx highlights the challenges of managing high-traffic websites and the importance of carefully evaluating hosting solutions. While OLS showed promise initially, the complexity and compatibility issues ultimately led to its abandonment in favor of the familiar and reliable Nginx. As Hutchinson continues to fine-tune the hosting stack, he hopes to achieve a stable and high-performing environment for Space City Weather.