I recently had an initial interview for an entry-level full-stack position with Apple and one of the questions I was not fully prepared for was regarding HTTP error codes and which class of error codes is most likely to warrant a retry. Although I was able to provide a description of the source for each HTTP error/status code class, at the time I was a little stumped by the second part of the question- on which error code class does it make the most sense to run a retry. Given that I did not get this part of the question right (I answered error class 4xx) - I wanted to learn more to ensure that I was more familiar with and knowledgeable about HTTP error codes.
When accessing a web application or server every HTTP request is responded to with an HTTP status code. There are five groups of HTTP status code classes that can be identified by their first number:
- 1xx — Informational
- 2xx — Success
- 3xx — Redirectional
- 4xx — Client Error
- 5xx — Server Error
As should be evident in reading the above list, the error code classes are 4xx and 5xx. There are many potential causes and solutions for an HTTP error code.
The 4xx (400–499) client error codes result from HTTP requests sent by a user-client — aka a web-browser or other HTTP client. Although these errors are client caused, it is beneficial to understand which error is occurring in order to determine if a server configuration adjustment could fix the issue.
Common 4xx HTTP Error Codes:
- 400 — Bad Request: The server does not understand the client request due to incorrect syntax. The client should not repeat the request without fixing the syntax issue.
- 401 — Unauthorized: User authentication is required by server and/or authorization has been refused due to user providing incorrect credentials.
- 403 — Forbidden: The request was ‘legal’ and understood by the server but is refusing to process/respond to it.
- 404 — File Not Found: Server response when HTTP request sent by client is understood, however it cannot locate a file specified by the requested URL address. Can also occur in the event that the website it dead. For low-level URLs a 404 usually results from a broken link.
- 405 — Method Not Allowed: Methods for accessing web server resources are defined in the HTTP protocol- however web servers can be configured to allow or not allow any method. This error indicates that the request specified method is not allowed for the resource in the request URL. This error is common with the use of the POST method.
Server errors (500–599) result when an error occurs with an HTTP request to a web server and the server is unable to process/fulfill the request for any reason. **This is the class of error codes that would warrant a retry.** In addition to the error code number, the server should also include an explanation of the situation that caused the error and information on whether the error is a temporary or permanent issue.
Depending on the complexity of the web-application — discovering an internal server error could take time from the developer’s side and thus cause frustration to the user in the production environment. To avoid this, developers can employ a variety of search-engine-optimization (SEO) tools. SEO tools aid developers by providing alerts and data related to the health of their web application, helping to ensure an errorless experience for users.
Common 5xx HTTP Error Codes:
- 500 — Internal Server Error: Possible the most common message encountered, the 500 error specifies a generic server error that results when the server cannot determine the exact issue.
- 501 — Not Implemented: The server is unable to determine the request or is not able to perform the request for any reason.
- 502 — Bad Gateway: Displayed when the server is operating as a proxy or gateway server and the upstream server returns an invalid response.
- 503 — Service Unavailable: Indicates that the server is unavailable at time of request, commonly resulting from overloading or maintenance. Generally this is a temporary issue.
- 504 — Gateway Timeout: Similar to 502, occurs when the server is functioning as a proxy or gateway server and the upstream server is unable to respond fast enough.
In terms of solving or troubleshooting these common 5xx errors, the approach is determined by the error code. A 500 error commonly results from a server misconfiguration or missing packages — so check your files for these issues. A 501 error can be caused by a virus or malware that is impacting the client’s system such that the browser is unable to establish a connection with the web server. The 501 error may also be displayed as the result of server overload or expired server software.
In response to a 502 error, areas to check include the health of the backend server (where HTTP requests are being forwarded); the configuration of the reverse proxy, that the correct backend is specified, and that the network connection between the reverse proxy and backend server is functional — if servers are able to communicate on multiple ports, ensure that the firewall allows traffic between; and if the web app is configured to listen on a socket, confirm that the socket is in the correct location with the correct permissions.
In the event of a 503 error, if the server is NOT under maintenance, the issue may be the result the server’s lack of adequate CPU or memory resources to handle incoming requests — or that the server requires configuration for allowance of more threads, processes, or users.
Things to consider as the origin for the 504 error include a poor network connection between servers, that the proxy server’s timeout duration is too short, or that the backend server is too slow in fulfilling requests.
Looking back, the mistake I made in answering this question during my interview was likely a result of nerves and a lack of thoroughly thinking through my answer prior to providing it. Lesson: allow yourself the time to calm down, think it through, and provide a thoughtful answer.
The pivotal intel in determining the viability of a retry strategy lies in the what, why and when of your system failure, and that there is a high enough chance that it will get back up at some point not too far in the future. This being the case, having knowledge of HTTP error code definitions and their potential causes is crucial in determining whether a 5xx error permits a retry.
My next post will be on best practices and patterns for a retry.