Request Timeouts
This feature is available on all Portkey plans.
Manage unpredictable LLM latencies effectively with Portkey's Request Timeouts. This feature allows automatic termination of requests that exceed a specified duration, letting you gracefully handle errors or make another, faster request.
Enabling Request Timeouts
You can enable request timeouts while making your request or you can set them in Configs.
Request timeouts are specified in milliseconds (integer)
While Making Request
Set request timeout while instantiating your Portkey client or if you're using the REST API, send the x-portkey-request-timeout
header.
With Configs
In Configs, request timeouts are set at either (1) strategy level, or (2) target level.
For a 10-second timeout, it will be:
Setting Request Timeout at Strategy Level
Here, the request timeout of 10 seconds will be applied to *all* the targets in this Config.
Setting Request Timeout at Target Level
Here, for the first target, a request timeout of 10s will be set, while for the second target, a request timeout of 2s will be set.
Nested target objects inherit the top-level timeout, with the option to override it at any level for customized control.
How timeouts work in nested Configs
We've set a global timeout of 2s at line #3
The first target has a nested fallback strategy, with a top level request timeout of 5s at line #7
The first virtual key (at line #10), the target-level timeout of 5s will be applied
For the second virtual key (i.e.
open-ai-1-2
), there is a timeout override, set at 10s, which will be applied only to this targetFor the last target (i.e. virtual key
azure-open-ai-1
), the top strategy-level timeout of 2s will be applied
Handling Request Timeouts
Portkey issues a standard 408 error for timed-out requests. You can leverage this by setting up fallback or retry strategies through the on_status_codes
parameter, ensuring robust handling of these scenarios.
Triggering Fallbacks with Request Timeouts
Here, fallback from OpenAI to Azure OpenAI will only be triggered if the first request times out after 2 seconds, otherwise the request will fail with a 408 error code.
Triggering Retries with Request Timeouts
Here, retry is triggered upto 3 times whenever the request takes more than 1s to return a response. After 3 unsuccessful retries, it will fail with a 408 code.
Here's a general guide on how to use Configs in your requests.
Caveats and Considerations
While the request timeout is a powerful feature to help you gracefully handle unruly models & their latencies, there are a few things to consider:
Ensure that you are setting reasonable timeouts - for example, models like
gpt-4
often have sub-10-second response timesEnsure that you gracefully handle 408 errors for whenever a request does get timed out - you can inform the user to rerun their query and setup some neat interactions on your app
For streaming requests, the timeout will not be triggered if it gets atleast a chunk before the specified duration.
Last updated