Hierarchical token buckets

專(zhuān)利號(hào)

US11616725B1

公開(kāi)日期

2023-03-28

申請(qǐng)人

Amazon Technologies, Inc.（US WA Seattle）

發(fā)明人

Salman Ahmad Syed; Sandeep Kumar

IPC分類(lèi)

H04L47/125; H04L47/78; H04L47/215

技術(shù)領(lǐng)域

token,bucket,tokens,service,host,throttle,request,requests,global,key

地域： WA WA Seattle

摘要

Systems and methods are provided for efficient handling of user requests to access shared resources in a distributed system, which handling may include throttling access to resources on a per-resource basis. A distributed load-balancing system can be logically represented as a hierarchical token bucket cache, where a global cache contains token buckets corresponding to individual resources whose tokens can be dispensed to service hosts each maintaining a local cache with token buckets that limit the servicing of requests to access those resources. Local and global caches can be implemented with a variant of a lazy token bucket algorithm to enable limiting the amount of communication required to manage cache state. High granularity of resource management can thus enable increased throttle limits on user accounts without risking overutilization of individual resources.

說(shuō)明書(shū)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

BACKGROUND

Computing devices can utilize communication networks to exchange data. Companies and organizations operate computer networks that interconnect a number of computing devices to support operations or to provide services to third parties. The computing systems can be located in a single geographic location or located in multiple, distinct geographic locations (e.g., interconnected via private or public communication networks). Specifically, data centers or data processing centers, herein generally referred to as a “data center,” may include a number of interconnected computing systems to provide computing resources to users of the data center. The data centers may be private data centers operated on behalf of an organization or public data centers operated on behalf, or for the benefit of, the general public.

To facilitate increased utilization of data center resources, virtualization technologies allow a single physical computing device to host one or more virtualized “sandboxes” that appear and operate as independent execution environments to users of a data center. For example, hardware virtualization can be used to provide a fully emulated hardware computing device (a “virtual machine”). Operating-system-level virtualization can enable a kernel of an operating system to provide multiple isolated user space instances (often called “containers”) without requiring virtualization of the kernel. With virtualization, the single physical computing device can create, maintain, delete, or otherwise manage execution environments in a dynamic manner. In turn, users can request computer resources from a data center, including containers, computing devices, or combinations thereof, and be provided with varying numbers of virtualized resources.

權(quán)利要求

What is claimed is:

1. A hierarchical token bucket system for load balancing access to a network-accessible service provided by a plurality of service hosts, the system comprising:the plurality of service hosts, each of the plurality of service hosts providing access to the network-accessible service; and

a global token bucket cache comprising a plurality of global token buckets, wherein each global token bucket corresponds to a throttle key of a plurality of throttle keys and identifies a number of available tokens for the throttle key within the global token bucket, wherein the global token bucket cache is configured to:receive, from an individual service host of the plurality of service hosts, a request for a number of tokens, the request comprising a throttle key identifying an individual global token bucket of the plurality of global token buckets;

when the number of available tokens in the individual global token bucket is greater than zero, dispense a number of tokens up to the number requested from the individual token bucket to the individual service host; and

when the number of available tokens in the individual token bucket is zero, notify the individual service host that insufficient tokens exist within the individual global token bucket;

wherein the global token bucket cache is further configured to, at each interval of a set of intervals, refill each global token bucket with an additional number of tokens;

wherein each service host of the plurality of service hosts maintains a local token bucket cache comprising a plurality of local token buckets, wherein each local token bucket corresponds to a throttle key of the plurality of throttle keys and identifies a number of available tokens for the throttle key within the local token bucket, and wherein each service host is configured to:receive an access request from a client requesting to access the network-accessible service;

determine a throttle key for the access request;

identify an individual local token bucket corresponding to the throttle key for the access request;

determine the number of available tokens in the individual local token bucket;

when the number of available tokens in the individual local token bucket is sufficient to satisfy the access request, process the access request using at least one available token in the individual local token bucket;

when the number of available tokens in the individual local token bucket is insufficient to satisfy the access request:transmit a request to the global token bucket cache for additional tokens associated with the throttle key for the access request;

when the request to the global token bucket cache for additional tokens results in dispensing of a sufficient number of the additional tokens to satisfy the access request, store the additional tokens in the individual local token bucket and process the access request using at least one available token in the individual local token bucket; and

when the request to the global token bucket cache for additional tokens results in dispensing of an insufficient number of the additional tokens to satisfy the access request, throttle the access request.

2. The hierarchical token bucket system of claim 1 wherein the service host is further configured to, when the number of available tokens in the individual local token bucket is insufficient to satisfy the access request:query a cache to determine whether the individual local token bucket is contained in the cache;

when the individual local token bucket is contained in the cache, throttle the request;

when the individual local token bucket is not contained in the cache, add the individual local token bucket to the cache.

3. The hierarchical token bucket system of claim 1, wherein throttling a request causes the service host to throttle subsequent requests until a predetermined interval has elapsed.

4. The hierarchical token bucket system of claim 1, wherein the service host is configured to forward the access request to the network-accessible service.

5. A computer-implemented method for load balancing access to a network-accessible service provided by a plurality of service hosts, the computer-implemented method comprising:receiving, by a service host of the plurality of service hosts, an access request from a client to access the network-accessible service;

determining a throttle key for the access request;

identifying an individual local token bucket corresponding to the throttle key for the access request;

determining a number of available tokens in the individual local token bucket;

responsive to a determination that the number of tokens in the individual local token bucket is insufficient to satisfy the access request, transmitting to a global cache a request to dispense additional tokens corresponding to the throttle key from a global token bucket for the throttle key, wherein the global cache is configured to refill the global token bucket with an additional number of tokens at each interval of a set of intervals and respond to requests to dispense additional tokens from the global token bucket by dispensing tokens to a requesting service host from the global token bucket when a number of available tokens in the global token bucket is greater than a threshold number and by notifying the requesting service host that insufficient tokens exist within the global token bucket when the number of available tokens in the global token bucket less than the threshold number;

obtaining, from the global token bucket for the throttle key maintained at the global cache, a sufficient number of additional tokens to satisfy the access request; and

servicing the access request using at least the additional tokens.

6. The computer-implemented method of claim 5, further comprising, at the global cache, refilling the global token bucket associated with the throttle key responsive to the request to dispense additional tokens.

7. The computer-implemented method of claim 6, further comprising, prior to refilling the global token bucket, determining at the global cache that the number of tokens contained in the global token bucket is insufficient to dispense the additional tokens.

8. The computer-implemented method of claim 6, wherein the number of tokens added to the global token bucket when it is refilled is less than a maximum number of tokens the global token bucket can contain.

9. The computer-implemented method of claim 6, wherein a number of tokens added to the global token bucket during refilling is calculated by multiplying a refill rate of the global token bucket and an amount of time since the last refill, and wherein the number of tokens added cannot cause the number of tokens to exceed a predetermined maximum.

10. The computer-implemented of claim 5, wherein the request to dispense additional tokens requests a number of additional tokens as a proportion of a maximum number tokens that can be held in the global token bucket for the throttle key.

11. The computer-implemented method of claim 5, wherein transmitting a request to dispense additional tokens corresponding to the throttle key further comprises determining a number of tokens calculated as a weighted average of the number of requests received over a plurality of recent intervals, wherein the calculation comprises a polynomial of degree greater than or equal to one.

12. The computer-implemented method of claim 11, wherein transmitting a request to dispense additional tokens corresponding to the throttle key further comprises determining a number of tokens according to the equation: ((1)(tar_x)+(2)(tar_x-1)+ . . . +(x?1) (tar₂)+(x)(tar₁))/(1+2+ . . . +(x?1)+x), wherein x is the number of preceding intervals and tar_xis the number of requests serviced by the service host during the x^thinterval.

13. One or more non-transitory computer-readable media comprising executable instructions for load balancing access to a network-accessible service provided by a plurality of service hosts, wherein the instructions, when executed by a distributed load-balancing system, cause the distributed load-balancing system to:receive, by a service host of the plurality of service hosts, an access request from a client to access the network-accessible service;

determine a throttle key for the access request;

identify an individual local token bucket corresponding to the throttle key for the access request;

determine a number of available tokens in the individual local token bucket;

responsive to a determination that the number of tokens in the individual local token bucket is insufficient to satisfy the access request, transmit to a global cache a request to dispense additional tokens corresponding to the throttle key from a global token bucket for the throttle key, wherein the global cache is configured to refill the global token bucket with an additional number of tokens at each interval of a set of intervals and respond to requests to dispense additional tokens from the global token bucket by dispensing tokens to a requesting service host from the global token bucket when a number of available tokens in the global token bucket is greater than a threshold number and by notifying the requesting service host that insufficient tokens exist within the global token bucket when the number of available tokens in the global token bucket less than the threshold number;

transmit the generated request to a global cache; and

obtain, from the global token bucket for the throttle key maintained at the global cache, a sufficient number of additional tokens to satisfy the access request; and

service the access request using at least the additional tokens.

14. The one or more non-transitory computer-readable media of claim 13, wherein the instructions cause the global cache to refill the global token bucket associated with the throttle key responsive to the request to dispense additional tokens.

15. The one or more non-transitory computer-readable media of claim 14, wherein the instructions cause the global cache to, prior to refilling the global token bucket associated with the throttle key, determine that the number of tokens in the global token bucket is insufficient to dispense the additional tokens.

16. The one or more non-transitory computer-readable media of claim 14, wherein, to refill the global token bucket associated with the throttle key, the instructions cause the global cache to add a number of tokens to the global token bucket that is less than a maximum number of tokens the global token bucket can contain.

17. The one or more non-transitory computer-readable media of claim 14, wherein the instructions cause the global cache to, prior to refilling the global token bucket associated with the throttle key, determine that the global token bucket contains no tokens.

18. The one or more non-transitory computer-readable media of claim 14, wherein, to refill the global token bucket, the instructions cause the global cache to calculate a number of tokens by multiplying a refill rate of the global token bucket with an amount of time since the last refill and add the minimum of the calculated number of tokens and an amount equal to a maximum number of tokens the global token bucket may contain minus the number of tokens the global token bucket currently contains.

19. The one or more non-transitory computer-readable media of claim 13, wherein to generate a request for additional tokens associated with the throttle key, the instructions cause the distributed load-balancing system to determine a number of tokens calculated as a weighted average of the number of requests received over a plurality of recent intervals, wherein the calculation comprises a polynomial of degree greater than or equal to one.

20. The one or more non-transitory computer-readable media of claim 19, wherein to generate a request for additional tokens associated with the throttle key, the instructions cause the distributed load-balancing system to determine a number of tokens according to the equation: ((1)(tar_x)+(2)(tar_x-1)+ . . . +(x?1) (tar₂)+(x)(tar₁))/(1+2+ . . . +(x?1)+x), wherein x is the number of preceding intervals and tar_xis the number of requests serviced by the service host during the x^thinterval.

21. The one or more non-transitory computer-readable media of claim 13, wherein the request transmitted to the global token bucket for additional tokens associated with the throttle key comprises a request for a number of tokens based on a fraction of a maximum number of tokens the global token bucket can contain.

微信群二維碼

意見(jiàn)反饋

白丝美女被狂躁免费视频网站,500av导航大全精品,yw.193.cnc爆乳尤物未满,97se亚洲综合色区,аⅴ天堂中文在线网官网

Hierarchical token buckets

摘要

說(shuō)明書(shū)

權(quán)利要求

該功能需要專(zhuān)業(yè)版企業(yè)版VIP權(quán)限，您可以：

該功能需要專(zhuān)業(yè)版企業(yè)版VIP權(quán)限，您可以：