Working With the "One-Second" Rule
What is the "One-Second Rule?"
The following condition in the Amazon Web Services license agreement often causes confusion or concern:
You may make calls at any time that the Amazon Web Services are available, provided that you [...] do not exceed 1 call per second per IP address [...]
Without the "one-second rule," Amazon's servers would be overwhelmed and unable to keep up with the demand on them. The rule encourages A2S developers to consider the impact their programs have on the servers and design applications more intelligently.
What, Me Worry?
Often developers worry about what will happen if they occasionally make more than one query per second so they design complicated systems to prevent their programs from every making two calls less than a second apart. This usually isn't necessary. The "one call per second" is an average and, for sites which send enough traffic to Amazon, the rule is relaxed. Amazon has not stated the period of time over which the count is averaged or what criteria are used for relaxing the rule.
What Happens When You Exceed One Call Per Second?
What happens when you regularly exceed the "one call per second" limit? A2S's servers will begin returning a "503" error. As explained above, you'll probably not see this unless your application continuously queries A2S without a delay between the queries. This is often the result of developers trying to use A2S as a "data feed."
How can I Download Everything?
Many affiliate programs provide data feeds. A data feed provides information on many products all in a single file. But with millions of constantly-updated products available from Amazon.com, Amazon decided a data feed was not practical. Instead they implemented the A2S web services. Unlike a data feed which is downloaded into a database, web services provide specific information when it's needed by your application.
If you're trying to find a way to "download everything" or "download all ...whatever..." using A2S, then you probably are not trying to use it the way it's designed to be used. The best thing to do is to use A2S as it's intended. Don't try to download everything at once. Just download the information as it's needed to be displayed.
Caching A2S Results
You can cache the information so it doesn't have to be downloaded as often. You may considering using a combination of a simple cache (just storing the results of specific queries) and smart caching (storing parsed information in a relational database).
An example of a simple cache is demonstrated in PHP code in the Getting Rest Results with Simple Caching article. It describes a simple caching mechanism where the entire XML result of an A2S query is stored in a database table indexed by the query (the A2S request URL) itself. When the script needs to do an A2S query, it first reads the cached result from the database. If there is no cached result in the database, it queries Amazon and stores the result, with a timestamp, in the database.
If there is a result in the database, it looks at the timestamp. If it's more than a specified number of days in the past, it does a query to Amazon. If the query is successful, it updates the record in the database and uses the new information.
If the Amazon query fails, or the information in the database is not too old, it uses the cached information.
There are limits specified in the AWS license agreement on how long you are allowed to cache some information. For example, you are not allowed to cache prices for more than one hour. So, when displaying prices, you should design your application to make a specific request for prices any time you need to display a price while caching other information.
The function described in the article allows you to bypass caching by setting the cache argument to false. You might also consider modifying the function to specify the cache duration in seconds instead of days which would allow you to cache requests for prices for up to one hour.
A smart caching scheme takes advantage of the fact that different A2S calls return similar types of information and that the AWS license agreement allows you to store different types of information for different lengths of time.
The same type of information is often available from different types of A2S calls. For example, you generally use an ItemSearch operation and specify a browse node ID in order to find products in a specific browse node. But, by including the BrowseNodes response group when doing an ItemLookup operation on a specific product (ASIN), you get a list of the browse nodes in which the product is included. A smart cache would parse the ItemLookup response and store the browse node information as well as the other product details. This could reduce the number of ItemSearch operations your application would need to make. In a similar manner, an ItemSearch operation to get a list of products in a specific browse node also returns information on the products which can be cached.
Each piece of data stored in the cache should be tagged with time stamp. When the information is about to be displayed to a user, the time stamp should be checked. If the information is older than allowed by the license agreement, perform an A2S query to update the information.
Generally, if your site is not very busy, you'll not average more than one request per second, so there won't be a problem. For a busy site, the "one-second rule" won't affect you if you design your code intelligently, cache information as appropriate, and send customers to Amazon. If, on the other hand, you write a tight loop that tries to download information on every Amazon product as fast as possible, you will run into problems.