A tool recently released by the Tumblr team can find the hot keys that seriously impact the performance of server clusters using Memcached to store frequently used data in memory.
Memcached is a great tool for improving the performance of server clusters that are under heavy load. It’s used by many large websites, including Facebook, Reddit, YouTube, and Twitter. Memcached is an in-memory key-value store that can be distributed across a number of servers, sitting between data-requesting clients and database servers, providing very fast access to frequently used data without the performance hit incurred when requesting data from an on-disk database.
Because Memcached distributes data across the RAM of multiple servers, it has excellent load-balancing potential. Network and processing load is spread out across multiple nodes. The node used to store a particular value is determined by a hash of the key, which means there needs to be no communication between Memcached servers. The clients hash the keys and the result determines which of the Memcached servers is used to store and access the data associated with that key.
One drawback of this system is that occasionally one or more keys can see much higher levels of use than the others. Since each key is linked to only one server, and all clients are using that server to retrieve the associated data, it’s possible for it to be overtaxed and for its network connection to become saturated, resulting in performance and stability problems across the cluster.
For example, in some cases clients may request a key every time any page loads across a large site, while requests for most other keys are more evenly distributed across time. That situation leads to one server in the Memcached cluster getting a great deal more traffic than the others.
“passively sniffs the network traffic passing in and out of a server’s network interface and tracks the responses to memcache get commands. The output is presented on the terminal and allows sorting by total calls, requests/sec and bandwidth. This gives us an instantaneous view of our memcache get traffic.”
Unfortunately, it appears as though mctop is prone to dropping packets, which distorts a user’s window into their Memcached store. Tumblr has created an open-source C++ version dubbed memkeys, which does much the same job, but drops fewer packets and gives a more accurate picture of the current state of the cache.
Knowing what’s going on in real-time within the cache is the first step to taking action. If we take the example we gave above, knowing which of the keys are being overused would allow us to change the application logic to distribute the load more evenly, improve cache performance, and improve the overall performance and stability of the site.