TAGS :Viewed: 18 - Published at: a few seconds ago

[ Dynamically add server to memcached ]

I am setting up memcached on my Elastic Beanstalk.

Amazon's Elastic cache is a good option too, but since my usage is going to be sparse, having a separate ec2 instance for cache running full-time seem like overkill in my case, also, it is expensive.

Since my app is running on python, I would use a python client to communicate to memcached.

When Elastic beanstalk scales up/down, it adds/removes some instances. I would like to dynamically add/remove memcached servers in case this happens. Loosing some of the cache while scaling down seems totally acceptable.

How can I accomplish this?


I found a similar question here

But the answer does not offer a solution to this issue, just a workaround (It recommends using another service which is free for small usage, network latency will slow things down, so not a good option)

Answer 1

You can create a load balanced EB environment and set it up in a way that YOU tell it when to scale up and down.
In the Scaling section set the "Minimum instance count" to 1 and the "Maximum instance count" to whatever the max number of cache hosts you need.
Now write a short (python) script that checks how many ec2 instances are up and decides whether the current memcache box is still needed. Make it return HTTP 500 or any other bad code if the box needs to be terminated.
Now in the "EC2 Instance Health Check" section, point the "Application health check URL" to a short python code that returns a bad code in case the cache server needs to be destroyed.
This will tie the number of application boxes to the number of cache servers as you instruct.

Answer 2

Ok, so here is what I ended up doing.

I added a post deploy hook to my app. Now, whenever I deploy to Elastic Beanstalk, the script in post deploy hook installs memcached and runs a memcached server on the local instance.

After that , the script connects to the MySQL server on an RDS instance, and registers it's IP by making an entry to the memcached_servers table.

Now, on the client side, we create a memcached client using pylibmc using a helper class that fetches the IPs from memcached_servers table once an hour and recreates a new client if the servers have changed.


class MCClient(object):

    _mc_client = None
    _last_refresh = time.time()
    _refresh_client_in = 3600  # seconds, 1 hour
    _servers = []

    def client():
        if MCClient._mc_client is None or MCClient.client_timeout():
            MCClient._mc_client = MCClient.new_memcached_client()
        return MCClient._mc_client

    def client_timeout():
        return (time.time() - MCClient._last_refresh) > MCClient._refresh_client_in

    def fetch_memcached_servers():
        MCClient._last_refresh = time.time()
        return list(MemcachedServer.objects.filter(active=True).values_list('ip', flat=True))

    def new_memcached_client():
        servers = MCClient.fetch_memcached_servers()
        if MCClient._mc_client is not None and set(MCClient._servers) == set(servers):
            # do not bother recreating a client, if the servers are still the same
            return MCClient._mc_client
            MCClient._servers = servers
            return pylibmc.Client(MCClient._servers, binary=True, behaviors={
                'tcp_nodelay': True,
                'ketama': True,
                'no_block': True,
                'num_replicas': min(len(MCClient._servers) - 1, 4),  # if a server goes down we don't loose cache
                'remove_failed': 3,
                'retry_timeout': 1,
                'dead_timeout': 60

To get a client I do mc = MCClient.client(). This way every time Elastic Beanstalk scales up/down memcached servers are updated within an hour. Also, the cache is replicated on atmost 4 servers, as a safety mechanism, so that we don't end up loosing cache in case a server goes down.