Caching in Python with Examples
In this tutorial, you will learn about how to cache frequently needed data in Python for making the code faster.
Caching is a programming technique to store frequently required data in a temporary location for faster access rather than requesting it from the main source every time.
Every Python programmer should be familiar with the concept of caching.
In this tutorial, we will learn how to implement caching in a Python program using the cachetools Python library. The cachetools module includes a number of classes that implement caches using various cache algorithms derived from Cache class which, in turn, is derived from the collections.MutableMapping. The module also provides maxsize and currsize attributes to get the maximum and current size of the cache. When a cache is full, Cache.__setitem__() repeatedly calls self.popitem() until the item can be inserted.
This module contains a number of memoizing collections and decorators, including variations of the @lru_cache function decorator from the Python Standard Library.
Install cachetools
pip install cachetools
cachetools.Cache
This cachetools.Cache class provides mutable mapping that can be used as a simple cache or cache base class. Cache class uses popitem() to make space when necessary. To implement specific caching algorithms, derived classes can override popitem(). We can also override __getitem__(), __setitem__(), and __delitem__() functions if a subclass needs to track item access, insertion, or deletion.
from cachetools import Cache
#maxsize is the size of data the Cache can hold
cache_data = Cache(maxsize=50000)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item4), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.FIFOCache
FIFOCache is an implementation of First In First Out cache. This class removes data in the order they were added in the cache to make space when necessary. The popitem() function removes the data inserted first in the cache.
from cachetools import FIFOCache
cache_data = FIFOCache(maxsize=50000)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.LFUCache
LFUCache is an implementation of Least Frequently Used cache. This class keeps track of how often an item is retrieved and discards the data that aren't used too often to free space when necessary. The popitem() function removes the least frequently used data from the cache.
from cachetools import LFUCache
cache_data = LFUCache(maxsize=50000)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.LRUCache
LRUCache is an implementation of Least Recently Used cache. This class discards the least recently used data from the cache to free space when necessary. The popitem() function removes the least recently used data from the cache.
from cachetools import LRUCache
cache_data = LRUCache(maxsize=50000)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.MRUCache
MRUCache is an implementation of Most Recently Used cache. This class discards the most recently used data items from the cache to free space when necessary. The popitem() function removes the most recently used data from the cache.
from cachetools import MRUCache
cache_data = MRUCache(maxsize=50000)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.RRCache
RRCache is an implementation of Random Replacement cache. This class randomly selects and discards the data items from the cache to free space when necessary. The popitem() function randomly removes the data from the cache.
from cachetools import RRCache
import random
cache_data = RRCache(maxsize=50000, choice=random.choice)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
cachetools.TTLCache
TTLCache is an implementation of LRU cache with a per-item time-to-live (TTL) value. Each cached item has a time-to-live value and the item will no longer be accessible after the time-to-live value expires. The popitem() function removes the least recently used data from the cache.
from cachetools import TTLCache
from datetime import datetime, timedelta
#Creating cache where each item will be accessbile for 1 hour
cache_data = TTLCache(maxsize=50000, ttl=timedelta(hours=1), timer=datetime.now)
data_item1 = "http://example1.com"
data_item2 = "http://example2.com"
data_item3 = "http://example3.com"
data_item4 = "http://example4.com"
data_item5 = "http://example5.com"
#Storing data in cache for no longer than 1 hour
cache_data[hash(data_item1)] = data_item1
cache_data[hash(data_item2)] = data_item2
cache_data[hash(data_item3)] = data_item3
cache_data[hash(data_item4)] = data_item4
cache_data[hash(data_item5)] = data_item5
#Accessing data from cache
item = cache_data.get(hash(data_item2), None)
print("Getting from cache = ", item)
The output of the above example will look like this:
Memoizing Decorators
The cachetools module also provides decorators for memoizing function and method calls. Memoization is a technique in which the results of expensive function calls are stored in temporary locations for faster access to speed up computer programs by returning the cached result when the same inputs are provided again.
@cachetools.cached
cachetools.cached is a decorator that wraps a function with a memoizing callable and stores the results in a cache.
from cachetools import cached
import time
#WITHOUT CACHE speed up calculating Fibonacci numbers
def fibonacci(n):
return n if n < 2 else fibonacci(n - 1) + fibonacci(n - 2)
start_time = time.time()
print("Result without cache = ",fibonacci(30))
end_time = time.time()
print("Time Taken without cache : ", end_time - start_time)
#WITH CACHE speed up calculating Fibonacci numbers
@cached(cache={})
def fibonacci(n):
return n if n < 2 else fibonacci(n - 1) + fibonacci(n - 2)
start_time = time.time()
print("Result with cache = ",fibonacci(30))
end_time = time.time()
print("Time Taken with cache : ", end_time - start_time)
The output of the above example will look like this:
Time Taken without cache : 0.19904470443725586
Result with cache = 832040
Time Taken with cache : 0.0
The following is another example of calling a method with and without caching the result. The caching method saves the result for no longer than 10 minutes:
from cachetools import cached, TTLCache
import urllib.request
import time
BASE_URL = "https://api.openweathermap.org/data/2.5/weather?"
API_KEY = Get your keys here at https://home.openweathermap.org/api_keys
def get_weather(city):
URL = BASE_URL + "q=" + city + "&appid=" + API_KEY
f = urllib.request.urlopen(URL)
if f.status == 200:
data = f.read().decode('utf-8')
return data
return None
start_time = time.time()
print(get_weather("Boston"))
end_time = time.time()
print("Time Taken without using cache = ", end_time - start_time)
# cache weather data for no longer than ten minutes
@cached(cache=TTLCache(maxsize=1024, ttl=600))
def get_weather(city):
URL = BASE_URL + "q=" + city + "&appid=" + API_KEY
f = urllib.request.urlopen(URL)
if f.status == 200:
data = f.read().decode('utf-8')
return data
return None
start_time = time.time()
print(get_weather("Boston"))
end_time = time.time()
print("Time Taken using cache = ", end_time - start_time)
The output of the above example will look like this:
Time Taken without using cache = 2.791274309158325
{"coord":{"lon":-71.0598,"lat":42.3584},"weather":[{"id":801,"main":"Clouds","description":"few clouds","icon":"02n"}],"base":"stations","main":{"temp":264.65,"feels_like":258.32,"temp_min":260.9,"temp_max":268.12,"pressure":1040,"humidity":63},"visibility":10000,"wind":{"speed":4.12,"deg":220},"clouds":{"all":20},"dt":1644982968,"sys":{"type":1,"id":3486,"country":"US","sunrise":1644925256,"sunset":1644963363},"timezone":-18000,"id":4930956,"name":"Boston","cod":200}
Time Taken using cache = 0.43514037132263184