Your concerns are very much valid and they tell me your original easy caching solution is eventually becoming part of your architecture which naturally brings a new level of issues as you described yourself.
A good architectural solution to caching is to use annotations combined with IoC which solves several problems you described. For example:
- Allow you to better control the life-cycle of your cached objects
- Allow you to replace the caching behavior easily by changing the annotations (instead of changing the implementation)
- Let you easily configure a multi-layered cache where you could be storing in memory then disk cache for example
- Let you define the key (hash) for each method in the annotation itself
In my projects (Java or C#) I use Spring caching annotations. You can find a brief description here.
IoC is a key concept in this solution because it allows you to configure your caching system anyway you want.
In order to implement a similar solution in Python you have to find how to use annotations and search for a IoC container that allows you to build proxies. That's how the annotations work in order to intercept all method calls and provide you this particular solution for caching.