As I’m sure you all know, one of the key technologies that VMware has offered for a long time is memory ballooning to free memory from idle guest OSs in order to return that memory to the pool.
My own real world experience managing hundreds of VMs in VMware has really made me want to do one thing more than anything else:
Manually inflate that damn memory balloon
I don’t want to have to wait until there is real memory pressure on the system to reclaim that memory. I don’t use windows so can’t speak for it there, but Linux is very memory greedy. It will use all the memory it can for disk cache and the like.
What I’d love to see is a daemon (maybe vmware-tools even) run on the system monitoring system load, as well as how much memory is actually used, which many Linux newbies do not know how to calculate, using the amount of memory reported being available by the “free” command or the “top” command is wrong. True memory usage on Linux is best calculated:
- [Total Memory] – [Free Memory] – [Buffers] – [Cache] = Used memory
I really wish there was an easy way to display that particular stat, because the numbers returned by the stock tools are so misleading. I can’t tell you how many times I’ve had to explain to newbies that just because ‘free’ is saying there is 10MB available that there is PLENTY of ram on the box because there is 10 gigs of memory in cache. They say, “oh no we’re out of memory we will swap soon!”. Wrong answer.
So back to my request. I want a daemon that runs on the system, watches system load, and watches true memory usage, and dynamically inflates that baloon to return that memory to the free pool, before the host runs low on memory. So often VMs that run idle really aren’t doing anything, and when your running on high grade enterprise stoage, well you know there is a lot of fancy caching and wide striping going on there, the storage is really fast! Well it should be. Since the memory is not being used(sitting in cache that is not being used) – inflate that balloon and return it.
There really should be no performance hit. 99% of the time the cache is a read cache, not a write cache, so when you free up the cache the data is just dropped, it doesn’t have to be flushed to disk (you can use the ‘sync’ command in a lot of cases to force a cache flush to see what I mean, typically the command returns instantaneously)
What I’d like even more than that though is to be able to better control how the Linux kernel allocates cache, and how frequently it frees it. I haven’t checked in a little while but last I checked there wasn’t much to control here.
I suppose that may be the next step in the evolution of virtualization – more intelligent operating systems that can be better aware they are operating in a shared environment, and return resources to the pool so others can play with them.
One approach might be to offload all of storage I/O caching to the hypervisor. I suppose this could be similar to using raw devices(bypasses several file system functions). Aggregate that caching at the hypervisor level, more efficient.
So totally agree with everything in this post. I’d like it… but thats too facebook for this blog. 🙂
Comment by Justin — October 8, 2010 @ 12:35 am
Oh… and windows sucks at memory management…. they page just because they can…. you can have a box with 32gb of ram… 16gb free… and your still paging…. worse just windows not running any applications pages to the page file….
Comment by Justin — October 8, 2010 @ 12:36 am
[…] Oh and I won’t forget – give us an ability to manually control the memory balloon. […]
Pingback by Cluster vMotion « TechOpsGuys.com — July 20, 2011 @ 12:05 am
[…] YAWA with regards to compression would be to provide me with compression ratios – how effective is the compression when it's in use? Recommend to me VMs that have low utilization that I could pro-actively reclaim memory by compressing these, or maybe only portions of the memory are worth compressing? The Hypervisor with the assistance of the vmware tools has the ability to see what is really going on in the guest by nature of having an agent there. The actual capability doesn't appear to exist now but I can't imagine it being too difficult to implement. Sort of along the lines of pro-actively inflating the memory balloon. […]
Pingback by Interesting discussion on vSphere vs Hyper-V « TechOpsGuys.com — April 7, 2012 @ 3:04 pm