Large or small memory size for my app?
When is it best to run your application with fewer instances and large memory size, or a lot of instances with a small memory size? In this article, Ram Lakshmanan discusses the differences and pricing of each module and goes over two multi-billion dollar enterprise stories.
Should I be running my application with few instances (i.e. machines) with large memory size or a lot of instances with small memory size? Which strategy is optimal? This question might be confronted often. After building applications for 2 decades, after building JVM performance engineering/troubleshooting tools (GCeasy, FastThread, HeapHero), I still don’t know the right answer to this question. At the same time, I believe there is no binary answer to this question as well. In this article, I would like to share my observations and experiences on this topic.
Two multi-billion dollars enterprises story
Since our JVM performance engineering/troubleshooting tools have been widely used in major enterprises, I had the opportunity to see world-class enterprise applications implementations in action. Recently I had a chance to see two hyper-growth technology companies (if I say their name everyone reading this article will know them). Both companies are headquartered in Silicon Valley. Their business is technology, so they know what they are doing when it comes to engineering. They are Wall Street darlings, enjoying great valuations. Their market cap is in the magnitude of several billions of dollars. They are the poster children of modern thriving enterprises. For our conversation let’s call these two enterprises as Company-A and Company-B.
It immensely surprises me to see how both enterprises have adopted two extremes when it comes to memory size. Company-A has set its heap size (i.e. -Xmx) to be 250gb, whereas Company-B has set its heap size to be 2gb. Company-A’s heap size is 125 times larger than Company-B’s heap size. Both enterprises are confident on their memory size settings. As they say: ‘The proof is in the pudding’; both enterprises are scaling and handling billions of business-critical transactions.
This is a great experience to see both companies who are in the same business, having more or less same revenue/same market cap, located in the same geographical region, at the same point in time adopting two extremes when it comes to memory size. Given this real-life experience, what is the right answer? Large or small size memory? My conclusion is: You can succeed with either strategy if you have a good team in place.
Large memory size can be expensive
A large memory size with a few instances (i.e. machines) tends to be more expensive than with small memory size, a greater number of instances. Here is simple math, based on the cost of an AWS EC2 instances in US East (N. Virginia) region:
m4.16xlarge – 256GB RAM – Linux on-demand instance cost: $3.2/hour
T3a small – 2GB RAM – Linux on-demand instance cost: $0.0188/hour
So, to have capacity of 256GB RAM, we would have to get 128 ‘T3a small’ instance (i.e. 128 instances x 2GB = 256GB).
128 x T3a small – 2GB RAM – Linux on-demand instance cost: $2.4064/hour (i.e. 128 x $0.0188/hour)
A large memory size with few instances costs $0.793/hour (i.e. $3.2 – $2.4064) more than small memory size with a lot of instances. In other words, a ‘large memory size with few instances’ strategy is 33% more expensive.
Of course, another counter-argument: You might need fewer engineers, less electricity, and less real estate if you have a smaller number of machines. Patching and upgrading servers might be easier to do as well.
In some cases, the nature of your business itself dictates the memory size of your application. Here is a real-life incident that we faced: When we built HeapHero (Heap Dump analysis tool), our tool’s memory size had to be larger than heap dump file it parses. Suppose a heap dump file size is 100gb, then HeapHero tool’s memory size must be more than 100gb. There is no choice.
Suppose you are caching a large amount (say 200gb) of data for maximizing your application’s performance, then your heap size must be more than 200gb. You will not have a choice. Thus, in some cases, the business requirements will dictate your memory size.
Performance & troubleshooting
If your memory size is large, then typically Garbage Collection pause times will also be high. Garbage collection is a process that runs in your application to clean-up unreferenced objects in memory. If your memory size is large, then the amount of garbage in the memory will also be large. Thus, the amount of time taken to clean up garbage will also be high. When garbage collection runs, it pauses your application.
But there are solutions to this problem:
- You can use pauseless JVM (like ‘Azul’)
- Proper GC tuning needs to be done to reduce pause times
SEE ALSO: GraalVM: Run programs faster everywhere
Similarly, if you need to troubleshoot any memory problem, you will have to capture heap dumps from the application. A heap dump is basically a file that contains all information about your application’s memory, like what objects were present, what are their references, and how much memory each object occupies. Heap dumps of large memory size applications will also tend to be very large. Analyzing large size heap dumps are difficult as well. Even the world’s best heap dump tools like Eclipse MAT and HeapHero have challenges in parsing heap dumps that are more than 100gb. Reproducing these problems in a test lab, storing these heap dump files, and sharing these heap dump files are all challenges.
Emotions come first, rationale next
After reading books like How we Decide by Jonah Lehrer, I am fairly convinced that your prior experience and emotions play a key role in deciding your application’s memory size. I used to work for a major financial institution. The chief architect of this financial institution was suggesting we run our JVMs with very large memory size. The rationale he gave was: “We used to run mainframes with very large memory size ”😊.
If you are working for very large corporations, then there is a 99.99% chance that you may not have a say on what should be the memory size for your application. That decision has already been made by elites/demi-gods who are sitting on the ivory tower 😊. It might be too hard to reverse or change that decision.
But if you have a choice or option, your decision for memory size can be most likely be influenced by your prior experience and emotions. But either way, you can’t go wrong (i.e. going with few instances with large memory size or a lot of instances with small memory size), provided you have the right team in place.