Scale and Resilience. Are they same
No they aren't. They are actually different paradigms all together.
But then why is this question Because there is a tendency to think they are same. Due to when they show up and sometimes the common solution for both.
Scale challenges and Resiliency problems show up almost always when systems are under load. Systems at limits, tends to expose points of failures. And additional capacities can help mitigate the symptoms for sometime but not always.
What they really are In general - scale is "more". And resilience is for "worse".
Consider a common restaurant example. Lets consider that there is a relatively new place - which has become popular. And on a specific festival night, there is a lot of expected crowd. The owner expected and got few temporary kitchen staff for the night. That is scaling. And they have got by with that more than a few times.
In their supply chain though - they had a single dairy provider. And their truck broke today on the road. It had never happened before - the partner was wildly punctual and had standby trucks. But during festival days - they were running above limit. Eventually they couldn't serve the most demanded shakes and cheese sandwiches. That is failure to be resilient. They didn't have an alternate. And it showed up more because system was under full load. And you can say the overall region had more load which created a ripple effect.
Life requires balancing Scaling and Resiliency
Scaling by definition allows you to do more. While Resiliency is allowing you to keep doing stuff even at some minimum pace. In our above example if the restaurant had some redundancy - backup supplier, or additional stock kicked in at such times. They could have kept the restaurant running even at slightly lower capacity, till the main supplier fixes their truck and delivers a delayed shipment.
One of the best way to distinguish for me is to see what systems are more critical then others. If the system is critical - stability is more important than scale. We would love to have redundancies and scale at the same time. But we always calculate the return that we get out from it.
This is also so true for us humans. Life has many facets, and we decide what is critical and what is not. We need to prioritize and decide where scale is important, and what is the minimum SLA we need for which facet. Unlike machines, there is limit of redundancies and scale humans can provide.
Pick where you should be resilient