Istio cluster on high load everytime

I spin up istio cluster for testing and install istio website example projects to start my tests. As soon as that is done, the cluster is constantly hitting load shoots up on the cluster and it never comes down, recovers. Its always a failure. Has anyone noticed that?

top - 16:22:19 up 4:33, 0 users, load average: 89.01, 68.04, 37.22
Tasks: 183 total, 16 running, 117 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.8 us, 97.2 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2040616 total, 53412 free, 1833896 used, 153308 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 46456 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
41 root 20 0 0 0 0 R 81.9 0.0 7:23.88 [kswapd0]

For more information, I deployed applications from Istio / Getting Started

I also see istio-ingressgateway is missing on the cluster everytime its spun up newly. It should be available if its an istio testing platform.

What kind of cluster are you using, and how much memory do you have allocated to it? istio is a hungry beast and can misbehave if memory starved.

I use Istio playground cluster. I haven’t allocated memory for the work load I deployed as they were test manifests from Istio’s official documentation.

I did some more testing. The cluster setup is missing metrics server. It needs to be added for observations. Also it can’t be with this bare minimum resources. I had to cut down some of the apps to reduce memory consumption within the cluster limits. At least you should consider raising the memory limits to accommodate Istio official example test apps for newbies like us to have a better chance of making it with testing Istio features. Otherwise the purpose of this playground is defeated.

So you’re running the demo profile? Or if not that, which profile? There will be resource limits on the playground, so I’d guess that you can try to install profiles that will stretch what the playground is intended to support.

I am using Istio’s demo profile. What do you recommend other than demo profile? I had to remove some apps to reduce its memory consumption.

I think you have a point here. demo is the “entry point” for testing out Istio; I’m running it right now on a minikube cluster with about 8 Gb of memory dedicated, then the playground likely needs to run on a slightly larger instance. I’ve reported this to our labs team; hopefully they’ll be able to fix this soon.

@rob_kodekloud @mmumshad - I appreciate your help here. Can we please have this expedited? It makes no sense paying for these services and seeing no resolution in the near future. Its not just these but several other issues with the entire platform. I know every learning lab environment has its own problems and they are resolved as and when reported, but Kodekloud is far from being a basic working environment.

I have asked the team about this today; if I hear anything, I’ll pass it on to you.

@mmumshad I dont want to keep tagging Mumshad for everything but unfortunately even a valid istio issue like this take forever is really concerning. Why can’t a simple problem like this be addressed? It is not just this, every project I take on in GCP/AWS have their own fair share of problems. I seriously think this is my worst choice of platform for learning.

I am told by the lab team today that they’ve made a fix here. Are you still having problems with this playground?

I wish they tested it after the implemented a solution. I did test it today and the load kept shooting up. It didn’t work as per my tests for Istio demo profile.

Sorry about that. I’ve told them that it is not fixed. On the assumption that the engineers in question aren’t that familiar with istio, how are you testing this? I can pass this on to them as well, which will hopefully get things resolved for using the demo profile.

Rob, Just have them follow the steps on Istio documentation. Istio / Getting Started

Has this been addressed at all please? You can’t charge a premium only to leave issues like these open for an year.

It was addressed a while ago. Part of the problem here is that the engineers needed a better sense of what you were doing that did not work – no procedure, nothing for them to test. I’ve just brought the playground up, and run the bookinfo demo:

root@controlplane ~/istio-1.23.2/samples ➜  k apply -f bookinfo/platform/ku

This does work. I don’t see signs of excessive load. I exposed the main page using the “View Port” feature of the lab environment, and Kiali show traffic on that page. So some things work in the environment. I’d guess some things won’t due to load. But without more info, there isn’t much for engineering to go by.

If you’ll please give a repeatable procedure for demonstrating what you’re seeing, I can pass it on to engineering. I believe they did increase the resources in the playground. But right now, they just don’t have enough to go on. Please be more specific: if they have something they can test for themselves, they can make changes.