GCH Congestion on GPU (Alcatel)

This topic has 9 replies, 1 voice, and was last updated 10 years, 6 months ago by Rex.

Viewing 10 posts - 1 through 10 (of 10 total)

Author

Posts
29th December 2013 at 12:12 #69422

Ian
Guest

We are facing high GCH congestion on some BSCs (GPU level) and it seems that it is due to uneven distribution of Traffic over the GPUs. We have reshuffeled GPUs many times in order to have balanced traffic on all GPUs but still traffic distribution is uneven and resulting in GCH Congestion.

Can any one help to move forward

29th December 2013 at 17:12 #69423

Pix
Guest

Hi Ian,

Do you know the way the cells are reshuffled among the GP boards?

It is not based on peak traffic, but on other inputs. Maybe it is based on MIN PDCH, or MAX PDCH, or the number of TRX per cell.

I would first analyse the reshuffling algo, and try to force it to spread the cells in a more harmonious way. However, it may be easier to either add more GP boards, or globally reduce the PDCH consumption of each cell.

Could you remind me of your ALU release, and BSC generation ?

Regards
pix

30th December 2013 at 07:21 #69424

Ian
Guest

Hi Pix,
Thx a lot for your feedback, I was searching for the reshuffeling ago but unfortunately i couldnt find it, I checked with my colleuges as well but they dont know exactly what is the way of GPU reshuffle.
Actually we dont really need GPU expansion as we have GCH Cong on certain GPUs of one BSC while other GPUs of same BSC are not very less laoded

We are in B12.1 & BSC & MFS are A9130

30th December 2013 at 09:59 #69425

Ian
Guest

Hi Pix,

Just to share that counter “P105L” is incremented,

P105l – NB_UL_TBF_EST_FAIL_TOO_MANY_TBF

And i am unable to find the cause. all other counters P105c,d,e,f,g,h are fine.

17th January 2014 at 23:09 #69426

Rex
Guest

Hi,
if anyone can help on issue how to handle PMU CPU congestion. It is very high in our case. Can we reduce the congestion by changing any parameter or we have to add another GP board. Vendor is ALU.
BR,
Rex

18th January 2014 at 10:06 #69427

pix
Guest

hi,

if you don’t want to add more GP, then the workaround is to reduce the overall amount of simultaneous data in your BSC.
1/ reduce number of PDCH(reduce Min/max Pdch)
2/ reduce max MCS
3/ reduce max CS
4/ reduce inactivity timers, to release unused PDCH faster
5/ disable tbf fast establishment (which can be used when min pdch > 0)

as a result, the GP should be less loaded, but you will face TBF congestion (lack of radio resources) and lower throughput. All in all, I’m not sure it’s a good workaround 😀

regards

18th January 2014 at 14:46 #69428

Rex
Guest

Hi Pix,
thanks a lot. By tbf fast establishment did you mean fast_initial_gprs_access?
What do you think,if we increase the MFS parameter N_DATA_BYTES_MAX_TRANS_PERIODIC from 100 bytes to 1000 bytes, do you think that cpu loading will be less? The definition of this param is: Number of bytes above which a transition from “short data” to “long
data” MS transfer shall be periodically reattempted (only useful in the
rare cases where such a transition previously failed).

And what is recommended: A transition from “short data” to “long data” MS transfer can only fail
in the rare cases where there is a too high number of TBFs on the
TRX / in some very specific reallocation failure scenarios.
A low value of N_DATA_BYTES_MAX_TRANS_PERIODIC should
be avoided because it will tend to increase the CPU load of the
MFS for a defence case which should be rare (if TRX and Abis/Ater
transmission resources are properly dimensioned).
The default value is 10000 bytes
Best regards,
Rex

19th January 2014 at 11:03 #69429

pix
Guest

yes, i meant the fast initial gprs access.

I’ve never tuned this parameter. It seems safe to put it back to the defaut value ! Default is 10,000 bytes, and current setting = 100 bytes ?! It will strongly reduce throughput for short transmissions, especially the UL TCP ACK, which will in turn reduce the DL throughput at the application (tcp) level.

Anyway, you understand what you need to do : degrade qos of users in order to decrease congestion. Altogether, I doubt that the end user experience will be different. The subscribers will perceive your network as very “poor”, whether their tbf is dropped because of CPU congestion, or their tbf is stuck on only one PDCH, or stuck on MCS5, or it is rejected because there is not enough pdch in the cell. The end result is similar for the subscriber.
Actually, if the CPU is congested, it will only affect end user during the busy hours (let’s say during 8hours/day). The rest of the time, users will enjoy good QoS.
If you start tuning your network with “hard” parameters, they will affect users 24hrs/day. That’s something that you get to keep in mind. The “transition” parameter you mentionned is interesting because it will only come into play when there is a failure. And failure will happen only during the busy hours. So that’s definitely a parameter that’s worth trying. Please let us know about the effect ! 😉

Regards,
pix

22nd January 2014 at 22:24 #69430

Rex
Guest

Hi Pix,
thanks again for the answer.
We changed the parameter from 100 to 1000, as 10000 (default) didn’t accept it in MFS. After changing it, all GP boards had to be reset and when all cells were normalized we noticed an improvement in a throughput, gprs attachment was ok etc. But only for few hours. Again, the problems were appeared and we had to add GP board and extending number of AterMux to maximum (we didn’t want to experiment with the parameter anymore, I would like to know what is the value of this param N_DATA_BYTES_MAX_TRANS_PERIODIC in other Alcatel-Lucent networks). After that, there were no more problems, no more CPU congestion.
So, in these cases I think that changing parameters is not so helpful for decreasing congestion of CPU (it was over 60%).
Best regards,
Rex

22nd January 2014 at 22:47 #69431

Rex
Guest

Hi Ian,
in order traffic to be distributed evenly among GPs, you should have the same number of AterMux-es between GPs and BSC. Recently, when we added new GP, first one had 12 AterMux-es, the new one at the beginning had only 6 AterMux-es. After reshuffling the number of cells in the first one was twice bigger than in the new one. And after adding 6 more Ater-s in the second one, the number of cells was the almost the same.
Best Regards,
Rex
Author

Posts

Viewing 10 posts - 1 through 10 (of 10 total)

The forum ‘Telecom Design’ is closed to new topics and replies.

GCH Congestion on GPU (Alcatel)

More info

More info