[users at i-scream] libstatgrab on AIX: data mismatch between saidar and topas

Jens Rehsack rehsack at gmail.com
Tue May 24 15:41:50 BST 2016


Hi Anderson,

no worries, you're welcome.

Best regards

> Am 24.05.2016 um 16:40 schrieb Anderson Carlos Trindade <anderson.trindade at optimode.com.br>:
> 
> Hi Jens,
> 
> Thank you by your explanations and sorry my questions. I’m really starting with AIX and I will study a little bit more about this partitioning stuff.
> 
> Thanks,
> 
> Anderson
> 
> 
> 
> 
>> Em 24 de mai de 2016, à(s) 10:10, Jens Rehsack <rehsack at gmail.com> escreveu:
>> 
>> Hi Anderson,
>> 
>> No, this is not correct. The LPAR technology allows (and prefers) dedicated Resources per partition, for shared resources WPAR's are recommended. I never tried, but assume that shared resources for LPAR mean minimum and maximum of resources can be reserved on demand.
>> 
>> The difference you see has nothing to do with physical machine view vs. partition / logical view.
>> 
>> Von meinem iPhone gesendet
>> 
>>> Am 24. Mäi. 2016 um 14:51 schrieb Anderson Carlos Trindade <anderson.trindade at optimode.com.br>:
>>> 
>>> Hi Jens,
>>> 
>>> So, the point is:
>>> 
>>>  - libstatgrab is reporting the physical CPU usage. If libstatgrab shows something around 65% of Idle, it means that 65% of all physical resources are Idle.
>>> 
>>>  - on the other hand, the sample code is reporting the LPAR usage. If the sample code shows something around 20% of Idle, it means that LPAR has just 20% of the CPU dedicated to LPAR is available to LPAR usage
>>> 
>>> Is this understanding correct?
>>> 
>>> Considering I have one application running inside a LPAR and this application is consuming almost all CPU dedicated to LPAR (around 80%) but the physical host is using just 35% of CPU, If libstatgrab returns the physical usage, I can’t see from the libstatgrab perspective that the LPAR is almost 100% of CPU usage. Is that correct?
>>> 
>>> 
>>> 
>>> 
>>>> Em 24 de mai de 2016, à(s) 06:11, Jens Rehsack <rehsack at gmail.com> escreveu:
>>>> 
>>>> Hi Anderson,
>>>> 
>>>> the example is very explicit about the measurement - it normalizes the values when lparstats.type.b.shared_enabled - libstatgrab doesn't.
>>>> libstatgrab reports the physical cpu measure - which can lead to misinterpretion for shared resources (which is up to our knowledge always the case when physical resources are shared on a best effort way). So we decided against that (similar for zones (Solaris), Jails (BSD) and Containers (Linux)) until we find a tuit to analyze all available technologies and a reasonable way to deal with them.
>>>> 
>>>> Thanks for remind me :)
>>>> 
>>>> Best regards,
>>>> Jens
>>>> 
>>>>> Am 23.05.2016 um 18:54 schrieb Anderson Carlos Trindade <anderson.trindade at optimode.com.br>:
>>>>> 
>>>>> Hi Jens,
>>>>> 
>>>>> Thank you for reply!
>>>>> 
>>>>> As far as I know, topas seems to be an AIX utility (https://www.ibm.com/support/knowledgecenter/#!/ssw_aix_71/com.ibm.aix.cmds5/topas.htm), but I can’t tel you where the data displayed by topas is coming from.
>>>>> 
>>>>> But let’s forget topas for a moment.
>>>>> 
>>>>> I got a sample code from IBM site (https://www.ibm.com/support/knowledgecenter/#!/ssw_aix_53/com.ibm.aix.prftools/doc/prftools/prftools07.htm%23wq407), which uses perfstat to report cpu usage statistics.
>>>>> 
>>>>> Then, I compiled this sample code and run in parallel to saidar, each one in a separated SSH session. While saidar is reporting around 80% of Idle time and 10% of user time, the sample code above (based on perfstat) is reporting something around 35% of idle time and 60% of user mode usage. I recorded a screenshot and I can share with you If you prefer.
>>>>> 
>>>>> Considering that saidar and the sample code above are getting data from the same source, why are these statistics so different?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> Em 23 de mai de 2016, à(s) 13:02, Jens Rehsack <rehsack at gmail.com> escreveu:
>>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>>> Am 23.05.2016 um 17:02 schrieb Anderson Carlos Trindade <anderson.trindade at optimode.com.br>:
>>>>>>> 
>>>>>>> Hello List,
>>>>>>> 
>>>>>>> I’m trying to understanding some differences between data reported by AIX utility topas and saidar.
>>>>>>> 
>>>>>>> On a given moment (almost in the same second), saidar report the following CPU usage:
>>>>>>> 
>>>>>>> CPU Idle: 88,28%
>>>>>>> CPU system: 4,72%
>>>>>>> CPU User: 7,00%
>>>>>>> 
>>>>>>> but topas report the following usage:
>>>>>>> 
>>>>>>> %Idle 35,8%
>>>>>>> %Kern 3,5%
>>>>>>> %User: 60,5%
>>>>>>> %Wait 0,2%
>>>>>>> 
>>>>>>> It seems that both utilities are using different sources of data, since the usage reported is very different.
>>>>>>> Please, could you help me to understand where these differences are coming from?
>>>>>> 
>>>>>> Well, I don't know where topas is fetching it's data from - and where your topas comes from (AIX Linux Tools? 3rd party repo?) ....
>>>>>> 
>>>>>> As you can see here https://github.com/i-scream/libstatgrab/blob/master/src/libstatgrab/cpu_stats.c#L162, libstatgrab is using perfstat - the IBM recommendation and the same source used by nmon.
>>>>>> See https://www.ibm.com/support/knowledgecenter/ssw_aix_53/com.ibm.aix.prftools/doc/prftools/prftools07.htm%23wq407 for more details about libperfstat.
>>>>>> 
>>>>>> The reason for enhancing libstatgrab by a former customer was the poor data quality of GNU tools on Unices (HP-UX, AIX, Solaris).
>>>>>> When I'm in doubt, I trust libstatgrab more than all GNU tools together >:-)
>>>>>> 
>>>>>>> my apologies in advance, because I'm very new on AIX world
>>>>>>> 
>>>>>>> This is a LPAR with 4 CPU’s
>>>>>> 
>>>>>> Best regards
>>>>>> --
>>>>>> Jens Rehsack - rehsack at gmail.com
>>>> 
>>>> --
>>>> Jens Rehsack - rehsack at gmail.com
>>> 
> 

--
Jens Rehsack - rehsack at gmail.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.i-scream.org/pipermail/users/attachments/20160524/17206717/attachment.bin>


More information about the users mailing list