Right Sizing VMware vCOPs – VCDX Way

As we march past towards our VCDX defense, I am sure there are multiple guys who would have worked on VMware vCenter Operations Manager and added as a component to monitor their VMware Infrastructure. If you are working on vCOPs and submitting that component in the monitoring section of your VCDX design defense, then you must account that the vCOPs itself will take certain amount of physical resources that includes, CPU, Memory, Storage & Networking. In generic term networking is not the hard way you need to accommodate in your design, however other three are the major components to consider.

Consider the vCOPs vApp as two VMs (UI VM & Analytics VM) and add that to your overall VM design strategy, including consideration for the disk size, CPU & Memory. It will add these resources to your overall VM design strategy. The vCenter Operations Manager vApp consists of two virtual machines with the same basic virtual machine layout. Both virtual machines have a system disk which is sized at 8GB which should not be resized. On initial deployment, within a logical volume managed by a logical volume manager, both virtual machines have created a 120GB disk each.

Sizing of a vCenter Operations Manager environment can be challenging. In this entire post we will assume some basic consideration. Core CPU speed for the physical host is 2GHz, each VM will produce 250-300 metrics each, a physical host will bring in 1200-1500 metrics. As you can see there is a variation and this is because the number of metrics depends not only on the virtual machine itself, they depend on soft values such as the number of virtual CPUs, memory, number of network adapters, and virtual disks. Each of these components usually brings at least one metric, but can also provide multiple metrics.

You also need to consider the consolidation ratio in your environment when sizing your vCOPs. If you use vCenter Operations Manager only with vSphere and no other additional adapters, you can use the formula shown below. This formula is only used to estimate the amount of expected metrics, in order to size the vApp correctly.

Environment size: metrics = (1350/10 x consolidation_ratio) x hosts + 275 x VMs

Data retention: storage[GB]= ((metrics x 16)/(1024 x 1024 x 1024)) x interval x 12 x 168 x 4.33 x (months+1) x 1.2

Additional adapters: Metrics per resource

Now you must be wondering that what is this 1350 in this formulae and what is this 275 is all about. Well if you read the earlier paragraph you will see that for a host there are about 1200-1500 metric. We take a mid value of this range, which is 1350. Every VM generate 250-300 metric each and we again took a mid value of this, which is 275.

For example, an infrastructure with 60 hosts and 1,800 virtual machines gives us a consolidation ratio of 30:1. If you use these values and insert them into this formula, you get a metric estimation of 738,000 metrics. This is a bit larger than the small deployment scenario, which is for 600,000 metrics, but definitely smaller than the medium scenario, which is for 1,500,000 metrics. Based on this estimation you can try the small scenario and add 2 vCPUs per virtual machine, but only add a little more memory and storage. The medium sized deployment would be to big for this infrastructure and would lead to oversized virtual machines in the vApp.

Calculating storage space needed:

This formula estimates the storage space needed for the Analytics VM. Every metric needs 16 Byte in the FSDB. You have to multiply the number of metrics by 16 to get the amount of bytes per interval. Most of the stats are collected every five minutes. If you divide that number three times by 1,024, you change the unit of measure from Byte to Gigabyte. Then you have to multiply that number by the collection interval, which is five minutes for most stats. Then multiply it by 12 to get the amount of Gigabytes per hour. Multiply the number by 168 to get the Gigabytes per week.

Because a month has an average of 4.33 weeks, multiplying the Gigabytes per week by 4.33 gives you the amount of Gigabytes per month. Finally, you multiply it by the number of months for data retention. Again, this is only an estimation and gives us the needed space. You have to consider to leave some wiggle room in the file system. A rough estimation would be around 20%. If you use the amount of metrics from this scenario, fill out the formula and add 20% head room (as the formula does), you get around 690 additional Gigabytes. Including the 120GB from the default configuration, this estimate leads us to 810 Gigabytes in total, with a little room for growth.

Additional adapters deliver a different amount of metrics per resource. This varies from adapter to adapter. SCOM for example creates an average of 250 metrics per resource approximately. You can use the given formulas and calculate that as well. Be aware that different adapters create different amounts of metrics per resource.

My friend Sunny Dua has written a nice article some time back on the sizing guide lines. So I would not reinvent the same wheel again here. However, just to make every one aware that the example he has shown is based on the disk storage calculation and rules applied for a vCenter Server centric approach, using the vCenter Server adapter only. Additional adapters add to the amount of data and IOps necessary. The table in the article is based on the fact that the vCenter Server collection settings have not been modified and the data collection interval is set to 5 minutes with a 6 months retention period. Doubling the retention period, or collecting twice as often, immediately doubles the amount of data collected.

vCOPs Sizing

So you see from the small environment you need at least 1500 IOps additional to what you have calculated for all your workload VMs and add it to the total required IOps from your back end storage. If you increase it for the medium and large add 3000 or 6000 IOps. This is not a small thing to be overlooked. If you are submitting your design for VCDX, remember panelists are very detail oriented and experts in their domain. Forgetting these additional consideration can cost you for sure. If your Capacity Planner is showing that your required total IOps is 55000, remember to add additional IOps for your vCOPs and add it to the total required IOps. Also add the additional GBs/TBs required for your total estate that includes your vCOPs vApp.

Also consider the headroom or growth for future. Get the value from customer and add it to your scalability section where in you need to buy in additional GBs/TBs and generate IOps for your future expansion. Future expansion of workload VMs = more storage IOps, GBs/TBs for your vCOPs as well. Think about that and add it to the future growth scalability.

Good luck for all VCDX aspirants.

 

4 thoughts on “Right Sizing VMware vCOPs – VCDX Way

  1. First of all really good post, the only thing I would suggest adding is the IOPs figures it seems pretty clear that their is a 1:1 relationship between the number of VM’s being monitored and the IOPS required by vCOPs.

    1,500 VM’s = 1,500 IOPS
    3,000 VM’s = 3,000 IOPS

    For example, I wouldn’t suggest to a client who has 50 VM’s they want monitoring that they need to provide 1,500 IOPs.

    Using vCOPs standard (no additional plugin) I have always found that under 1 IOPS is required per VM being monitored. Real work statistics across different client sites with various environments, average is around 0.6.

    Interestingly enough vCOPs always seems to tell itself that it’s over sized in terms of RAM and CPU, but if this was changed it would be unsupported :)

  2. Pingback: Choosing VMware SIOC with Storage Auto Tiering - VCDX Way | Stretch Cloud - Technology Undressed

  3. Thanks for the mentioned Prasenjit. The article which I wrote back in 2012 also talks about sizing for adapters other than VMware vCenter. The sizing their is based on number of metrics being collected from 3rd party sources :-)

    Quoting my self on the article – http://vxpresss.blogspot.in/2012/11/right-sizing-vcenter-operations-manager.html

    “In case you are using a collector to collect metrics data from Non-vSphere environment, then you would need to calculate and add CPU, Memory & Disk resources on the basis of the recommended numbers below:-”

  4. Pingback: vCOPs Backup & Disaster Recovery | Stretch Cloud - Technology Undressed

Leave a Reply