Its been really a long time (close to a month) when I last blogged. Hectic and Frantic schedule took a toll on me and even though I wanted to talk about many things, I could not do so.
However now as I am back, I am back with couple of interesting ideas and will see how are you lining up your thought around this.
The first one here I wanted to talk about is Self Healing of your Cloud, well needless to say I am focusing on Service Provider Cloud.
Now let me define what do I mean by Self Healing of your SP Cloud.
By definition this will be a system which can determine that something is not operating correctly or not configured properly and make the appropriate corrections to restore it to its best condition.
In a nutshell this module can be integrated into any Automation Tool and compare the what should be condition to what is condition and apply the automation to put it back.
Now you may ask me how to implement it or how to hook this module with any automation tool. Well, to do this, there could be many a way. I am going to talk about three ways here.
To make it happen, what we can do is using a SNMP and API based solution. To take an example using either some API based module which will proactively check the health/parameters/SLA condition for this Cloud or using any standard practiced Monitoring Solution. Now look at the first solution which will be using API based proactive check module along with this Self Healing module.
Using API calls, SNMP, or other configuration gathering method, the status of a subsystem can be compared to what is considered normal as defined by the SLA for that subsystem. If it is not in compliance, the automation and orchestration components can be invoked to put things back they way they should be. This might involve redeploying a configuration file, reconfiguring a network interface, vacating a hypervisor host of all the VMs it contains and placing them on another host, or just about any other task that can be automated or orchestrated.
For Monitoring, you can use Hyperic or EMC Smarts or CA Spectrum console and Puppet open source software packages can be used which will be used to facilitate self-healing tasks.
- Hyperic is a monitoring tool for hosts and services, similarly CA Spectrum too.
- Puppet is an automatic configuration management tool. Used together, they can monitor and correct any configuration issue on any platform that is supported by both tools.
Now look at the Monitoring based solution, where this module can integrate with any standard practiced monitoring service and reactively solve many a issues (where ever possible).
Also another solution could be using a CMDB and API. Here we need to look at the DB for errors and change the configuration as part of automation and orchestration. Well, to me this is a lengthy process to start with. However I am open for comments.
Think about a situation something went wrong and at the same time your operations team is having a glass of beer and in the background your self healing module is doing the work. How cool can that be 🙂
I am all ears and looking for more ideas around it and your esteemed feedback.