AIOps is fast becoming the next big buzzword and is about using AI as the driver for operations. The early use of machine learning is in the area of observability while some vendors like Swim.ai use deep learning to offer intelligence at the edge. We expect to see more vendors using machine learning or deep learning in their products/services in the next two years. As AIOps gain traction in the market, there is quite a bit of confusion on the term, leading to the sub-optimal use of these products or failed projects. In this post, we will highlight some of the factors to consider before rolling out AIOps in your organization.
Factors impacting successful AIOps rollout
For any AIOps to offer the kind of efficiencies we expect from ML/AI, the enterprise decision-makers should consider the following factors as they evaluate various platforms:
- Training data: The most critical part of any machine learning system is the training data on which the model is trained and optimized. Large and diverse data is needed to make AIOps useful to organizations except in certain use cases where streaming data and its short-term relevancy can make “less data” useful (especially in edge computing and IoT scenarios). Every organization has unique needs and models trained on limited data will not help most enterprises. In fact, it is Rishidot Research’s strong belief that models trained on data in web-scale companies will be necessary for AIOps to go beyond the low hanging fruits. We hope that one of these web-scale companies will take lead in either open sourcing training data or models trained on their data. Think of an “AIOps service” from these web-scale cloud providers. One of the reasons for taking the training data from web-scale companies or using a service trained on their data is to ensure that the model used in the enterprise can foresee grey failures which it may miss with limited data
- ML Models: Another important factor is the models used in the product/service. In some of the early stage ML products, I see then using simple parametric fitting using basic regression models as an easy way to extrapolate data and make predictions. This is not just useless but it could give a false sense of comfort, leading to disastrous consequences. Most AI engines are blackbox by definition but it is important to have some basic understanding of the underlying models before trusting these predictions. We strongly urge you to talk to the product teams and understand how they have implemented the learning algorithms
- Train on your own data and verify: Even with pre-trained models, it is important to train the ML service on your own dataset before it can be put to production. Any ML system needs to learn from your own data before it can give insights that are relevant to your organization. Train and test extensively on your data before it goes to production
- Don’t reinvent the wheel: If you are building your own custom built system, there are many AI cloud services that are springing up among cloud providers and leading enterprise vendors. Take advantage of these services instead of building from scratch. In many cases, reinventing the wheel offers very little advantage
The marketing pitch around the use of ML/AI in operations is going to increase many folds in the next two years. AIOps is going to mature into a powerful technology helping modern enterprises scale efficiently. As a decision maker, it is important for you to understand what is at stake and ensure that the investments produce optimal ROI.
Disclosure: Swim.ai and Cloudfabrix are sponsors of StackSense and they are bucked at times in AIOps space