Machine Learning (ML) stands at the forefront, revolutionizing the way we perceive and interact with data. The journey of conceptualizing and implementing an ML model is complex and has different phases. The following table delves into the various stages of ML Life cycle.
ML Phase
Description
Planning
The planning phase involves scope, success metric, cost and feasibility of the application needs to be defined. The cost benefit analysis and the clear and measurable definition of successful business metrics which will include the model accuracy (F1 score, AUC) need to be defined.
Once the scope and other parameters are defined, the availability of data and its source, the legal implications are also defined. The scalability and robustness of the application is also predicted for a said duration of time.
Data Preparation & Feature Engineering
Once the scope is determined the various sources for the data are identified, the data can be extracted from internal as well as verified external sources. The Extracted data is then cleaned by filling out missing data, standardizing data. Once the data is cleaned the data is verified from a quality perspective.
Once the data is cleaned. The said data is transformed suitable for machine learning models (Feature Engineering), data augmentation and normalizing is done.
On completion of the above step the data storage solutions, metadata storage, data versioning, is done. A ETL pipeline is created to ensure constant stream of data to train the model.
ML Phase
Description
Model Engineering
Once the data pipe line is created for the model. Appropriate algorithm is selected based on the approach to Machine Learning (Supervised or Unsupervised Learning). Once the algorithm is selected the model is developed. The development level testing is done on the model for checking the results.
Model Evaluation
The developed model is trained which includes hyperparameter tuning for the model training activity. The trained model is tested with backtest data set / real world data to ensure the model meets the business success criterion before signing off to move to the production. The model evaluation is recorded and versioned to maintain reproducibility. Once the model is signed off by the business user, it is packaged.
Model Deployment
A deployment strategy (Container based, in house app based or jupyter notebook and aws sagemaker) is evolved and the packaged model is deployed on the cloud infrastructure with edge locations or local infrastructure. APIs are used for accessing the predictions done by the model. The performance of the model is evaluated in the production environment. It should be ensured that the infrastructure has enough ram, computing power and storage for ensuring scalability of the model.
Model Monitoring
Once the model is deployed it is continuously monitored for performance and accuracy over time. The data that flows into the model is in continuous state of motion and change which might result in model degradation. On encountering model degradation, the model will have to be retrained accordingly with new set of data and redeployed.
Conclusion
The Machine Learning lifecycle is a fascinating journey that transforms data into actionable insights. Machine learning demands a deep understanding of the domain and the data that is involved in the ML Life cycle that would give out the ML model as the end result. The true potential of ML can be harness by mastering the ML lifecyle, business and data which would pave the way for innovation and growth.