It’s a fact that a model is as good as its training data. In the world of data science and machine learning, Python programming language is making the rules. Microsoft ML.NET is an excellent cross-platform open-source framework but the support for data visualization was its Achilles’ heel. Until recently ML.NET did not have a good REPL (read–eval–print loop) instrument to experience with data, but now, .NET developers can run on-premise interactive machine learning scenarios with Jupyter Notebooks using with C#, F#, or Powershell scripts in a web browser.
According to Forbes, data scientists spend 80% of their time on data preparation. Let’s see how you can accelerate your data preparation using box plot segmentation, correlation matrix, permutation feature importance, confusion matrix, and other instruments.