Resolved: Which ML algorithm should I use for this dataset

In this post, we will see how to resolve Which ML algorithm should I use for this dataset


I have a dataset let say data1,data2,data3… output or predictive data should be names of people based on the given dataset. I have a training dataset but not sure which ML algorithm to use. And the list of peoples name does not change.

Best Answer:

It sounds like you are doing a classification task, so preferably you should use a classification algorithm. The type of algorithm to use really depends on the quality and structure of your data and its decision boundaries. Typically, before one embarks on a classification task, you must identify your data’s outliers, noise, class imbalances, missing values and other data quality issues. And from there, you should select a model that best suits your needs.
For example, if your model contains lots of outliers and missing values, a decision tree might be preferable. However, if you have a large class imbalance, anomaly detection may be better suited. If you decision boundary is linear, you could make use of support vector machines. While if you have non-linear decision boundaries you’ll need to look into more complex models such as gaussian discriminative models, self-organizing maps, or neural networks.
In summary, it is entirely dependent on your data.

If you have better answer, please add a comment about this, thank you!