Simplest Classifier

My journey to automatically generate fuzzy rules for a fuzzy engine led me to frequency-based algorithms like Apriori and FP-Growth.
As I tried to understand how they work, I began to wonder if it would be possible to write a simple classifier based purely on support values. When all features in a dataset are one-hot encoded, it’s possible to calculate the support value of each feature for a given classification target.

The steps are as follows:

  1. Get the unique target values.
  2. Generate a one-hot encoded dataframe.
  3. For each target value:
    • Filter the dataframe.
    • Calculate the support value for each feature = (count of ones) / (total count).
  4. For prediction:
    • One-hot encode the test dataset.
  5. For each row in the dataset:
    • Get the feature values and compute the difference to the support values.
  6. Sum up the differences and choose the target with the lowest total difference.

For small datasets where features are not heavily interconnected, the results are quite good‚ like with the Zoo Dataset.
However, for more complex classification tasks, the classifier performs poorly. Still, it was a fun experiment and I learned a lot.