Comparing the Evaluation Results of Processed and Reduced Datasets Using Excel

- Februari 05, 2025

Since your dataset was reduced from all attributes to only two (Status & Target), you should compare the evaluation metrics (e.g., accuracy, precision, recall, F1-score) of the classification models before and after reduction.

1. Collect Evaluation Results from Weka

After running classification models on both datasets (full dataset and reduced dataset), record the following metrics:

Accuracy (%)
Precision (%)
Recall (%)
F1-Score (%)
Cross-validation (k=10, k=20) results
Percentage split (70:30, 80:20) results

2. Organizing Data in Excel

Step 1: Create a Table in Excel

Format your results into a table:

Dataset	Model	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
Full (Before Reduction)	J48	85	84	83	84
Full (Before Reduction)	Random Forest	88	87	86	87
Reduced (After Reduction)	J48	75	74	73	74
Reduced (After Reduction)	Random Forest	78	77	76	77

Step 2: Insert a Bar Chart

Select the table data.
Go to Insert → Charts → Bar Chart.
Choose Clustered Column Chart (best for comparison).
Add Data Labels:
- Click on bars → Right-click → Add Data Labels.

Step 3: Customize the Chart

Legend: Differentiate between Before Reduction and After Reduction.
Axis Labels:
- X-axis: Dataset & Model
- Y-axis: Accuracy/Precision/Recall/F1-Score (%)
Title: Comparison of Classification Performance Before and After Feature Reduction.

Step 4: Analyze the Graph

Compare Accuracy: Did it drop after reducing attributes?
Compare Precision & Recall: Was there a significant performance loss?
Discuss trade-offs:
- If accuracy dropped slightly, but computation speed improved, it might be acceptable.
- If performance dropped significantly, feature reduction was too aggressive.

Example Interpretation of Graph

Before reduction: Higher accuracy (~85%-88%) due to more features.
After reduction: Accuracy dropped (~75%-78%) because fewer features were available for classification.
Trade-off: Model is now simpler and faster, but less accurate.

Conclusion

Your graph should visually show how dataset reduction impacts performance. If the drop is too significant, you might reconsider which attributes to retain.

Cari Blog Ini

Nurah.lee