Evaluation
Each participating team will initially have access only to the training data. Later, the unlabelled test data will also be released. After the assessment, the labels for the test data will also be released. The evaluation will be performed according to standard metrics.
More specifically, since the dataset is balanced regarding to Task A, we will look at the accuracy performance for Misogyny Identification.
For what concerns Task B, because of the unbalance within both the Misogynistic Category Classification and the Target Classification, we will take into account the performance by using the macro F-measure.
Results
ENGLISH
- Ranking SubTask A: Misogyny Identification
- Ranking SubTask B: Misogynistic Behaviour and Target Classification
- Detailed Results (category & target)
SPANISH
- Ranking SubTask A: Misogyny Identification
- Ranking SubTask B: Misogynistic Behaviour and Target Classification
- Detailed Results (category & target)