Opensource Softwares
Docta
- docta.ai is online. This is a library to help you understand and curate your data.
Key usages:
Find and fix label errors in LLM Alignment Data, image data, tabular data, etc.
Detecting rare patterns/features in your data and perform special treatment.
Data
CIFAR-N
We provide CIFAR-10, CIFAR-100 train images with human annotations obtained from Amazon Mechanical Turk.
The dataset and leaderboard are public avaialble here!
Looking forward to contributors (benchmarking efforts) on the leadboard of CIFAR-N datasets.
1st Learning and Mining with Noisy Labels Challenge at IJCAI-ECAI 2022 [link].
Code
Weakly-Supervised Learning
We are maintaining a curated list of most recent papers and codes in Learning with Noisy Labels. Check here!
CIFAR-N dataset: dataloader and starter code to learn from real-world noisy labeled CIFAR datasets [Code]. The dataset website is available here.
IJCAI-2023 Tutorial: A Hands-on Tutorial for Learning with Noisy Labels [Code].
Implementation of the paper Negative-Label-Smoothing [Code].
Implementation of the paper Robust-f-Divergence [Code].
Incentives in Machine Learning
- Implementation of the paper Sample-Elicitation [Code].