The modern way of labeling
If there is a "thank you" in a sentence, you would most probably say that its sentiment is positive.
And if the term “ASAP” occurs, a message is urgent.
This is the concept of Weak Supervision. If you are capable of defining some imperfect heuristics, chances are high that Weak Supervision can significantly accelerate your labeling process. Instead of labeling everything by hand, this technological approach combines imperfect heuristics by analyzing their correlations and conflicts, resulting in one program that assigns your data probabilistic labels.
We use this concept and go one step further.
We combine Weak Supervision with Active Learning.
As data is being labeled, a Machine Learning model is trained, which helps prioritizing unlabeled data. This ensures that your labeling efforts aren’t wasted on the easy examples but rather focus on the hard ones that need manual labeling. This is Active Learning.
Weak Supervision and Active Learning act in a complementary way. Domain knowledge can be expressed with heuristics, which then are combined to automatically generate probabilistic labels. Active Learning trains classifiers in the background that pick the most complicated data points. One major advantage of this combination is that by labeling the hard examples that cannot be easily labeled with heuristics, the classifiers learn implicit knowledge that further contribute to the automated labeling with Weak Supervision.
Apply any classification model to our labels.
We have two restrictions when it comes to our platform. First, your task must be a classification. Second, your data must theoretically fit into a JSON format, e.g. simple text, tabular data or parsed documents. This works for common formats such as spreadsheets, CSV-files and many more. Other than that, there are no restrictions.
You can apply the labeled data to any Machine Learning infrastructure. This can be a custom Machine Learning model you implemented in a Jupyter Notebook or an ML model build using your favorite AutoML providers.