The modern way of labeling

If there is a "thank you" in a sentence, you would most probably say that its sentiment is positive.
And if the term “ASAP” occurs, a message is urgent.

This is the concept of Weak Supervision. If you are capable of defining some imperfect heuristics, chances are high that Weak Supervision can significantly accelerate your labeling process. Instead of labeling everything by hand, this technological approach combines imperfect heuristics by analyzing their correlations and conflicts, resulting in one program that assigns your data probabilistic labels.

We use this concept and go one step further.
We combine Weak Supervision with Active Learning.

As data is being labeled, a Machine Learning model is trained, which helps prioritizing unlabeled data. This ensures that your labeling efforts aren’t wasted on the easy examples but rather focus on the hard ones that need manual labeling. This is Active Learning.

Weak Supervision and Active Learning act in a complementary way. Domain knowledge can be expressed with heuristics, which then are combined to automatically generate probabilistic labels. Active Learning trains classifiers in the background that pick the most complicated data points. One major advantage of this combination is that by labeling the hard examples that cannot be easily labeled with heuristics, the classifiers learn implicit knowledge that further contribute to the automated labeling with Weak Supervision.

Apply any classification model to our labels.

We have two restrictions when it comes to our platform. First, your task must be a classification. Second, your data must theoretically fit into a JSON format, e.g. simple text, tabular data or parsed documents. This works for common formats such as spreadsheets, CSV-files and many more. Other than that, there are no restrictions.

You can apply the labeled data to any Machine Learning infrastructure. This can be a custom Machine Learning model you implemented in a Jupyter Notebook or an ML model build using your favorite AutoML providers.

How it works

Define Labeling Functions

Write your first labeling functions using Python. These functions can be as simple as two lines of code.

Label using Active Learning

Machine Learning models are trained as you go, helping you select the records that need to be labeled manually.

Monitor quality and quantity

As your labels are created, you can validate their distributions and confidence.

Your benefits of using onetask

Scalability

onetask is designed to work on both 10,000 and 10,000,000 records.

Transparency

Every created label has information attached that you can use for quality assessment

Documentation

Labeling Functions are auditable and can be communicated as docs.

Privacy

If needed, you can keep all your data in-house using our on-premises app.

Flexibility

Requirements can - and will - change. We enable flexible labeling.

Cooperation

Our platform is designed to profit from collaboration.

Become a labeling pioneer now!

Join our Closed Beta