The data training startup Labelbox announced a $40 million Series C round of funding on Thursday led by B Capital Group, with participation from Andreessen Horowitz, First Round Capital, Kleiner Perkins, and others.
Labelbox’s platform manages and annotates data to prepare it for use in machine learning applications. Labeling data — like indicating whether a photo contains a traffic light or a plant — helps machines find patterns and know what to look for. Once trained, the system can find similar trends in new data sets. Building useful algorithms like this requires vast amounts of training data and Labelbox’s platform essentially helps data science teams quickly and easily label that data to ensure that it’s accurate and high-quality.
The raise comes just about a year after LabelBox’s $25 million Series B, which valued it at $135 million according to PitchBook, and this latest round — for which the firm declined to share valuation — brings the total funding to $79 million.
Cofounder and CEO Manu Sharma told Insider that the San Francisco-based startup, which was founded in 2018, was getting cold emails from interested investors “every day” in mid-2020. Even though the startup still had at least two years of runway left at the time, the team began having conversations and was very impressed with the insights of B Capital partner Rashmi Gopinath. Ultimately, that relationship bloomed into B Capital leading the oversubscribed round.
Data labeling and management can be labor-intensive for companies, Gopinath said, citing a 2019 Cognilytica report that enterprises adopting machine learning spend over 80% of their time on those activities alone.
“If we believe that data is the new oil, then labeling and annotation becomes like the essential parts of the refinery,” she told Insider. “Saving the amount of time that data scientists and engineers are spending labeling that data becomes additional time that they could spend on actually building the most predictive models.”
From the same Cognilytica report, the market for third-party data labeling solutions was $150 million in 2018, and is projected to grow to six times that by 2023, to over $1 billion.
The firm is taking on at least one aggressive competitor though: Amazon’s SageMaker platform. SageMaker is an admittedly obvious option for many businesses because their applications are already built on the AWS cloud platform, Gopinath said. Still, Labelbox maintains that it has several advantages — including better automated labeling — and that its customers spend less time curating data sets with its platform over Amazon’s.
“[Labelbox has] hundreds of customers today on the platform, including some of the Fortune 500 enterprises, federal customers, many of who run on AWS or Azure today,” Gopinath said. “But they’re really looking for that best-in-class product and have mentioned that Labelbox is several years ahead of where the cloud native tool capabilities are at today.”