The value of labeled data sets

From self-driving cars to speech recognition and computer vision, machine learning technologies have been behind many of the latest advancements in computing. Used for training and evaluation, labeled and unlabeled data sets are the foundation of these powerful algorithms. Labeled data is annotated with information that, while easy for humans to intuit, is difficult for computers to derive. For example, consider the road imagery below from one of Vizzion’s data sets. While it is easy for humans to recognize the wet, snowy, and slushy conditions, it is far more challenging for a computer to understand that distinction. Labeled data, in comparison to unlabeled data, is particularly valuable, as it is used in the crucial steps of training and evaluating the algorithm’s efficacy. Vizzion has a wide variety of traffic and weather data that it uses internally and offers to partners as a way of improving their own analytics.

Quickly changing weather in Montreal

Road conditions, which are crucial to driver safety, are naturally easy for humans to detect.

As the world’s largest combined roadside and on-vehicle camera data provider, Vizzion’s cameras are a powerful tool for collecting all types of traffic and weather data sets. Our database, which consists of over 65,000 roadside cameras across the world and tens of thousands of on-vehicle cameras, is curated and maintained by full time staff members, ensuring that all data collected and returned is of high quality. Each camera image is timestamped and comes with metadata such as the camera’s current viewpoint and a latitude and longitude accurate to four decimal places.

The labeled data sets we collect and annotate internally are crucial to the development of our own analytics algorithms. The majority of our labeled data uses binary classification, meaning the images either show something or they don’t, and Vizzion employs many such simple labeled data sets for algorithm training and evaluation. Our visibility data set, which we used to create and refine our low visibility detection algorithm, is another example, containing images with labels denoting low or clear visibility.

A camera image with clear visibility. A camera image with low visibility. A camera image with clear visibility. A camera image with low visibility. A camera image with clear visibility. A camera image with low visibility. A camera image with clear visibility. A camera image with low visibility.

Each low-visibility image is paired with an image showing the same scene with clear visibility, which contrasts the differences between the two images.

Other labeled data sets Vizzion can provide include images showing construction and work zone equipment such as cones, diamond signs, or drums.

An image showing construction An image showing construction An image showing construction An image showing construction

Images with construction equipment are labeled as such, which can aid in the development of construction recognition algorithms.

Weather imagery such as wet, dry, and snowy road surfaces or raindrops accumulated on camera lenses is also collected and stored in a variety of data sets.

An image showing a dry road An image showing a wet road An image showing a snowy road An image showing raindrops on a camera lens

Our data sets encompass all types of weather data, from rain and snow to clear, dry conditions.

While binary-classified data composes the majority of our data offering, we also create more complex sets with labels such as the number of vehicles in a given image or the lane boundaries on a stretch of road. All of our labeled data is sorted and labeled by hand, but our vehicle and lane data is particularly time-intensive to produce, making Vizzion an ideal supplier for partners developing roadway-related machine learning algorithms. We are able to efficiently produce labeled data through the efforts of our in-house content engineers, experts in road imagery who use proprietary tools to efficiently and accurately deal with large numbers of camera feeds. Images to be labeled can be collected manually or through API calls based on external data. For example, we can input a location- and time-stamped list of events and pull the closest images to each event for subsequent labeling. Through this process, we’re confident in the quality of our data. Our customers trust our data collection and rely on us to deliver image feeds and data that are accurate.

An image of a highway, marked up with labels.

Complex data such as road boundaries and vehicle locations are stored as metadata, which can be visualized as shown.

The applications for machine learning are multitudinous and steadily expanding, but a large number of these applications require reliable labeled data sources to train and evaluate algorithms. Vizzion, as the leading road camera data provider in terms of both technology and coverage, is well prepared to supply data sets such as out-of-service/in-service cameras, low and clear visibility, images showing construction sites, dry/wet/snowy roads, road boundary and vehicle locations, and more. Our own algorithms are trained on our labeled data sets, but we also encourage others to train algorithms on the images we collect. Our services are available to do custom development work or to collect and label an image set for our customers. Spend the time you would have spent assembling a data set on the important algorithm development, and let us handle your labeled image data needs.

About Vizzion

Vizzion is the leading provider of road imagery for traffic, weather, road condition, and safety operations and applications. Through partnerships with over 200 different transport agencies and on-vehicle camera providers, Vizzion offers live feeds from over 100,000 cameras in 40 countries across North America, Europe, Asia, Australasia, and key markets in South America and Africa. Both on-vehicle and roadside traffic camera services are available through Vizzion’s flexible API and turnkey Video Wall application. Vizzion’s content is trusted by major apps, map providers, broadcasters, fleets, and automotive organizations. Contact busdev@vizzion.com for more information.

Articles

Traffic cameras save time, money, and lives

Traffic Cameras: Providing the Whole Picture

Governments around the world spend millions of dollars on implementing and maintaining these traffic cameras because they know that the cameras can help mitigate traffic jams and even save lives.

Read More


Vizzion's managed traffic camera database

Vizzion Announces New Platform Providing Massive Network of Live On-Vehicle Imagery

Vizzion's new platform will give access to on-vehicle cameras—producing live imagery as they travel.

Read More


Vizzion analyzes images

Vizzion Uses Image Analysis to Enhance Data Accuracy and Detect Low-Visibility Conditions

Vizzion's image analysis algorithms create a better end-user experience for its customers by detecting and automatically removing bad images, enhancing impaired images, detecting viewpoints, and through other features.

Read More


Vizzion detects low visibility conditions

Vizzion Can Detect Areas of Low-Visibility through Image Analysis

Vizzion’s Low Visibility Detection Service scans its network of traffic cameras to detect unsafe driving conditions. The service outputs georeferenced polygons outlining locations of low visibility for use in traffic management centers and mobility apps.

Read More