Testing autonomous vehicles is one of the key elements in the development of this technology. A consortium established by the Motor Transport Institute (Instytut Transportu Samochodowego, ITS) and Warsaw University of Technology is working on a database of road scenarios specific to Polish conditions, which will be used to train AI models. These scenarios will be created using the DARTS-PL measurement platform, a system of sensors collecting data on infrastructure, traffic signs, and driver behavior in selected locations across Poland.
“The DARTS-PL project aims to build a Polish database of road scenarios for testing automated and highly automated vehicles – in other words, autonomous vehicles. Our goal is to record 840 scenarios that will reflect the road conditions present in Poland, on Polish roads,”
says Aleksandra Rodak, Senior Research and Technical Specialist at the Motor Transport Institute, in an interview with Newseria.
“We want this to reflect the reality of Polish conditions, since our infrastructure, traffic signs, and driver behavior are quite different from other countries. That is why we pushed so hard to create a national database. We are now at the stage where we can start driving on the roads and recording data.”
The project website, darts-database.com, includes a map of selected locations (eventually more than 100). Each site – including its infrastructure, traffic signs, and other drivers’ behavior – will be recorded multiple times, in different conditions: daytime and nighttime, as well as in different seasons. As Rodak explains, the chosen locations are either problematic for drivers, where accidents frequently occur, or otherwise characteristic of Polish regions.
“With our technology partners, we have built a research vehicle that goes out onto the roads to capture data in these areas. Then the Warsaw University of Technology, our consortium partner, will be responsible for applying annotations – small rectangles marking what type of object is in the recording, whether it is a pedestrian, a car, or a building. Mainly moving objects will be annotated,”
adds Rodak.
Annotation is the process of identifying and labeling each object in a dataset, such as in a video frame. These annotated datasets are critical for training AI models to detect objects, understand the road environment, and make real-time driving decisions. Automated annotation tools, powered by machine learning algorithms, speed up this process by identifying and labeling pedestrians, vehicles, traffic signs, or lane markings. This automation not only accelerates data processing but also improves accuracy and scalability.
The DARTS-PL mobile measurement platform is equipped with advanced sensors, including four LiDARs, seven 360° cameras, six radars, and a thermal imaging camera. Based on these measurements, the database will cover 840 scenarios and approximately 168,000 data frames, featuring 15 types of road users (such as pedestrians, cyclists, and agricultural machinery), over 100 different road signs, 27 types of special events, and a variety of infrastructure (such as railway crossings and all road classes).
“Autonomous vehicles rely on artificial intelligence algorithms – specifically machine learning. To properly design a control algorithm, we need training data,”
explains the ITS expert.
“Testing is extremely important, as it directly impacts the quality, accuracy, behavior, and safety of autonomous vehicles in the future. If the control algorithm is well-trained, with a high-quality training database and correct annotations, the vehicle will correctly recognize all road users and respond appropriately to the conditions.”
According to the project’s creators, the road scenarios will be broadly accessible – both for scientific and commercial use.
“We expect that developers of autonomous driving algorithms will use our database to train their systems, so that vehicles eventually operating on Polish roads will be able to react and adapt to local conditions,”
emphasizes Rodak.
ITS also highlights that AI algorithms require massive amounts of high-quality training data. While there are around 200 public databases worldwide for autonomous vehicles, only 90 are directly applicable to training autonomous driving (AV) systems. Many of them rely on limited sensor setups or low-quality data. For example, algorithms trained on U.S. data may not perform correctly in Europe due to differences in infrastructure and traffic behavior.
“Globally, there are about 200 such databases. Unfortunately, not all are open for commercial or even research use – many are restricted to private companies and developers. We, however, will provide our database on a non-profit basis. Anyone who approaches us will be able to access and test it. We hope that both research institutions and service providers worldwide will reach out to us,”
says Aleksandra Rodak.