AAVI recently spoke to Beamr’s Dani Megrelishvili about the company’s content-adaptive compression for petabyte-scale video data – you can also read this interview in the January 2026 issue of ADAS & Autonomous Vehicle International magazine
The autonomous vehicle industry faces an escalating challenge: the video data essential for ML development is creating infrastructure bottlenecks that threaten progress. The scale is staggering: as fleets expand, AV and ADAS companies are managing hundreds of petabytes of video data with cascading impacts on costs, processing speed and iteration cycles.
Across the industry, there is growing understanding that it is essential to reduce costs and accelerate processes. However, any compression method must be rigorously validated before deployment at scale, or it may risk the integrity of the whole project.
Consider a 150-vehicle fleet, with each vehicle producing 1TB of data daily – a conservative estimate. That’s 150TB per day, or 55PB per year. Larger companies managing hundreds of petabytes of real-world and synthetic video data already face annual costs of millions to tens of millions of dollars for their storage.
The challenge extends beyond storage capacity. Managing massive datasets requires data factories for training, validation, deployment and iteration. This creates cascading operational constraints, including network transfer limitations and infrastructure scaling pressure.
The economics are stark: cloud storage costs, egress fees and compute overhead for petabyte-scale datasets create compounding expenses that strain even well-funded programs.
The compression dilemma
Video data compression is a critical strategy to address these issues. However, reducing data size while preserving model performance is difficult to achieve. Even with careful configuration, standard tools offer limited control over the size-accuracy trade-offs.
For example, applying uniform compression parameters to a busy intersection and a highway drive can degrade ML model accuracy or miss efficiency opportunities.
This is also a safety issue, as autonomous driving models need to detect street signs at considerable distances, recognize objects and track vehicles through complex lighting.
Industry experience with compression has established clear validation requirements. Early implementations revealed that standard tools often lacked the precision needed for ML workloads. Some businesses implemented aggressive recompression early in development, only to discover downstream impacts on model accuracy that forced reversions to less compressed formats. Others found that certain methods failed to preserve the specific visual characteristics that their ML models required.
Despite these concerns, the AV professionals that AAVI has spoken to signal that even a 20-30% reduction in file size would justify investigation. The potential payback is considerable: reduced storage costs, faster pipeline throughput and decreased egress expenses. The question isn’t whether compression would be valuable, but whether it can be achieved without compromising data integrity. This creates a clear market requirement: substantial savings with validated model preservation.

Meeting the validation standard
Solutions that can demonstrate significant compression improvements and validated model preservation are gaining adoption as AV programs mature, reports Beamr’s chief product officer, Dani Megrelishvili. He says his company’s patented Content-Adaptive Bitrate (CABR) technology takes a unique approach, as it analyzes the complexity of each frame and adjusts accordingly.
The Emmy Award-winning technology uses metrics originally developed for high-quality broadcast that are now adapted for machine vision requirements. The compressed video data preserves spatial and temporal information critical for ML tasks.
“Recent validation testing and running PoCs with real-world AV footage demonstrated 23-50% compression improvement over existing workflows,” says Megrelishvili. “When combined with modern codecs (such as HEVC or AV1), total improvements reached 40-50% while maintaining quality metrics, indicating no ML model degradation.
“We’re seeing a gap emerge between AV programs that treat data infrastructure as strategic and those that treat it as operational. The strategic ones compress early, iterate faster and free up millions for R&D instead of burning it on storage and data transfer costs.”
As data volumes reach hundreds of petabytes and continue to double annually, infrastructure choices become competitive advantages, according to Megrelishvili. “AV teams addressing their data bottlenecks now gain both cost advantages and development velocity,” he says. “They free capital and engineering resources that competitors must dedicate to storage management. With validation frameworks established and proven solutions available, the strategic question is shifting from whether to optimize to how quickly to act.”
How frame−level compression works for AVs
Compressing AV video data without careful attention to ML accuracy requirements can degrade model performance. Standard tools compress every frame in the same way, but AV footage varies dramatically – highway drives versus dense intersections, bright sun versus heavy rain, day versus night. Treating them all identically either leaves savings on the table or risks degrading model precision.
In the search for optimal bitrate-accuracy balance, engineers emphasize the need for rigorous validation. Pixel-level metrics provide one measure but don’t capture
the full picture. Teams need evidence that compression doesn’t degrade performance on specific workloads: object detection accuracy, tracking consistency and depth estimation precision.A content-adaptive approach addresses these challenges by analyzing each frame individually rather than imposing uniform constraints, according to Beamr’s Dani Megrelishvili. “Our GPU-accelerated video pipelines can handle the high throughput demands of AV data at high speed and low cost, enabling rapid and scalable compression,” he explains. “Each frame is compressed as aggressively as possible while necessary details are perfectly preserved.”
Megrelishvili says this approach addresses the limitations of alternative methods: “Lossless or near-lossless compression attempts to avoid any degradation but results in unmanageable file sizes for petabyte-scale datasets. Standard recipes with default settings may not preserve the details required to ensure fidelity.
“Beamr’s proprietary Content-Adaptive Bitrate technology (CABR) demonstrates the efficient and ML-safe path for video data. Recent validation testing and running PoCs with real-world AV footage demonstrated up to 50% improvement over existing workflows while maintaining quality metrics, indicating no ML model degradation.”

