Benefits of managing large objects within database systems
Modern applications handle ever-growing volumes of large binary data – medical scans, video assets, ML embeddings, document archives. In practice, developers store these large objects (often called big values in key-value stores) on the filesystem and keep only a file path in the database, because database systems have historically handled large objects poorly. But this split comes at a real cost: a crash between the filesystem write and the database commit can leave the application in an inconsistent state [3], and many more issues that require custom infrastructure a typical DBMS would provide for free. As ML pipelines and media workloads go mainstream, the case for managing large objects with the same transactional guarantees as structured data grows stronger – if only the DBMS could handle them efficiently.
Recent works follow the key-value separation design, advocating to store large objects in the write-ahead log while only maintaining a small key and object ID in the primary data structure [1][2]. This avoids the write amplification that occurs when large objects are treated like any other record – repeatedly rewritten during page reorganization or background maintenance, regardless of whether the underlying engine is log-structured or B-tree-based. However, placing object bodies in the WAL introduces a fundamental reclamation problem: unlike normal log records that can be discarded once checkpointed, large object payloads must persist indefinitely, interleaving with temporary log data that will eventually be truncated.
This is the problem that Why Files If You Have a DBMS?[3] attacks directly. The key insight is to asynchronously write every large object exactly once to the main storage – never to the WAL – and record only a tiny metadata footprint – the Blob State – in the log. Because the WAL never carries large object bodies, checkpointing remains cheap and simple, log growth stays bounded, regardless of object size or the underlying storage engine. Furthermore, managing large objects properly inside a DBMS requires addressing the full stack of challenges: an efficient physical storage format that avoids deep indirection chains on reads, full content indexing without duplicating object data, and interoperability with external programs that expect filesystem paths. The paper tackles all of these – proposing an extent-sequence storage layout, a single-layer Blob State indirection, BLOB indexing with no extra copies, and a FUSE interface that exposes database-managed objects as read-only virtual files to unmodified applications.
The result is a storage design that finally makes the DBMS competitive with – and in experiments, faster than – the filesystem for large objects, without giving up transactions, indexing, or crash safety. By enabling all data to be stored and processed efficiently within a single system, it naturally supports LOB-intensive workloads such as computer vision, video and media management, and ML embeddings. As these workloads become standard, efficient large object management is no longer a niche concern; it is infrastructure that every serious storage engine needs to get right.
References
[1] Li et al. BVLSM: Write-Efficient LSM-Tree Storage via WAL-Time Key-Value Separation. arXiv:2506.04678, 2025. https://arxiv.org/abs/2506.04678
[2] Chursin et al. Tidehunter: Large-Value Storage With Minimal Data Relocation, arxiv:2602.01873, 2025. https://arxiv.org/abs/2602.01873v2
[3] Nguyen LD, Leis V. Why Files If You Have a DBMS?. In 2024 IEEE 40th International Conference on Data Engineering (ICDE) 2024 May 13 (pp. 3878-3892). IEEE. https://www.cs.cit.tum.de/fileadmin/w00cfj/dis/papers/blob.pdf