Benchmarking – DOM-NET

(last edited March 18th, 2026)

Main trends in benchmarking deformable-object manipulation

Benchmarking of robotic manipulation of textiles remains fragmented and evolving. Early work focused on physical task benchmarks that define standardized manipulation tasks (e.g., towel folding or cloth spreading) and evaluation metrics. More recent efforts have expanded benchmarking along three complementary directions.

First, several works emphasize object standardization, proposing shared cloth object sets and material characterization protocols so that experiments can be reproduced across laboratories. This reflects the recognition that textile manipulation results are strongly influenced by fabric properties, which are often poorly reported.

Second, the rapid growth of learning-based approaches has led to the emergence of simulation-based benchmarks such as SoftGym, DaXBench, GarmentLab and DexGarmentLab. These platforms provide reproducible environments and datasets for comparing reinforcement learning or planning algorithms at scale, but they also highlight a persistent sim-to-real gap in cloth dynamics.

Third, the field is beginning to adopt dataset-driven and competition-style benchmarks, where shared datasets and defined tasks enable head-to-head comparisons between methods.

Compared with rigid-object manipulation (e.g., YCB benchmarks), textile manipulation still lacks a widely adopted community standard benchmark with common datasets, tasks, and leaderboards.

Ref	Work	Benchmark type	Scope	Tasks defined	Standardized objects	Metrics defined	Dataset shared	Data type	Simulation environment	Main contribution	Main limitation
[1]	Bimanual Cloth Manipulation	Physical benchmark	Cloth	Yes	Yes	Yes	Partial	Real	No	One of the first structured benchmarks for cloth manipulation tasks.	Limited dataset and task diversity.
[2]	Household Cloth Object Set	Object standardization	Cloth	Yes	Yes	Partial	Yes	Real	No	Standardized cloth object set enabling reproducible experiments across labs.	No large dataset or leaderboard.
[3]	Standardization of Cloth Objects	Material characterization	Cloth	No	Yes	Partial	Partial	Real	No	Introduces material descriptors to enable comparable experiments.	Focuses on cloth characterization rather than manipulation tasks.
[4]	SoftGym	Simulation benchmark	Deformable objects	Yes	Yes	Yes	Yes	Simulation	Yes	Widely adopted RL benchmark with cloth manipulation tasks.	Limited realism of cloth physics.
[5]	DaXBench	Simulation benchmark	Deformable objects	Yes	Yes	Yes	Yes	Simulation	Yes	Differentiable physics benchmark for learning deformable manipulation.	Mostly simulation-based evaluation.
[6]	Sim-to-Real Gap	Sim-to-real benchmark	Cloth	Yes	Yes	Yes	Partial	Real + Simulation	Yes	Benchmarks fidelity of cloth simulators compared to real experiments.	Narrow focus on simulator evaluation.
[7]	GarmentLab	Simulation benchmark	Garments	Yes	Yes	Yes	Yes	Simulation	Yes	Large-scale garment manipulation environment with many tasks and garments.	Still emerging; limited real-world validation.
[8]	Cloth Unfolding Benchmark (ICRA competition dataset)	Dataset + competition	Cloth	Yes	Yes	Yes	Yes	Real	No	Public dataset and benchmark for cloth unfolding grasp prediction.	Focused on a single subtask.
[9]	NIST Deformable Object Benchmark	Methodology benchmark	Deformable objects	Yes	Partial	Yes	No	—	No	Standardized metrics for evaluating deformable manipulation tasks.	Not textile-specific.
[10]	Flat’n’Fold	Dataset + benchmark	Garments	Yes	Yes	Yes	Yes	Real	No	Large multimodal dataset with ~2,000 demonstrations across 44 garments, capturing full manipulation sequences from crumpled to folded states.	Primarily perception/learning benchmark rather than full robot manipulation benchmark.
[11]	Cloth Folding Dataset (UGent)	Demonstration dataset	Cloth	Yes	Yes	Partial	Yes	Real	No	Dataset of ~8.5 hours of folding demonstrations (~1000 samples) for learning cloth folding policies.	Limited garment diversity and task scope.
[12]	Cloth Tracking Dataset	Dynamic cloth dataset	Cloth	No	Yes	No	Yes	Real	No	Motion capture dataset capturing cloth deformation dynamics across different fabrics.	Focused on cloth tracking for simulator system identification rather than manipulation tasks.
[13]	DexGarmentLab	Simulation environment	Garments	Yes	Yes	Yes	Yes	Simulation	Yes	Environment for dexterous bimanual garment manipulation with diverse garment assets and tasks.	Mostly simulation; limited physical validation.

References

[1] Garcia-Camacho et al., Benchmarking Bimanual Cloth Manipulation, RA-L, 2020.
[2] Garcia-Camacho et al., Household Cloth Object Set, RA-L, 2022.
[3] Garcia-Camacho et al., Standardization of Cloth Objects and its Relevance in Robotic Manipulation, ICRA 2024.
[4] Lin et al., SoftGym: Benchmarking Deep Reinforcement Learning for Deformable Object Manipulation, CoRL 2021.
[5] Chen et al., DaXBench, ICLR 2023.
[6] Blanco-Mulero et al., Benchmarking the Sim-to-Real Gap in Cloth Manipulation, RA-L 2024.
[7] Lu et al., GarmentLab, NeurIPS 2024.
[8] De Gusseme et al., Cloth Unfolding Benchmark from ICRA 2024 competition, IJRR 2025.
[9] Kimble et al., Performance Measures to Benchmark Deformable Object Manipulation, Frontiers in Robotics and AI 2022.
[10] Zhuang et al., Flat’n’Fold: A Diverse Multi-Modal Dataset for Garment Perception and Manipulation, ICRA 2025.
[11] Verleysen et al., Human Demonstrations of Cloth Folding Dataset, IJRR 2020.
[12] Coltraro et al., Cloth Tracking Dataset, IJRR 2025.
[13] Wang et al., DexGarmentLab, 2025.