CV Arxiv Daily

Updated on 2026.04.03

Usage instructions: here

3D Segmentation

Publish Date	Title	Authors	PDF	Code
2025-07-21	A Study of Anatomical Priors for Deep Learning-Based Segmentation of Pheochromocytoma in Abdominal CT	Tanjin Taher Toma et.al.	2507.15193	null
2025-07-20	TriCLIP-3D: A Unified Parameter-Efficient Framework for Tri-Modal 3D Visual Grounding based on CLIP	Fan Li et.al.	2507.14904	null
2025-07-18	TexGS-VolVis: Expressive Scene Editing for Volume Visualization via Textured Gaussian Splatting	Kaiyuan Tang et.al.	2507.13586	null
2025-07-15	ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition	Ronggang Huang et.al.	2507.11261	null
2025-07-15	Acquiring and Adapting Priors for Novel Tasks via Neural Meta-Architectures	Sudarshan Babu et.al.	2507.10446	null
2025-07-08	OTAS: Open-vocabulary Token Alignment for Outdoor Segmentation	Simon Schwaiger et.al.	2507.08851	null
2025-07-10	SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes	Jiaxin Huang et.al.	2507.07781	null
2025-07-10	MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation	Bangning Wei et.al.	2507.07519	null
2025-07-09	A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding	Zhenyang Liu et.al.	2507.06719	null
2025-07-08	DreamArt: Generating Interactable Articulated Objects from a Single Image	Ruijie Lu et.al.	2507.05763	null
2025-07-07	All in One: Visual-Description-Guided Unified Point Cloud Segmentation	Zongyan Han et.al.	2507.05211	null
2025-07-05	PLUS: Plug-and-Play Enhanced Liver Lesion Diagnosis Model on Non-Contrast CT Scans	Jiacheng Hao et.al.	2507.03872	null
2025-07-01	Audio-3DVG: Unified Audio - Point Cloud Fusion for 3D Visual Grounding	Duc Cao-Dinh et.al.	2507.00669	null
2025-06-30	PGOV3D: Open-Vocabulary 3D Semantic Segmentation with Partial-to-Global Curriculum	Shiqi Zhang et.al.	2506.23607	null
2025-06-27	SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding	Zhao Jin et.al.	2506.21924	null
2025-06-26	SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark	Alex Costanzino et.al.	2506.21549	null
2025-06-26	GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding	Zijun Lin et.al.	2506.21188	null
2025-06-24	ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation	Ahmad Mustafa et.al.	2506.19687	null
2025-06-22	Auto-Regressive Surface Cutting	Yang Li et.al.	2506.18017	null
2025-06-17	I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs	Yu Qi et.al.	2506.14495	null
2025-06-20	Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition	Xiaohui Jiang et.al.	2506.14243	link
2025-06-17	Unified Representation Space for 3D Visual Grounding	Yinuo Zheng et.al.	2506.14238	null
2025-06-09	PIG: Physically-based Multi-Material Interaction with 3D Gaussians	Zeyu Xiao et.al.	2506.07657	null
2025-06-06	NeurNCD: Novel Class Discovery via Implicit Neural Representation	Junming Wang et.al.	2506.06412	null
2025-06-05	From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes	Tianxu Wang et.al.	2506.04897	null
2025-06-05	Midplane based 3D single pass unbiased segment-to-segment contact interaction using penalty method	Indrajeet Sahu et.al.	2506.04841	null
2025-06-05	OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model	Kunshen Zhang et.al.	2506.04837	link
2025-05-28	Zero-Shot 3D Visual Grounding from Vision-Language Models	Rong Li et.al.	2505.22429	null
2025-05-26	Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging	Ho Hin Lee et.al.	2505.19603	null
2025-05-23	SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding	Xuerui Qiu et.al.	2505.17674	null
2025-06-03	A Unified Multi-Scale Attention-Based Network for Automatic 3D Segmentation of Lung Parenchyma & Nodules In Thoracic CT Images	Muhammad Abdullah et.al.	2505.17602	link
2025-05-23	From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation	Mahmoud Chick Zaouali et.al.	2505.17402	null
2025-05-18	Attention-Enhanced U-Net for Accurate Segmentation of COVID-19 Infected Lung Regions in CT Scans	Amal Lahchim et.al.	2505.12298	null
2025-05-17	iSegMan: Interactive Segment-and-Manipulate 3D Gaussians	Yian Zhao et.al.	2505.11934	null
2025-05-15	MOSAIC: A Multi-View 2.5D Organ Slice Selector with Cross-Attentional Reasoning for Anatomically-Aware CT Localization in Medical Organ Segmentation	Hania Ghouse et.al.	2505.10672	null
2025-05-27	HWA-UNETR: Hierarchical Window Aggregate UNETR for 3D Multimodal Gastric Lesion Segmentation	Jiaming Liang et.al.	2505.10464	link
2025-05-13	Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving	Zongchuang Zhao et.al.	2505.08725	link
2025-05-08	DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding	Henry Zheng et.al.	2505.04965	null
2025-05-20	AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding	Feng Xiao et.al.	2505.04058	link
2025-05-04	Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving	Alexey Nekrasov et.al.	2505.02148	null
2025-05-03	Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes	Jie Liu et.al.	2505.01726	null
2025-04-30	SAM4EM: Efficient memory-based two stage prompt-free segment anything model adapter for complex 3D neuroscience electron microscopy stacks	Uzair Shah et.al.	2504.21544	link
2025-05-04	Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation	Victoria Yue Chen et.al.	2504.19718	null
2025-04-24	OmniMamba4D: Spatio-temporal Mamba for longitudinal CT lesion segmentation	Justin Namuk Kim et.al.	2504.09655	null
2025-04-13	Ges3ViG: Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding	Atharv Mahesh Mane et.al.	2504.09623	link
2025-04-11	DSM: Building A Diverse Semantic Map for 3D Visual Grounding	Qinghongbing Xie et.al.	2504.08307	null
2025-04-09	MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs	Jiawei Mao et.al.	2504.06897	null
2025-04-08	InvNeRF-Seg: Fine-Tuning a Pre-Trained NeRF for 3D Object Segmentation	Jiangsan Zhao et.al.	2504.05751	null
2025-04-01	Deconver: A Deconvolutional Network for Medical Image Segmentation	Pooya Ashtari et.al.	2504.00302	link
2025-03-30	ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning	Zhenyang Liu et.al.	2503.23297	null
2025-03-28	TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting	Boyang et.al.	2503.22676	null
2025-03-28	NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving	Fuhao Li et.al.	2503.22436	null
2025-03-28	Segment then Splat: A Unified Approach for 3D Open-Vocabulary Segmentation based on Gaussian Splatting	Yiren Lu et.al.	2503.22204	null
2025-03-26	COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting	Jiaxin Zhang et.al.	2503.19443	link
2025-03-24	DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation	Karim Abou Zeid et.al.	2503.18944	link
2025-03-24	ZECO: ZeroFusion Guided 3D MRI Conditional Generation	Feiran Wang et.al.	2503.18246	link
2025-03-23	PanopticSplatting: End-to-End Panoptic Gaussian Splatting	Yuxuan Xie et.al.	2503.18073	null
2025-03-19	SPNeRF: Open Vocabulary 3D Neural Scene Segmentation with Superpoints	Weiwen Hu et.al.	2503.15712	null
2025-03-19	Federated Continual 3D Segmentation With Single-round Communication	Can Peng et.al.	2503.15414	null
2025-03-18	Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting	Runsong Zhu et.al.	2503.14029	link
2025-03-17	Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation	Yanlin Xiang et.al.	2503.12853	null
2025-03-12	QuickDraw: Fast Visualization, Analysis and Active Learning for Medical Image Segmentation	Daniel Syomichev et.al.	2503.09885	link
2025-03-17	WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images	Yansong Guo et.al.	2503.08407	null
2025-03-11	nnInteractive: Redefining 3D Promptable Segmentation	Fabian Isensee et.al.	2503.08373	link
2025-03-11	Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous Driving	Runwei Guan et.al.	2503.08336	null
2025-03-10	SegResMamba: An Efficient Architecture for 3D Medical Image Segmentation	Badhan Kumar Das et.al.	2503.07766	null
2025-03-07	HexPlane Representation for 3D Semantic Scene Understanding	Zeren Chen et.al.	2503.05127	null
2025-03-03	OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging	Yijie Tang et.al.	2503.01309	null
2025-02-27	Open-Vocabulary Semantic Part Segmentation of 3D Human	Keito Suzuki et.al.	2502.19782	null
2025-02-27	Deep Learning-Based Approach for Automatic 2D and 3D MRI Segmentation of Gliomas	Kiranmayee Janardhan et.al.	2502.19760	null
2025-02-27	ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding	Qihang Peng et.al.	2502.19247	null
2025-02-26	Subclass Classification of Gliomas Using MRI Fusion Technique	Kiranmayee Janardhan et.al.	2502.18775	null
2025-02-22	Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field	Wenhao Hu et.al.	2502.16303	null
2025-02-20	Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing	Yoel Levy et.al.	2502.14789	null
2025-02-19	Pericoronary adipose tissue attenuation as a predictor of functional severity of coronary stenosis	Marta Pillitteri et.al.	2502.13649	null
2025-02-18	Learning Wall Segmentation in 3D Vessel Trees using Sparse Annotations	Hinrich Rahlfs et.al.	2502.12801	null
2025-02-14	Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding	Wenxuan Guo et.al.	2502.10392	link
2025-02-04	Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation	Junha Lee et.al.	2502.02548	null
2025-02-20	Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection	Boyu Mi et.al.	2502.01401	link
2025-02-01	Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings	Zachary Huemann et.al.	2502.00528	null
2025-01-31	Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields	Xingyu Miao et.al.	2501.19084	link
2025-01-30	Full-Head Segmentation of MRI with Abnormal Brain Anatomy: Model and Data Release	Andrew M Birnbaum et.al.	2501.18716	link
2025-01-29	3DSES: an indoor Lidar point cloud segmentation dataset with real and pseudo-labels from a 3D model	Maxime Mérizette et.al.	2501.17534	null
2025-01-27	CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation	Xiaochuan Ma et.al.	2501.16246	null
2025-01-18	No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling	Young Seok Jeon et.al.	2501.10814	null
2025-01-16	AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring	Xinyi Wang et.al.	2501.09428	null
2025-01-17	Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding	Kohei Torimi et.al.	2501.09278	null
2025-01-12	3DCoMPaT200: Language-Grounded Compositional Understanding of Parts and Materials of 3D Shapes	Mahmoud Ahmed et.al.	2501.06785	link
2025-01-10	Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers	Kuan Liu et.al.	2501.05961	null
2025-01-07	Self-adaptive vision-language model for 3D segmentation of pulmonary artery and vein	Xiaotong Guo et.al.	2501.03722	null
2025-01-09	GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models	Zhangyang Qi et.al.	2501.01428	link
2025-01-02	ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding	Austin T. Wang et.al.	2501.01366	null
2024-12-31	OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies	Runnan Chen et.al.	2501.00326	null
2024-12-28	Advances in Additive Manufacturing of 3D-segmented Plastic Scintillator Detectors for Particle Tracking and Calorimetry	Umut Kose et.al.	2412.20267	null
2024-12-24	LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding	Hao Li et.al.	2412.17635	null
2024-12-22	GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs	Xingrui Wang et.al.	2412.16932	link
2024-12-18	MobiFuse: A High-Precision On-device Depth Perception System with Multi-Data Fusion	Jinrui Zhang et.al.	2412.13848	null
2024-12-14	DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting	Luis Wiedmann et.al.	2412.10972	link

Reasoning Segmentation

Publish Date	Title	Authors	PDF	Code
2025-07-22	Temporally-Constrained Video Reasoning Segmentation and Automated Benchmark Construction	Yiqing Shen et.al.	2507.16718	null
2025-07-17	HRSeg: High-Resolution Visual Perception and Enhancement for Reasoning Segmentation	Weihuang Lin et.al.	2507.12883	null
2025-07-10	SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes	Jiaxin Huang et.al.	2507.07781	null
2025-07-04	Controlling Thinking Speed in Reasoning Models	Zhengkai Lin et.al.	2507.03704	null
2025-06-29	Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation	Zhenhua Ning et.al.	2506.23120	null
2025-06-27	Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning	Zuyao You et.al.	2506.22624	null
2025-06-12	MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models	Yu Huang et.al.	2506.10465	null
2025-06-11	Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations	Yizhen Li et.al.	2506.07943	null
2025-06-05	OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model	Kunshen Zhang et.al.	2506.04837	link
2025-06-04	RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought	Yi Lu et.al.	2506.04277	null
2025-05-29	PixelThink: Towards Efficient Chain-of-Pixel Reasoning	Song Wang et.al.	2505.23727	null
2025-06-15	PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding	Ansel Blume et.al.	2505.20759	null
2025-05-24	Reasoning Segmentation for Images and Videos: A Survey	Yiqing Shen et.al.	2505.18816	null
2025-05-22	PRS-Med: Position Reasoning Segmentation with Vision-Language Model in Medical Imaging	Quoc-Huy Trinh et.al.	2505.11872	null
2025-05-17	RVTBench: A Benchmark for Visual Reasoning Tasks	Yiqing Shen et.al.	2505.11838	link
2025-05-05	LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery	Jerome Quenum et.al.	2505.02829	null
2025-04-17	SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding	Qianqian Sun et.al.	2504.12704	null
2025-04-23	MediSee: Reasoning-based Pixel-level Perception in Medical Images	Qinyue Tong et.al.	2504.11008	null
2025-04-15	LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation	Hanning Chen et.al.	2504.10854	null
2025-04-01	POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation	Lanyun Zhu et.al.	2504.00640	null
2025-03-27	Online Reasoning Video Segmentation with Just-in-Time Digital Twins	Yiqing Shen et.al.	2503.21056	null
2025-03-26	Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins	Yiqing Shen et.al.	2503.21054	null
2025-03-23	MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation	Jiaxin Huang et.al.	2503.18135	null
2025-03-19	VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation	Shoubin Yu et.al.	2503.14350	null
2025-03-18	MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation	Donggon Jang et.al.	2503.13881	link
2025-03-13	Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA	Zhixuan Li et.al.	2503.10225	null
2025-03-11	TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement	Miao Zhang et.al.	2503.08168	null
2025-03-25	Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts	Shiu-hong Kao et.al.	2503.07503	null
2025-03-13	InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models	Yuchen Yan et.al.	2503.06692	null
2025-03-09	Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement	Yuqi Liu et.al.	2503.06520	link
2025-03-04	UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface	Hao Tang et.al.	2503.01342	link
2025-02-13	Pixel-Level Reasoning Segmentation via Multi-turn Conversations	Dexian Cai et.al.	2502.09447	link
2025-01-15	The Devil is in Temporal Token: High Quality Video Reasoning Segmentation	Sitong Gong et.al.	2501.08549	link
2024-12-19	PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation	Muntasir Wahed et.al.	2412.15209	null
2024-12-18	InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models	Cong Wei et.al.	2412.14006	link
2024-12-02	HyperSeg: Towards Universal Visual Segmentation with Large Language Model	Cong Wei et.al.	2411.17606	link
2024-11-21	Multimodal 3D Reasoning Segmentation with Complex Scenes	Xueying Jiang et.al.	2411.13927	null
2024-11-15	Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level	Andong Deng et.al.	2411.09921	null
2024-10-31	SegLLM: Multi-round Reasoning Segmentation	XuDong Wang et.al.	2410.18923	null
2024-09-29	One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos	Zechen Bai et.al.	2409.19603	link
2024-09-20	Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model	Li Zhou et.al.	2409.13407	link
2025-02-10	Visual Agents as Fast and Slow Thinkers	Guangyan Sun et.al.	2408.08862	link

3D Generative

Publish Date	Title	Authors	PDF	Code
2025-07-23	Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention	Yiwen Chen et.al.	2507.17745	null
2025-07-23	EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion	Shang Liu et.al.	2507.16535	null
2025-07-20	Towards Geometric and Textural Consistency 3D Scene Generation via Single Image-guided Model Generation and Layout Optimization	Xiang Tang et.al.	2507.14841	null
2025-07-19	AutoPartGen: Autogressive 3D Part Generation and Discovery	Minghao Chen et.al.	2507.13346	null
2025-07-20	PhysX-3D: Physical-Grounded 3D Asset Generation	Ziang Cao et.al.	2507.12465	null
2025-07-15	Acquiring and Adapting Priors for Novel Tasks via Neural Meta-Architectures	Sudarshan Babu et.al.	2507.10446	null
2025-07-13	Advancing Text-to-3D Generation with Linearized Lookahead Variational Score Distillation	Yu Lei et.al.	2507.09748	null
2025-07-11	From One to More: Contextual Part Latents for 3D Generation	Shaocong Dong et.al.	2507.08772	null
2025-07-21	InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes	Zesong Yang et.al.	2507.08416	null
2025-07-08	OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion	Yunhan Yang et.al.	2507.06165	null
2025-07-08	DreamArt: Generating Interactable Articulated Objects from a Single Image	Ruijie Lu et.al.	2507.05763	null
2025-07-07	Modelling the 3D atmospheric structure of the cold Jupiter WD1856+534b orbiting a white dwarf	Pascal A. Noti et.al.	2507.05422	null
2025-07-07	SegmentDreamer: Towards High-fidelity Text-to-3D Synthesis with Segmented Consistency Trajectory Distillation	Jiahao Zhu et.al.	2507.05256	null
2025-07-03	Three-dimensional Transport-induced Chemistry on Temperate sub-Neptune K2-18b, Part I: the Effects of Atmospheric Dynamics	Jiachen Liu et.al.	2506.23891	null
2025-06-30	Refine Any Object in Any Scene	Ziwei Chen et.al.	2506.23835	null
2025-06-29	AlignCVC: Aligning Cross-View Consistency for Single-Image-to-3D Generation	Xinyue Liang et.al.	2506.23150	null
2025-06-26	Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image	Pufan Li et.al.	2506.21152	null
2025-06-25	WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration	Chaojun Ni et.al.	2506.20590	null
2025-06-23	3D Arena: An Open Platform for Generative 3D Evaluation	Dylan Ebert et.al.	2506.18787	null
2025-06-23	Geometry-Aware Preference Learning for 3D Texture Generation	AmirHossein Zamani et.al.	2506.18331	null
2025-06-13	VEIGAR: View-consistent Explicit Inpainting and Geometry Alignment for 3D object Removal	Pham Khai Nguyen Do et.al.	2506.15821	null
2025-06-18	Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards	Qingming Liu et.al.	2506.15684	null
2025-06-18	Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material	Team Hunyuan3D et.al.	2506.15442	link
2025-06-17	RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills	Chunru Lin et.al.	2506.14763	null
2025-06-16	Disentangling 3D from Large Vision-Language Models for Controlled Portrait Generation	Nick Yiwen Huang et.al.	2506.14015	null
2025-06-16	Dive3D: Diverse Distillation-based Text-to-3D Generation via Score Implicit Matching	Weimin Bai et.al.	2506.13594	null
2025-06-11	DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision	Xiandong Zou et.al.	2506.09814	null
2025-06-10	Orientation Matters: Making 3D Generative Models Orientation-Aligned	Yichong Lu et.al.	2506.08640	null
2025-06-09	Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor	Rishit Dagli et.al.	2506.07932	null
2025-06-09	R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation	William Ljungbergh et.al.	2506.07826	null
2025-06-09	NOVA3D: Normal Aligned Video Diffusion Model for Single Image to 3D Generation	Yuxiao Yang et.al.	2506.07698	null
2025-06-05	PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers	Yuchen Lin et.al.	2506.05573	null
2025-06-02	ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding	Junliang Ye et.al.	2506.01853	link
2025-05-31	ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary	Zeqi Gu et.al.	2506.00742	null
2025-05-30	LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework	Xin Kang et.al.	2505.24245	null
2025-05-29	Universal Radial Scaling of Large-Scale Black Hole Accretion for Magnetically Arrested And Rocking Accretion Disks	Aretaios Lalakos et.al.	2505.23888	null
2025-05-28	Advancing high-fidelity 3D and Texture Generation with 2.5D latents	Xin Yang et.al.	2505.21050	null
2025-05-27	Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction	Yifei Wang et.al.	2505.20755	null
2025-05-30	ART-DECO: Arbitrary Text Guidance for 3D Detailizer Construction	Qimin Chen et.al.	2505.20431	null
2025-05-26	Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling	Junhong Lee et.al.	2505.19868	null
2025-05-26	Global stability for the compressible isentropic magnetohydrodynamic equations in 3D bounded domains with Navier-slip boundary conditions	Yang Liu et.al.	2505.19749	null
2025-05-23	SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation	Dekai Zhu et.al.	2505.17721	null
2025-05-26	Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention	Shuang Wu et.al.	2505.17412	null
2025-05-22	MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM	Siwei Meng et.al.	2505.16456	null
2025-05-21	Constructing a 3D Town from a Single Image	Kaizhi Zheng et.al.	2505.15765	null
2025-05-20	Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image	Yuxuan Wang et.al.	2505.14537	null
2025-05-21	Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling	Zhihao Li et.al.	2505.14521	null
2025-05-19	Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction	Yuanbo Wang et.al.	2505.13091	null
2025-05-15	Pharmacophore-Conditioned Diffusion Model for Ligand-Based De Novo Drug Design	Amira Alakhdar et.al.	2505.10545	null
2025-05-13	Long timescale numerical simulations of large, super-critical accretion discs	P. Chris Fragile et.al.	2505.08859	null
2025-05-12	Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets	Weiyu Li et.al.	2505.07747	null
2025-05-11	CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation	Peng Li et.al.	2505.07003	null
2025-05-07	Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation	Yiming Qin et.al.	2505.05505	link
2025-05-07	Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond	Jessie Richter-Powell et.al.	2505.04621	null
2025-05-07	Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting	Feng Yang et.al.	2505.04262	null
2025-05-07	S3D: Sketch-Driven 3D Model Generation	Hail Song et.al.	2505.04185	link
2025-05-06	Effects of transient stellar emissions on planetary climates of tidally-locked exo-earths	Howard Chen et.al.	2505.03723	null
2025-05-03	Rethinking Score Distilling Sampling for 3D Editing and Generation	Xingyu Miao et.al.	2505.01888	null
2025-04-30	3D Stylization via Large Reconstruction Model	Ipek Oztas et.al.	2504.21836	null
2025-04-29	A 3D pocket-aware and affinity-guided diffusion model for lead optimization	Anjie Qiao et.al.	2504.21065	null
2025-04-28	CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback	Chenhan Jiang et.al.	2504.19860	null
2025-04-27	Making Physical Objects with Generative AI and Robotic Assembly: Considering Fabrication Constraints, Sustainability, Time, Functionality, and Accessibility	Alexander Htet Kyaw et.al.	2504.19131	null
2025-04-25	Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation	Shivam Duggal et.al.	2504.18509	null
2025-04-24	DiMeR: Disentangled Mesh Reconstruction Model	Lutao Jiang et.al.	2504.17670	link
2025-04-23	Global stability for compressible isentropic Navier-Stokes equations in 3D bounded domains with Navier-slip boundary conditions	Yang Liu et.al.	2504.17136	null
2025-04-22	Text-based Animatable 3D Avatars with Morphable Model Alignment	Yiqian Wu et.al.	2504.15835	link
2025-04-21	Cyc3D: Fine-grained Controllable 3D Generation via Cycle Consistency Regularization	Hongbin Xu et.al.	2504.14975	null
2025-04-17	HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation	Wenqi Dong et.al.	2504.13072	null
2025-04-17	RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins	Yao Mu et.al.	2504.13059	null
2025-04-17	SOPHY: Generating Simulation-Ready Objects with Physical Materials	Junyi Cao et.al.	2504.12684	null
2025-04-16	Recent Advance in 3D Object and Scene Generation: A Survey	Xiang Tang et.al.	2504.11734	null
2025-04-15	3D full-GR simulations of magnetorotational core-collapse supernovae on GPUs: A systematic study of rotation rates and magnetic fields	Swapnil Shankar et.al.	2504.11537	null
2025-04-14	Art3D: Training-Free 3D Generation from Flat-Colored Illustration	Xiaoyan Cong et.al.	2504.10466	null
2025-04-14	ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting	Huiqi Wu et.al.	2504.10316	null
2025-04-16	GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting	Junlin Hao et.al.	2504.10001	null
2025-04-11	GeoTexBuild: 3D Building Model Generation from Map Footprints	Ruizhe Wang et.al.	2504.08419	null
2025-04-11	Generative AI for Film Creation: A Survey of Recent Advances	Ruihan Zhang et.al.	2504.08296	null
2025-04-10	Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects	Shalini Maiti et.al.	2504.08125	null
2025-04-10	ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting	Junbang Liu et.al.	2504.08100	link
2025-04-11	Objaverse++: Curated 3D Object Dataset with Quality Annotations	Chendi Lin et.al.	2504.07334	link
2025-04-10	Stochastic Ray Tracing of 3D Transparent Gaussians	Xin Sun et.al.	2504.06598	null
2025-04-05	Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization	Yikai Wang et.al.	2504.04153	link
2025-04-04	D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations	Antoine Dumoulin et.al.	2504.03468	null
2025-04-03	Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization	Kangle Deng et.al.	2504.02817	null
2025-04-03	ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation	Yuan Zhou et.al.	2504.02316	link
2025-04-03	WonderTurbo: Generating Interactive 3D World in 0.72 Seconds	Chaojun Ni et.al.	2504.02261	null
2025-04-02	WorldPrompter: Traversable Text-to-Scene Generation	Zhaoyang Zhang et.al.	2504.02045	null
2025-04-02	3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting	Hao Wu et.al.	2504.01619	null
2025-04-02	High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model	Yiyang Shen et.al.	2504.01512	null
2025-04-03	Distilling Multi-view Diffusion Models into 3D Generators	Hao Qin et.al.	2504.00457	null
2025-03-31	Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes	Daichi Otsuka et.al.	2503.24229	null
2025-03-28	DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness	Ruining Li et.al.	2503.22677	null
2025-03-28	Clouds and Hazes in GJ 1214b’s Metal-Rich Atmosphere	Isaac Malsky et.al.	2503.22608	null
2025-03-28	CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving	Yishen Ji et.al.	2503.22231	null
2025-03-27	3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models	Yuhan Zhang et.al.	2503.21745	null
2025-03-27	Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data	Zhiyuan Ma et.al.	2503.21694	link
2025-03-29	GenFusion: Closing the Loop between Reconstruction and Generation via Videos	Sibo Wu et.al.	2503.21219	null
2025-03-26	FB-4D: Spatial-Temporal Coherent Dynamic 3D Content Generation with Feature Banks	Jinwei Li et.al.	2503.20784	link
2025-03-27	MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation	Jinnan Chen et.al.	2503.20519	null
2025-03-24	MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing	Lingting Zhu et.al.	2503.18461	null
2025-03-23	Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook	Xu Zheng et.al.	2503.18016	null
2025-03-20	SynCity: Training-Free Generation of 3D Worlds	Paul Engstler et.al.	2503.16420	null
2025-03-26	Unleashing Vecset Diffusion Model for Fast Shape Generation	Zeqiang Lai et.al.	2503.16302	link
2025-03-21	Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens	Shuqi Lu et.al.	2503.16278	link
2025-03-20	Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation	Tiange Xiang et.al.	2503.15877	null
2025-03-19	Shap-MeD	Nicolás Laverde et.al.	2503.15562	null
2025-03-18	MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling	Damian Boborzi et.al.	2503.14002	link
2025-03-17	Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images	Tianhao Wu et.al.	2503.13439	null
2025-03-16	VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting	Songen Gu et.al.	2503.12383	null
2025-03-15	DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting	Utkarsh Nath et.al.	2503.11981	null
2025-03-14	PBR3DGen: A VLM-guided Mesh Generation with High-quality PBR Texture	Xiaokang Wei et.al.	2503.11368	null
2025-03-08	Text-to-3D Generation using Jensen-Shannon Score Distillation	Khoi Do et.al.	2503.10660	null
2025-03-13	Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders	Jingyu Guo et.al.	2503.10403	null
2025-03-13	RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling	Itay Chachy et.al.	2503.09601	link
2025-03-11	MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention	Yuhan Wang et.al.	2503.08664	link
2025-03-12	CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction	Zhiyuan Wu et.al.	2503.08005	null
2025-03-10	DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation	Xiaoliang Ju et.al.	2503.06900	null
2025-03-09	A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation	Jiajie Fan et.al.	2503.06485	null
2025-03-08	GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation	Ye Tao et.al.	2503.06136	null
2025-03-07	Decay of solutions of nonlinear Dirac equations	Sebastian Herr et.al.	2503.05410	null
2025-03-06	Simulating the Real World: A Unified Survey of Multimodal Generative Models	Yuqi Hu et.al.	2503.04641	link
2025-03-03	On the behavior of the Generalized Alignment Index (GALI) method for dissipative systems	Henok Tenaw Moges et.al.	2503.01784	null
2025-03-03	The Interplay between Dust Dynamics and Turbulence Induced by the Vertical Shear Instability	Pinghui Huang et.al.	2503.01656	null
2025-03-03	Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation	Jiantao Lin et.al.	2503.01370	link
2025-03-02	DreamPrinting: Volumetric Printing Primitives for High-Fidelity 3D Printing	Youjia Wang et.al.	2503.00887	null
2025-03-01	GenVDM: Generating Vector Displacement Maps From a Single Image	Yuezhi Yang et.al.	2503.00605	null
2025-02-28	CADDreamer: CAD object Generation from Single-view Images	Yuan Li et.al.	2502.20732	null
2025-02-27	Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting	Hengyu Meng et.al.	2502.20045	null
2025-02-27	GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors	An Li et.al.	2502.19896	null
2025-02-24	Evidence for Low Universal Equilibrium Black Hole Spin in Luminous Magnetically Arrested Disks	Beverly Lowell et.al.	2502.17559	null
2025-02-24	RELICT: A Replica Detection Framework for Medical Image Generation	Orhun Utku Aydin et.al.	2502.17360	link
2025-02-25	Evolution 6.0: Evolving Robotic Capabilities Through Generative Design	Muhammad Haris Khan et.al.	2502.17034	null
2025-02-23	Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control	Jinbo Yan et.al.	2502.16475	null
2025-02-21	Generative AI Framework for 3D Object Generation in Augmented Reality	Majid Behravan et.al.	2502.15869	null
2025-02-28	WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents	Xinhang Liu et.al.	2502.15601	null
2025-02-20	Hier-SLAM++: Neuro-Symbolic Semantic SLAM with a Hierarchically Categorical Gaussian Splatting	Boying Li et.al.	2502.14931	null
2025-02-18	CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image	Kaixin Yao et.al.	2502.12894	null
2025-02-18	RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation	Chenxi Zheng et.al.	2502.12640	null
2025-02-18	NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation	Zhiyuan Liu et.al.	2502.12638	link
2025-02-18	Not-So-Optimal Transport Flows for 3D Point Cloud Generation	Ka-Hei Hui et.al.	2502.12456	null
2025-02-17	A new convection scheme for GCMs of temperate sub-Neptunes	Edouard F. L. Barrier et.al.	2502.12234	null
2025-02-17	GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text	Gyumin Shim et.al.	2502.11642	null
2025-02-13	X-SG $^2$ S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks	Zihang Cheng et.al.	2502.10475	null
2025-02-13	Latent Radiance Fields with 3D-aware 2D Representations	Chaoyi Zhou et.al.	2502.09613	null
2025-02-17	ConsistentDreamer: View-Consistent Meshes Through Balanced Multi-View Gaussian Optimization	Onat Şahin et.al.	2502.09278	null
2025-02-10	Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC	Siwei Meng et.al.	2502.07007	null
2025-02-10	Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene	Tai-Yu Pan et.al.	2502.06682	null
2025-02-10	TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models	Yangguang Li et.al.	2502.06608	link
2025-02-10	Relativistic Gas Accretion onto Supermassive Black Hole Binaries from Inspiral through Merger	Lorenzo Ennoggi et.al.	2502.06389	null
2025-02-05	DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization	Zhenglin Zhou et.al.	2502.04370	null
2025-02-04	ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion	Nissim Maruani et.al.	2502.02187	null
2025-01-31	TRAPPIST-1 d: Exo-Venus, Exo-Earth or Exo-Dead?	M. J. Way et.al.	2502.00132	null
2025-01-29	Towards Training-Free Open-World Classification with 3D Generative Models	Xinzhe Xia et.al.	2501.17547	null
2025-01-28	CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation	Nikolai Kalischek et.al.	2501.17162	null
2025-01-28	DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation	Chenguo Lin et.al.	2501.16764	null
2025-01-27	BAG: Body-Aligned 3D Wearable Asset Generation	Zhongjin Luo et.al.	2501.16177	null
2025-01-26	Comparative clinical evaluation of “memory-efficient” synthetic 3d generative adversarial networks (gan) head-to-head to state of art: results on computed tomography of the chest	Mahshid shiri et.al.	2501.15572	null
2025-01-22	InsTex: Indoor Scenes Stylized Texture Synthesis	Yunfan Zhang et.al.	2501.13969	null
2025-01-22	Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation	Akshay Krishnan et.al.	2501.13087	null
2025-01-17	Mitigating Hallucinations on Object Attributes using Multiview Images and Negative Instructions	Zhijie Tan et.al.	2501.10011	null
2025-01-16	CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation	Hwan Heo et.al.	2501.09433	link
2025-01-13	UnCommon Objects in 3D	Xingchen Liu et.al.	2501.07574	link
2025-01-12	Synthetic Prior for Few-Shot Drivable Head Avatar Inversion	Wojciech Zielonka et.al.	2501.06903	null
2025-01-09	Consistent Flow Distillation for Text-to-3D Generation	Runjie Yan et.al.	2501.05445	null
2025-01-09	Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation	Xuyi Meng et.al.	2501.05427	null
2025-01-07	Chirpy3D: Continuous Part Latents for Creative 3D Bird Generation	Kam Woh Ng et.al.	2501.04144	link
2025-01-04	Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models	Suttisak Wizadwongsa et.al.	2501.00651	null
2024-12-30	PERSE: Personalized 3D Generative Avatars from A Single Portrait	Hyunsoo Cha et.al.	2412.21206	null
2025-01-02	Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation	Yuanbo Yang et.al.	2412.21117	null
2024-12-29	Toward Scene Graph and Layout Guided Complex 3D Scene Generation	Yu-Hsiang Huang et.al.	2412.20473	null
2024-12-26	Habitability in 4-D: Predicting the Climates of Earth Analogs across Rotation and Orbital Configurations	Arthur D. Adams et.al.	2412.19357	link
2024-12-29	PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models	Minghao Chen et.al.	2412.18608	null
2024-12-23	ArchComplete: Autoregressive 3D Architectural Design Generation with Hierarchical Diffusion-Based Upsampling	S. Rasoulzadeh et.al.	2412.17957	link
2024-12-21	GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space	Souhaib Attaiki et.al.	2412.16717	null
2024-12-18	AdvIRL: Reinforcement Learning-Based Adversarial Attacks on 3D NeRF Models	Tommy Nguyen et.al.	2412.16213	link
2024-12-20	GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D Generators	Hengjia Li et.al.	2412.15491	null
2024-12-18	DreaMark: Rooting Watermark in Score Distillation Sampling Generated Neural Radiance Fields	Xingyu Zhu et.al.	2412.15278	null
2024-12-19	DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation	Wang Zhao et.al.	2412.15200	null
2024-12-19	LiftRefine: Progressively Refined View Synthesis from 3D Lifting with Volume-Triplane Representations	Tung Do et.al.	2412.14464	null
2024-12-18	GraphicsDreamer: Image to 3D Generation with Physical Consistency	Pei Chen et.al.	2412.14214	null
2024-12-15	Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation	Yujie Zhang et.al.	2412.11170	null
2024-12-17	Virtual Trial Room with Computer Vision and Machine Learning	Tulashi Prasad Joshi et.al.	2412.10710	null
2024-12-13	GT23D-Bench: A Comprehensive General Text-to-3D Generation Benchmark	Sitong Su et.al.	2412.09997	null
2024-12-11	DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models	Kevin Miao et.al.	2412.09648	null
2024-12-19	SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing	Xueting Li et.al.	2412.09545	null
2024-12-09	Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation	Ruihan Gao et.al.	2412.06785	link
2024-12-09	Diverse Score Distillation	Yanbo Xu et.al.	2412.06780	null
2024-12-14	You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale	Baorui Ma et.al.	2412.06699	link
2024-12-09	Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy	Yuxuan Xue et.al.	2412.06698	null
2024-12-08	Enhanced 3D Generation by 2D Editing	Haoran Li et.al.	2412.05929	null
2024-12-07	Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation	Wenqing Wang et.al.	2412.05560	null
2024-12-06	DNF: Unconditional 4D Generation with Dictionary-based Neural Fields	Xinyi Zhang et.al.	2412.05161	null
2024-12-05	PaintScene4D: Consistent 4D Scene Generation from Text Prompts	Vinayak Gupta et.al.	2412.04471	null
2024-12-05	Turbo3D: Ultra-fast Text-to-3D Generation	Hanzhe Hu et.al.	2412.04470	null
2024-12-05	InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models	Yifan Lu et.al.	2412.03934	null
2024-12-04	MV-Adapter: Multi-view Consistent Image Generation Made Easy	Zehuan Huang et.al.	2412.03632	null
2024-12-04	MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation	Zehuan Huang et.al.	2412.03558	null
2024-12-04	CLAS: A Machine Learning Enhanced Framework for Exploring Large 3D Design Datasets	XiuYu Zhang et.al.	2412.02996	null
2024-12-03	Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation	Yiftach Edelstein et.al.	2412.02631	null
2024-12-03	Continual Learning of Personalized Generative Face Models with Experience Replay	Annie N. Wang et.al.	2412.02627	null
2024-12-03	HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset	Zedong Chu et.al.	2412.02317	link
2024-12-03	Viewpoint Consistency in 3D Generation via Attention and CLIP Guidance	Qing Zhang et.al.	2412.02287	null
2024-12-03	3D representation in 512-Byte:Variational tokenizer is the key for autoregressive 3D generation	Jinzhi Zhang et.al.	2412.02202	null
2024-12-03	CLERF: Contrastive LEaRning for Full Range Head Pose Estimation	Ting-Ruen Wei et.al.	2412.02066	null
2024-12-02	World-consistent Video Diffusion with Explicit 3D Modeling	Qihang Zhang et.al.	2412.01821	null
2024-12-02	Structured 3D Latents for Scalable and Versatile 3D Generation	Jianfeng Xiang et.al.	2412.01506	link
2024-11-30	Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects	Amir Barda et.al.	2412.00518	null
2024-11-28	3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes	Tejaswini Medi et.al.	2411.19037	null
2024-11-28	RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning	Jiacheng Wang et.al.	2411.18866	null
2024-11-27	G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation	Tianxing Chen et.al.	2411.18369	null
2024-11-27	ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts	Uy Dieu Tran et.al.	2411.18135	null
2024-11-26	Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation	Xiang Li et.al.	2411.17763	null
2024-11-27	SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE	Yongwei Chen et.al.	2411.16856	null
2024-11-27	DetailGen3D: Generative 3D Geometry Enhancement via Data-Dependent Flow	Ken Deng et.al.	2411.16820	null
2024-11-25	SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis	Hyojun Go et.al.	2411.16443	link
2024-11-24	Fixing the Perspective: A Critical Examination of Zero-1-to-3	Jack Yu et.al.	2411.15706	null
2024-11-26	Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction	Huiwon Jang et.al.	2411.14762	null
2024-11-22	Any-to-3D Generation via Hybrid Diffusion Supervision	Yijun Fan et.al.	2411.14715	null
2024-11-26	Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation	Yuanhao Cai et.al.	2411.14384	null
2024-11-19	Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting	Haoyu Zhao et.al.	2411.12789	null
2024-11-21	FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting	Fangyu Wu et.al.	2411.12089	null
2024-11-18	sMoRe: Enhancing Object Manipulation and Organization in Mixed Reality Spaces with LLMs and Generative AI	Yunhao Xing et.al.	2411.11752	null
2024-11-18	MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion	Dongseok Shim et.al.	2411.11475	null
2024-11-18	Thickness-dependent Topological Phases and Flat Bands in Rhombohedral Multilayer Graphene	H. B. Xiao et.al.	2411.11359	null
2024-11-17	Direct and Explicit 3D Generation from a Single Image	Haoyu Wu et.al.	2411.10947	null
2024-11-16	ARM: Appearance Reconstruction Model for Relightable 3D Generation	Xiang Feng et.al.	2411.10825	null
2024-11-14	LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models	Zhengyi Wang et.al.	2411.09595	null
2024-11-16	A Survey on Vision Autoregressive Model	Kai Jiang et.al.	2411.08666	null
2024-11-12	GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation	Yushi Lan et.al.	2411.08033	null
2024-11-12	Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings	Aditya Sanghi et.al.	2411.08017	link
2024-11-16	SAMPart3D: Segment Any Part in 3D Objects	Yunhan Yang et.al.	2411.07184	link
2024-11-09	AI-Driven Stylization of 3D Environments	Yuanbo Chen et.al.	2411.06067	null
2024-11-08	Autoregressive Models in Vision: A Survey	Jing Xiong et.al.	2411.05902	link
2024-11-07	DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion	Wenqiang Sun et.al.	2411.04928	null
2024-11-05	Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation	Xianghui Yang et.al.	2411.02293	null
2024-11-03	DreamPolish: Domain Score Distillation With Progressive Geometry Generation	Yean Cheng et.al.	2411.01602	null
2024-10-31	Manipulating Vehicle 3D Shapes through Latent Space Editing	JiangDong Miao et.al.	2410.23931	null
2024-11-01	Fast Transients from Magnetic Disks Around Non-Spinning Collapsar Black Holes	Justin Bopp et.al.	2410.22401	null
2024-10-16	TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt	Jiahui Yang et.al.	2410.21299	null
2024-10-28	CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians	Chongjian Ge et.al.	2410.20723	null
2024-10-30	DiffGS: Functional Gaussian Splatting Diffusion	Junsheng Zhou et.al.	2410.19657	null
2024-10-24	3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation	Hansheng Chen et.al.	2410.18974	link
2024-10-23	GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation	Ruowei Wang et.al.	2410.17802	link
2024-10-23	Under the magnifying glass: A combined 3D model applied to cloudy warm Saturn type exoplanets around M-dwarfs	Sven Kiefer et.al.	2410.17716	null
2024-10-21	MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors	Honghua Chen et.al.	2410.16272	null
2024-10-22	LucidFusion: Generating 3D Gaussians with Arbitrary Unposed Images	Hao He et.al.	2410.15636	null
2024-10-20	Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint	Junwei Zhou et.al.	2410.15391	null
2024-10-16	DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model	Jingxiang Sun et.al.	2410.12928	null
2024-10-15	Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery	Alexander Saikia et.al.	2410.11703	null
2024-10-15	Evolutionary Retrofitting	Mathurin Videau et.al.	2410.11330	null
2024-10-13	GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation	Dingdong Yang et.al.	2410.10037	null
2024-10-12	ControLRM: Fast and Controllable 3D Generation via Large Reconstruction Model	Hongbin Xu et.al.	2410.09592	null
2024-10-12	Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors	Hritam Basak et.al.	2410.09467	null
2024-10-11	SceneCraft: Layout-Guided 3D Scene Generation	Xiuyu Yang et.al.	2410.09049	link
2024-10-11	Semantic Score Distillation Sampling for Compositional Text-to-3D Generation	Ling Yang et.al.	2410.09009	link
2024-10-11	One-shot Generative Domain Adaptation in 3D GANs	Ziqiang Li et.al.	2410.08824	link
2024-10-10	RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image	Xiaoxue Chen et.al.	2410.08181	null
2024-10-10	SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors	Xiao Cai et.al.	2410.07658	null
2024-10-09	DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation	Zhiqi Li et.al.	2410.06756	null
2024-10-02	OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning	Shuxin Yang et.al.	2410.01861	null
2024-10-02	Towards Native Generative Model for 3D Head Avatar	Yiyu Zhuang et.al.	2410.01226	null
2024-10-01	Extreme scale height variations and nozzle shocks in warped disks	Nicholas Kaaz et.al.	2410.00961	null
2024-10-02	Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation	Junlin Han et.al.	2410.00890	null
2024-09-29	Global well-posedness of the fractional dissipative system in the framework of variable Fourier–Besov spaces	Gastón Vergara-Hermosilla et.al.	2410.00060	null
2024-09-30	Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images	Bahri Batuhan Bilecen et.al.	2409.20530	null
2024-09-27	Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly	Alexander Htet Kyaw et.al.	2409.18390	null
2024-09-26	Long-lived neutron-star remnants from asymmetric binary neutron star mergers: element formation, kilonova signals and gravitational waves	Sebastiano Bernuzzi et.al.	2409.18185	null
2024-09-25	Disco4D: Disentangled 4D Human Generation and Animation from a Single Image	Hui En Pang et.al.	2409.17280	null
2024-09-19	3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion	Zhaoxi Chen et.al.	2409.12957	link
2024-09-18	Vista3D: Unravel the 3D Darkside of a Single Image	Qiuhong Shen et.al.	2409.12193	link
2024-09-17	Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion	Zhenwei Wang et.al.	2409.11406	null
2024-09-16	The Spin Zone: Synchronously and Asynchronously Rotating Exoplanets Have Spectral Differences in Transmission	Nicholas Scarsdale et.al.	2409.10752	null
2024-09-11	DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation	Haibo Yang et.al.	2409.07454	null
2024-09-11	Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models	Haibo Yang et.al.	2409.07452	link
2024-09-11	Some effects of limited wall-sensor availability on flow estimation with 3D-GANs	Antonio Cuéllar et.al.	2409.07348	null
2024-09-11	Detectability Simulations of a NIR Surface Biosignature on Proxima Centauri b with Future Space Observatories	Connor O. Metz et.al.	2409.07289	null
2024-09-12	3DGCQA: A Quality Assessment Database for 3D AI-Generated Contents	Yingjie Zhou et.al.	2409.07236	link
2024-09-10	G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer	Jinzhi Zhang et.al.	2409.06322	null
2024-09-19	DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping	Zeyu Cai et.al.	2409.05099	null
2024-09-04	Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models	Zhibin Liu et.al.	2409.02851	link
2024-09-03	ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis	Wangbo Yu et.al.	2409.02048	null
2024-08-27	OctFusion: Octree-based Diffusion Models for 3D Shape Generation	Bojun Xiong et.al.	2408.14732	link
2024-08-28	PhysPart: Physically Plausible Part Completion for Interactable Objects	Rundong Luo et.al.	2408.13724	null
2024-08-26	Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation	Bonan Li et.al.	2408.13149	null
2024-08-23	Atlas Gaussians Diffusion for 3D Generation with Infinite Number of Points	Haitao Yang et.al.	2408.13055	null
2024-08-22	Multimodal Foundational Models for Unsupervised 3D General Obstacle Detection	Tamás Matuszka et.al.	2408.12322	null
2024-08-27	Pano2Room: Novel View Synthesis from a Single Indoor Panorama	Guo Pu et.al.	2408.11413	link
2024-08-20	Large Point-to-Gaussian Model for Image-to-3D Generation	Longfei Lu et.al.	2408.10935	null
2024-08-19	SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views	Chao Xu et.al.	2408.10195	null
2024-08-15	Single-image coherent reconstruction of objects and humans	Sarthak Batra et.al.	2408.08086	null
2024-08-15	MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing	Chenjie Cao et.al.	2408.08000	null
2024-08-12	Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models	Ioannis Romanelis et.al.	2408.06145	link
2024-08-12	Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation	Utkarsh Nath et.al.	2408.05938	null
2024-08-09	DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow	Hangyu Li et.al.	2408.05008	null
2024-08-06	An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion	Xingguang Yan et.al.	2408.03178	null
2024-08-09	DreamLCM: Towards High-Quality Text-to-3D Generation via Latent Consistency Model	Yiming Zhong et.al.	2408.02993	link
2024-08-05	SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements	Hou In Ivan Tam et.al.	2408.02211	null
2024-08-02	A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness	Lutao Jiang et.al.	2408.01269	null
2024-07-30	Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering	Yanpeng Zhao et.al.	2407.20908	link
2024-07-28	Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle	Zhenyu Tang et.al.	2407.19548	null
2024-07-25	Signatures of Low Mass Black Hole-Neutron Star Mergers	Rahime Matur et.al.	2407.18045	null
2024-07-23	She’s Got Her Mother’s Hair: End-to-End Collapsar Simulations Unveil the Origin of Black Holes’ Magnetic Field	Ore Gottlieb et.al.	2407.16745	null
2024-07-23	DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors	Zizheng Yan et.al.	2407.16260	null
2024-07-19	HOTS3D: Hyper-Spherical Optimal Transport for Semantic Alignment of Text-to-3D Generation	Zezeng Li et.al.	2407.14419	null
2024-07-19	PlacidDreamer: Advancing Harmony in Text-to-3D Generation	Shuo Huang et.al.	2407.13976	link
2024-07-20	Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation	Zongrui Li et.al.	2407.13584	link
2024-07-17	4Dynamic: Text-to-4D Generation with Hybrid Priors	Yu-Jie Yuan et.al.	2407.12684	null
2024-07-17	JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation	Chenhan Jiang et.al.	2407.12291	null
2024-07-16	Superintegrable families of magnetic monopoles with non-radial potential in curved background	Antonella Marchesiello et.al.	2407.11709	null
2024-07-17	VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation	Wenjie Zhuo et.al.	2407.09822	null
2024-07-08	Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images	Zhangyang Qi et.al.	2407.06191	null
2024-07-08	On a new 3D generalized Hunter-Saxton equation	Sergei Sakovich et.al.	2407.05723	null
2024-07-05	Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality matters	Benoit Baillif et.al.	2407.04424	link
2024-07-05	Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos	Leonhard Sommer et.al.	2407.04384	link
2024-07-03	NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries	Ewa M. Nowara et.al.	2407.03428	link
2024-07-02	Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials	Yawar Siddiqui et.al.	2407.02445	null
2024-07-02	ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation	Zhiyuan Ma et.al.	2407.02040	link
2024-07-01	fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence	Francis Williams et.al.	2407.01781	null
2024-07-01	VolETA: One- and Few-shot Food Volume Estimation	Ahmad AlMughrabi et.al.	2407.01717	link
2024-07-01	GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting	Chenxin Li et.al.	2407.01301	null
2024-06-27	From Efficient Multimodal Models to World Models: A Survey	Xinji Mai et.al.	2407.00118	null
2024-06-27	In LIGO’s Sight? Vigorous Coherent Gravitational Waves from Cooled Collapsar Disks	Ore Gottlieb et.al.	2406.19452	null
2024-06-26	Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling	Abril Corona-Figueroa et.al.	2406.18422	link
2024-06-25	Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text	Xinyang Li et.al.	2406.17601	link
2024-06-25	Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds	Hongliang Zeng et.al.	2406.17342	null
2024-07-01	Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling	Min-Seop Kwak et.al.	2406.16695	null
2024-06-24	YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals	Sandeep Mishra et.al.	2406.16273	null
2024-06-21	GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation	Chubin Zhang et.al.	2406.15333	link
2024-06-21	A3D: Does Diffusion Dream about 3D Alignment?	Savva Ignatyev et.al.	2406.15020	null
2024-06-21	VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation	Zixuan Chen et.al.	2406.14964	null
2024-06-14	OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control	Yuzhong Huang et.al.	2406.10000	null
2024-06-14	GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View Diffusion	Trapoom Ukarapol et.al.	2406.09850	link
2024-06-15	2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction	Tianqi Chen et.al.	2406.08374	null
2024-06-12	Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata	Dongsu Zhang et.al.	2406.08292	null
2024-06-12	SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models	Abhay Rawat et.al.	2406.07840	null
2024-06-11	C3DAG: Controlled 3D Animal Generation using 3D pose guidance	Sandeep Mishra et.al.	2406.07742	null
2024-06-11	Instant 3D Human Avatar Generation using Image Diffusion Models	Nikos Kolotouros et.al.	2406.07516	null
2024-06-11	4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models	Heng Yu et.al.	2406.07472	null
2024-06-11	Efficient 3D Molecular Generation with Flow Matching and Scale Optimal Transport	Ross Irwin et.al.	2406.07266	null
2024-06-10	PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation	Zhenyu Li et.al.	2406.06679	null
2024-06-10	GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation	Haozhe Xie et.al.	2406.06526	link
2024-06-10	MVGamba: Unify 3D Content Generation as State Space Sequence Modeling	Xuanyu Yi et.al.	2406.06367	link
2024-06-09	GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement	Peiye Zhuang et.al.	2406.05649	null
2024-06-11	Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion	Fangfu Liu et.al.	2406.04338	null
2024-06-07	DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data	Qihao Liu et.al.	2406.04322	link
2024-06-07	GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions	Salvatore Esposito et.al.	2406.04254	null
2024-06-05	Text-to-Image Rectified Flow as Plug-and-Play Priors	Xiaofeng Yang et.al.	2406.03293	link
2024-06-05	Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion	Hao Wen et.al.	2406.03184	link
2024-06-05	Adversarial Generation of Hierarchical Gaussians for 3D Generative Model	Sangeek Hyun et.al.	2406.02968	link
2024-06-03	TAGMol: Target-Aware Gradient-guided Molecule Generation	Vineeth Dorna et.al.	2406.01650	link
2024-06-03	Tetrahedron Splatting for 3D Generation	Chun Gu et.al.	2406.01579	link
2024-06-04	Towards Practical Single-shot Motion Synthesis	Konstantinos Roditakis et.al.	2406.01136	null
2024-06-02	Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models	Wenqiang Sun et.al.	2406.00750	null
2024-06-04	Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors	Ohad Rahamim et.al.	2406.00687	link
2024-05-31	Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation	Shuzhou Yang et.al.	2405.20669	link
2024-05-30	What makes a cosmic filament? The dynamical origin and identity of filaments I. fundamentals in 2D	Job Feldbrugge et.al.	2405.20475	null
2024-05-30	GECO: Generative Image-to-3D within a SECOnd	Chen Wang et.al.	2405.20327	null
2024-06-05	PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting	Qiaowei Miao et.al.	2405.19957	link
2024-05-28	Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication	Yunuo Chen et.al.	2405.18515	null
2024-05-28	SubDLe: identification of substructures in cosmological simulations with deep learning	Michela Esposito et.al.	2405.18257	null
2024-05-27	PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance	Haohan Weng et.al.	2405.16890	null
2024-05-27	Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation	Zhoujie Fu et.al.	2405.16849	null
2024-05-24	ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching	Yumin Zhang et.al.	2405.15914	link
2024-05-24	Score Distillation via Reparametrized DDIM	Artem Lukoianov et.al.	2405.15891	link
2024-05-24	Automating the Diagnosis of Human Vision Disorders by Cross-modal 3D Generation	Li Zhang et.al.	2405.15239	link
2024-05-23	CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner	Weiyu Li et.al.	2405.14979	link
2024-05-23	Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer	Shuang Wu et.al.	2405.14832	null
2024-05-23	MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes	Ruiyuan Gao et.al.	2405.14475	null
2024-05-22	Multi-Zone Modeling of Black Hole Accretion and Feedback in 3D GRMHD: Bridging Vast Spatial and Temporal Scales	Hyerin Cho et.al.	2405.13887	null
2024-05-22	Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children’s Engagement in Storytelling	Yibo Wang et.al.	2405.13701	null
2024-05-18	Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching	Xingyu Miao et.al.	2405.11252	link
2024-05-16	Flow Score Distillation for Diverse Text-to-3D Generation	Runjie Yan et.al.	2405.10988	null
2024-05-23	Describing heat dissipation in the resistive state of three-dimensional superconductors	Leonardo Rodrigues Cadorim et.al.	2405.10415	null
2024-05-16	Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion	Xinyang Li et.al.	2405.09874	null
2024-05-16	The metallicity and carbon-to-oxygen ratio of the ultra-hot Jupiter WASP-76b from Gemini-S/IGRINS	Megan Weiner Mansfield et.al.	2405.09769	null
2024-05-15	A Survey On Text-to-3D Contents Generation In The Wild	Chenhan Jiang et.al.	2405.09431	null
2024-05-15	3D Shape Augmentation with Content-Aware Shape Resizing	Mingxiang Chen et.al.	2405.09050	null
2024-05-13	DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation	Ziang Cao et.al.	2405.08055	link
2024-05-13	Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning	Wenqi Dong et.al.	2405.08054	null
2024-05-14	SketchDream: Sketch-based Text-to-3D Generation and Editing	Feng-Lin Liu et.al.	2405.06461	null
2024-04-30	GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting	Kai Zhang et.al.	2404.19702	null
2024-04-30	MicroDreamer: Zero-shot 3D Generation in $\sim$ 20 Seconds by Score-based Iterative Reconstruction	Luxi Chen et.al.	2404.19525	link
2024-04-26	Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation	Seungwook Kim et.al.	2404.17419	null
2024-04-25	Interactive3D: Create What You Want by Interactive 3D Generation	Shaocong Dong et.al.	2404.16510	null
2024-04-22	X-Ray: A Sequential 3D Representation for Generation	Tao Hu et.al.	2404.14329	link
2024-04-18	MeshLRM: Large Reconstruction Model for High-Quality Mesh	Xinyue Wei et.al.	2404.12385	null
2024-04-17	Shaping Realities: Enhancing 3D Generative AI with Fabrication Constraints	Faraz Faruqi et.al.	2404.10142	null
2024-04-14	InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models	Jiale Xu et.al.	2404.07191	link
2024-04-10	Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior	Fan Lu et.al.	2404.06780	null
2024-04-09	Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion	Fan Yang et.al.	2404.06429	link
2024-04-09	DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation	Junkai Yan et.al.	2404.06119	link
2024-04-09	Hash3D: Training-free Acceleration for 3D Generation	Xingyi Yang et.al.	2404.06091	link
2024-04-08	StylizedGS: Controllable Stylization for 3D Gaussian Splatting	Dingxi Zhang et.al.	2404.05220	null
2024-04-11	Diffusion Time-step Curriculum for One Image to 3D Generation	Xuanyu Yi et.al.	2404.04562	link
2024-04-03	Design2Cloth: 3D Cloth Generation from 2D Masks	Jiali Zheng et.al.	2404.02686	null
2024-04-02	Towards Robust 3D Pose Transfer with Adversarial Learning	Haoyu Chen et.al.	2404.02242	null
2024-04-02	Black Hole-Disk Interactions in Magnetically Arrested Active Galactic Nuclei: General Relativistic Magnetohydrodynamic Simulations Using A Time-Dependent, Binary Metric	Sean M. Ressler et.al.	2404.02193	null
2024-04-02	Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models	Zeyu Yang et.al.	2404.02148	link
2024-04-07	Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation	Wangguandong Zheng et.al.	2404.01843	null
2024-04-01	FlexiDreamer: Single Image-to-3D Generation with FlexiCubes	Ruowen Zhao et.al.	2404.00987	link
2024-03-29	Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior	Jaehoon Ko et.al.	2403.20153	link
2024-04-05	GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling	Bowen Zhang et.al.	2403.19655	null
2024-03-28	Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation	Yujin Chen et.al.	2403.19319	null
2024-03-29	Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction	Qiuhong Shen et.al.	2403.18795	link
2024-03-25	DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion	Yuanze Lin et.al.	2403.17237	null
2024-03-25	VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation	Yang Chen et.al.	2403.17001	null
2024-03-25	Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning	Sicong Pan et.al.	2403.16803	link
2024-03-22	InterFusion: Text-Driven Generation of 3D Human-Object Interaction	Sisi Dai et.al.	2403.15612	link
2024-03-22	LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis	Kevin Xie et.al.	2403.15385	null
2024-03-22	ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars	Zhenwei Wang et.al.	2403.15383	link
2024-03-22	DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow	Kyungmin Lee et.al.	2403.14966	null
2024-03-22	STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians	Yifei Zeng et.al.	2403.14939	null
2024-03-21	DreamReward: Text-to-3D Generation with Human Preference	Junliang Ye et.al.	2403.14613	null
2024-03-20	Compress3D: a Compressed Latent Space for 3D Generation from a Single Image	Bowen Zhang et.al.	2403.13524	null
2024-03-17	General Line Coordinates in 3D	Joshua Martinez et.al.	2403.13014	null
2024-03-19	GVGEN: Text-to-3D Generation with Volumetric Representation	Xianglong He et.al.	2403.12957	null
2024-03-19	Precise-Physics Driven Text-to-3D Generation	Qingshan Xu et.al.	2403.12438	null
2024-03-19	ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance	Yongwei Chen et.al.	2403.12409	null
2024-03-18	VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models	Junlin Han et.al.	2403.12034	null
2024-03-19	Generic 3D Diffusion Adapter Using Controlled Multi-View Editing	Hansheng Chen et.al.	2403.12032	link
2024-03-18	LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation	Yushi Lan et.al.	2403.12019	link
2024-03-18	SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion	Vikram Voleti et.al.	2403.12008	null
2024-03-17	BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis	Lutao Jiang et.al.	2403.11273	link
2024-03-15	Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding	Pengkun Liu et.al.	2403.10395	link
2024-03-19	Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting	Zhiqi Li et.al.	2403.09981	link
2024-03-14	Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation	Fangfu Liu et.al.	2403.09625	null
2024-03-14	Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph	Donglin Di et.al.	2403.09236	link
2024-03-14	Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior	Cheng Chen et.al.	2403.09140	null
2024-03-13	UniLiDAR: Bridge the domain gap among different LiDARs for continual learning	Zikun Xu et.al.	2403.08512	null
2024-03-11	3D simulations of TRAPPIST-1e with varying CO2, CH4 and haze profiles	Mei Ting Mak et.al.	2403.06928	null
2024-03-11	ExoCubed: A Riemann-Solver based Cubed-Sphere Dynamic Core for Planetary Atmospheres	Sihe Chen et.al.	2403.06844	link
2024-03-11	V3D: Video Diffusion Models are Effective 3D Generators	Zilong Chen et.al.	2403.06738	link
2024-03-11	3D-aware Image Generation and Editing with Multi-modal Conditions	Bo Li et.al.	2403.06470	null
2024-03-08	CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model	Zhengyi Wang et.al.	2403.05034	null
2024-03-04	3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors	Fangzhou Hong et.al.	2403.02234	link
2024-03-04	TripoSR: Fast 3D Object Reconstruction from a Single Image	Dmitry Tochilkin et.al.	2403.02151	link
2024-03-08	G3DR: Generative 3D Reconstruction in ImageNet	Pradyumna Reddy et.al.	2403.00939	link
2024-02-28	The VOROS: Lifting ROC curves to 3D	Christopher Ratigan et.al.	2402.18689	link
2024-02-27	DivAvatar: Diverse 3D Avatar Generation with a Single Prompt	Weijing Tao et.al.	2402.17292	null
2024-02-22	Place Anything into Any Video	Ziling Liu et.al.	2402.14316	null
2024-02-22	MVD $^2$ : Efficient Multiview 3D Reconstruction for Multiview Diffusion	Xin-Yang Zheng et.al.	2402.14253	null
2024-02-20	MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction	Shitao Tang et.al.	2402.12712	null
2024-02-19	Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability	Xuelin Qian et.al.	2402.12225	null
2024-02-13	IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation	Luke Melas-Kyriazi et.al.	2402.08682	null
2024-02-11	GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting	Xiaoyu Zhou et.al.	2402.07207	null
2024-02-08	AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning	Wamiq Reyaz Para et.al.	2402.05803	null
2024-02-07	SPAD : Spatially Aware Multiview Diffusers	Yash Kant et.al.	2402.05235	null
2024-02-05	Retrieval-Augmented Score Distillation for Text-to-3D Generation	Junyoung Seo et.al.	2402.02972	link
2024-02-02	A Comprehensive Survey on 3D Content Generation	Jian Liu et.al.	2402.01166	link

3D Gaussian Splatting

Publish Date	Title	Authors	PDF	Code
2025-07-23	Temporal Smoothness-Aware Rate-Distortion Optimized 4D Gaussian Splatting	Hyeongmin Lee et.al.	2507.17336	null
2025-07-22	StreamME: Simplify 3D Gaussian Avatar within Live Stream	Luchuan Song et.al.	2507.17029	null
2025-07-22	Sparse-View 3D Reconstruction: Recent Advances and Open Challenges	Tanveer Younis et.al.	2507.16406	null
2025-07-22	LongSplat: Online Generalizable 3D Gaussian Splatting from Long Sequence Images	Guichen Huang et.al.	2507.16144	null
2025-07-21	Appearance Harmonization via Bilateral Grid Prediction with Transformers for 3DGS	Jisu Shin et.al.	2507.15748	null
2025-07-21	DWTGS: Rethinking Frequency Regularization for Sparse-view 3D Gaussian Splatting	Hung Nguyen et.al.	2507.15690	null
2025-07-21	Hi^2-GSLoc: Dual-Hierarchical Gaussian-Specific Visual Relocalization for Remote Sensing	Boni Hu et.al.	2507.15683	null
2025-07-21	Gaussian Splatting with Discretized SDF for Relightable Assets	Zuo-Liang Zhu et.al.	2507.15629	null
2025-07-21	SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting	Zihui Gao et.al.	2507.15602	null
2025-07-21	ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting	Ruijie Zhu et.al.	2507.15454	null
2025-07-22	GCC: A 3DGS Inference Architecture with Gaussian-Wise and Cross-Stage Conditional Processing	Minnan Pei et.al.	2507.15300	null
2025-07-20	Stereo-GS: Multi-View Stereo Vision Model for Generalizable 3D Gaussian Splatting Reconstruction	Xiufeng Huang et.al.	2507.14921	null
2025-07-19	Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey	Jiahui Zhang et.al.	2507.14501	null
2025-07-19	Adaptive 3D Gaussian Splatting Video Streaming: Visual Saliency-Aware Tiling and Meta-Learning-Based Bitrate Adaptation	Han Gong et.al.	2507.14454	null
2025-07-18	Neural-GASh: A CGA-based neural radiance prediction pipeline for real-time shading	Efstratios Geronikolakis et.al.	2507.13917	null
2025-07-21	PCR-GS: COLMAP-Free 3D Gaussian Splatting via Pose Co-Regularizations	Yu Wei et.al.	2507.13891	null
2025-07-16	NLI4VolVis: Natural Language Interaction for Volume Visualization via LLM Multi-Agents and Editable 3D Gaussian Splatting	Kuangshi Ai et.al.	2507.12621	null
2025-07-21	Wavelet-GS: 3D Gaussian Splatting with Wavelet Decomposition	Beizhen Zhao et.al.	2507.12498	null
2025-07-16	SGLoc: Semantic Localization System for Camera Pose Estimation from 3D Gaussian Splatting Representation	Beining Xu et.al.	2507.12027	null
2025-07-16	Dark-EvGS: Event Camera as an Eye for Radiance Field in the Dark	Jingqian Wu et.al.	2507.11931	null
2025-07-21	Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling	Hayeon Kim et.al.	2507.11061	null
2025-07-14	ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions	Shivangi Aneja et.al.	2507.10542	null
2025-07-19	3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving	Yixun Zhang et.al.	2507.09993	null
2025-07-11	RePaintGS: Reference-Guided Gaussian Splatting for Realistic and View-Consistent 3D Scene Inpainting	Ji Hyun Seo et.al.	2507.08434	null
2025-07-10	Temporally Consistent Amodal Completion for 3D Human-Object Interaction Reconstruction	Hyungjun Doh et.al.	2507.08137	null
2025-07-10	RegGS: Unposed Sparse Views Gaussian Splatting with 3DGS Registration	Chong Cheng et.al.	2507.08136	null
2025-07-10	RTR-GS: 3D Gaussian Splatting for Inverse Rendering with Radiance Transfer and Reflection	Yongyang Zhou et.al.	2507.07733	null
2025-07-10	MUVOD: A Novel Multi-view Video Object Segmentation Dataset and A Benchmark for 3D Segmentation	Bangning Wei et.al.	2507.07519	null
2025-07-10	Seg-Wild: Interactive Segmentation based on 3D Gaussian Splatting for Unconstrained Image Collections	Yongtang Bao et.al.	2507.07395	null
2025-07-09	Enhancing non-Rigid 3D Model Deformations Using Mesh-based Gaussian Splatting	Wijayathunga W. M. R. D. B et.al.	2507.07000	null
2025-07-09	FlexGaussian: Flexible and Cost-Effective Training-Free Compression for 3D Gaussian Splatting	Boyuan Tian et.al.	2507.06671	null
2025-07-08	A Probabilistic Approach to Uncertainty Quantification Leveraging 3D Geometry	Rushil Desai et.al.	2507.06269	null
2025-07-08	LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures	Seungoh Han et.al.	2507.06109	null
2025-07-08	Reflections Unlock: Geometry-Aware Reflection Disentanglement in 3D Gaussian Splatting for Photorealistic Scenes Rendering	Jiayi Song et.al.	2507.06103	null
2025-07-08	VisualSpeaker: Visually-Guided 3D Avatar Lip Synthesis	Alexandre Symeonidis-Herzig et.al.	2507.06060	null
2025-07-08	D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos	Wenkang Zhang et.al.	2507.05859	null
2025-07-08	3DGS_LSR:Large_Scale Relocation for Autonomous Driving Based on 3D Gaussian Splatting	Haitao Lu et.al.	2507.05661	null
2025-07-07	Mastering Regional 3DGS: Locating, Initializing, and Editing with Diverse 2D Priors	Lanqing Guo et.al.	2507.05426	null
2025-07-07	SegmentDreamer: Towards High-fidelity Text-to-3D Synthesis with Segmented Consistency Trajectory Distillation	Jiahao Zhu et.al.	2507.05256	null
2025-07-07	InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior	Minghao Wen et.al.	2507.04961	null
2025-07-05	A3FR: Agile 3D Gaussian Splatting with Incremental Gaze Tracked Foveated Rendering in Virtual Reality	Shuo Xin et.al.	2507.04147	null
2025-07-09	Gaussian-LIC2: LiDAR-Inertial-Camera Gaussian Splatting SLAM	Xiaolei Lang et.al.	2507.04004	null
2025-07-05	ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments	Guile Wu et.al.	2507.03886	null
2025-07-04	Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps	Chong Cheng et.al.	2507.03737	null
2025-07-08	HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars	Gent Serifi et.al.	2507.02803	null
2025-07-03	ArtGS:3D Gaussian Splatting for Interactive Visual-Physical Modeling and Manipulation of Articulated Objects	Qiaojun Yu et.al.	2507.02600	null
2025-07-03	LocalDyGS: Multi-view Global Dynamic Scene Modeling via Adaptive Local Implicit Feature Decoupling	Jiahao Wu et.al.	2507.02363	null
2025-07-03	Gbake: Baking 3D Gaussian Splats into Reflection Probes	Stephen Pasch et.al.	2507.02257	null
2025-07-02	3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation	Tianrui Lou et.al.	2507.01367	null
2025-07-01	VISTA: Open-Vocabulary, Task-Relevant Robot Exploration with Online Semantic Gaussian Splatting	Keiko Nagami et.al.	2507.01125	null
2025-07-01	Masks make discriminative models great again!	Tianshi Cao et.al.	2507.00916	null
2025-07-01	GaussianVLM: Scene-centric 3D Vision-Language Models using Language-aligned Gaussian Splats for Embodied Reasoning and Beyond	Anna-Maria Halacheva et.al.	2507.00886	null
2025-07-06	LOD-GS: Level-of-Detail-Sensitive 3D Gaussian Splatting for Detail Conserved Anti-Aliasing	Zhenya Yang et.al.	2507.00554	null
2025-07-01	GDGS: 3D Gaussian Splatting Via Geometry-Guided Initialization And Dynamic Density Control	Xingjun Wang et.al.	2507.00363	null
2025-06-30	AttentionGS: Towards Initialization-Free 3D Gaussian Splatting via Structural Attention	Ziao Liu et.al.	2506.23611	null
2025-06-29	Endo-4DGX: Robust Endoscopic Scene Reconstruction and Illumination Correction with Gaussian Splatting	Yiming Huang et.al.	2506.23308	null
2025-06-29	TVG-SLAM: Robust Gaussian Splatting SLAM with Tri-view Geometric Constraints	Zhen Tan et.al.	2506.23207	null
2025-06-29	From Coarse to Fine: Learnable Discrete Wavelet Transforms for Efficient 3D Gaussian Splatting	Hung Nguyen et.al.	2506.23042	null
2025-06-28	Confident Splatting: Confidence-Based Compression of 3D Gaussian Splatting via Learnable Beta Distributions	AmirHossein Naghi Razlighi et.al.	2506.22973	null
2025-06-28	RGE-GS: Reward-Guided Expansive Driving Scene Reconstruction via Diffusion Priors	Sicong Du et.al.	2506.22800	null
2025-06-28	VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding	Minchao Jiang et.al.	2506.22799	null
2025-06-28	RoboPearls: Editable Video Simulation for Robot Manipulation	Tao Tang et.al.	2506.22756	null
2025-06-25	SAR-GS: 3D Gaussian Splatting for Synthetic Aperture Radar Target Reconstruction	Aobo Li et.al.	2506.21633	null
2025-06-24	ICP-3DGS: SfM-free 3D Gaussian Splatting for Large-scale Unbounded Scenes	Chenhao Zhang et.al.	2506.21629	null
2025-06-26	MADrive: Memory-Augmented Driving Scene Modeling	Polina Karpikova et.al.	2506.21520	null
2025-06-26	EndoFlow-SLAM: Real-Time Endoscopic SLAM with Flow-Constrained Gaussian Splatting	Taoyu Wu et.al.	2506.21420	null
2025-06-26	Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image	Pufan Li et.al.	2506.21152	null
2025-06-26	User-in-the-Loop View Sampling with Error Peaking Visualization	Ayaka Yasunaga et.al.	2506.21009	null
2025-06-25	3DGH: 3D Head Generation with Composable Hair and Face	Chengan He et.al.	2506.20875	null
2025-06-24	Virtual Memory for 3D Gaussian Splatting	Jonathan Haberl et.al.	2506.19415	null
2025-06-23	GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi-Agent Gaussian SLAM	Annika Thomas et.al.	2506.18885	null
2025-06-23	Reconstructing Tornadoes in 3D with Gaussian Splatting	Adam Yang et.al.	2506.18677	null
2025-06-21	3D Gaussian Splatting for Fine-Detailed Surface Reconstruction in Large-Scale Scene	Shihan Chen et.al.	2506.17636	null
2025-06-20	Part $^{2}$ GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting	Tianjiao Yu et.al.	2506.17212	null
2025-06-23	R3eVision: A Survey on Robust Rendering, Restoration, and Enhancement for 3D Low-Level Vision	Weeyoung Kwon et.al.	2506.16262	link
2025-06-24	RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories	Qingsong Yan et.al.	2506.15242	null
2025-06-17	Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction	Zhengquan Zhang et.al.	2506.14856	null
2025-06-17	3DGS-IEval-15K: A Large-scale Image Quality Evaluation Database for 3D Gaussian-Splatting	Yuke Xing et.al.	2506.14642	link
2025-06-17	HRGS: Hierarchical Gaussian Splatting for Memory-Efficient High-Resolution 3D Reconstruction	Changbai Li et.al.	2506.14229	null
2025-06-23	GAF: Gaussian Action Field as a Dynamic World Model for Robotic Manipulation	Ying Chai et.al.	2506.14135	null
2025-06-16	GRaD-Nav++: Vision-Language Model Enabled Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics	Qianzhong Chen et.al.	2506.14009	null
2025-06-16	PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images	Lingteng Qiu et.al.	2506.13766	null
2025-06-16	Multiview Geometric Regularization of Gaussian Splatting for Accurate Radiance Fields	Jungeon Kim et.al.	2506.13508	null
2025-06-16	GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction	Jinguang Tong et.al.	2506.13110	null
2025-06-15	Metropolis-Hastings Sampling for 3D Gaussian Reconstruction	Hyunjin Kim et.al.	2506.12945	null
2025-06-17	Efficient multi-view training for 3D Gaussian Splatting	Minhyuk Choi et.al.	2506.12727	null
2025-06-14	Perceptual-GS: Scene-adaptive Perceptual Densification for Gaussian Splatting	Hongbi Zhou et.al.	2506.12400	link
2025-06-12	PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting	Lintao Xiang et.al.	2506.10335	null
2025-06-11	DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos	Chieh Hubert Lin et.al.	2506.09997	null
2025-06-11	Gaussian Herding across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS	Tao Wang et.al.	2506.09534	null
2025-06-11	HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene	Jianing Chen et.al.	2506.09518	null
2025-06-11	TinySplat: Feedforward Approach for Generating Compact 3D Scene Representation	Zetian Song et.al.	2506.09479	null
2025-06-12	ODG: Occupancy Prediction Using Dual Gaussians	Yunxiao Shi et.al.	2506.09417	null
2025-06-10	StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams	Zike Wu et.al.	2506.08862	link
2025-06-11	Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting	Keyi Liu et.al.	2506.08777	null
2025-06-10	SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting	Mengjiao Ma et.al.	2506.08710	null
2025-06-10	Complex-Valued Holographic Radiance Fields	Yicheng Zhan et.al.	2506.08350	null
2025-06-09	Speedy Deformable 3D Gaussian Splatting: Fast Rendering and Compression of Dynamic Scenes	Allen Tu et.al.	2506.07917	link
2025-06-09	GaussianVAE: Adaptive Learning Dynamics of 3D Gaussians for High-Fidelity Super-Resolution	Shuja Khalid et.al.	2506.07897	null
2025-06-09	R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation	William Ljungbergh et.al.	2506.07826	null
2025-06-09	OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting	Jens Piekenbrinck et.al.	2506.07697	null
2025-06-09	ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views	Xiaohan Lu et.al.	2506.07670	null
2025-06-09	PIG: Physically-based Multi-Material Interaction with 3D Gaussians	Zeyu Xiao et.al.	2506.07657	null
2025-06-09	Hierarchical Scoring with 3D Gaussian Splatting for Instance Image-Goal Navigation	Yijie Deng et.al.	2506.07338	null
2025-06-08	Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization	Zhican Wang et.al.	2506.07069	null
2025-06-08	Hybrid Mesh-Gaussian Representation for Efficient Indoor Scene Reconstruction	Binxiao Huang et.al.	2506.06988	null
2025-06-07	Gaussian Mapping for Evolving Scenes	Vladimir Yugay et.al.	2506.06909	null
2025-06-06	Dy3DGS-SLAM: Monocular 3D Gaussian Splatting SLAM for Dynamic Environments	Mingrui Li et.al.	2506.05965	null
2025-06-06	SurGSplat: Progressive Geometry-Constrained Gaussian Splatting for Surgical Scene Reconstruction	Yuchao Zheng et.al.	2506.05935	null
2025-06-06	Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy	Yu Feng et.al.	2506.05682	null
2025-06-05	VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction	Ziyue Zhu et.al.	2506.05563	null
2025-06-05	On-the-fly Reconstruction for Large-Scale Novel View Synthesis from Unposed Images	Andreas Meuleman et.al.	2506.05558	null
2025-06-05	ODE-GS: Latent ODEs for Dynamic Scene Extrapolation with 3D Gaussian Splatting	Daniel Wang et.al.	2506.05480	null
2025-06-05	Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting	Duochao Shi et.al.	2506.05327	null
2025-06-05	Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training	Aneesh Deogan et.al.	2506.05092	null
2025-06-05	Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting	Alfred T. Christiansen et.al.	2506.05009	null
2025-06-05	Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer	Filip Slezak et.al.	2506.04908	null
2025-06-05	Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations	Gaia Di Lorenzo et.al.	2506.04789	null
2025-06-04	Pseudo-Simulation for Autonomous Driving	Wei Cao et.al.	2506.04218	link
2025-06-04	FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting	Hengyu Liu et.al.	2506.04174	null
2025-06-04	Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data	Ben Moran et.al.	2506.04120	null
2025-06-04	SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting	Shengjie Lin et.al.	2506.03594	link
2025-06-04	Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting	Chengqi Li et.al.	2506.03538	null
2025-06-03	Multi-Spectral Gaussian Splatting with Neural Color Representation	Lukas Meyer et.al.	2506.03407	null
2025-06-03	Large Processor Chip Model	Kaiyan Chang et.al.	2506.02929	null
2025-06-04	Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone	Zheng Liu et.al.	2506.02774	null
2025-06-03	RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS	Chuanyu Fu et.al.	2506.02751	null
2025-06-03	EyeNavGS: A 6-DoF Navigation Dataset and Record-n-Replay Software for Real-World 3DGS Scenes in VR	Zihao Ding et.al.	2506.02380	link
2025-06-02	GSCodec Studio: A Modular Framework for Gaussian Splat Compression	Sicheng Li et.al.	2506.01822	link
2025-06-02	WorldExplorer: Towards Generating Fully Navigable 3D Scenes	Manuel-Andreas Schneider et.al.	2506.01799	null
2025-06-01	Globally Consistent RGB-D SLAM with 2D Gaussian Splatting	Xingguang Zhong et.al.	2506.00970	link
2025-05-30	3D Gaussian Splat Vulnerabilities	Matthew Hull et.al.	2506.00280	link
2025-05-30	Adaptive Voxelization for Transform coding of 3D Gaussian splatting data	Chenjunjie Wang et.al.	2506.00271	null
2025-05-30	Understanding while Exploring: Semantics-driven Active Mapping	Liyan Chen et.al.	2506.00225	null
2025-05-30	AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion	Yangyi Huang et.al.	2505.24877	null
2025-05-30	TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor Cores	Zimu Liao et.al.	2505.24796	link
2025-05-30	Tackling View-Dependent Semantics in 3D Language Gaussian Splatting	Jiazhong Cen et.al.	2505.24746	link
2025-05-30	LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework	Xin Kang et.al.	2505.24245	null
2025-05-29	3DGEER: Exact and Efficient Volumetric Rendering with 3D Gaussians	Zixun Huang et.al.	2505.24053	link
2025-05-30	ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS	Weijie Wang et.al.	2505.23734	link
2025-05-29	AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views	Lihan Jiang et.al.	2505.23716	null
2025-05-29	Mobi- $π$ : Mobilizing Your Robot Learning Policy	Jingyun Yang et.al.	2505.23692	null
2025-05-29	Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting	Chuandong Liu et.al.	2505.23280	link
2025-05-29	LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering	Jonas Kulhanek et.al.	2505.23158	null
2025-05-29	Pose-free 3D Gaussian splatting via shape-ray estimation	Youngju Na et.al.	2505.22978	null
2025-05-28	3DGS Compression with Sparsity-guided Hierarchical Transform Coding	Hao Xu et.al.	2505.22908	null
2025-05-28	STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering	Zehao Li et.al.	2505.22400	null
2025-05-28	UP-SLAM: Adaptively Structured Gaussian SLAM with Uncertainty Prediction in Dynamic Environments	Wancai Zheng et.al.	2505.22335	null
2025-05-28	Learning Fine-Grained Geometry for Sparse-View Splatting via Cascade Depth Loss	Wenjun Lu et.al.	2505.22279	null
2025-05-28	Hyperspectral Gaussian Splatting	Sunil Kumar Narayanan et.al.	2505.21890	null
2025-05-27	Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility	Yidi Li et.al.	2505.21377	link
2025-05-27	Structure from Collision	Takuhiro Kaneko et.al.	2505.21335	null
2025-05-29	3D-UIR: 3D Gaussian for Underwater 3D Scene Reconstruction via Physics Based Appearance-Medium Decoupling	Jieyu Yuan et.al.	2505.21238	null
2025-05-28	CityGo: Lightweight Urban Modeling and Rendering with Proxy Buildings and Residual Gaussians	Weihang Liu et.al.	2505.21041	null
2025-05-27	Intern-GS: Vision Model Guided Sparse-View 3D Gaussian Splatting	Xiangyu Sun et.al.	2505.20729	null
2025-05-27	Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting	Zechen Li et.al.	2505.20714	link
2025-05-26	ParticleGS: Particle-Based Dynamics Modeling of 3D Gaussians for Prior-free Motion Extrapolation	Jinsheng Quan et.al.	2505.20270	link
2025-05-26	OB3D: A New Dataset for Benchmarking Omnidirectional 3D Reconstruction Using Blender	Shintaro Ito et.al.	2505.20126	link
2025-05-26	K-Buffers: A Plug-in Method for Enhancing Neural Fields with Multiple Buffers	Haofan Ren et.al.	2505.19564	link
2025-05-25	Improving Novel view synthesis of 360 $^\circ$ Scenes in Extremely Sparse Views by Jointly Training Hemisphere Sampled Synthetic Images	Guangan Chen et.al.	2505.19264	link
2025-05-25	Triangle Splatting for Real-Time Radiance Field Rendering	Jan Held et.al.	2505.19175	null
2025-05-25	FHGS: Feature-Homogenized Gaussian Splatting	Q. G. Duan et.al.	2505.19154	null
2025-05-25	Veta-GS: View-dependent deformable 3D Gaussian Splatting for thermal infrared Novel-view Synthesis	Myeongseok Nam et.al.	2505.19138	null
2025-05-25	VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes	Tianchen Deng et.al.	2505.18992	link
2025-05-24	Efficient Differentiable Hardware Rasterization for 3D Gaussian Splatting	Yitian Yuan et.al.	2505.18764	null
2025-05-24	SuperGS: Consistent and Detailed 3D Super-Resolution Scene Reconstruction via Gaussian Splatting	Shiyun Xie et.al.	2505.18649	null
2025-05-23	Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance	Jack Goffinet et.al.	2505.18342	null
2025-05-23	CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis	Florian Barthel et.al.	2505.17590	link
2025-05-23	From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation	Mahmoud Chick Zaouali et.al.	2505.17402	null
2025-05-22	Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction	Jiacong Chen et.al.	2505.16533	null
2025-05-21	RUSplatting: Robust 3D Gaussian Splatting for Sparse-View Underwater Scene Reconstruction	Zhuodong Jiang et.al.	2505.15737	null
2025-05-21	PlantDreamer: Achieving Realistic 3D Plant Models with Diffusion-Guided Gaussian Splatting	Zane K J Hartley et.al.	2505.15528	null
2025-05-21	GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation	Yuchen Li et.al.	2505.15287	null
2025-05-21	MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models	Yifan Liu et.al.	2505.15185	link
2025-05-20	Scan, Materialize, Simulate: A Generalizable Framework for Physically Grounded Robot Planning	Amine Elhafsi et.al.	2505.14938	null
2025-05-20	Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image	Yuxuan Wang et.al.	2505.14537	null
2025-05-20	MGStream: Motion-aware 3D Gaussian for Streamable Dynamic Scene Reconstruction	Zhenyu Bao et.al.	2505.13839	link
2025-05-19	3D Gaussian Adaptive Reconstruction for Fourier Light-Field Microscopy	Chenyu Xu et.al.	2505.12875	null
2025-05-19	TACOcc:Target-Adaptive Cross-Modal Fusion with Volume Rendering for 3D Semantic Occupancy	Luyao Lei et.al.	2505.12693	null
2025-05-18	Is Semantic SLAM Ready for Embedded Systems ? A Comparative Survey	Calvin Galagain et.al.	2505.12384	null
2025-05-17	GTR: Gaussian Splatting Tracking and Reconstruction of Unknown Objects Based on Appearance and Geometric Complexity	Takuya Ikeda et.al.	2505.11905	null
2025-05-16	GrowSplat: Constructing Temporal Digital Twins of Plants with Gaussian Splats	Simeon Adebola et.al.	2505.10923	null
2025-05-16	EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes	Jianlin Guo et.al.	2505.10787	link
2025-05-14	ExploreGS: a vision-based low overhead framework for 3D scene reconstruction	Yunji Feng et.al.	2505.10578	null
2025-05-15	Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian Splatting	Fengdi Zhang et.al.	2505.10473	link
2025-05-15	VRSplat: Fast and Robust Gaussian Splatting for Virtual Reality	Xuechang Tu et.al.	2505.10144	link
2025-05-15	Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field	Jinlong Fan et.al.	2505.10049	link
2025-05-15	Large-Scale Gaussian Splatting SLAM	Zhe Xin et.al.	2505.09915	null
2025-05-14	Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware	Justin Yu et.al.	2505.09601	null
2025-05-13	DLO-Splatting: Tracking Deformable Linear Objects Using 3D Gaussian Splatting	Holly Dinkel et.al.	2505.08644	null
2025-05-13	FOCI: Trajectory Optimization on Gaussian Splats	Mario Gomez Andreu et.al.	2505.08510	null
2025-05-13	A Survey of 3D Reconstruction with Event Cameras: From Event-based Geometry to Neural 3D Rendering	Chuanzhi Xu et.al.	2505.08438	null
2025-05-10	Virtualized 3D Gaussians: Flexible Cluster-based Level-of-Detail System for Real-Time Rendering of Composed Scenes	Xijie Yang et.al.	2505.06523	null
2025-05-08	TeGA: Texture Space Gaussian Avatars for High-Resolution Dynamic Head Modeling	Gengyan Li et.al.	2505.05672	null
2025-05-08	Steepest Descent Density Control for Compact 3D Gaussian Splatting	Peihao Wang et.al.	2505.05587	null
2025-05-08	SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation	Yonwoo Choi et.al.	2505.05475	link
2025-05-08	Time of the Flight of the Gaussians: Optimizing Depth Indirectly in Dynamic Radiance Fields	Runfeng Li et.al.	2505.05356	null
2025-05-07	SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction	Xinran Yang et.al.	2505.04668	link
2025-05-07	Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting	Feng Yang et.al.	2505.04262	null
2025-05-06	3D Gaussian Splatting Data Compression with Mixture of Priors	Lei Liu et.al.	2505.03310	null
2025-05-04	SparSplat: Fast Multi-View Reconstruction with Generalizable 2D Gaussian Splatting	Shubhendu Jena et.al.	2505.02175	null
2025-05-04	GarmentGS: Point-Cloud Guided Gaussian Splatting for High-Fidelity Non-Watertight 3D Garment Reconstruction	Zhihao Tang et.al.	2505.02126	null
2025-05-03	HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder	Qi Yang et.al.	2505.01938	link
2025-05-03	GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting	Anushka Agarwal et.al.	2505.01928	null
2025-05-03	Visual enhancement and 3D representation for underwater scenes: a review	Guoxi Huang et.al.	2505.01869	null
2025-05-03	AquaGS: Fast Underwater Scene Reconstruction with SfM-Free Gaussian Splatting	Junhao Shi et.al.	2505.01799	null
2025-05-02	FalconWing: An Open-Source Platform for Ultra-Light Fixed-Wing Aircraft Research	Yan Miao et.al.	2505.01383	null
2025-05-02	Compensating Spatiotemporally Inconsistent Observations for Online Dynamic 3D Gaussian Splatting	Youngsik Yun et.al.	2505.01235	null
2025-04-30	A Survey on 3D Reconstruction Techniques in Plant Phenotyping: From Classical Methods to Neural Radiance Fields (NeRF), 3D Gaussian Splatting (3DGS), and Beyond	Jiajia Li et.al.	2505.00737	link
2025-04-29	GauSS-MI: Gaussian Splatting Shannon Mutual Information for Active 3D Reconstruction	Yuhan Xie et.al.	2504.21067	link
2025-04-29	GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion	Jiaxin Hong et.al.	2504.20829	null
2025-04-29	EfficientHuman: Efficient Training and Reconstruction of Moving Human using Articulated 2D Gaussian	Hao Tian et.al.	2504.20607	null
2025-04-29	Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting	Hanxi Liu et.al.	2504.20403	null
2025-05-01	GSFeatLoc: Visual Localization Using Feature Correspondence on 3D Gaussian Splatting	Jongwon Lee et.al.	2504.20379	null
2025-04-28	Mesh-Learner: Texturing Mesh with Spherical Harmonics	Yunfei Wan et.al.	2504.19938	link
2025-04-28	CE-NPBG: Connectivity Enhanced Neural Point-Based Graphics for Novel View Synthesis in Autonomous Driving Scenes	Mohammad Altillawi et.al.	2504.19557	null
2025-04-28	GSFF-SLAM: 3D Semantic Gaussian Splatting SLAM via Feature Field	Zuxing Lu et.al.	2504.19409	null
2025-04-30	4DGS-CC: A Contextual Coding Framework for 4D Gaussian Splatting Data Compression	Zicong Chen et.al.	2504.18925	null
2025-05-01	TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians	Letian Huang et.al.	2504.18768	null
2025-04-28	RGS-DR: Reflective Gaussian Surfels with Deferred Rendering for Shiny Objects	Georgios Kouros et.al.	2504.18468	null
2025-04-25	PerfCam: Digital Twinning for Production Lines Using 3D Gaussian Splatting and Vision Models	Michel Gokan Khan et.al.	2504.18165	link
2025-04-24	iVR-GS: Inverse Volume Rendering for Explorable Visualization via Editable 3D Gaussian Splatting	Kaiyuan Tang et.al.	2504.17954	link
2025-04-23	Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning	Mingxuan Cui et.al.	2504.17815	link
2025-04-24	CasualHDRSplat: Robust High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos	Shucheng Gong et.al.	2504.17728	link
2025-04-23	HUG: Hierarchical Urban Gaussian Splatting with Block-Based Reconstruction	Zhongtao Wang et.al.	2504.16606	null
2025-04-23	ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration	Andrea Conti et.al.	2504.16545	null
2025-04-21	StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians	Cailin Zhuang et.al.	2504.15281	null
2025-04-21	MoBGS: Motion Deblurring Dynamic 3D Gaussian Splatting for Blurry Monocular Video	Minh-Quan Viet Bui et.al.	2504.15122	null
2025-04-20	NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation	Junyuan Fang et.al.	2504.14638	null
2025-04-20	VGNC: Reducing the Overfitting of Sparse-view 3DGS via Validation-guided Gaussian Number Control	Lifeng Lin et.al.	2504.14548	null
2025-04-20	Metamon-GS: Enhancing Representability with Variance-Guided Densification and Light Encoding	Junyan Su et.al.	2504.14460	null
2025-04-23	SEGA: Drivable 3D Gaussian Head Avatar from a Single Image	Chen Guo et.al.	2504.14373	null
2025-04-18	EG-Gaussian: Epipolar Geometry and Graph Network Enhanced 3D Gaussian Splatting	Beizhen Zhao et.al.	2504.13540	null
2025-04-17	Volume Encoding Gaussians: Transfer Function-Agnostic 3D Gaussians for Volume Rendering	Landon Dyken et.al.	2504.13339	null
2025-04-17	Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation	Sizhe Yang et.al.	2504.13175	null
2025-04-18	ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos	Zetong Zhang et.al.	2504.13167	null
2025-04-17	Digital Twin Generation from Visual Data: A Survey	Andrew Melnik et.al.	2504.13159	link
2025-04-17	Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs	Shaohui Dai et.al.	2504.13153	link
2025-04-17	GSAC: Leveraging Gaussian Splatting for Photorealistic Avatar Creation with Unity Integration	Rendong Zhang et.al.	2504.12999	link
2025-04-17	Second-order Optimization of Gaussian Splats with Importance Sampling	Hamza Pehlivan et.al.	2504.12905	null
2025-04-17	AAA-Gaussians: Anti-Aliased and Artifact-Free 3D Gaussian Rendering	Michael Steiner et.al.	2504.12811	null
2025-04-17	CAGE-GS: High-fidelity Cage Based 3D Gaussian Splatting Deformation	Yifei Tong et.al.	2504.12800	null
2025-04-17	TSGS: Improving Gaussian Splatting for Transparent Surface Reconstruction via Normal and De-lighting Priors	Mingwei Li et.al.	2504.12799	null
2025-04-17	ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior	Xiao Han et.al.	2504.12788	null
2025-04-16	CAGS: Open-Vocabulary 3D Scene Understanding with Context-Aware Gaussian Splatting	Wei Sun et.al.	2504.11893	null
2025-04-16	3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians	Zeming Wei et.al.	2504.11218	link
2025-04-15	3D Gabor Splatting: Reconstruction of High-frequency Surface Texture using Gabor Noise	Haato Watanabe et.al.	2504.11003	null
2025-04-15	LL-Gaussian: Low-Light Scene Reconstruction and Enhancement via Gaussian Splatting for Novel View Synthesis	Hao Sun et.al.	2504.10331	null
2025-04-14	EBAD-Gaussian: Event-driven Bundle Adjusted Deblur Gaussian Splatting	Yufei Deng et.al.	2504.10012	null
2025-04-16	GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting	Junlin Hao et.al.	2504.10001	null
2025-04-13	DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering	Yexing Xu et.al.	2504.09491	null
2025-04-12	A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds	Jizong Peng et.al.	2504.09129	null
2025-04-12	BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting	Jeongwan On et.al.	2504.09097	null
2025-04-12	You Need a Transition Plane: Bridging Continuous Panoramic 3D Reconstruction with Perspective Gaussian Splatting	Zhijie Shen et.al.	2504.09062	null
2025-04-15	BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting	Yongchang Wu et.al.	2504.09048	link
2025-04-11	FMLGS: Fast Multilevel Language Embedded Gaussians for Part-level Interactive Agents	Xin Tan et.al.	2504.08581	null
2025-04-10	InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians	Kefan Chen et.al.	2504.07949	null
2025-04-10	View-Dependent Uncertainty Estimation of 3D Gaussian Splatting	Chenyu Han et.al.	2504.07370	null
2025-04-09	Wheat3DGS: In-field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting	Daiwei Zhang et.al.	2504.06978	null
2025-04-09	IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments	Can Zhang et.al.	2504.06827	null
2025-04-09	SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering	Hanxiao Sun et.al.	2504.06815	link
2025-04-10	Stochastic Ray Tracing of 3D Transparent Gaussians	Xin Sun et.al.	2504.06598	null
2025-04-08	Micro-splatting: Maximizing Isotropic Constraints for Refined Optimization in 3D Gaussian Splatting	Jee Won Lee et.al.	2504.05740	null
2025-04-07	View-Dependent Deformation Fields for 2D Editing of 3D Models	Martin El Mqirmi et.al.	2504.05544	null
2025-04-07	L3GS: Layered 3D Gaussian Splats for Efficient 3D Scene Delivery	Yi-Zhen Tsai et.al.	2504.05517	link
2025-04-07	Let it Snow! Animating Static Gaussian Scenes With Dynamic Weather Effects	Gal Fiebelman et.al.	2504.05296	null
2025-04-07	PanoDreamer: Consistent Text to 360-Degree Scene Generation	Zhexiao Xiong et.al.	2504.05152	null
2025-04-07	Embracing Dynamics: Dynamics-aware 4D Gaussian Splatting SLAM	Zhicong Sun et.al.	2504.04844	link
2025-04-07	DeclutterNeRF: Generative-Free 3D Scene Recovery for Occlusion Removal	Wanzhou Liu et.al.	2504.04679	null
2025-04-05	3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS	Zhisheng Huang et.al.	2504.04294	null
2025-04-05	Interpretable Single-View 3D Gaussian Splatting using Unsupervised Hierarchical Disentangled Representation Learning	Yuyang Zhang et.al.	2504.04190	null
2025-04-04	HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration	Boyuan Wang et.al.	2504.03536	null
2025-04-03	Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization	Haishan Wang et.al.	2504.03059	link
2025-04-03	MonoGS++: Fast and Accurate Monocular RGB Gaussian SLAM	Renwu Li et.al.	2504.02437	null
2025-04-03	ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation	Yuan Zhou et.al.	2504.02316	link
2025-04-02	UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting	Jaehoon Choi et.al.	2504.02158	null
2025-04-02	Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis	Niluthpol Chowdhury Mithun et.al.	2504.01960	null
2025-04-02	BOGausS: Better Optimized Gaussian Splatting	Stéphane Pateux et.al.	2504.01844	null
2025-04-02	FlowR: Flowing from Sparse to Dense 3D Reconstructions	Tobias Fischer et.al.	2504.01647	null
2025-04-02	3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting	Hao Wu et.al.	2504.01619	null
2025-04-02	RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars	Yahui Li et.al.	2504.01559	null
2025-04-02	Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment	Ziteng Cui et.al.	2504.01503	link
2025-04-02	3D Gaussian Inverse Rendering with Approximated Global Illumination	Zirui Wu et.al.	2504.01358	null
2025-04-01	DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting	Hyunwoo Park et.al.	2504.00773	null
2025-04-01	UnIRe: Unsupervised Instance Decomposition for Dynamic Urban Scene Reconstruction	Yunxuan Mao et.al.	2504.00763	null
2025-04-01	Monocular and Generalizable Gaussian Talking Head Animation	Shengjie Gong et.al.	2504.00665	null
2025-03-31	StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting	Shakiba Kheradmand et.al.	2503.24366	null
2025-04-01	Visual Acoustic Fields	Yuelei Li et.al.	2503.24270	null
2025-03-31	DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting	Seungjun Lee et.al.	2503.24210	null
2025-03-31	Learning 3D-Gaussian Simulators from RGB Videos	Mikel Zhobro et.al.	2503.24009	null
2025-03-31	ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image	Tianyi Gong et.al.	2503.23881	null
2025-03-30	Gaussian Blending Unit: An Edge GPU Plug-in for Real-Time Gaussian-Based Rendering in AR/VR	Zhifan Ye et.al.	2503.23625	null
2025-03-30	Enhancing 3D Gaussian Splatting Compression via Spatial Condition-based Prediction	Jingui Ma et.al.	2503.23337	null
2025-03-30	ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning	Zhenyang Liu et.al.	2503.23297	null
2025-03-29	NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations	Zhenyu Tang et.al.	2503.23162	null
2025-03-29	CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction	Yuanyuan Gao et.al.	2503.23044	null
2025-03-28	TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting	Boyang et.al.	2503.22676	null
2025-03-28	AH-GS: Augmented 3D Gaussian Splatting for High-Frequency Detail Representation	Chenyang Xu et.al.	2503.22324	null
2025-03-28	Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance	Haijie Yang et.al.	2503.22225	null
2025-03-28	ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian Splatting	Wenjie Liu et.al.	2503.22218	null
2025-03-31	Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering	Hao Feng et.al.	2503.22159	null
2025-03-27	Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying	Hairong Yin et.al.	2503.21767	null
2025-03-28	LandMarkSystem Technical Report	Zhenxiang Ma et.al.	2503.21364	link
2025-03-27	Frequency-Aware Gaussian Splatting Decomposition	Yishai Lavi et.al.	2503.21226	null
2025-03-26	PGC: Physics-Based Gaussian Cloth from a Single Pose	Michelle Guo et.al.	2503.20779	null
2025-03-26	TC-GS: Tri-plane based compression for 3D Gaussian Splatting	Taorui Wang et.al.	2503.20221	link
2025-03-26	EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis	Sheng Miao et.al.	2503.20168	null
2025-03-25	Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields	Navami Kairanda et.al.	2503.19976	null
2025-03-26	A Survey on Event-driven 3D Reconstruction: Development under Different Categories	Chuanzhi Xu et.al.	2503.19753	null
2025-03-28	GaussianUDF: Inferring Unsigned Distance Functions through 3D Gaussian Splatting	Shujuan Li et.al.	2503.19458	null
2025-03-25	SparseGS-W: Sparse-View 3D Gaussian Splatting in the Wild with Generative Priors	Yiqing Li et.al.	2503.19452	null
2025-03-26	COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting	Jiaxin Zhang et.al.	2503.19443	link
2025-03-25	MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection	Jee Won Lee et.al.	2503.19330	null
2025-03-25	HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting	Xinpeng Liu et.al.	2503.19232	link
2025-03-24	NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting	Yulong Zheng et.al.	2503.18794	null
2025-03-24	GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting	Lijiang Li et.al.	2503.18718	null
2025-03-24	Hardware-Rasterized Ray-Based Gaussian Splatting	Samuel Rota Bulò et.al.	2503.18682	null
2025-03-24	LLGS: Unsupervised Gaussian Splatting for Image Enhancement and Reconstruction in Pure Dark Environment	Haoran Wang et.al.	2503.18640	null
2025-03-25	StableGS: A Floater-Free Framework for 3D Gaussian Splatting	Luchao Wang et.al.	2503.18458	null
2025-03-24	4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video	Qiang Hu et.al.	2503.18421	null
2025-03-24	DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds	Youyu Chen et.al.	2503.18402	null
2025-03-24	GI-SLAM: Gaussian-Inertial SLAM	Xulang Liu et.al.	2503.18275	null
2025-03-23	Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving	Junhao Ge et.al.	2503.18108	link
2025-03-23	PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding	Hongjia Zhai et.al.	2503.18107	null
2025-03-21	TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting	Jianchuan Chen et.al.	2503.17032	null
2025-03-21	DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery	Jiadong Tang et.al.	2503.16964	null
2025-03-21	Optimized Minimal 3D Gaussian Splatting	Joo Chan Lee et.al.	2503.16924	null
2025-03-20	SAGE: Semantic-Driven Adaptive Gaussian Splatting in Extended Reality	Chiara Schiavo et.al.	2503.16747	null
2025-03-20	GauRast: Enhancing GPU Triangle Rasterizers to Accelerate 3D Gaussian Splatting	Sixu Li et.al.	2503.16681	null
2025-03-20	M3: 3D-Spatial MultiModal Memory	Xueyan Zou et.al.	2503.16413	link
2025-03-20	Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images	Shengjun Zhang et.al.	2503.16338	null
2025-03-20	OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering	Shiyong Liu et.al.	2503.16177	null
2025-03-20	Enhancing Close-up Novel View Synthesis via Pseudo-labeling	Jiatong Xia et.al.	2503.15908	link
2025-03-20	VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling	Hyojun Go et.al.	2503.15855	null
2025-03-20	BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting	Yiren Lu et.al.	2503.15835	null
2025-03-18	HandSplat: Embedding-Driven Gaussian Splatting for High-Fidelity Hand Rendering	Yilan Dong et.al.	2503.14736	null
2025-03-18	Optimized 3D Gaussian Splatting using Coarse-to-Fine Image Frequency Modulation	Umar Farooq et.al.	2503.14475	null
2025-03-18	Improving Adaptive Density Control for 3D Gaussian Splatting	Glenn Grubert et.al.	2503.14274	link
2025-03-18	Lightweight Gradient-Aware Upscaling of 3D Gaussian Splatting Images	Simon Niedermayr et.al.	2503.14171	null
2025-03-18	Light4GS: Lightweight Compact 4D Gaussian Splatting Generation via Context Model	Mufan Liu et.al.	2503.13948	null
2025-03-17	Gaussian On-the-Fly Splatting: A Progressive Framework for Robust Near Real-Time 3DGS Optimization	Yiwei Xu et.al.	2503.13086	null
2025-03-17	CAT-3DGS Pro: A New Benchmark for Efficient 3DGS Compression	Yu-Ting Zhan et.al.	2503.12862	null
2025-03-17	CompMarkGS: Robust Watermarking for Compression 3D Gaussian Splatting	Sumin In et.al.	2503.12836	null
2025-03-17	AV-Surf: Surface-Enhanced Geometry-Aware Novel-View Acoustic Synthesis	Hadam Baek et.al.	2503.12806	null
2025-03-16	SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs	Guibiao Liao et.al.	2503.12535	null
2025-03-16	VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting	Songen Gu et.al.	2503.12383	null
2025-03-18	GS-I $^{3}$ : Gaussian Splatting for Surface Reconstruction from Illumination-Inconsistent Images	Tengfei Wang et.al.	2503.12335	link
2025-03-16	Swift4D:Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene	Jiahao Wu et.al.	2503.12307	null
2025-03-18	3D Gaussian Splatting against Moving Objects for High-Fidelity Street Scene Reconstruction	Peizhen Zheng et.al.	2503.12001	link
2025-03-15	DynaGSLAM: Real-Time Gaussian-Splatting SLAM for Online Rendering, Tracking, Motion Predictions of Moving Objects in Dynamic Scenes	Runfa Blark Li et.al.	2503.11979	null
2025-03-14	Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information	Xuanqi Zhang et.al.	2503.11601	null
2025-03-14	EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting	Di Li et.al.	2503.11345	null
2025-03-14	Uncertainty-Aware Normal-Guided Gaussian Splatting for Surface Reconstruction from Sparse Image Sequences	Zhen Tan et.al.	2503.11172	null
2025-03-13	LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds	Lingteng Qiu et.al.	2503.10625	link
2025-03-13	VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames	Zhiqi Li et.al.	2503.10286	null
2025-03-13	ROODI: Reconstructing Occluded Objects with Denoising Inpainters	Yeonjin Chang et.al.	2503.10256	null
2025-03-15	3D Student Splatting and Scooping	Jialin Zhu et.al.	2503.10148	link
2025-03-13	GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping	Jinfeng Liu et.al.	2503.10143	null
2025-03-12	Physics-Aware Human-Object Rendering from Sparse Views via 3D Gaussian Splatting	Weiquan Wang et.al.	2503.09640	null
2025-03-12	Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation	Máté Tóth et.al.	2503.09464	null
2025-03-12	Online Language Splatting	Saimouli Katragadda et.al.	2503.09447	null
2025-03-12	Close-up-GS: Enhancing Close-Up View Synthesis in 3D Gaussian Splatting with Progressive Self-Training	Jiatong Xia et.al.	2503.09396	null
2025-03-11	PCGS: Progressive Compression of 3D Gaussian Splatting	Yihang Chen et.al.	2503.08511	link
2025-03-11	HRAvatar: High-Quality and Relightable Gaussian Head Avatar	Dongbin Zhang et.al.	2503.08224	null
2025-03-11	S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction	Guangting Zheng et.al.	2503.08217	null
2025-03-11	Dynamic Scene Reconstruction: Recent Advance in Real-time Rendering and Streaming	Jiaxuan Zhu et.al.	2503.08166	null
2025-03-11	ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting	Junfu Guo et.al.	2503.08135	null
2025-03-13	MVGSR: Multi-View Consistency Gaussian Splatting for Robust Surface Reconstruction	Chenfeng Hou et.al.	2503.08093	null
2025-03-11	GigaSLAM: Large-Scale Monocular SLAM with Hierachical Gaussian Splats	Kai Deng et.al.	2503.08071	link
2025-03-11	7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting	Zhongpai Gao et.al.	2503.07946	null
2025-03-10	POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality	Joey Wilson et.al.	2503.07819	null
2025-03-10	SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting	Jiahui Zhang et.al.	2503.07476	null
2025-03-10	EigenGS Representation: From Eigenspace to Gaussian Image Space	Lo-Wei Tai et.al.	2503.07446	null
2025-03-10	All That Glitters Is Not Gold: Key-Secured 3D Secrets within 3D Gaussian Splatting	Yan Ren et.al.	2503.07191	link
2025-03-10	Frequency-Aware Density Control via Reparameterization for High-Quality Rendering of 3D Gaussian Splatting	Zhaojie Zeng et.al.	2503.07000	link
2025-03-09	REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints	Di Wu et.al.	2503.06677	null
2025-03-09	StructGS: Adaptive Spherical Harmonics and Rendering Enhancements for Superior 3D Gaussian Splatting	Zexu Huang et.al.	2503.06462	null
2025-03-08	SplatTalk: 3D VQA with Gaussian Splatting	Anh Thai et.al.	2503.06271	null
2025-03-08	StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams	Yang LI et.al.	2503.06235	null
2025-03-08	ForestSplats: Deformable transient field for Gaussian Splatting in the Wild	Wongi Park et.al.	2503.06179	null
2025-03-08	Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction	Kai Li et.al.	2503.06161	null
2025-03-07	Free Your Hands: Lightweight Relightable Turntable Capture Pipeline	Jiahui Fan et.al.	2503.05511	null
2025-03-07	LiDAR-enhanced 3D Gaussian Splatting Mapping	Jian Shen et.al.	2503.05425	null
2025-03-07	Self-Modeling Robots by Photographing	Kejun Hu et.al.	2503.05398	null
2025-03-07	CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images	Jungho Lee et.al.	2503.05332	link
2025-03-07	STGA: Selective-Training Gaussian Head Avatars	Hanzhi Guo et.al.	2503.05196	null
2025-03-07	MGSR: 2D/3D Mutual-boosted Gaussian Splatting for High-fidelity Surface Reconstruction under Various Light Conditions	Qingyuan Zhou et.al.	2503.05182	null
2025-03-07	SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting	Linqi Yang et.al.	2503.05174	null
2025-03-07	SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting	Xiaotong Huang et.al.	2503.05168	null
2025-03-07	EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation	Chao Zhang et.al.	2503.05162	null
2025-03-07	GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting	Zheng Zhou et.al.	2503.05161	null
2025-03-06	S2Gaussian: Sparse-View Super-Resolution 3D Gaussian Splatting	Yecong Wan et.al.	2503.04314	null
2025-03-06	Instrument-Splatting: Controllable Photorealistic Reconstruction of Surgical Instruments Using Gaussian Splatting	Shuojue Yang et.al.	2503.04082	null
2025-03-06	Beyond Existance: Fulfill 3D Reconstructed Scenes with Pseudo Details	Yifei Gao et.al.	2503.04037	null
2025-03-06	GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding	Xihan Wang et.al.	2503.04034	null
2025-03-06	GRaD-Nav: Efficiently Learning Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics	Qianzhong Chen et.al.	2503.03984	null
2025-03-04	2DGS-Avatar: Animatable High-fidelity Clothed Avatar via 2D Gaussian Splatting	Qipeng Yan et.al.	2503.02452	null
2025-03-04	DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting	Haoyuan Li et.al.	2503.02223	link
2025-03-03	Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization	Jamie Wynn et.al.	2503.02009	null
2025-03-03	Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models	Jay Zhangjie Wu et.al.	2503.01774	null
2025-03-03	OpenGS-SLAM: Open-Set Dense Semantic SLAM with 3D Gaussian Splatting for Object-Level Scene Understanding	Dianyi Yang et.al.	2503.01646	null
2025-03-03	FGS-SLAM: Fourier-based Gaussian Splatting for Real-time SLAM with Sparse and Dense Map Fusion	Yansong Xu et.al.	2503.01109	null
2025-03-02	Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization	You Shen et.al.	2503.00881	null
2025-03-02	Vid2Fluid: 3D Dynamic Fluid Assets from Single-View Videos with Generative Gaussian Splatting	Zhiwei Zhao et.al.	2503.00868	null
2025-03-02	PSRGS:Progressive Spectral Residual of 3D Gaussian for High-Frequency Recovery	BoCheng Li et.al.	2503.00848	null
2025-03-02	DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting	Liao Shen et.al.	2503.00746	null
2025-03-03	FlexDrive: Toward Trajectory Flexibility in Driving Scene Reconstruction and Rendering	Jingqiu Zhou et.al.	2502.21093	null
2025-02-28	EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering	John J. Han et.al.	2502.20669	null
2025-02-27	No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics	Dongbo Shi et.al.	2502.19800	null
2025-02-27	Open-Vocabulary Semantic Part Segmentation of 3D Human	Keito Suzuki et.al.	2502.19782	null
2025-02-26	Compression in 3D Gaussian Splatting: A Survey of Methods, Trends, and Future Directions	Muhammad Salman Ali et.al.	2502.19457	null
2025-02-26	Does 3D Gaussian Splatting Need Accurate Volumetric Rendering?	Adam Celarek et.al.	2502.19318	link
2025-02-28	OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation	Yunpeng Gao et.al.	2502.18041	null
2025-02-27	UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting	Haoyuan Li et.al.	2502.17860	null
2025-02-24	Laplace-Beltrami Operator for Gaussian Splatting	Hongyu Zhou et.al.	2502.17531	null
2025-02-24	Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting	Chong Cheng et.al.	2502.17377	null
2025-02-24	VR-Pipe: Streamlining Hardware Graphics Pipeline for Volume Rendering	Junseo Lee et.al.	2502.17078	null
2025-02-23	Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration	Kim Jun-Seong et.al.	2502.16652	null
2025-02-23	Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control	Jinbo Yan et.al.	2502.16475	null
2025-02-21	RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes	Sicheng Yu et.al.	2502.15633	null
2025-02-20	GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models	Miao Tao et.al.	2502.14938	null
2025-02-20	Hier-SLAM++: Neuro-Symbolic Semantic SLAM with a Hierarchically Categorical Gaussian Splatting	Boying Li et.al.	2502.14931	null
2025-02-20	CDGS: Confidence-Aware Depth Regularization for 3D Gaussian Splatting	Qilin Zhang et.al.	2502.14684	link
2025-02-20	OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving	Yedong Shen et.al.	2502.14235	null
2025-02-19	GlossGau: Efficient Inverse Rendering for Glossy Surface with Anisotropic Spherical Gaussian	Bang Du et.al.	2502.14129	null
2025-02-19	3D Gaussian Splatting aided Localization for Large and Complex Indoor-Environments	Vincent Ress et.al.	2502.13803	null
2025-02-18	RadSplatter: Extending 3D Gaussian Splatting to Radio Frequencies for Wireless Radiomap Extrapolation	Yiheng Wang et.al.	2502.12686	null
2025-02-17	3D Gaussian Inpainting with Depth-Guided Cross-View Consistency	Sheng-Yu Huang et.al.	2502.11801	null
2025-02-17	Exploring the Versal AI Engine for 3D Gaussian Splatting	Kotaro Shimamura et.al.	2502.11782	null
2025-02-17	GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text	Gyumin Shim et.al.	2502.11642	null
2025-02-16	OMG: Opacity Matters in Material Modeling with Gaussian Splatting	Silong Yong et.al.	2502.10988	null
2025-02-16	GS-GVINS: A Tightly-integrated GNSS-Visual-Inertial Navigation System Augmented by 3D Gaussian Splatting	Zelin Zhou et.al.	2502.10975	null
2025-02-15	E-3DGS: Event-Based Novel View Rendering of Large-Scale Scenes Using 3D Gaussian Splatting	Sohaib Zahid et.al.	2502.10827	null
2025-02-13	X-SG $^2$ S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks	Zihang Cheng et.al.	2502.10475	null
2025-02-12	Interactive Holographic Visualization for 3D Facial Avatar	Tri Tung Nguyen Nguyen et.al.	2502.08085	null
2025-02-11	TranSplat: Surface Embedding-guided 3D Gaussian Splatting for Transparent Object Manipulation	Jeongyun Kim et.al.	2502.07840	link
2025-02-11	Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors	Lin-Zhuo Chen et.al.	2502.07615	null
2025-02-05	GARAD-SLAM: 3D GAussian splatting for Real-time Anti Dynamic SLAM	Mingrui Li et.al.	2502.03228	null
2025-02-05	GP-GS: Gaussian Processes for Enhanced Gaussian Splatting	Zhihao Guo et.al.	2502.02283	link
2025-02-04	LAYOUTDREAMER: Physics-guided Layout for Text-to-3D Compositional Scene Generation	Yang Zhou et.al.	2502.01949	null
2025-02-11	UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping	Aashish Rai et.al.	2502.01846	null
2025-02-03	Scalable 3D Gaussian Splatting-Based RF Signal Spatial Propagation Modeling	Kang Yang et.al.	2502.01826	null
2025-02-03	VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion	Shaoting Zhu et.al.	2502.01536	null
2025-02-02	EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis	Junuk Cha et.al.	2502.00654	null
2025-01-31	Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation	Rohan Chacko et.al.	2502.00173	null
2025-01-31	Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping	Yiming Huang et.al.	2501.19319	link
2025-01-31	RaySplats: Ray Tracing based Gaussian Splatting	Krzysztof Byrski et.al.	2501.19196	link
2025-01-31	JGHand: Joint-Driven Animatable Hand Avater via 3D Gaussian Splatting	Zhoutao Sun et.al.	2501.19088	null
2025-01-30	Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting	Yansong Qu et.al.	2501.18672	null
2025-01-29	3D Reconstruction of Shoes for Augmented Reality	Pratik Shrestha et.al.	2501.18643	null
2025-01-31	VoD-3DGS: View-opacity-Dependent 3D Gaussian Splatting	Mateusz Nowak et.al.	2501.17978	null
2025-01-29	CrowdSplat: Exploring Gaussian Splatting For Crowd Rendering	Xiaohan Sun et.al.	2501.17792	link
2025-01-29	FeatureGS: Eigenvalue-Feature Optimization in 3D Gaussian Splatting for Geometrically Accurate and Artifact-Reduced Reconstruction	Miriam Jäger et.al.	2501.17655	null
2025-01-28	Evaluating CrowdSplat: Perceived Level of Detail for Gaussian Crowds	Xiaohan Sun et.al.	2501.17085	null
2025-01-28	DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation	Chenguo Lin et.al.	2501.16764	null
2025-01-25	Towards Better Robustness: Progressively Joint Pose-3DGS Learning for Arbitrarily Long Videos	Zhen-Hui Dong et.al.	2501.15096	null
2025-01-25	HuGDiffusion: Generalizable Single-Image Human Rendering via 3D Gaussian Diffusion	Yingzhi Tang et.al.	2501.15008	null
2025-01-24	HAMMER: Heterogeneous, Multi-Robot Semantic Gaussian Splatting	Javier Yu et.al.	2501.14147	null
2025-01-27	3DGS $^2$ : Near Second-order Converging 3D Gaussian Splatting	Lei Lan et.al.	2501.13975	null
2025-01-23	GoDe: Gaussians on Demand for Progressive Level of Detail and Scalable Compression	Francesco Di Sario et.al.	2501.13558	null
2025-01-23	MultiDreamer3D: Multi-concept 3D Customization with Concept-Aware Diffusion Guidance	Wooseok Song et.al.	2501.13449	null
2025-01-23	GeomGS: LiDAR-Guided Geometry-Aware Gaussian Splatting for Robot Localization	Jaewon Lee et.al.	2501.13417	null
2025-01-23	VIGS SLAM: IMU-based Large-Scale 3D Gaussian Splatting SLAM	Gyuhyeon Pak et.al.	2501.13402	null
2025-01-23	Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos	Xianrui Luo et.al.	2501.13335	null
2025-01-22	Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes	Yuang Shi et.al.	2501.13045	null
2025-01-21	DARB-Splatting: Generalizing Splatting with Decaying Anisotropic Radial Basis Functions	Vishagar Arunan et.al.	2501.12369	null
2025-01-22	HAC++: Towards 100X Compression of 3D Gaussian Splatting	Yihang Chen et.al.	2501.12255	link
2025-01-22	GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting	Longan Wang et.al.	2501.12060	null
2025-01-20	See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization	Zongqi He et.al.	2501.11508	null
2025-01-19	RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering	Chenlu Zhan et.al.	2501.11102	null
2025-01-15	BloomScene: Lightweight Structured 3D Gaussian Splatting for Crossmodal Scene Generation	Xiaolu Hou et.al.	2501.10462	link
2025-01-20	GSTAR: Gaussian Surface Tracking and Reconstruction	Chengwei Zheng et.al.	2501.10283	null
2025-01-16	Creating Virtual Environments with 3D Gaussian Splatting: A Comparative Study	Shi Qiu et.al.	2501.09302	null
2025-01-15	CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation	Qi Ma et.al.	2501.08982	null
2025-01-15	GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping	Sheng Hong et.al.	2501.08672	null
2025-01-14	3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering	Meenakshi Krishnan et.al.	2501.08370	null
2025-01-13	UnCommon Objects in 3D	Xingchen Liu et.al.	2501.07574	link
2025-01-13	3DGS-to-PC: Convert a 3D Gaussian Splatting Scene into a Dense Point Cloud or Mesh	Lewis A G Stuart et.al.	2501.07478	link
2025-01-14	SplatMAP: Online Dense Monocular SLAM with 3D Gaussian Splatting	Yue Hu et.al.	2501.07015	null
2025-01-12	Synthetic Prior for Few-Shot Drivable Head Avatar Inversion	Wojciech Zielonka et.al.	2501.06903	null
2025-01-12	ActiveGAMER: Active GAussian Mapping through Efficient Rendering	Liyan Chen et.al.	2501.06897	null
2025-01-11	NVS-SQA: Exploring Self-Supervised Quality Representation Learning for Neurally Synthesized Scenes without References	Qiang Qu et.al.	2501.06488	link
2025-01-10	Locality-aware Gaussian Compression for Fast and High-quality Rendering	Seungjoo Shin et.al.	2501.05757	null
2025-01-13	Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance	Dimitrios Gerogiannis et.al.	2501.05379	null
2025-01-09	Scaffold-SLAM: Structured 3D Gaussians for Simultaneous Localization and Photorealistic Mapping	Wen Tianci et.al.	2501.05242	null
2025-01-08	GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting	Andrew Bond et.al.	2501.04782	null
2025-01-07	MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting	Sangwoon Kwak et.al.	2501.03714	null
2025-01-07	DehazeGS: Seeing Through Fog with 3D Gaussian Splatting	Jinze Yu et.al.	2501.03659	null
2025-01-07	ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting	Yifeng Yang et.al.	2501.03605	link
2025-01-06	Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs	Soonbin Lee et.al.	2501.03399	null
2025-01-06	HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation	Wentian Qu et.al.	2501.02845	null
2025-01-03	Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision	Alberta Longhini et.al.	2501.01715	null
2025-01-03	CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction	Chenhao Zhang et.al.	2501.01695	null
2025-01-03	PG-SAG: Parallel Gaussian Splatting for Fine-Grained Large-Scale Urban Buildings Reconstruction via Semantic-Aware Grouping	Tengfei Wang et.al.	2501.01677	link
2025-01-02	Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes	Jiwei Shan et.al.	2501.01101	null
2025-01-02	EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting Easy	Ao Gao et.al.	2501.01003	null
2024-12-31	PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM	Runnan Chen et.al.	2501.00352	null
2024-12-31	SG-Splatting: Accelerating 3D Gaussian Splatting with Spherical Gaussians	Yiwen Wang et.al.	2501.00342	null
2024-12-30	PERSE: Personalized 3D Generative Avatars from A Single Portrait	Hyunsoo Cha et.al.	2412.21206	null
2024-12-30	KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences	Keng-Wei Chang et.al.	2412.20767	null
2024-12-29	MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks	Yifei Liu et.al.	2412.20522	link
2024-12-28	DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis	Kaijun Deng et.al.	2412.20148	link
2024-12-28	GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting	Atticus J. Zeller et.al.	2412.20056	link
2024-12-27	Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images	Xudong Cai et.al.	2412.19518	null
2024-12-27	Learning Radiance Fields from a Single Snapshot Compressive Image	Yunhao Li et.al.	2412.19483	null
2024-12-26	BeSplat – Gaussian Splatting from a Single Blurry Image and Event Stream	Gopi Raju Matta et.al.	2412.19370	link
2024-12-26	Generating Editable Head Avatars with 3D Gaussian GANs	Guohao Li et.al.	2412.19149	link
2024-12-26	CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting	Siyu Jiao et.al.	2412.19142	null
2024-12-26	MVS-GS: High-Quality 3D Gaussian Splatting Mapping via Online Multi-View Stereo	Byeonggwon Lee et.al.	2412.19130	null
2024-12-25	WeatherGS: 3D Scene Reconstruction in Adverse Weather Conditions via Gaussian Splatting	Chenghao Qian et.al.	2412.18862	link
2024-12-25	GSAVS: Gaussian Splatting-based Autonomous Vehicle Simulator	Rami Wilson et.al.	2412.18816	null
2024-12-25	ArtNVG: Content-Style Separated Artistic Neighboring-View Gaussian Stylization	Zixiao Gu et.al.	2412.18783	null
2024-12-24	RSGaussian:3D Gaussian Splatting with LiDAR for Aerial Remote Sensing Novel View Synthesis	Yiling Yao et.al.	2412.18380	null
2024-12-23	GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance	Jingqiu Zhou et.al.	2412.17715	null
2024-12-23	CoSurfGS:Collaborative 3D Surface Gaussian Splatting with Distributed Learning for Large Scene Reconstruction	Yuanyuan Gao et.al.	2412.17612	null
2024-12-23	Balanced 3DGS: Gaussian-wise Parallelism Rendering with Fine-Grained Tiling	Hao Gui et.al.	2412.17378	null
2024-12-22	GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs	Xingrui Wang et.al.	2412.16932	link
2024-12-22	GeoTexDensifier: Geometry-Texture-Aware Densification for High-Quality Photorealistic 3D Gaussian Splatting	Hanqing Jiang et.al.	2412.16809	null
2024-12-21	Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity	Tianqi Shen et.al.	2412.16619	link
2024-12-21	OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities	Suyoung Lee et.al.	2412.16604	null
2024-12-20	Interactive Scene Authoring with Specialized Generative Primitives	Clément Jambon et.al.	2412.16253	null
2024-12-20	CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images	Jungho Lee et.al.	2412.16028	null
2024-12-20	AvatarPerfect: User-Assisted 3D Gaussian Splatting Avatar Refinement with Automatic Pose Suggestion	Jotaro Sakamiya et.al.	2412.15609	null
2024-12-20	EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene	Yixiong Huo et.al.	2412.15550	link
2024-12-19	GSRender: Deduplicated Occupancy Prediction via Weakly Supervised 3D Gaussian Splatting	Qianpu Sun et.al.	2412.14579	null
2024-12-19	Improving Geometry in Sparse-View 3DGS via Reprojection-based DoF Separation	Yongsung Kim et.al.	2412.14568	null
2024-12-18	GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians	Xiaobao Wei et.al.	2412.13983	link
2024-12-18	GAGS: Granularity-Aware Feature Distillation for Language Gaussian Splatting	Yuning Peng et.al.	2412.13654	null
2024-12-18	4D Radar-Inertial Odometry based on Gaussian Modeling and Multi-Hypothesis Scan Matching	Fernando Amodeo et.al.	2412.13639	link
2024-12-18	Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields	Tao Lu et.al.	2412.13547	null
2024-12-18	Vivar: A Generative AR System for Intuitive Multi-Modal Sensor Data Presentation	Yunqi Guo et.al.	2412.13509	null
2024-12-17	CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image	Wonseok Roh et.al.	2412.12906	null
2024-12-17	HyperGS: Hyperspectral 3D Gaussian Splatting	Christopher Thirgood et.al.	2412.12849	null
2024-12-17	3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting	Qi Wu et.al.	2412.12507	link
2024-12-16	Wonderland: Navigating 3D Scenes from a Single Image	Hanwen Liang et.al.	2412.12091	null
2024-12-16	SweepEvGS: Event-Based 3D Gaussian Splatting for Macro and Micro Radiance Field Rendering from a Single Sweep	Jingqian Wu et.al.	2412.11579	null
2024-12-16	EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting	Dong In Lee et.al.	2412.11520	null
2024-12-14	DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting	Luis Wiedmann et.al.	2412.10972	link
2024-12-13	SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians	Siyun Liang et.al.	2412.10231	null
2024-12-18	SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video	Jongmin Park et.al.	2412.09982	null
2024-12-13	RP-SLAM: Real-time Photorealistic SLAM with Efficient 3D Gaussian Splatting	Lizhi Bai et.al.	2412.09868	null
2024-12-12	PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields	Sean Wu et.al.	2412.09680	link
2024-12-12	LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors	Yabo Chen et.al.	2412.09597	null
2024-12-12	LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting	Haotian Mao et.al.	2412.09176	null
2024-12-10	Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians	Yixuan Li et.al.	2412.07660	null
2024-12-10	Faster and Better 3D Splatting via Group Training	Chengbo Wang et.al.	2412.07608	null
2024-12-10	ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery	Yanzhe Lyu et.al.	2412.07494	null
2024-12-10	EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering	Toshiya Yura et.al.	2412.07293	null
2024-12-09	Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video	Renlong Wu et.al.	2412.06424	link
2024-12-09	4D Gaussian Splatting with Scale-aware Residual Field and Adaptive Optimization for Real-time Rendering of Temporally Complex Dynamic Scenes	Jinbo Yan et.al.	2412.06299	null
2024-12-12	Advancing Extended Reality with 3D Gaussian Splatting: Innovations and Prospects	Shi Qiu et.al.	2412.06257	null
2024-12-09	Splatter-360: Generalizable 360 $^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images	Zheng Chen et.al.	2412.06250	link
2024-12-09	Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction	Seungtae Nam et.al.	2412.06234	null
2024-12-07	Temporally Compressed 3D Gaussian Splatting for Dynamic Scenes	Saqib Javed et.al.	2412.05700	null
2024-12-07	WATER-GS: Toward Copyright Protection for 3D Gaussian Splatting via Universal Watermarking	Yuqi Tan et.al.	2412.05695	null
2024-12-07	Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis	Diwen Wan et.al.	2412.05570	null
2024-12-07	Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation	Wenqing Wang et.al.	2412.05560	null
2024-12-07	Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework	Haosong Peng et.al.	2412.05546	null
2024-12-06	Extrapolated Urban View Synthesis Benchmark	Xiangyu Han et.al.	2412.05256	link
2024-12-06	MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting	Peng Chen et.al.	2412.04955	link
2024-12-06	Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction	Jixuan Fan et.al.	2412.04887	link
2024-12-06	WRF-GS: Wireless Radiation Field Reconstruction with 3D Gaussian Splatting	Chaozheng Wen et.al.	2412.04832	link
2024-12-06	Pushing Rendering Boundaries: Hard Gaussian Splatting	Qingshan Xu et.al.	2412.04826	null
2024-12-05	QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos	Sharath Girish et.al.	2412.04469	null
2024-12-06	PBDyG: Position Based Dynamic Gaussians for Motion-Aware Clothed Human Avatars	Shota Sasaki et.al.	2412.04433	null
2024-12-05	Multi-View Pose-Agnostic Change Localization with Zero Labels	Chamuditha Jayanga Galappaththige et.al.	2412.03911	link
2024-12-05	HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting	Jingyu Lin et.al.	2412.03844	link
2024-12-04	Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos	Hanxue Liang et.al.	2412.03526	null
2024-12-04	2DGS-Room: Seed-Guided 2D Gaussian Splatting with Geometric Constrains for High-Fidelity Indoor Scene Reconstruction	Wanting Zhang et.al.	2412.03428	null
2024-12-04	Volumetrically Consistent 3D Gaussian Rasterization	Chinmay Talegaonkar et.al.	2412.03378	link
2024-12-04	SGSST: Scaling Gaussian Splatting StyleTransfer	Bruno Galerne et.al.	2412.03371	link
2024-12-04	Splats in Splats: Embedding Invisible 3D Watermark within Gaussian Splatting	Yijia Guo et.al.	2412.03121	null
2024-12-03	Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects	Abdurrahman Zeybey et.al.	2412.02803	null
2024-12-03	RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians	Qiankun Gao et.al.	2412.02493	link
2024-12-03	GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos	Zhiyuan Chen et.al.	2412.02267	null
2024-12-03	Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance	Jing Zeng et.al.	2412.02249	null
2024-12-03	How to Use Diffusion Priors under Sparse Views?	Qisen Wang et.al.	2412.02225	link
2024-12-03	SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images	Junqiu Yu et.al.	2412.02140	null
2024-12-03	Gaussian Object Carver: Object-Compositional Gaussian Splatting with surfaces completion	Liu Liu et.al.	2412.02075	link
2024-12-02	Occam’s LGS: A Simple Approach for Language Gaussian Splatting	Jiahuan Cheng et.al.	2412.01807	null
2024-12-02	CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion	Kai He et.al.	2412.01792	null
2024-12-02	Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes	Lihan Jiang et.al.	2412.01745	null
2024-12-02	HUGSIM: A Real-Time, Photo-Realistic and Closed-Loop Simulator for Autonomous Driving	Hongyu Zhou et.al.	2412.01718	null
2024-12-02	GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting	Zixuan Chen et.al.	2411.19895	link
2024-11-29	TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting	Bojun Xiong et.al.	2411.19654	link
2024-11-29	Tortho-Gaussian: Splatting True Digital Orthophoto Maps	Xin Wang et.al.	2411.19594	null
2024-11-29	Gaussian Splashing: Direct Volumetric Rendering Underwater	Nir Mualem et.al.	2411.19588	null
2024-11-29	Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding	Wenbo Zhang et.al.	2411.19551	link
2024-12-02	GausSurf: Geometry-Guided 3D Gaussian Splatting for Surface Reconstruction	Jiepeng Wang et.al.	2411.19454	null
2024-11-29	RF-3DGS: Wireless Channel Modeling with Radio Radiance Field and 3D Gaussian Splatting	Lihao Zhang et.al.	2411.19420	link
2024-11-28	InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception	Haijie Li et.al.	2411.19235	null
2024-11-28	Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes	Thomas Wimmer et.al.	2411.19233	link
2024-11-28	RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning	Jiacheng Wang et.al.	2411.18866	null
2024-11-27	Textured Gaussians for Enhanced 3D Scene Appearance Modeling	Brian Chao et.al.	2411.18625	null
2024-11-27	PhyCAGE: Physically Plausible Compositional 3D Asset Generation from a Single Image	Han Yan et.al.	2411.18548	null
2024-11-27	HEMGS: A Hybrid Entropy Model for 3D Gaussian Splatting Data Compression	Lei Liu et.al.	2411.18473	null
2024-11-27	Neural Surface Priors for Editable Gaussian Splatting	Jakub Szymkowiak et.al.	2411.18311	link
2024-11-27	Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters	Zhiyang Guo et.al.	2411.18197	null
2024-11-27	GLS: Geometry-aware 3D Language Gaussian Splatting	Jiaxiong Qiu et.al.	2411.18066	link
2024-11-27	HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction	Wei Zhang et.al.	2411.17982	link
2024-11-26	DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting	Christian Homeyer et.al.	2411.17660	link
2024-11-26	Distractor-free Generalizable 3D Gaussian Splatting	Yanqi Bao et.al.	2411.17605	link
2024-11-28	SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting	Gyeongjin Kang et.al.	2411.17190	null
2024-11-25	G2SDF: Surface Reconstruction from Explicit Gaussians with Implicit SDFs	Kunyi Li et.al.	2411.16898	null
2024-11-25	PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence	Zequn Chen et.al.	2411.16877	null
2024-11-25	SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving	Georg Hess et.al.	2411.16816	link
2024-11-25	SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis	Hyojun Go et.al.	2411.16443	link
2024-11-25	Quadratic Gaussian Splatting for Efficient and Detailed Surface Reconstruction	Ziyu Zhang et.al.	2411.16392	null
2024-11-25	Event-boosted Deformable 3D Gaussians for Fast Dynamic Scene Reconstruction	Wenhao Xu et.al.	2411.16180	null
2024-11-24	ZeroGS: Training 3D Gaussian Splatting from Unposed Images	Yu Chen et.al.	2411.15779	null
2024-11-24	GSurf: 3D Reconstruction via Signed Distance Fields with Direct Gaussian Supervision	Xu Baixin et.al.	2411.15723	link
2024-11-23	Gassidy: Gaussian Splatting SLAM in Dynamic Environments	Long Wen et.al.	2411.15476	null
2024-11-23	SplatSDF: Boosting Neural Implicit SDF via Gaussian Splatting Fusion	Runfa Blark Li et.al.	2411.15468	null
2024-11-22	UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations	Yuan Ren et.al.	2411.15355	null
2024-11-22	3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes	Jan Held et.al.	2411.14974	link
2024-11-22	Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly Training for 4D Reconstruction	Zhening Liu et.al.	2411.14847	null
2024-11-22	VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving	Haiming Zhang et.al.	2411.14716	null
2024-11-21	NexusSplats: Efficient 3D Gaussian Splatting in the Wild	Yuzhou Tang et.al.	2411.14514	null
2024-11-21	Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation	Zhuoman Liu et.al.	2411.14423	null
2024-11-21	SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching	Arjun P S et.al.	2411.14322	link
2024-11-20	Generating 3D-Consistent Videos from Unposed Internet Photos	Gene Chou et.al.	2411.13549	null
2024-11-20	GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting	Xiaobao Wei et.al.	2411.12981	null
2024-11-19	PR-ENDO: Physically Based Relightable Gaussian Splatting for Endoscopy	Joanna Kaleta et.al.	2411.12510	link
2024-11-19	SCIGS: 3D Gaussians Splatting from a Snapshot Compressive Image	Zixu Wang et.al.	2411.12471	null
2024-11-20	Beyond Gaussians: Fast and High-Fidelity 3D Splatting with Linear Kernels	Haodong Chen et.al.	2411.12440	null
2024-11-19	LiV-GS: LiDAR-Vision Integration for 3D Gaussian Splatting SLAM in Outdoor Environments	Renxiang Xiao et.al.	2411.12185	null
2024-11-19	Sketch-guided Cage-based 3D Gaussian Splatting Deformation	Tianhao Xie et.al.	2411.12168	null
2024-11-21	FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting	Fangyu Wu et.al.	2411.12089	null
2024-11-18	TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction	DaDong Jiang et.al.	2411.11941	null
2024-11-18	DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes	Chensheng Peng et.al.	2411.11921	link
2024-11-18	RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator	Xinhai Li et.al.	2411.11839	null
2024-11-18	GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views	Boyao Zhou et.al.	2411.11363	null
2024-11-17	VeGaS: Video Gaussian Splatting	Weronika Smolak-Dyżewska et.al.	2411.11024	link
2024-11-15	The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods	Yifu Tao et.al.	2411.10546	null
2024-11-15	USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting	Kang Chen et.al.	2411.10504	link
2024-11-15	Efficient Density Control for 3D Gaussian Splatting	Xiaobin Deng et.al.	2411.10133	link
2024-11-15	GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization	Yanhao Sun et.al.	2411.10033	null
2024-11-15	GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video	Jingxuan Chen et.al.	2411.09952	link
2024-11-14	Adversarial Attacks Using Differentiable Rendering: A Survey	Matthew Hull et.al.	2411.09749	null
2024-11-14	DyGASR: Dynamic Generalized Exponential Splatting with Surface Alignment for Accelerated 3D Mesh Reconstruction	Shengchao Zhao et.al.	2411.09156	null
2024-11-13	Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models	Chengdong Dong et.al.	2411.08642	null
2024-11-13	Biomass phenotyping of oilseed rape through UAV multi-view oblique imaging with 3DGS and SAM model	Yutao Shen et.al.	2411.08453	null
2024-11-13	MBA-SLAM: Motion Blur Aware Dense Visual SLAM with Radiance Fields Representation	Peng Wang et.al.	2411.08279	link
2024-11-14	Projecting Gaussian Ellipsoids While Avoiding Affine Projection Approximation	Han Qi et.al.	2411.07579	null
2024-11-12	GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting	Umangi Jain et.al.	2411.07555	null
2024-11-12	HiCoM: Hierarchical Coherent Motion for Streamable Dynamic Scene with 3D Gaussian Splatting	Qiankun Gao et.al.	2411.07541	link
2024-11-12	GUS-IR: Gaussian Splatting with Unified Shading for Inverse Rendering	Zhihao Liang et.al.	2411.07478	null
2024-11-11	A Hierarchical Compression Technique for 3D Gaussian Splatting Compression	He Huang et.al.	2411.06976	null
2024-11-10	Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction	Decai Chen et.al.	2411.06602	null
2024-11-12	SplatFormer: Point Transformer for Robust 3D Gaussian Splatting	Yutong Chen et.al.	2411.06390	link
2024-11-10	Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field	Liuyue Xie et.al.	2411.06365	null
2024-11-09	AI-Driven Stylization of 3D Environments	Yuanbo Chen et.al.	2411.06067	null
2024-11-09	GaussianSpa: An “Optimizing-Sparsifying” Simplification Framework for Compact and High-Quality 3D Gaussian Splatting	Yangming Zhang et.al.	2411.06019	null
2024-11-07	ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing	Jun-Kun Chen et.al.	2411.05006	null
2024-11-07	MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views	Yuedong Chen et.al.	2411.04924	link
2024-11-08	GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting	Jilan Mei et.al.	2411.03807	null
2024-11-06	3DGS-CD: 3D Gaussian Splatting-based Change Detection for Physical Object Rearrangement	Ziqi Lu et.al.	2411.03706	link
2024-11-06	Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis	Rui Peng et.al.	2411.03637	link
2024-11-05	Object and Contact Point Tracking in Demonstrations Using 3D Gaussian Splatting	Michael Büttner et.al.	2411.03555	null
2024-11-05	HFGaussian: Learning Generalizable Gaussian Human with Integrated Human Features	Arnab Dey et.al.	2411.03086	null
2024-11-05	LVI-GS: Tightly-coupled LiDAR-Visual-Inertial SLAM using 3D Gaussian Splatting	Huibin Zhao et.al.	2411.02703	null
2024-11-04	Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting	Joey Wilson et.al.	2411.02547	null
2024-11-06	SplatOverflow: Asynchronous Hardware Troubleshooting	Amritansh Kwatra et.al.	2411.02332	null
2024-11-05	FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training	Ruihong Yin et.al.	2411.02229	null
2024-11-06	GVKF: Gaussian Voxel Kernel Functions for Highly Efficient Surface Reconstruction in Open Scenes	Gaochao Song et.al.	2411.01853	null
2024-11-01	CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes	Yang Liu et.al.	2411.00771	null
2024-10-31	Aquatic-GS: A Hybrid 3D Representation for Underwater Scenes	Shaohua Liu et.al.	2411.00239	null
2024-10-31	Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis	Chen Zhao et.al.	2411.00144	link
2024-10-31	No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images	Botao Ye et.al.	2410.24207	link
2024-11-01	GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering	Kai Ye et.al.	2410.24204	null
2024-10-31	GaussianMarker: Uncertainty-Aware Copyright Protection of 3D Gaussian Splatting	Xiufeng Huang et.al.	2410.23718	null
2024-10-31	GS-Blur: A 3D Scene-Based Dataset for Realistic Image Deblurring	Dongwoo Lee et.al.	2410.23658	link
2024-10-30	ELMGS: Enhancing memory and computation scaLability through coMpression for 3D Gaussian Splatting	Muhammad Salman Ali et.al.	2410.23213	null
2024-10-31	Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis	Zhiyuan Min et.al.	2410.22817	null
2024-10-29	PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting	Sunghwan Hong et.al.	2410.22128	link
2024-10-29	FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives	Qizhi Chen et.al.	2410.22070	null
2024-10-28	CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians	Chongjian Ge et.al.	2410.20723	null
2024-10-28	ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings	Suyoung Lee et.al.	2410.20686	link
2024-10-27	Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering	Meng Wei et.al.	2410.20593	null
2024-10-30	DiffGS: Functional Gaussian Splatting Diffusion	Junsheng Zhou et.al.	2410.19657	null
2024-10-25	Robotic Learning in your Backyard: A Neural Simulator from Open Source Components	Liyou Zhou et.al.	2410.19564	link
2024-10-25	Content-Aware Radiance Fields: Aligning Model Complexity with Scene Intricacy Through Learned Bitwidth Quantization	Weihang Liu et.al.	2410.19483	link
2024-10-24	Sort-free Gaussian Splatting via Weighted Sum Rendering	Qiqi Hou et.al.	2410.18931	null
2024-10-24	Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling	Mingtong Zhang et.al.	2410.18912	null
2024-10-27	Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis	Liang Han et.al.	2410.18822	null
2024-10-23	VR-Splatting: Foveated Radiance Field Rendering via 3D Gaussian Splatting and Neural Points	Linus Franke et.al.	2410.17932	null
2024-10-23	PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting	Yu Wang et.al.	2410.17505	null
2024-10-22	AG-SLAM: Active Gaussian Splatting SLAM	Wen Jiang et.al.	2410.17422	null
2024-10-22	SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes	Cheng-De Fan et.al.	2410.17249	null
2024-10-18	GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting	Yusen Xie et.al.	2410.17084	null
2024-10-22	E-3DGS: Gaussian Splatting with Exposure and Motion Events	Xiaoting Yin et.al.	2410.16995	link
2024-10-21	3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors	Xi Liu et.al.	2410.16266	null
2024-10-22	Fully Explicit Dynamic Gaussian Splatting	Junoh Lee et.al.	2410.15629	null
2024-10-22	EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting	Bohao Liao et.al.	2410.15392	null
2024-10-18	Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set	Wenyuan Zhang et.al.	2410.14189	null
2024-10-17	DepthSplat: Connecting Gaussian Splatting and Depth	Haofei Xu et.al.	2410.13862	link
2024-10-17	DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering	Jiahao Lu et.al.	2410.13607	link
2024-10-17	GlossyGS: Inverse Rendering of Glossy Objects with 3D Gaussian Splatting	Shuichang Lai et.al.	2410.13349	null
2024-10-16	3D Gaussian Splatting in Robotics: A Survey	Siting Zhu et.al.	2410.12262	link
2024-10-15	SplatPose+: Real-time Image-Based Pose-Agnostic 3D Anomaly Detection	Yizhe Liu et.al.	2410.12080	link
2024-10-15	LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images	Yuzhou Cheng et.al.	2410.11505	null
2024-10-15	MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields	Yuru Xiao et.al.	2410.11394	null
2024-10-15	GSORB-SLAM: Gaussian Splatting SLAM benefits from ORB features and Transmittance information	Wancai Zheng et.al.	2410.11356	null
2024-10-15	Scalable Indoor Novel-View Synthesis using Drone-Captured 360 Imagery with 3D Gaussian Splatting	Yuanbo Chen et.al.	2410.11285	null
2024-10-14	Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting	Raja Kumar et.al.	2410.11080	link
2024-10-15	4-LEGS: 4D Language Embedded Gaussian Splatting	Gal Fiebelman et.al.	2410.10719	null
2024-10-11	SurgicalGS: Dynamic 3D Gaussian Splatting for Accurate Robotic-Assisted Surgical Scene Reconstruction	Jialei Chen et.al.	2410.09292	null
2024-10-11	MeshGS: Adaptive Mesh-Aligned Gaussian Splatting for High-Quality Rendering	Jaehoon Choi et.al.	2410.08941	null
2024-10-11	Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars	Xuan Huang et.al.	2410.08840	link
2024-10-11	Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose Initialization	Christian Schmidt et.al.	2410.08743	link
2024-10-10	FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction	Irving Fang et.al.	2410.08282	null
2024-10-10	Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics	Junyi Cao et.al.	2410.08257	null
2024-10-10	Poison-splat: Computation Cost Attack on 3D Gaussian Splatting	Jiahao Lu et.al.	2410.08190	link
2024-10-10	DifFRelight: Diffusion-Based Facial Performance Relighting	Mingming He et.al.	2410.08188	null
2024-10-10	Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency	Florian Hahlbohm et.al.	2410.08129	null
2024-10-10	IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera	Jian Huang et.al.	2410.08107	link
2024-10-11	Fast Feedforward 3D Gaussian Splatting Compression	Yihang Chen et.al.	2410.08017	link
2024-10-10	MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting	Ruijie Zhu et.al.	2410.07707	link
2024-10-09	Spiking GS: Towards High-Accuracy and Low-Cost Surface Reconstruction via Spiking Neuron-based Gaussian Splatting	Weixing Zhang et.al.	2410.07266	link
2024-10-09	3D Representation Methods: A Survey	Zhengren Wang et.al.	2410.06475	null
2024-10-08	HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction	Shengji Tang et.al.	2410.06245	null
2024-10-08	GSLoc: Visual Localization with 3D Gaussian Splatting	Kazii Botashev et.al.	2410.06165	null
2024-10-08	Comparative Analysis of Novel View Synthesis and Photogrammetry for 3D Forest Stand Reconstruction and extraction of individual tree parameters	Guoji Tian et.al.	2410.05772	null
2024-10-07	GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting	Yukang Cao et.al.	2410.05259	null
2024-10-07	DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects	Nidhi Mathihalli et.al.	2410.05097	link
2024-10-07	PhotoReg: Photometrically Registering 3D Gaussian Splatting Models	Ziwen Yuan et.al.	2410.05044	null
2024-10-07	6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering	Zhongpai Gao et.al.	2410.04974	null
2024-10-07	Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting	Matthew Strong et.al.	2410.04680	link
2024-10-06	Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for Robust Ground-View Scene Rendering	Yonghan Lee et.al.	2410.04646	null
2024-10-04	Variational Bayes Gaussian Splatting	Toon Van de Maele et.al.	2410.03592	link
2024-10-03	Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats	Mingyang Xie et.al.	2410.02764	null
2024-10-03	GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering	Hongze Chen et.al.	2410.02619	null
2024-10-07	SuperGS: Super-Resolution 3D Gaussian Splatting via Latent Feature Field and Gradient-guided Splitting	Shiyun Xie et.al.	2410.02571	link
2024-10-02	MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis	Xiaobiao Du et.al.	2410.02103	link
2024-10-03	EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis	Alexander Mai et.al.	2410.01804	null
2024-10-02	3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection	Yang Cao et.al.	2410.01647	link
2024-10-02	Gaussian Splatting in Mirrors: Reflection-Aware Rendering via Virtual Camera Optimization	Zihan Wang et.al.	2410.01614	link
2024-10-02	UW-GS: Distractor-Aware 3D Gaussian Splatting for Enhanced Underwater Scene Reconstruction	Haoran Wang et.al.	2410.01517	link
2024-10-02	EVA-Gaussian: 3D Gaussian-based Real-time Human Novel View Synthesis under Diverse Camera Settings	Yingdong Hu et.al.	2410.01425	null
2024-10-02	CaRtGS: Computational Alignment for Real-Time Gaussian Splatting SLAM	Dapeng Feng et.al.	2410.00486	link
2024-10-01	Seamless Augmented Reality Integration in Arthroscopy: A Pipeline for Articular Reconstruction and Guidance	Hongchao Shu et.al.	2410.00386	null
2024-10-01	GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving	Zhangshuo Qi et.al.	2410.00299	link
2024-09-30	RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning	Yuxuan Wu et.al.	2409.20291	null
2024-09-30	Robust Gaussian Splatting SLAM by Leveraging Loop Closure	Zunjie Zhu et.al.	2409.20111	null
2024-10-01	RNG: Relightable Neural Gaussians	Jiahui Fan et.al.	2409.19702	null
2024-09-28	1st Place Solution to the 8th HANDS Workshop Challenge – ARCTIC Track: 3DGS-based Bimanual Category-agnostic Interaction Reconstruction	Jeongwan On et.al.	2409.19215	null
2024-09-26	HGS-Planner: Hierarchical Planning Framework for Active Scene Reconstruction Using 3D Gaussian Splatting	Zijun Xu et.al.	2409.17624	null
2024-09-25	SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model	Daniel Yang et.al.	2409.17345	null
2024-09-25	Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM	Phu Pham et.al.	2409.16944	null
2024-09-24	GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization	Gennady Sidorov et.al.	2409.16502	link
2024-09-24	Frequency-based View Selection in Gaussian Splatting Reconstruction	Monica M. Q. Li et.al.	2409.16470	null
2024-09-26	Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities	Peizhi Yan et.al.	2409.16147	link
2024-09-23	Human Hair Reconstruction with Strand-Aligned 3D Gaussians	Egor Zakharov et.al.	2409.14778	null
2024-09-22	MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views	Wangze Xu et.al.	2409.14316	null
2024-09-21	SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality	Hongjia Zhai et.al.	2409.14067	null
2024-09-20	Elite-EvGS: Learning Event-based 3D Gaussian Splatting by Distilling Event-to-Video Priors	Zixin Zhang et.al.	2409.13392	null
2024-09-20	3D-GSW: 3D Gaussian Splatting Watermark for Protecting Copyrights in Radiance Fields	Youngdong Jang et.al.	2409.13222	null
2024-09-19	MGSO: Monocular Real-time Photometric SLAM with Efficient 3D Gaussian Splatting	Yan Song Hu et.al.	2409.13055	null
2024-09-18	SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation	Mingze Sun et.al.	2409.11682	link
2024-09-18	Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks	Joji Joseph et.al.	2409.11681	link
2024-09-17	GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module	Yichen Zhang et.al.	2409.11307	null
2024-09-17	SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction	Marko Mihajlovic et.al.	2409.11211	null
2024-09-17	GLC-SLAM: Gaussian Splatting SLAM with Efficient Loop Closure	Ziheng Xu et.al.	2409.10982	null
2024-09-16	Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering	Euntae Choi et.al.	2409.10335	null
2024-09-16	BEINGS: Bayesian Embodied Image-goal Navigation with Gaussian Splatting	Wugang Meng et.al.	2409.10216	link
2024-09-16	DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments	Mahmud A. Mohamad et.al.	2409.10041	link
2024-09-15	MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation	Shuzhao Xie et.al.	2409.09756	null
2024-09-17	A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis	Yohan Poirier-Ginter et.al.	2409.08947	null
2024-09-13	AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius	Xinzhe Wang et.al.	2409.08669	null
2024-09-13	Dense Point Clouds Matter: Dust-GS for Scene Reconstruction from Sparse Viewpoints	Shan Chen et.al.	2409.08613	null
2024-09-13	CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian Splatting	Runze Chen et.al.	2409.08562	null
2024-09-12	FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally	Qiuhong Shen et.al.	2409.08270	link
2024-09-12	Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis	Qian Chen et.al.	2409.08042	link
2024-09-12	SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length	Bangya Liu et.al.	2409.07759	null
2024-09-11	Self-Evolving Depth-Supervised 3D Gaussian Splatting from Rendered Stereo Pairs	Sadra Safadoust et.al.	2409.07456	null
2024-09-11	Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models	Haibo Yang et.al.	2409.07452	link
2024-09-11	ThermalGaussian: Thermal 3D Gaussian Splatting	Rongfeng Lu et.al.	2409.07200	link
2024-09-10	GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction	Junyi Chen et.al.	2409.06685	null
2024-09-10	Sources of Uncertainty in 3D Scene Reconstruction	Marcus Klasson et.al.	2409.06407	link
2024-09-09	Lagrangian Hashing for Compressed Neural Field Representations	Shrisudhan Govindarajan et.al.	2409.05334	null
2024-09-08	GS-PT: Exploiting 3D Gaussian Splatting for Comprehensive Point Cloud Understanding via Self-supervised Learning	Keyi Liu et.al.	2409.04963	null
2024-09-11	Fisheye-GS: Lightweight and Extensible Gaussian Splatting Module for Fisheye Cameras	Zimu Liao et.al.	2409.04751	link
2024-09-06	GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers	Lorenza Prospero et.al.	2409.04196	link
2024-09-06	3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors	Yujun Huang et.al.	2409.04013	link
2024-09-05	LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors	Hanyang Yu et.al.	2409.03456	null
2024-09-05	Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction	Shen Chen et.al.	2409.03213	null
2024-09-04	Object Gaussian for Monocular 6D Pose Estimation from Sparse Views	Luqing Luo et.al.	2409.02581	null
2024-09-04	GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving	Huasong Han et.al.	2409.02382	null
2024-09-03	DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction	Jenny Seidenschwarz et.al.	2409.02104	null
2024-09-03	PRoGS: Progressive Rendering of Gaussian Splats	Brent Zoomers et.al.	2409.01761	null
2024-09-03	GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting	Zixuan Guo et.al.	2409.01581	null
2024-09-02	Free-DyGS: Camera-Pose-Free Scene Reconstruction based on Gaussian Splatting for Dynamic Surgical Videos	Qian Li et.al.	2409.01003	null
2024-09-06	3D Gaussian Splatting for Large-scale 3D Surface Reconstruction from Aerial Images	YuanZheng Wu et.al.	2409.00381	null
2024-08-30	OG-Mapping: Octree-based Structured 3D Gaussians for Online Dense Mapping	Meng Wang et.al.	2408.17223	null
2024-08-29	ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model	Fangfu Liu et.al.	2408.16767	null
2024-08-28	Towards Realistic Example-based Modeling via 3D Gaussian Stitching	Xinyu Gao et.al.	2408.15708	null
2024-08-27	Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty	Saining Zhang et.al.	2408.15242	link
2024-08-27	Learning-based Multi-View Stereo: A Survey	Fangjinhua Wang et.al.	2408.15235	null
2024-08-27	LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming	Yuang Shi et.al.	2408.14823	link
2024-08-26	Avatar Concept Slider: Manipulate Concepts In Your Human Avatar With Fine-grained Control	Yixuan He et.al.	2408.13995	null
2024-08-27	Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs	Brandon Smart et.al.	2408.13912	null
2024-08-25	TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers	Chuanrui Zhang et.al.	2408.13770	null
2024-08-25	SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting	Wenrui Li et.al.	2408.13711	link
2024-08-23	BiGS: Bidirectional Gaussian Primitives for Relightable 3D Gaussian Splatting	Zhenyuan Liu et.al.	2408.13370	null
2024-08-23	FLoD: Integrating Flexible Level of Detail into 3D Gaussian Splatting for Customizable Rendering	Yunji Seo et.al.	2408.12894	null
2024-08-26	GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF Fusion	Jiaxin Wei et.al.	2408.12677	link
2024-08-22	Subsurface Scattering for 3D Gaussian Splatting	Jan-Niklas Dihlmann et.al.	2408.12282	null
2024-08-21	Robust 3D Gaussian Splatting for Novel View Synthesis in Presence of Distractors	Paul Ungermann et.al.	2408.11697	link
2024-08-27	Pano2Room: Novel View Synthesis from a Single Indoor Panorama	Guo Pu et.al.	2408.11413	link
2024-08-20	GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting	Changkun Liu et.al.	2408.11085	link
2024-08-20	ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining	Qi Ma et.al.	2408.10906	null
2024-08-20	DEGAS: Detailed Expressions on Full-Body Gaussian Avatars	Zhijing Shao et.al.	2408.10588	link
2024-08-20	LoopSplat: Loop Closure by Registering 3D Gaussian Splats	Liyuan Zhu et.al.	2408.10154	link
2024-08-20	Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting	Sheng Ye et.al.	2408.09130	link
2024-08-16	Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS	Wei Sun et.al.	2408.08723	null
2024-08-15	WaterSplatting: Fast Underwater 3D Scene Reconstruction Using Gaussian Splatting	Huapeng Li et.al.	2408.08206	null
2024-08-19	FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering	Guofeng Feng et.al.	2408.07967	link
2024-08-14	3D Gaussian Editing with A Single Image	Guan Luo et.al.	2408.07540	null
2024-08-13	SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis	Saptarshi Neil Sinha et.al.	2408.06975	null
2024-08-12	Mipmap-GS: Let Gaussians Deform with Scale-specific Mipmap for Anti-aliasing Rendering	Jiameng Li et.al.	2408.06286	link
2024-08-12	Developing Smart MAVs for Autonomous Inspection in GPS-denied Constructions	Paoqiang Pan et.al.	2408.06030	null
2024-08-10	Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis	Zhongche Qu et.al.	2408.05635	null
2024-08-09	DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow	Hangyu Li et.al.	2408.05008	null
2024-08-08	InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting	Xin-Yi Yu et.al.	2408.04249	null
2024-08-07	Towards Real-Time Gaussian Splatting: Accelerating 3DGS through Photometric SLAM	Yan Song Hu et.al.	2408.03825	null
2024-08-07	Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields	Joo Chan Lee et.al.	2408.03822	null
2024-08-07	3iGS: Factorised Tensorial Illumination for 3D Gaussian Splatting	Zhe Jun Tang et.al.	2408.03753	link
2024-08-07	PRTGS: Precomputed Radiance Transfer of Gaussian Splats for Real-Time High-Quality Relighting	Yijia Guo et.al.	2408.03538	null
2024-08-02	A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness	Lutao Jiang et.al.	2408.01269	null
2024-08-02	Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion	Ke Li et.al.	2408.01225	link
2024-08-07	IG-SLAM: Instant Gaussian SLAM	F. Aykut Sarikamis et.al.	2408.01126	null
2024-08-01	LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting	Zhenyu Bao et.al.	2408.00254	null
2024-07-31	Localized Gaussian Splatting Editing with Contextual Awareness	Hanyuan Xiao et.al.	2408.00083	null
2024-07-31	Expressive Whole-Body 3D Gaussian Avatar	Gyeongsik Moon et.al.	2407.21686	null
2024-07-30	SceneTeller: Language-to-3D Scene Generation	Başak Melis Öcal et.al.	2407.20727	null
2024-07-29	Radiance Fields for Robotic Teleoperation	Maximum Wilder-Smith et.al.	2407.20194	link
2024-07-24	3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities	Yanqi Bao et.al.	2407.17418	link
2024-07-23	HDRSplat: Gaussian Splatting for High Dynamic Range 3D Scene Reconstruction from Raw Images	Shreyas Singh et.al.	2407.16503	link
2024-07-23	Integrating Meshes and 3D Gaussians for Indoor Scene Reconstruction with SAM Mask Guidance	Jiyeop Kim et.al.	2407.16173	null
2024-07-22	6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model	Matteo Bortolon et.al.	2407.15484	null
2024-07-22	Enhancement of 3D Gaussian Splatting using Raw Mesh for Photorealistic Recreation of Architectures	Ruizhe Wang et.al.	2407.15435	null
2024-07-21	HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions	Haiyang Zhou et.al.	2407.15187	null
2024-07-20	Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting	Tianle Zeng et.al.	2407.14846	null
2024-07-19	DirectL: Efficient Radiance Fields Rendering for 3D Light Field Displays	Zongyuan Yang et.al.	2407.14053	null
2024-07-20	Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation	Zongrui Li et.al.	2407.13584	link
2024-07-18	EaDeblur-GS: Event assisted 3D Deblur Reconstruction with Gaussian Splatting	Yuchen Weng et.al.	2407.13520	null
2024-07-17	Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections	Congrong Xu et.al.	2407.12306	null
2024-07-16	MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification	Zhuoxiao Li et.al.	2407.11840	null
2024-07-16	Click-Gaussian: Interactive Segmentation to Any 3D Gaussians	Seokhun Choi et.al.	2407.11793	null
2024-07-16	SlingBAG: Sliding ball adaptive growth algorithm with differentiable radiation enables super-efficient iterative 3D photoacoustic image reconstruction	Shuang Li et.al.	2407.11781	link
2024-07-16	Ev-GS: Event-based Gaussian splatting for Efficient and Accurate Radiance Field Rendering	Jingqian Wu et.al.	2407.11343	null
2024-07-14	3DEgo: 3D Editing on the Go!	Umar Khalid et.al.	2407.10102	null
2024-07-14	SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion	Jiyuan Zhang et.al.	2407.10062	null
2024-07-12	StyleSplat: 3D Object Style Transfer with Gaussian Splatting	Sahil Jain et.al.	2407.09473	null
2024-07-11	WildGaussians: 3D Gaussian Splatting in the Wild	Jonas Kulhanek et.al.	2407.08447	link
2024-07-11	Survey on Fundamental Deep Learning 3D Reconstruction Techniques	Yonge Bai et.al.	2407.08137	null
2024-07-17	MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition	Aggelina Chatziagapi et.al.	2407.07284	null
2024-07-09	Reference-based Controllable Scene Stylization with Gaussian Splatting	Yiqun Mei et.al.	2407.07220	null
2024-07-10	3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes	Nicolas Moenne-Loccoz et.al.	2407.07090	null
2024-07-07	PICA: Physics-Integrated Clothed Avatar	Bo Peng et.al.	2407.05324	null
2024-07-06	SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction	Weixing Xie et.al.	2407.05023	link
2024-07-12	Segment Any 4D Gaussians	Shengxiang Ji et.al.	2407.04504	null
2024-07-04	PFGS: High Fidelity Point Cloud Rendering via Feature Splatting	Jiaxu Wang et.al.	2407.03857	link
2024-07-04	SpikeGS: Reconstruct 3D scene via fast-moving bio-inspired sensors	Yijia Guo et.al.	2407.03771	null
2024-07-04	VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors	Sungwon Hwang et.al.	2407.02945	link
2024-07-03	Free-SurGS: SfM-Free 3D Gaussian Splatting for Surgical Scene Reconstruction	Jiaxin Guo et.al.	2407.02918	link
2024-07-04	AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction	Mustafa Khan et.al.	2407.02598	null
2024-07-02	TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation	Chaofan Luo et.al.	2407.02034	null
2024-07-01	GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting	Chenxin Li et.al.	2407.01301	null
2024-07-02	RTGS: Enabling Real-Time Gaussian Splatting on Mobile Devices Using Efficiency-Guided Pruning and Foveated Rendering	Weikai Lin et.al.	2407.00435	link
2024-06-29	OccFusion: Rendering Occluded Humans with Generative Diffusion Priors	Adam Sun et.al.	2407.00316	null
2024-06-28	SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting	Sara Sabour et.al.	2406.20055	null
2024-06-28	EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting	Daiwei Zhang et.al.	2406.19811	null
2024-06-27	Lightweight Predictive 3D Gaussian Splats	Junli Cao et.al.	2406.19434	link
2024-06-26	On Scaling Up 3D Gaussian Splatting Training	Hexu Zhao et.al.	2406.18533	link
2024-06-26	GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality	Taoran Yi et.al.	2406.18462	null
2024-06-26	Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning	Muhammad Salman Ali et.al.	2406.18214	link
2024-06-26	GS-Octree: Octree-based 3D Gaussian Splatting for Robust Object-level 3D Reconstruction Under Strong Lighting	Jiaze Li et.al.	2406.18199	null
2024-06-25	NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods	Jonas Kulhanek et.al.	2406.17345	null
2024-06-24	Reducing the Memory Footprint of 3D Gaussian Splatting	Panagiotis Papantonakis et.al.	2406.17074	null
2024-06-23	LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction	Hengyu Liu et.al.	2406.16073	link
2024-06-23	Learning with Noisy Ground Truth: From 2D Classification to 3D Reconstruction	Yangdi Lu et.al.	2406.15982	null
2024-06-21	Taming 3DGS: High-Quality Radiance Fields with Limited Resources	Saswat Subhajyoti Mallick et.al.	2406.15643	link
2024-06-21	Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks	Alex Quach et.al.	2406.15149	null
2024-06-18	Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models	Paul Henderson et.al.	2406.13099	null
2024-06-18	HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors	Panwang Pan et.al.	2406.12459	link
2024-06-17	A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets	Bernhard Kerbl et.al.	2406.12080	null
2024-06-22	RetinaGS: Scalable Training for Dense Scene Rendering with Billion-Scale 3D Gaussians	Bingling Li et.al.	2406.11836	null
2024-06-18	Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting	Junha Hyung et.al.	2406.11672	null
2024-06-14	Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections	Jiacong Xu et.al.	2406.10373	null
2024-06-14	L4GM: Large 4D Gaussian Reconstruction Model	Jiawei Ren et.al.	2406.10324	null
2024-06-14	PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting	Alex Hanson et.al.	2406.10219	link
2024-06-14	GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors	Xiqian Yu et.al.	2406.10111	null
2024-06-14	Unified Gaussian Primitives for Scene Representation and Rendering	Yang Zhou et.al.	2406.09733	null
2024-06-13	Modeling Ambient Scene Dynamics for Free-view Synthesis	Meng-Li Shih et.al.	2406.09395	null
2024-06-13	GGHead: Fast and Generalizable 3D Gaussian Heads	Tobias Kirschstein et.al.	2406.09377	null
2024-06-13	Gaussian-Forest: Hierarchical-Hybrid 3D Gaussian Splatting for Compressed Scene Modeling	Fengyi Zhang et.al.	2406.08759	null
2024-06-12	ICE-G: Image Conditional Editing of 3D Gaussian Splats	Vishnu Jaganathan et.al.	2406.08488	null
2024-06-12	Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models	Yuxuan Xue et.al.	2406.08475	null
2024-06-12	From Chaos to Clarity: 3DGS in the Dark	Zhihao Li et.al.	2406.08300	null
2024-06-11	Trim 3D Gaussian Splatting for Accurate Geometry Representation	Lue Fan et.al.	2406.07499	null
2024-06-11	Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field	Chao Wang et.al.	2406.07329	null
2024-06-10	GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation	Haozhe Xie et.al.	2406.06526	link
2024-06-10	PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction	Danpeng Chen et.al.	2406.06521	null
2024-06-10	MVGamba: Unify 3D Content Generation as State Space Sequence Modeling	Xuanyu Yi et.al.	2406.06367	link
2024-06-10	Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis	Xin Jin et.al.	2406.06216	link
2024-06-09	RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering	Rui Zhang et.al.	2406.05852	null
2024-06-09	VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction	Hanlin Chen et.al.	2406.05774	null
2024-06-06	A Survey on 3D Human Avatar Modeling – From Reconstruction to Generation	Ruihe Wang et.al.	2406.04253	null
2024-06-06	Localized Gaussian Point Management	Haosen Yang et.al.	2406.04251	null
2024-06-06	Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction	Diwen Wan et.al.	2406.03697	link
2024-06-10	Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion	Tianyi Xiong et.al.	2406.02972	null
2024-06-05	Adversarial Generation of Hierarchical Gaussians for 3D Generative Model	Sangeek Hyun et.al.	2406.02968	link
2024-06-04	3D-HGS: 3D Half-Gaussian Splatting	Haolin Li et.al.	2406.02720	link
2024-06-06	Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting	Inkyu Shin et.al.	2406.02541	null
2024-06-04	SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition	Van Minh Nguyen et.al.	2406.02533	null
2024-06-04	DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering	Zhongpai Gao et.al.	2406.02518	null
2024-06-04	WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections	Yuze Wang et.al.	2406.02407	null
2024-06-04	Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning	Jiaxu Wang et.al.	2406.02370	null
2024-06-04	OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding	Yanmin Wu et.al.	2406.02058	null
2024-06-04	FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping	Yuzhou Ji et.al.	2406.01916	null
2024-06-03	Tetrahedron Splatting for 3D Generation	Chun Gu et.al.	2406.01579	link
2024-06-03	DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors	Tianyu Huang et.al.	2406.01476	link
2024-06-03	RaDe-GS: Rasterizing Depth in Gaussian Splatting	Baowen Zhang et.al.	2406.01467	link
2024-05-31	ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model	Yufei Wang et.al.	2405.20721	link
2024-05-31	R $^2$ -Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction	Ruyi Zha et.al.	2405.20693	link
2024-05-30	$\textit{S}^3$ Gaussian: Self-Supervised Street Gaussians for Autonomous Driving	Nan Huang et.al.	2405.20323	link
2024-06-03	A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction	Jianghao Shen et.al.	2405.20310	null
2024-05-29	EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images	Wangbo Yu et.al.	2405.20224	null
2024-05-30	Object-centric Reconstruction and Tracking of Dynamic Unknown Objects using 3D Gaussian Splatting	Kuldeep R Barad et.al.	2405.20104	null
2024-05-30	GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction	Haodong Xiang et.al.	2405.19671	null
2024-05-30	Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian	Wei Sun et.al.	2405.19657	null
2024-05-30	TAMBRIDGE: Bridging Frame-Centered Tracking and 3D Gaussian Splatting for Enhanced SLAM	Peifeng Jiang et.al.	2405.19614	null
2024-05-29	NPGA: Neural Parametric Gaussian Avatars	Simon Giebenhain et.al.	2405.19331	null
2024-05-29	LP-3DGS: Learning to Prune 3D Gaussian Splatting	Zhaoliang Zhang et.al.	2405.18784	link
2024-05-28	A Grid-Free Fluid Solver based on Gaussian Spatial Representation	Jingrui Xing et.al.	2405.18133	null
2024-05-28	FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes	Yunsong Wang et.al.	2405.17958	link
2024-05-28	A Refined 3D Gaussian Representation for High-Quality Dynamic Scene Reconstruction	Bin Zhang et.al.	2405.17891	null
2024-05-29	HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction	Haoyu Zhao et.al.	2405.17872	link
2024-05-30	Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting	Shuojue Yang et.al.	2405.17835	link
2024-05-28	Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh	Xiangjun Gao et.al.	2405.17811	null
2024-05-28	SafeguardGS: 3D Gaussian Primitive Pruning While Avoiding Catastrophic Scene Destruction	Yongjae Lee et.al.	2405.17793	link
2024-05-29	DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos	Linhan Wang et.al.	2405.17705	link
2024-05-27	GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane	Yansong Qu et.al.	2405.17596	null
2024-05-27	DOF-GS: Adjustable Depth-of-Field 3D Gaussian Splatting for Refocusing,Defocus Rendering and Blur Removal	Yujie Wang et.al.	2405.17351	null
2024-05-27	Memorize What Matters: Emergent Scene Decomposition from Multitraverse	Yiming Li et.al.	2405.17187	link
2024-05-28	F-3DGS: Factorized Coordinates and Representations for 3D Gaussian Splatting	Xiangyu Sun et.al.	2405.17083	null
2024-05-28	SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain	Butian Xiong et.al.	2405.16923	null
2024-05-28	PyGS: Large-scale Scene Representation with Pyramidal 3D Gaussian Splatting	Zipeng Wang et.al.	2405.16829	null
2024-05-26	Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians	Erik Sandström et.al.	2405.16544	link
2024-05-24	Feature Splatting for Better Novel View Synthesis with Low Overlap	T. Berriel Martins et.al.	2405.15518	link
2024-05-24	GSDeformer: Direct Cage-based Deformation for 3D Gaussian Splatting	Jiajun Huang et.al.	2405.15491	null
2024-05-27	HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting	Yuanhao Cai et.al.	2405.15125	link
2024-05-24	GS-Hider: Hiding Messages into 3D Gaussian Splatting	Xuanyu Zhang et.al.	2405.15118	null
2024-05-23	TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing	Teng Xu et.al.	2405.14455	null
2024-05-24	RoGS: Large Scale Road Surface Reconstruction based on 2D Gaussian Splatting	Zhiheng Feng et.al.	2405.14342	link
2024-05-22	DoGaussian: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus	Yu Chen et.al.	2405.13943	link
2024-05-22	Gaussian Time Machine: A Real-Time Rendering Methodology for Time-Variant Appearances	Licheng Shen et.al.	2405.13694	null
2024-05-21	Gaussian Control with Hierarchical Semantic Graphs in 3D Human Recovery	Hongsheng Wang et.al.	2405.12477	null
2024-05-20	GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details	Boqian Li et.al.	2405.12420	link
2024-05-22	AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field	Rong Liu et.al.	2405.12369	link
2024-05-20	Embracing Radiance Field Rendering in 6G: Over-the-Air Training and Inference with 3D Contents	Guanlin Wu et.al.	2405.12155	null
2024-05-20	CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization	Jiawei Zhang et.al.	2405.12110	link
2024-05-21	Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping	Tianhao Wu et.al.	2405.12069	null
2024-05-20	MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections	Jiayue Liu et.al.	2405.11921	null
2024-05-18	Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching	Xingyu Miao et.al.	2405.11252	link
2024-05-18	MotionGS : Compact Gaussian Splatting SLAM by Motion Filter	Xinli Guo et.al.	2405.11129	link
2024-05-17	Photorealistic 3D Urban Scene Reconstruction and Point Cloud Extraction using Google Earth Imagery and Gaussian Splatting	Kyle Gao et.al.	2405.11021	null
2024-05-17	ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation	Pengzhi Li et.al.	2405.10508	null
2024-05-16	GS-Planner: A Gaussian-Splatting-based Planning Framework for Active High-Fidelity Reconstruction	Rui Jin et.al.	2405.10142	null
2024-05-11	Direct Learning of Mesh and Appearance via 3D Gaussian Splatting	Ancheng Lin et.al.	2405.06945	null
2024-05-10	I3DGS: Improve 3D Gaussian Splatting from Multiple Dimensions	Jinwei Lin et.al.	2405.06408	null
2024-05-09	DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation	Sitian Shen et.al.	2405.05800	null
2024-05-09	FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting	Yikun Ma et.al.	2405.05768	null
2024-05-18	NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap	Mingrui Li et.al.	2405.05702	null
2024-05-09	Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview	Yuhang Ming et.al.	2405.05526	null
2024-05-08	GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields	Yuanhao Gong et.al.	2405.05446	null
2024-05-06	A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose	Kaiwen Jiang et.al.	2405.03659	null
2024-05-03	HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2	Miriam Jäger et.al.	2405.02005	null
2024-05-01	Spectrally Pruned Gaussian Fields with Neural Compensation	Runyi Yang et.al.	2405.00676	link
2024-04-30	GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting	Kai Zhang et.al.	2404.19702	null
2024-04-29	SAGS: Structure-Aware 3D Gaussian Splatting	Evangelos Ververas et.al.	2404.19149	null
2024-04-29	MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing	Cong Wang et.al.	2404.19026	null
2024-04-29	DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing	Minghao Chen et.al.	2404.18929	null
2024-04-29	Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting	Yifei Gao et.al.	2404.18669	null
2024-04-29	3D Gaussian Splatting with Deferred Reflection	Keyang Ye et.al.	2404.18454	link
2024-04-29	Reconstructing Satellites in 3D from Amateur Telescope Images	Zhiming Chang et.al.	2404.18394	null
2024-04-26	SLAM for Indoor Mapping of Wide Area Construction Environments	Vincent Ress et.al.	2404.17215	null
2024-04-25	GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting	Kyusun Cho et.al.	2404.16012	link
2024-04-25	OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation	Lizhi Wang et.al.	2404.15891	link
2024-04-22	Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses	Inhee Lee et.al.	2404.14410	null
2024-04-22	CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding	Guibiao Liao et.al.	2404.14249	link
2024-04-28	GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting	Hongyun Yu et.al.	2404.14037	null
2024-04-21	GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal	Yuxin Wang et.al.	2404.13679	null
2024-04-19	Learn2Talk: 3D Talking Face Learns from 2D Talking Face	Yixiang Zhuang et.al.	2404.12888	null
2024-04-19	EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation	Wenkai Liu et.al.	2404.12777	null
2024-04-22	Does Gaussian Splatting need SFM Initialization?	Yalda Foroutan et.al.	2404.12547	null
2024-04-22	Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos	Isabella Liu et.al.	2404.12379	null
2024-04-17	RainyScape: Unsupervised Rainy Scene Reconstruction using Decoupled Neural Rendering	Xianqiang Lyu et.al.	2404.11401	null
2024-04-18	DeblurGS: Gaussian Splatting for Camera Motion Blur	Jeongtaek Oh et.al.	2404.11358	null
2024-04-17	Novel View Synthesis for Cinematic Anatomy on Mobile and Immersive Displays	Simon Niedermayr et.al.	2404.11285	null
2024-04-16	Gaussian Opacity Fields: Efficient and Compact Surface Reconstruction in Unbounded Scenes	Zehao Yu et.al.	2404.10772	null
2024-04-16	Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks	Florian Barthel et.al.	2404.10625	null
2024-04-16	AbsGS: Recovering Fine Details for 3D Gaussian Splatting	Zongxin Ye et.al.	2404.10484	null
2024-04-16	SRGS: Super-Resolution 3D Gaussian Splatting	Xiang Feng et.al.	2404.10318	link
2024-04-15	LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives	Jiadi Cui et.al.	2404.09748	null
2024-04-15	3D Gaussian Splatting as Markov Chain Monte Carlo	Shakiba Kheradmand et.al.	2404.09591	null
2024-04-16	LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field	Jiyang Li et.al.	2404.08966	link
2024-04-15	OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering	Jingrui Ye et.al.	2404.08449	null
2024-04-10	RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion	Jaidev Shriram et.al.	2404.07199	null
2024-04-10	Gaussian-LIC: Photo-realistic LiDAR-Inertial-Camera SLAM with 3D Gaussian Splatting	Xiaolei Lang et.al.	2404.06926	null
2024-04-10	SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection	Mathis Kruse et.al.	2404.06832	link
2024-04-12	SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera	Gaole Dai et.al.	2404.06710	null
2024-04-14	3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis	Zhicheng Lu et.al.	2404.06270	null
2024-04-09	Gaussian Pancakes: Geometrically-Regularized 3D Gaussian Splatting for Realistic Endoscopic Reconstruction	Sierra Bonilla et.al.	2404.06128	link
2024-04-09	Revising Densification in Gaussian Splatting	Samuel Rota Bulò et.al.	2404.06109	null
2024-04-09	Hash3D: Training-free Acceleration for 3D Generation	Xingyi Yang et.al.	2404.06091	link
2024-04-08	StylizedGS: Controllable Stylization for 3D Gaussian Splatting	Dingxi Zhang et.al.	2404.05220	null
2024-04-06	Z-Splat: Z-Axis Gaussian Splatting for Camera-Sonar Fusion	Ziyuan Qu et.al.	2404.04687	link
2024-04-05	Robust Gaussian Splatting	François Darmon et.al.	2404.04211	null
2024-04-04	Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting	Jeongmin Bae et.al.	2404.03613	null
2024-04-08	OmniGS: Omnidirectional Gaussian Splatting for Fast Radiance Field Reconstruction using Omnidirectional Images	Longwei Li et.al.	2404.03202	link
2024-04-03	TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Surrounding Autonomous Driving Scenes	Cheng Zhao et.al.	2404.02410	null
2024-04-01	Mirror-3DGS: Incorporating Mirror Reflections into 3D Gaussian Splatting	Jiarui Meng et.al.	2404.01168	null
2024-04-07	CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians	Yang Liu et.al.	2404.01133	link
2024-04-01	MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements	Lisong C. Sun et.al.	2404.00923	null
2024-03-30	3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting	Xiaoyang Lyu et.al.	2404.00409	null
2024-03-29	InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds	Zhiwen Fan et.al.	2403.20309	link
2024-03-29	Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces	Mauro Comi et.al.	2403.20275	null
2024-03-29	HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes	Ke Wu et.al.	2403.20159	null
2024-03-29	SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior	Zhongrui Yu et.al.	2403.20079	null
2024-03-29	HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes	Zhuopeng Li et.al.	2403.20032	null
2024-03-28	GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling	Bowen Zhang et.al.	2403.19655	null
2024-03-28	GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond	Chongjie Ye et.al.	2403.19632	link
2024-03-28	CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians	Avinash Paliwal et.al.	2403.19495	link
2024-03-29	Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction	Qiuhong Shen et.al.	2403.18795	link
2024-03-26	Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians	Kerui Ren et.al.	2403.17898	link
2024-03-26	2D Gaussian Splatting for Geometrically Accurate Radiance Fields	Binbin Huang et.al.	2403.17888	link
2024-03-26	DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing	Matias Turkulainen et.al.	2403.17822	link
2024-03-25	GSDF: 3DGS Meets SDF for Improved Rendering and Reconstruction	Mulin Yu et.al.	2403.16964	null
2024-03-23	Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections	Dongbin Zhang et.al.	2403.15704	null
2024-03-22	Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting	Jun Guo et.al.	2403.15624	null
2024-03-22	Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting	Zheng Zhang et.al.	2403.15530	null
2024-03-22	STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians	Yifei Zeng et.al.	2403.14939	null
2024-03-21	MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images	Yuedong Chen et.al.	2403.14627	link
2024-03-21	Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering	Antoine Guédon et.al.	2403.14554	null
2024-03-21	HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression	Yihang Chen et.al.	2403.14530	link
2024-03-21	Isotropic Gaussian Splatting for Real-Time Radiance Field Rendering	Yuanhao Gong et.al.	2403.14244	null
2024-03-19	GVGEN: Text-to-3D Generation with Volumetric Representation	Xianglong He et.al.	2403.12957	null
2024-03-19	HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting	Hongyu Zhou et.al.	2403.12722	null
2024-03-22	RGBD GS-ICP SLAM	Seongbo Ha et.al.	2403.12550	link
2024-03-19	High-Fidelity SLAM Using Gaussian Splatting with Rendering-Guided Densification and Regularized Optimization	Shuo Sun et.al.	2403.12535	link
2024-03-20	View-Consistent 3D Editing with Gaussian Splatting	Yuxuan Wang et.al.	2403.11868	null
2024-03-19	BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting	Lingzhe Zhao et.al.	2403.11831	link
2024-03-18	NEDS-SLAM: A Novel Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian Splatting	Yiming Ji et.al.	2403.11679	null
2024-03-20	GaussNav: Gaussian Splatting for Visual Navigation	Xiaohan Lei et.al.	2403.11625	link
2024-03-18	3DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration	Quentin Herau et.al.	2403.11577	null
2024-03-18	Fed3DGS: Scalable 3D Gaussian Splatting with Federated Learning	Teppei Suzuki et.al.	2403.11460	link
2024-03-18	Bridging 3D Gaussian and Mesh for Freeview Video Rendering	Yuting Xiao et.al.	2403.11453	null
2024-03-18	Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction	Zhiyang Guo et.al.	2403.11447	null
2024-03-18	BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion Priors	Tingyang Zhang et.al.	2403.11427	null
2024-03-18	Beyond Uncertainty: Risk-Aware Active View Acquisition for Safe Robot Navigation and 3D Scene Understanding with FisherRF	Guangyi Liu et.al.	2403.11396	null
2024-03-17	3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization	Peng Jiang et.al.	2403.11367	null
2024-03-17	Compact 3D Gaussian Splatting For Dense Visual SLAM	Tianchen Deng et.al.	2403.11247	link
2024-03-15	SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians	Hiba Dahmani et.al.	2403.10427	null
2024-03-15	GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time	Hao Li et.al.	2403.10147	null
2024-03-15	Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing	Tian-Xing Xu et.al.	2403.10050	null
2024-03-14	Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting	Aiden Swann et.al.	2403.09875	null
2024-03-14	GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping	Yuhang Zheng et.al.	2403.09637	link
2024-03-14	Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting	Jaewoo Jung et.al.	2403.09413	link
2024-03-14	A New Split Algorithm for 3D Gaussian Splatting	Qiyuan Feng et.al.	2403.09143	null
2024-03-14	GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing	Jing Wu et.al.	2403.08733	link
2024-03-13	Gaussian Splatting in Style	Abhishek Saroha et.al.	2403.08498	null
2024-03-12	StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting	Kunhao Liu et.al.	2403.07807	null
2024-03-13	DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization	Jiahe Li et.al.	2403.06912	link
2024-03-11	FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization	Jiahui Zhang et.al.	2403.06908	null
2024-03-07	Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis	Yuanhao Cai et.al.	2403.04116	link
2024-02-29	3D Gaussian Model for Animation and Texturing	Xiangzhi Eric Wang et.al.	2402.19441	null
2024-02-27	VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction	Jiaqi Lin et.al.	2402.17427	null
2024-02-24	Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting	Ziyi Yang et.al.	2402.15870	null
2024-02-22	GaussianPro: 3D Gaussian Splatting with Progressive Propagation	Kai Cheng et.al.	2402.14650	null
2024-02-21	Identifying Unnecessary 3D Gaussians using Clustering for Fast Rendering of 3D Gaussian Splatting	Joongho Jo et.al.	2402.13827	null
2024-02-20	How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey	Fabio Tosi et.al.	2402.13255	link
2024-02-15	GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering	Abdullah Hamdi et.al.	2402.10128	link
2024-02-11	3D Gaussian as a New Vision Era: A Survey	Ben Fei et.al.	2402.07181	null
2024-02-13	GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data	Haoyuan Li et.al.	2402.06198	null
2024-02-09	HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting	Zhenglin Zhou et.al.	2402.06149	link
2024-02-06	Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos	Alfredo Rivero et.al.	2402.03723	null
2024-02-07	4D Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes	Yuanxing Duan et.al.	2402.03307	link
2024-02-01	360-GS: Layout-guided Panoramic Gaussian Splatting For Indoor Roaming	Jiayang Bai et.al.	2402.00763	null

Text-to-Video

Publish Date	Title	Authors	PDF	Code
2025-07-23	Yume: An Interactive World Generation Model	Xiaofeng Mao et.al.	2507.17744	null
2025-07-23	EndoGen: Conditional Autoregressive Endoscopic Video Generation	Xinyu Liu et.al.	2507.17388	null
2025-07-22	Controllable Video Generation: A Survey	Yue Ma et.al.	2507.16869	null
2025-07-22	MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation	Yanchen Liu et.al.	2507.16310	null
2025-07-22	PUSA V1.0: Surpassing Wan-I2V with $500 Training Cost by Vectorized Timestep Adaptation	Yaofang Liu et.al.	2507.16116	null
2025-07-21	Can Your Model Separate Yolks with a Water Bottle? Benchmarking Physical Commonsense Understanding in Video Generation Models	Enes Sanli et.al.	2507.15824	null
2025-07-21	TokensGen: Harnessing Condensed Tokens for Long Video Generation	Wenqi Ouyang et.al.	2507.15728	null
2025-07-21	Conditional Video Generation for High-Efficiency Video Compression	Fangqiu Yi et.al.	2507.15269	null
2025-07-19	BusterX++: Towards Unified Cross-Modal AI-Generated Content Detection and Explanation with MLLM	Haiquan Wen et.al.	2507.14632	null
2025-07-17	$\nabla$ NABLA: Neighborhood Adaptive Block-Level Attention	Dmitrii Mikhailov et.al.	2507.13546	null
2025-07-17	“PhyWorldBench”: A Comprehensive Evaluation of Physical Realism in Text-to-Video Models	Jing Gu et.al.	2507.13428	null
2025-07-17	Taming Diffusion Transformer for Real-Time Mobile Video Generation	Yushu Wu et.al.	2507.13343	null
2025-07-17	Leveraging Pre-Trained Visual Models for AI-Generated Video Detection	Keerthi Veeramachaneni et.al.	2507.13224	null
2025-07-17	LoViC: Efficient Long Video Generation with Context Compression	Jiaxiu Jiang et.al.	2507.12952	null
2025-07-17	World Model-Based End-to-End Scene Generation for Accident Anticipation in Autonomous Driving	Yanchen Guan et.al.	2507.12762	null
2025-07-15	NarrLV: Towards a Comprehensive Narrative-Centric Evaluation for Long Video Generation Models	X. Feng et.al.	2507.11245	null
2025-07-14	Flows and Diffusions on the Neural Manifold	Daniel Saragih et.al.	2507.10623	null
2025-07-12	$I^{2}$ -World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting	Zhimin Liao et.al.	2507.09144	null
2025-07-11	Taming generative video models for zero-shot optical flow extraction	Seungwoo Kim et.al.	2507.09082	null
2025-07-11	Detecting Deepfake Talking Heads from Facial Biometric Anomalies	Justin D. Norman et.al.	2507.08917	null
2025-07-11	Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective	Hangjie Yuan et.al.	2507.08801	null
2025-07-11	Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers	Wongi Jeong et.al.	2507.08422	null
2025-07-14	M2DAO-Talker: Harmonizing Multi-granular Motion Decoupling and Alternating Optimization for Talking-head Generation	Kui Jiang et.al.	2507.08307	null
2025-07-10	Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling	Haoyu Wu et.al.	2507.07982	null
2025-07-10	Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions	Longfei Li et.al.	2507.07978	null
2025-07-10	Scaling RL to Long Videos	Yukang Chen et.al.	2507.07966	null
2025-07-11	T-GVC: Trajectory-Guided Generative Video Coding at Ultra-Low Bitrates	Zhitao Wang et.al.	2507.07633	null
2025-07-09	A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality	Mohamed Elmoghany et.al.	2507.07202	null
2025-07-09	Physics-Grounded Motion Forecasting via Equation Discovery for Trajectory-Guided Image-to-Video Generation	Tao Feng et.al.	2507.06830	null
2025-07-14	Democratizing High-Fidelity Co-Speech Gesture Video Generation	Xu Yang et.al.	2507.06812	null
2025-07-09	PromptTea: Let Prompts Tell TeaCache the Optimal Threshold	Zishen Huang et.al.	2507.06739	null
2025-07-09	FIFA: Unified Faithfulness Evaluation Framework for Text-to-Video and Video-to-Text Generation	Liqiang Jing et.al.	2507.06523	null
2025-07-08	Bridging Sequential Deep Operator Network and Video Diffusion: Residual Refinement of Spatio-Temporal PDE Solutions	Jaewan Park et.al.	2507.06133	null
2025-07-09	Omni-Video: Democratizing Unified Video Understanding and Generation	Zhiyu Tan et.al.	2507.06119	null
2025-07-09	Tora2: Motion and Appearance Customized Diffusion Transformer for Multi-Entity Video Generation	Zhenghao Zhang et.al.	2507.05963	null
2025-07-08	MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos	Rongsheng Wang et.al.	2507.05675	null
2025-07-07	HV-MMBench: Benchmarking MLLMs for Human-Centric Video Understanding	Yuxuan Cai et.al.	2507.04909	null
2025-07-07	Music2Palette: Emotion-aligned Color Palette Generation via Cross-Modal Representation Learning	Jiayun Hu et.al.	2507.04758	null
2025-07-07	Identity-Preserving Text-to-Video Generation Guided by Simple yet Effective Spatial-Temporal Decoupled Representations	Yuji Wang et.al.	2507.04705	null
2025-07-06	MambaVideo for Discrete Video Tokenization with Channel-Split Quantization	Dawit Mureja Argaw et.al.	2507.04559	null
2025-07-06	CLIP-RL: Surgical Scene Segmentation Using Contrastive Language-Vision Pretraining & Reinforcement Learning	Fatmaelzahraa Ali Ahmed et.al.	2507.04317	null
2025-07-05	PresentAgent: Multimodal Agent for Presentation Video Generation	Jingwei Shi et.al.	2507.04036	null
2025-07-05	EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation	Rang Meng et.al.	2507.03905	null
2025-07-08	StreamDiT: Real-Time Streaming Text-to-Video Generation	Akio Kodaira et.al.	2507.03745	null
2025-07-03	RefTok: Reference-Based Tokenization for Video Generation	Xiang Fan et.al.	2507.02862	null
2025-07-03	Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching	Xin Zhou et.al.	2507.02860	null
2025-07-03	AnyI2V: Animating Any Conditional Image with Motion Control	Ziye Li et.al.	2507.02857	null
2025-07-03	Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation	François Rozet et.al.	2507.02608	null
2025-07-02	LongAnimation: Long Animation Generation with Dynamic Global-Local Memory	Nan Chen et.al.	2507.01945	null
2025-07-02	SD-Acc: Accelerating Stable Diffusion through Phase-aware Sampling and Hardware Co-Optimizations	Zhican Wang et.al.	2507.01309	null
2025-07-02	LLM-based Realistic Safety-Critical Driving Video Generation	Yongjie Fu et.al.	2507.01264	null
2025-07-02	AIGVE-MACS: Unified Multi-Aspect Commenting and Scoring Model for AI-Generated Video Evaluation	Xiao Liu et.al.	2507.01255	null
2025-07-01	Geometry-aware 4D Video Generation for Robot Manipulation	Zeyi Liu et.al.	2507.01099	null
2025-07-01	Populate-A-Scene: Affordance-Aware Human Video Generation	Mengyi Shan et.al.	2507.00334	null
2025-06-30	FreeLong++: Training-Free Long Video Generation via Multi-band SpectralFusion	Yu Lu et.al.	2507.00162	null
2025-06-30	Epona: Autoregressive Diffusion World Model for Autonomous Driving	Kaiwen Zhang et.al.	2506.24113	null
2025-06-30	VMoBA: Mixture-of-Block Attention for Video Diffusion Models	Jianzong Wu et.al.	2506.23858	null
2025-07-03	RGC-VQA: An Exploration Database for Robotic-Generated Video Quality Assessment	Jianing Jin et.al.	2506.23852	null
2025-06-30	SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation	Shuai Tan et.al.	2506.23690	null
2025-06-30	ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models	Zixun Fang et.al.	2506.23513	null
2025-06-29	Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis	Lei-lei Li et.al.	2506.23263	null
2025-06-29	RoboScape: Physics-informed Embodied World Model	Yu Shang et.al.	2506.23135	null
2025-07-01	Listener-Rewarded Thinking in VLMs for Image Preferences	Alexander Gambashidze et.al.	2506.22832	null
2025-06-27	RoboEnvision: A Long-Horizon Video Generation Model for Multi-Task Robot Manipulation	Liudi Yang et.al.	2506.22007	null
2025-06-26	SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture	Kehan Sui et.al.	2506.21478	null
2025-06-27	ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models	Hongbo Liu et.al.	2506.21356	null
2025-06-26	HieraSurg: Hierarchy-Aware Diffusion Model for Surgical Video Generation	Diego Biagini et.al.	2506.21287	null
2025-06-26	Video Virtual Try-on with Conditional Diffusion Transformer Inpainter	Cheng Zou et.al.	2506.21270	null
2025-06-27	DFVEdit: Conditional Delta Flow Vector for Zero-shot Video Editing	Lingling Cai et.al.	2506.20967	null
2025-06-26	Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models	Donggoo Kang et.al.	2506.20946	null
2025-06-25	Video Perception Models for 3D Scene Synthesis	Rui Huang et.al.	2506.20601	null
2025-06-25	BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos	Jiahao Lin et.al.	2506.20103	null
2025-06-24	Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation	Xingyang Li et.al.	2506.19852	null
2025-06-24	GenHSI: Controllable Generation of Human-Scene Interaction Videos	Zekun Li et.al.	2506.19840	null
2025-06-24	SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution	Liangbin Xie et.al.	2506.19838	null
2025-06-24	Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router	Yubo Huang et.al.	2506.19833	null
2025-06-24	Training-Free Motion Customization for Distilled Video Generators with Adaptive Test-Time Distillation	Jintao Rong et.al.	2506.19348	null
2025-06-23	VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory	Runjia Li et.al.	2506.18903	null
2025-06-23	From Virtual Games to Real-World Play	Wenqiang Sun et.al.	2506.18901	null
2025-06-23	FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation	Kaiyi Huang et.al.	2506.18899	null
2025-06-23	MinD: Unified Visual Imagination and Control via Hierarchical World Models	Xiaowei Chi et.al.	2506.18897	null
2025-06-23	OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation	Qijun Gan et.al.	2506.18866	null
2025-06-23	Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset	Zhuowei Chen et.al.	2506.18851	null
2025-06-23	Matrix-Game: Interactive World Foundation Model	Yifan Zhang et.al.	2506.18701	null
2025-06-23	RDPO: Real Data Preference Optimization for Physics Consistency Video Generation	Wenxu Qian et.al.	2506.18655	null
2025-06-23	BulletGen: Improving 4D Reconstruction with Bullet-Time Generation	Denys Rozumnyi et.al.	2506.18601	null
2025-06-23	VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning	Xuanyu Zhang et.al.	2506.18564	null
2025-06-23	Emergent Temporal Correspondences from Video Diffusion Transformers	Jisu Nam et.al.	2506.17220	link
2025-06-20	Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition	Jiaqi Li et.al.	2506.17201	null
2025-06-20	Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation	Riccardo Corvi et.al.	2506.16802	null
2025-06-19	VideoGAN-based Trajectory Proposal for Automated Vehicles	Annajoyce Mariani et.al.	2506.16209	link
2025-06-19	FastInit: Fast Noise Initialization for Temporally Consistent Video Generation	Chengyu Bai et.al.	2506.16119	null
2025-06-19	PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models	Tianchen Zhao et.al.	2506.16054	null
2025-06-19	Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization	Cong Wang et.al.	2506.15980	link
2025-06-20	Sekai: A Video Dataset towards World Exploration	Zhen Li et.al.	2506.15675	null
2025-06-20	Show-o2: Improved Native Unified Multimodal Models	Jinheng Xie et.al.	2506.15564	link
2025-06-17	Causally Steered Diffusion for Automated Video Counterfactual Generation	Nikos Spyrou et.al.	2506.14404	link
2025-06-17	CausalDiffTab: Mixed-Type Causal-Aware Diffusion for Tabular Data Generation	Jia-Chen Zhang et.al.	2506.14206	null
2025-06-18	VideoMAR: Autoregressive Video Generatio with Continuous Tokens	Hu Yu et.al.	2506.14168	null
2025-06-16	UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions	Zhucun Xue et.al.	2506.13691	null
2025-06-16	STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation	Jiamin Wang et.al.	2506.13138	null
2025-06-15	iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer	Zhelun Shen et.al.	2506.12847	null
2025-06-13	SignAligner: Harmonizing Complementary Pose Modalities for Coherent Sign Language Generation	Xu Wang et.al.	2506.11621	null
2025-06-12	GenWorld: Towards Detecting AI-generated Real-world Simulation Videos	Weiliang Chen et.al.	2506.10975	null
2025-06-12	M4V: Multi-Modal Mamba for Text-to-Video Generation	Jiancheng Huang et.al.	2506.10915	null
2025-06-12	GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning	Xiaoyi Bao et.al.	2506.10639	null
2025-06-12	DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers	Lizhen Wang et.al.	2506.10568	null
2025-06-12	AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation	Haoyuan Shi et.al.	2506.10540	null
2025-06-11	PlayerOne: Egocentric World Simulator	Yuanpeng Tu et.al.	2506.09995	null
2025-06-11	InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions	Zhenzhi Wang et.al.	2506.09984	null
2025-06-11	ReSim: Reliable World Simulation for Autonomous Driving	Jiazhi Yang et.al.	2506.09981	null
2025-06-11	DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning	Dongxu Liu et.al.	2506.09644	null
2025-06-11	Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation	Shanchuan Lin et.al.	2506.09350	null
2025-06-10	Seedance 1.0: Exploring the Boundaries of Video Generation Models	Yu Gao et.al.	2506.09113	null
2025-06-10	FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation	Zheqi He et.al.	2506.09081	link
2025-06-10	MagCache: Fast Video Generation with Magnitude-Aware Cache	Zehong Ma et.al.	2506.09045	link
2025-06-11	Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models	Xuanchi Ren et.al.	2506.09042	link
2025-06-10	HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation	Ziyao Huang et.al.	2506.08797	null
2025-06-10	How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models	Huixuan Zhang et.al.	2506.08351	null
2025-06-09	Seeing Voices: Generating A-Roll Video from Audio with Mirage	Aditi Sundararaman et.al.	2506.08279	null
2025-06-09	Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion	Xun Huang et.al.	2506.08009	null
2025-06-09	Dreamland: Controllable World Creation with Simulator and Generative Models	Sicheng Mo et.al.	2506.08006	null
2025-06-09	Audio-Sync Video Generation with Multi-Stream Temporal Control	Shuchen Weng et.al.	2506.08003	null
2025-06-09	Generative Modeling of Weights: Generalization or Memorization?	Boya Zeng et.al.	2506.07998	link
2025-06-09	Video Unlearning via Low-Rank Refusal Vector	Simone Facchiano et.al.	2506.07891	null
2025-06-09	PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement	Teng Hu et.al.	2506.07848	null
2025-06-09	Consistent Video Editing as Flow-Driven Image-to-Video Generation	Ge Wang et.al.	2506.07713	null
2025-06-10	From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models	Pablo Acuaviva et.al.	2506.07280	null
2025-06-08	TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation	Min-Jung Kim et.al.	2506.07205	null
2025-06-08	Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models	Sangwon Jang et.al.	2506.07177	null
2025-06-06	Restereo: Diffusion stereo video generation and restoration	Xingchang Huang et.al.	2506.06023	null
2025-06-06	LLIA – Enabling Low-Latency Interactive Avatars: Real-Time Audio-Driven Portrait Video Generation with Diffusion Models	Haojie Yu et.al.	2506.05806	null
2025-06-05	EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh	Tao Hu et.al.	2506.05554	null
2025-06-05	ContentV: Efficient Training of Video Generation Models with Limited Compute	Wenfeng Lin et.al.	2506.05343	null
2025-06-09	Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers	Haosong Liu et.al.	2506.05096	null
2025-06-05	FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation	Huihan Wang et.al.	2506.04956	link
2025-06-05	DualX-VSR: Dual Axial Spatial $\times$ Temporal Transformer for Real-World Video Super-Resolution without Motion Compensation	Shuo Cao et.al.	2506.04830	null
2025-06-06	FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion	Akide Liu et.al.	2506.04648	null
2025-06-05	Follow-Your-Creation: Empowering 4D Creation through Video Inpainting	Yue Ma et.al.	2506.04590	null
2025-06-04	LayerFlow: A Unified Model for Layer-aware Video Generation	Sihui Ji et.al.	2506.04228	null
2025-06-04	UNIC: Unified In-Context Video Editing	Zixuan Ye et.al.	2506.04216	null
2025-06-05	FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers	Xuanhua He et.al.	2506.04213	null
2025-06-04	DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models	Ziyi Wu et.al.	2506.03517	null
2025-06-03	Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas	Austin Silveria et.al.	2506.03275	null
2025-06-03	IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation	Yuanze Lin et.al.	2506.03150	null
2025-06-03	Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval	Jiwen Yu et.al.	2506.03141	null
2025-06-03	CamCloneMaster: Enabling Reference-based Camera Control for Video Generation	Yawen Luo et.al.	2506.03140	null
2025-06-03	AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation	Lu Qiu et.al.	2506.03126	null
2025-06-03	DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation	Zhengyao Lv et.al.	2506.03123	null
2025-06-03	TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models	Chetwin Low et.al.	2506.03099	null
2025-06-03	ORV: 4D Occupancy-centric Robot Video Generation	Xiuyu Yang et.al.	2506.03079	link
2025-06-03	Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers	Pengtao Chen et.al.	2506.03065	null
2025-06-03	LinkTo-Anime: A 2D Animation Optical Flow Dataset from 3D Model Rendering	Xiaoyi Feng et.al.	2506.02733	null
2025-06-03	LumosFlow: Motion-Guided Long Video Generation	Jiahao Chen et.al.	2506.02497	null
2025-05-30	MiniMax-Remover: Taming Bad Noise Helps Video Object Removal	Bojia Zi et.al.	2505.24873	null
2025-05-30	DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds	Jiaxu Zhang et.al.	2505.24733	null
2025-05-30	UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation	Yang-Tian Sun et.al.	2505.24521	null
2025-05-30	Interactive Video Generation via Domain Adaptation	Ishaan Rawal et.al.	2505.24253	null
2025-05-30	STORK: Improving the Fidelity of Mid-NFE Sampling for Diffusion and Flow Matching Models	Zheng Tan et.al.	2505.24210	link
2025-05-29	MAGREF: Masked Guidance for Any-Reference Video Generation	Yufan Deng et.al.	2505.23742	link
2025-05-29	VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos	Tingyu Song et.al.	2505.23693	link
2025-05-29	VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models	Xiangdong Zhang et.al.	2505.23656	link
2025-05-29	VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation	Shi-Xue Zhang et.al.	2505.23484	link
2025-05-29	Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis	Hengyuan Cao et.al.	2505.23325	null
2025-05-29	RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer	Liu Liu et.al.	2505.23171	null
2025-05-29	Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing	Tongtong Su et.al.	2505.23134	link
2025-05-29	MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation	Siyuan Wang et.al.	2505.23120	link
2025-05-29	GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion	Gwanghyun Kim et.al.	2505.23085	null
2025-05-29	MOVi: Training-free Text-conditioned Multi-Object Video Generation	Aimon Rahman et.al.	2505.22980	null
2025-05-28	Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation	Zhe Kong et.al.	2505.22647	link
2025-05-28	Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers	Weilun Feng et.al.	2505.22167	null
2025-05-28	FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing	Guanwen Feng et.al.	2505.22141	null
2025-05-28	LatentMove: Towards Complex Human Movement Video Generation	Ashkan Taghipour et.al.	2505.22046	null
2025-05-28	PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms	Yifei Xia et.al.	2505.22016	null
2025-05-28	Learning World Models for Interactive Video Generation	Taiye Chen et.al.	2505.21996	null
2025-05-27	HDRSDR-VQA: A Subjective Video Quality Dataset for HDR and SDR Comparative Evaluation	Bowen Chen et.al.	2505.21831	null
2025-05-27	Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation	Ke Zhang et.al.	2505.21653	null
2025-05-27	VideoMarkBench: Benchmarking Robustness of Video Watermarking	Zhengyuan Jiang et.al.	2505.21620	link
2025-05-27	Frame In-N-Out: Unbounded Controllable Image-to-Video Generation	Boyang Wang et.al.	2505.21491	null
2025-05-27	Dynamic Vision from EEG Brain Recordings: How much does EEG know?	Prajwal Singh et.al.	2505.21385	null
2025-05-28	SageAttention2++: A More Efficient Implementation of SageAttention2	Jintao Zhang et.al.	2505.21136	link
2025-05-27	Minute-Long Videos with Dual Parallelisms	Zeqing Wang et.al.	2505.21070	link
2025-05-27	RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy	Aiyue Chen et.al.	2505.21036	null
2025-05-27	Frame-Level Captions for Long Video Generation with Complex Multi Scenes	Guangcong Zheng et.al.	2505.20827	null
2025-05-27	Learning Generalizable Robot Policy with Human Demonstration Video as a Prompt	Xiang Zhu et.al.	2505.20795	null
2025-05-27	Photography Perspective Composition: Towards Aesthetic Perspective Recommendation	Lujian Yao et.al.	2505.20655	null
2025-05-27	Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training	Bolin Lai et.al.	2505.20629	null
2025-05-28	OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation	Shenghai Yuan et.al.	2505.20292	link
2025-05-27	Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM	Peng Liu et.al.	2505.19901	null
2025-05-26	DriveCamSim: Generalizable Camera Simulation via Explicit Camera Modeling for Autonomous Driving	Wenchao Sun et.al.	2505.19692	link
2025-05-26	TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs	Juntong Wang et.al.	2505.19535	null
2025-05-26	The Role of Video Generation in Enhancing Data-Limited Action Understanding	Wei Li et.al.	2505.19495	null
2025-05-26	Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals	Nate Gillman et.al.	2505.19386	null
2025-05-25	From Single Images to Motion Policies via Video-Generation Environment Representations	Weiming Zhi et.al.	2505.19306	null
2025-05-25	SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation	Shenggan Cheng et.al.	2505.19151	null
2025-05-25	WorldEval: World Model as Real-World Robot Policies Evaluator	Yaxuan Li et.al.	2505.19017	null
2025-05-24	Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation	Shuo Yang et.al.	2505.18875	null
2025-05-24	VORTA: Efficient Video Diffusion via Routing Sparse Attention	Wenhao Sun et.al.	2505.18809	link
2025-05-23	WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions	Zizhang Li et.al.	2505.18151	null
2025-05-23	DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation	Junhao Chen et.al.	2505.18078	null
2025-05-23	SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain	Jiawei Zhou et.al.	2505.17727	null
2025-05-23	Scaling Image and Video Generation via Test-Time Evolutionary Search	Haoran He et.al.	2505.17618	null
2025-05-23	InfLVG: Reinforce Inference-Time Consistent Long Video Generation with GRPO	Xueji Fang et.al.	2505.17574	link
2025-05-22	Training-Free Efficient Video Generation via Dynamic Token Carving	Yuechen Zhang et.al.	2505.16864	link
2025-05-22	Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts	Taewon Kang et.al.	2505.16819	null
2025-05-22	MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM	Siwei Meng et.al.	2505.16456	null
2025-05-23	Challenger: Affordable Adversarial Driving Video Generation	Zhiyuan Xu et.al.	2505.15880	null
2025-05-21	Generative AI for Autonomous Driving: A Review	Katharina Winter et.al.	2505.15863	null
2025-05-25	Interspatial Attention for Efficient 4D Human Video Generation	Ruizhi Shao et.al.	2505.15800	null
2025-05-21	AvatarShield: Visual Reinforcement Learning for Human-Centric Video Forgery Detection	Zhipei Xu et.al.	2505.15173	null
2025-05-21	CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation	Xinran Wang et.al.	2505.15145	link
2025-05-20	Programmatic Video Prediction Using Large Language Models	Hao Tang et.al.	2505.14948	link
2025-05-20	Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers	Sucheng Ren et.al.	2505.14687	link
2025-05-20	LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer	Changgu Chen et.al.	2505.14167	null
2025-05-20	Hunyuan-Game: Industrial-grade Intelligent Game Creation Model	Ruihuang Li et.al.	2505.14135	null
2025-05-19	FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance	Dian Shao et.al.	2505.13437	null
2025-05-19	MAGI-1: Autoregressive Video Generation at Scale	Sand. ai et.al.	2505.13211	link
2025-05-19	DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories	Joel Jang et.al.	2505.12705	link
2025-05-19	Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking	Zihan Su et.al.	2505.12667	null
2025-05-19	BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation	Haiquan Wen et.al.	2505.12620	link
2025-05-18	Video-GPT via Next Clip Diffusion	Shaobin Zhuang et.al.	2505.12489	null
2025-05-17	LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation	Jiarui Wang et.al.	2505.12098	link
2025-05-17	VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption	Tianxiong Zhong et.al.	2505.12053	null
2025-05-16	QVGen: Pushing the Limit of Quantized Video Generative Models	Yushi Huang et.al.	2505.11497	null
2025-05-16	Face Consistency Benchmark for GenAI Video	Michal Podstawski et.al.	2505.11425	null
2025-05-14	Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios	Huafeng Shi et.al.	2505.10584	null
2025-05-16	MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation	Yanbo Ding et.al.	2505.10238	link
2025-05-15	ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars	Rui-Yang Ju et.al.	2505.10072	null
2025-05-18	EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models	Hu Yue et.al.	2505.09694	link
2025-05-15	Generating time-consistent dynamics with discriminator-guided image diffusion models	Philipp Hess et.al.	2505.09089	null
2025-05-13	Generative AI for Autonomous Driving: Frontiers and Opportunities	Yuping Wang et.al.	2505.08854	link
2025-05-13	Symbolically-Guided Visual Plan Inference from Uncurated Video Data	Wenyan Yang et.al.	2505.08444	null
2025-05-12	DanceGRPO: Unleashing GRPO on Visual Generation	Zeyue Xue et.al.	2505.07818	null
2025-05-12	ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models	Ozgur Kara et.al.	2505.07652	null
2025-05-16	Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model	Wei Li et.al.	2505.07449	link
2025-05-15	Generative Pre-trained Autoregressive Diffusion Transformer	Yuan Zhang et.al.	2505.07344	null
2025-05-11	DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models	Junhao Xia et.al.	2505.07057	null
2025-05-11	BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation	Panwen Hu et.al.	2505.06985	null
2025-05-10	Jailbreaking the Text-to-Video Generative Models	Jiayang Liu et.al.	2505.06679	null
2025-05-10	ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images	Xianghao Kong et.al.	2505.06537	null
2025-05-08	T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models	Xuyang Guo et.al.	2505.04946	null
2025-05-08	HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation	Teng Hu et.al.	2505.04512	null
2025-05-06	Real-Time Person Image Synthesis Using a Flow Matching Model	Jiwoo Jeong et.al.	2505.03562	link
2025-05-06	Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights	Zhaiming Shen et.al.	2505.03205	null
2025-05-04	DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization	Wenchuan Wang et.al.	2505.02192	null
2025-05-03	PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth	Bu Jin et.al.	2505.01729	null
2025-05-02	VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos	Zongxia Li et.al.	2505.01481	link
2025-05-02	FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis	Jiangtong Tan et.al.	2505.01172	link
2025-05-01	Controllable Weather Synthesis and Removal with Video Diffusion Models	Chih-Hao Lin et.al.	2505.00704	null
2025-05-01	T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation	Xuyang Guo et.al.	2505.00337	null
2025-04-30	Direct Motion Models for Assessing Generated Videos	Kelsey Allen et.al.	2505.00209	null
2025-04-30	Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis	Michal Geyer et.al.	2505.00135	null
2025-04-30	ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction	Qihao Liu et.al.	2504.21855	null
2025-04-30	HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation	Haiyang Zhou et.al.	2504.21650	link
2025-04-30	Simple Visual Artifact Detection in Sora-Generated Videos	Misora Sugiyama et.al.	2504.21334	null
2025-04-30	Capturing Conditional Dependence via Auto-regressive Diffusion Models	Xunpeng Huang et.al.	2504.21314	null
2025-04-29	TesserAct: Learning 4D Embodied World Models	Haoyu Zhen et.al.	2504.20995	null
2025-04-29	DDPS: Discrete Diffusion Posterior Sampling for Paths in Layered Graphs	Hao Luan et.al.	2504.20754	null
2025-04-29	Advance Fake Video Detection via Vision Transformers	Joy Battocchio et.al.	2504.20669	null
2025-04-28	DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer	Junpeng Jiang et.al.	2504.19614	null
2025-04-26	Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning	Yifan Xie et.al.	2504.18810	null
2025-04-26	Stealing Creator’s Workflow: A Creator-Inspired Agentic Framework with Iterative Feedback Loop for Improved Scientific Short-form Generation	Jong Inn Park et.al.	2504.18805	null
2025-04-25	NoiseController: Towards Consistent Multi-view Video Generation via Noise Decomposition and Collaboration	Haotian Dong et.al.	2504.18448	null
2025-04-23	Subject-driven Video Generation via Disentangled Identity and Motion	Daneul Kim et.al.	2504.17816	null
2025-04-24	Dynamic Camera Poses and Where to Find Them	Chris Rockwell et.al.	2504.17788	null
2025-04-24	MV-Crafter: An Intelligent System for Music-guided Video Generation	Chuer Chen et.al.	2504.17267	null
2025-04-24	DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks	Yinqi Li et.al.	2504.17253	link
2025-04-25	We’ll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback	Minkyu Choi et.al.	2504.17180	null
2025-04-23	BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation	Ruotong Wang et.al.	2504.16907	null
2025-04-23	ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance	Ying Li et.al.	2504.16464	null
2025-04-23	VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models	Xuming Hu et.al.	2504.16359	null
2025-04-22	Survey of Video Diffusion Models: Foundations, Implementations, and Applications	Yimu Wang et.al.	2504.16081	link
2025-04-22	Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework	Xinyuan Song et.al.	2504.16016	null
2025-04-22	Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning	Wang Lin et.al.	2504.15932	null
2025-04-22	Satellite to GroundScape – Large-scale Consistent Ground View Generation from Satellite Views	Ningli Xu et.al.	2504.15786	null
2025-04-22	DiTPainter: Efficient Video Inpainting with Diffusion Transformers	Xian Wu et.al.	2504.15661	null
2025-04-21	Solving New Tasks by Adapting Internet Video Knowledge	Calvin Luo et.al.	2504.15369	null
2025-04-21	Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform	Xianpan Zhou et.al.	2504.15182	null
2025-04-21	DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation	Weijie He et.al.	2504.15032	null
2025-04-21	Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation	Chenjie Cao et.al.	2504.14899	link
2025-04-20	Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis	Jingjing Ren et.al.	2504.14470	null
2025-04-19	SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation	Minho Park et.al.	2504.14396	link
2025-04-21	SkyReels-V2: Infinite-length Film Generative Model	Guibin Chen et.al.	2504.13074	link
2025-04-21	Packing Input Frame Context in Next-Frame Prediction Models for Video Generation	Lvmin Zhang et.al.	2504.12626	link
2025-04-16	VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate	Zhihang Yuan et.al.	2504.12259	link
2025-04-16	Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM	Zirui Pan et.al.	2504.12048	null
2025-04-16	The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation	Bingjie Gao et.al.	2504.11739	null
2025-04-17	VideoPanda: Video Panoramic Diffusion with Multi-view Attention	Kevin Xie et.al.	2504.11389	null
2025-04-15	InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation	Yukang Lin et.al.	2504.10905	null
2025-04-15	OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding	Dianbing Xi et.al.	2504.10825	null
2025-04-14	H-MoRe: Learning Human-centric Motion Representation for Action Analysis	Zhanbo Huang et.al.	2504.10676	link
2025-04-14	H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models	Yushu Wu et.al.	2504.10567	null
2025-04-14	FingER: Content Aware Fine-grained Evaluation with Reasoning for AI-Generated Videos	Rui Chen et.al.	2504.10358	null
2025-04-14	Aligning Anime Video Generation with Human Feedback	Bingwen Zhu et.al.	2504.10044	null
2025-04-14	EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise	Chao Liu et.al.	2504.09789	null
2025-04-13	CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models	Pooja Guhan et.al.	2504.09472	null
2025-04-11	Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model	Team Seawead et.al.	2504.08685	null
2025-04-11	Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization	Jialu Li et.al.	2504.08641	null
2025-04-11	Diffusion Models for Robotic Manipulation: A Survey	Rosa Wolf et.al.	2504.08438	null
2025-04-11	EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model	Renda Li et.al.	2504.08344	null
2025-04-11	RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements	Guangcong Zheng et.al.	2504.08212	link
2025-04-11	TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation	Ruineng Li et.al.	2504.08181	null
2025-04-10	Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction	Zeren Jiang et.al.	2504.07961	link
2025-04-10	Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos	Rundong Luo et.al.	2504.07940	null
2025-04-10	Diffusion Transformers for Tabular Data Time Series Generation	Fabrizio Garuti et.al.	2504.07566	link
2025-04-09	EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation	Diljeet Jagpal et.al.	2504.06861	null
2025-04-09	DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation	Wangbo Zhao et.al.	2504.06803	link
2025-04-09	RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism	Elia Peruzzo et.al.	2504.06672	null
2025-04-09	Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception	Ruotian Peng et.al.	2504.06666	null
2025-04-08	CamContextI2V: Context-aware Controllable Video Generation	Luis Denninger et.al.	2504.06022	link
2025-04-07	One-Minute Video Generation with Test-Time Training	Karan Dalal et.al.	2504.05298	null
2025-04-07	Video-Bench: Human-Aligned Video Generation Benchmark	Hui Han et.al.	2504.04907	null
2025-04-05	Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization	Yikai Wang et.al.	2504.04153	link
2025-04-05	Multi-identity Human Image Animation with Structural Video Diffusion	Zhenzhi Wang et.al.	2504.04126	null
2025-04-05	Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models	Xuyang Guo et.al.	2504.04051	null
2025-04-05	DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion	Maksim Siniukov et.al.	2504.04010	null
2025-04-04	Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models	Xuran Ma et.al.	2504.03140	link
2025-04-03	How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models	Pascal Chang et.al.	2504.03072	null
2025-04-03	Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments	Chenyu Zhang et.al.	2504.02918	null
2025-04-03	Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets	Chuning Zhu et.al.	2504.02792	null
2025-04-03	Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model	Shengjun Zhang et.al.	2504.02764	null
2025-04-04	Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation	Fa-Ting Hong et.al.	2504.02542	link
2025-04-03	ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer	Jiayi Gao et.al.	2504.02451	link
2025-04-03	SkyReels-A2: Compose Anything in Video Diffusion Transformers	Zhengcong Fei et.al.	2504.02436	link
2025-04-04	MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition	Takahiro Shirakawa et.al.	2504.02361	null
2025-04-03	OmniCam: Unified Multimodal Video Generation via Camera Control	Xiaoda Yang et.al.	2504.02312	null
2025-04-02	WorldPrompter: Traversable Text-to-Scene Generation	Zhaoyang Zhang et.al.	2504.02045	null
2025-04-03	VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step	Hanyang Wang et.al.	2504.01956	null
2025-04-01	WorldScore: A Unified Evaluation Benchmark for World Generation	Haoyi Duan et.al.	2504.00983	null
2025-04-01	DecoFuse: Decomposing and Fusing the “What”, “Where”, and “How” for Brain-Inspired fMRI-to-Video Decoding	Chong Li et.al.	2504.00432	null
2025-03-31	GazeLLM: Multimodal LLMs incorporating Human Visual Attention	Jun Rekimoto et.al.	2504.00221	null
2025-03-31	Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation	Shengqiong Wu et.al.	2503.24379	null
2025-04-01	HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation	Boyuan Wang et.al.	2503.24026	null
2025-03-31	JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation	Fangda Chen et.al.	2503.23951	null
2025-04-01	On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices	Bosung Kim et.al.	2503.23796	link
2025-03-31	HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation	Kun Liu et.al.	2503.23715	null
2025-03-30	VideoGen-Eval: Agent-based System for Video Generation Evaluation	Yuhang Yang et.al.	2503.23452	link
2025-03-30	JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization	Kai Liu et.al.	2503.23377	null
2025-04-02	Towards Physically Plausible Video Generation via VLM Planning	Xindi Yang et.al.	2503.23368	null
2025-03-30	MoCha: Towards Movie-Grade Talking Character Synthesis	Cong Wei et.al.	2503.23307	null
2025-03-30	SketchVideo: Sketch-based Video Generation and Editing	Feng-Lin Liu et.al.	2503.23284	null
2025-03-28	Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion Model	Jangho Park et.al.	2503.22622	null
2025-03-28	EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation	Hadrien Reynaud et.al.	2503.22357	null
2025-03-28	CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving	Yishen Ji et.al.	2503.22231	null
2025-03-27	VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models	Chi-Pin Huang et.al.	2503.21781	null
2025-03-27	Exploring the Evolution of Physics Cognition in Video Generation: A Survey	Minghui Lin et.al.	2503.21765	link
2025-03-27	VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness	Dian Zheng et.al.	2503.21755	link
2025-03-27	Audio-driven Gesture Generation via Deviation Feature in the Latent Space	Jiahui Chen et.al.	2503.21616	null
2025-03-27	ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model	Jinwei Qi et.al.	2503.21144	null
2025-03-26	RecTable: Fast Modeling Tabular Data with Rectified Flow	Masane Fuchi et.al.	2503.20731	link
2025-03-26	AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports	Xiangwen Zhang et.al.	2503.20654	null
2025-03-26	GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving	Lloyd Russell et.al.	2503.20523	null
2025-03-26	VPO: Aligning Text-to-Video Generation Models with Prompt Optimization	Jiale Cheng et.al.	2503.20491	link
2025-03-26	Wan: Open and Advanced Large-Scale Video Generative Models	WanTeam et.al.	2503.20314	link
2025-03-26	Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models	Prin Phunyaphibarn et.al.	2503.20240	null
2025-03-26	Video Motion Graphs	Haiyang Liu et.al.	2503.20218	null
2025-03-25	Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors	Yuke Lou et.al.	2503.20118	null
2025-03-25	Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals	Stefan Stojanov et.al.	2503.19953	null
2025-03-25	FullDiT: Multi-Task Video Generative Foundation Model with Full Attention	Xuan Ju et.al.	2503.19907	null
2025-03-25	Mask $^2$ DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation	Tianhao Qi et.al.	2503.19881	null
2025-03-25	AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers	Jiazhi Guan et.al.	2503.19824	null
2025-03-25	AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset	Haiyu Zhang et.al.	2503.19462	null
2025-03-26	Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing	Jaihoon Kim et.al.	2503.19385	null
2025-03-25	MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation	Yukang Lin et.al.	2503.19383	null
2025-03-26	EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models	Yufei Cai et.al.	2503.19369	link
2025-03-25	Long-Context Autoregressive Video Modeling with Next-Frame Prediction	Yuchao Gu et.al.	2503.19325	link
2025-03-25	Aether: Geometric-Aware Unified World Modeling	Aether Team et.al.	2503.18945	null
2025-03-24	Video-T1: Test-Time Scaling for Video Generation	Fangfu Liu et.al.	2503.18942	null
2025-03-24	Training-free Diffusion Acceleration with Bottleneck Sampling	Ye Tian et.al.	2503.18940	null
2025-03-25	AMD-Hummingbird: Towards an Efficient Text-to-Video Model	Takashi Isobe et.al.	2503.18559	link
2025-03-24	EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation	Qiang Qu et.al.	2503.18552	null
2025-03-24	Can Text-to-Video Generation help Video-Language Alignment?	Luca Zanella et.al.	2503.18507	null
2025-03-24	Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation	Dingcheng Zhen et.al.	2503.18429	null
2025-03-24	Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance	Sicong Feng et.al.	2503.18386	null
2025-03-23	LongDiff: Training-Free Long Video Generation in One Go	Zhuoling Li et.al.	2503.18150	null
2025-03-23	TransAnimate: Taming Layer Diffusion to Generate RGBA Video	Xuewei Chen et.al.	2503.17934	null
2025-03-21	Position: Interactive Generative Video as Next-Generation Game Engine	Jiwen Yu et.al.	2503.17359	null
2025-03-21	AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process	Junjie Hu et.al.	2503.17029	null
2025-03-21	Enabling Versatile Controls for Video Diffusion Models	Xu Zhang et.al.	2503.16983	link
2025-03-21	Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model	Yingying Fan et.al.	2503.16942	null
2025-03-20	XAttention: Block Sparse Attention with Antidiagonal Scoring	Ruyi Xu et.al.	2503.16428	link
2025-03-20	MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance	Quanhao Li et.al.	2503.16421	null
2025-03-20	ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos	Haolin Yang et.al.	2503.16400	null
2025-03-20	PoseTraj: Pose-Aware Trajectory Control in Video Diffusion	Longbin Ji et.al.	2503.16068	null
2025-03-20	Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models	Marc Benedí San Millán et.al.	2503.15996	null
2025-03-20	MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving	Haiguang Wang et.al.	2503.15875	link
2025-03-20	VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling	Hyojun Go et.al.	2503.15855	null
2025-03-19	Temporal Regularization Makes Your Video Generator Stronger	Harold Haodong Chen et.al.	2503.15417	null
2025-03-20	VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention	Mingzhe Zheng et.al.	2503.15138	null
2025-03-18	MusicInfuser: Making Video Diffusion Listen and Dance	Susung Hong et.al.	2503.14505	null
2025-03-18	MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation	Hongyu Zhang et.al.	2503.14428	null
2025-03-18	Impossible Videos	Zechen Bai et.al.	2503.14378	null
2025-03-18	LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models	Yu Cheng et.al.	2503.14325	link
2025-03-18	Concat-ID: Towards Universal Identity-Preserving Video Synthesis	Yong Zhong et.al.	2503.14151	null
2025-03-18	Fast Autoregressive Video Generation with Diagonal Decoding	Yang Ye et.al.	2503.14070	null
2025-03-18	AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark	Xinhao Xiang et.al.	2503.14064	link
2025-03-17	Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction	Zheyuan Liu et.al.	2503.12953	null
2025-03-17	AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations	Quang Trung Truong et.al.	2503.12828	null
2025-03-16	SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs	Guibiao Liao et.al.	2503.12535	null
2025-03-15	A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI	Paula Andrea Pérez-Toro et.al.	2503.12102	null
2025-03-15	SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering	Byeongjun Park et.al.	2503.12024	link
2025-03-14	ReCamMaster: Camera-Controlled Generative Rendering from A Single Video	Jianhong Bai et.al.	2503.11647	null
2025-03-14	HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models	Ziqin Zhou et.al.	2503.11513	null
2025-03-14	TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation	Hongxiang Zhao et.al.	2503.11423	null
2025-03-14	Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model	Haoyang Huang et.al.	2503.11251	link
2025-03-14	Cross-Modal Learning for Music-to-Music-Video Description Generation	Zhuoyuan Mao et.al.	2503.11190	null
2025-03-13	CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models	Hao He et.al.	2503.10592	null
2025-03-13	Long Context Tuning for Video Generation	Yuwei Guo et.al.	2503.10589	null
2025-03-13	CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance	Yufan Deng et.al.	2503.10391	null
2025-03-13	Semantic Latent Motion for Portrait Video Generation	Qiyuan Zhang et.al.	2503.10096	null
2025-03-16	VMBench: A Benchmark for Perception-Aligned Video Motion Generation	Xinran Ling et.al.	2503.10076	link
2025-03-13	UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?	Yuanxin Liu et.al.	2503.09949	link
2025-03-13	VideoMerge: Towards Training-free Long Video Generation	Siyang Zhang et.al.	2503.09926	null
2025-03-12	LuciBot: Automated Robot Policy Learning from Generated Videos	Xiaowen Qiu et.al.	2503.09871	null
2025-03-14	On the Limitations of Vision-Language Models in Understanding Image Transforms	Ahmad Mustafa Anis et.al.	2503.09837	null
2025-03-12	I2V3D: Controllable image-to-video generation with 3D guidance	Zhiyuan Zhang et.al.	2503.09733	null
2025-03-12	PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop	Chenyu Li et.al.	2503.09595	link
2025-03-12	Unified Dense Prediction of Video Diffusion	Lehan Yang et.al.	2503.09344	null
2025-03-12	Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latant Space	Jian Zhu et.al.	2503.09215	null
2025-03-13	WonderVerse: Extendable 3D Scene Generation with Video Generative Models	Hao Feng et.al.	2503.09160	null
2025-03-12	Reangle-A-Video: 4D Video Generation as Video-to-Video Translation	Hyeonho Jeong et.al.	2503.09151	null
2025-03-11	REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder	Yitian Zhang et.al.	2503.08665	null
2025-03-11	Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling	Subin Kim et.al.	2503.08605	null
2025-03-12	$^R$ FLAV: Rolling Flow matching for infinite Audio Video generation	Alex Ergasti et.al.	2503.08307	link
2025-03-11	WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation	Jing Wang et.al.	2503.08153	null
2025-03-11	ObjectMover: Generative Object Movement with Video Prior	Xin Yu et.al.	2503.08037	null
2025-03-11	How Can Video Generative AI Transform K-12 Education? Examining Teachers’ Perspectives through TPACK and TAM	Unggi Lee et.al.	2503.08003	null
2025-03-10	DreamRelation: Relation-Centric Video Customization	Yujie Wei et.al.	2503.07602	null
2025-03-11	VACE: All-in-One Video Creation and Editing	Zeyinzi Jiang et.al.	2503.07598	null
2025-03-10	AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion	Mingzhen Sun et.al.	2503.07418	null
2025-03-10	Automated Movie Generation via Multi-Agent CoT Planning	Weijia Wu et.al.	2503.07314	link
2025-03-09	VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation	Hritik Bansal et.al.	2503.06800	null
2025-03-09	TR-DQ: Time-Rotation Diffusion Quantization	Yihua Shao et.al.	2503.06564	null
2025-03-09	QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation	Junyi Wu et.al.	2503.06545	link
2025-03-11	LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation	Quanjian Song et.al.	2503.06508	link
2025-03-09	Generative Video Bi-flow	Chen Liu et.al.	2503.06364	null
2025-03-08	Text2Story: Advancing Video Storytelling with Text Guidance	Taewon Kang et.al.	2503.06310	null
2025-03-08	Object-Centric World Model for Language-Guided Manipulation	Youngjoon Jeong et.al.	2503.06170	null
2025-03-08	VACT: A Video Automatic Causal Testing System and a Benchmark	Haotong Yang et.al.	2503.06163	null
2025-03-07	MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio	Xuenan Xu et.al.	2503.05242	link
2025-03-07	Unified Reward Model for Multimodal Understanding and Generation	Yibin Wang et.al.	2503.05236	null
2025-03-06	Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video Generation	Alexey Buzovkin et.al.	2503.04871	link
2025-03-06	FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video	Yue Gao et.al.	2503.04720	null
2025-03-06	What Are You Doing? A Closer Look at Controllable Human Video Generation	Emanuele Bugliarello et.al.	2503.04666	null
2025-03-08	The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation	Aoxiong Yin et.al.	2503.04606	link
2025-03-05	GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control	Xuanchi Ren et.al.	2503.03751	link
2025-03-08	Rethinking Video Tokenization: A Conditioned Diffusion-based Approach	Nianzu Yang et.al.	2503.03708	link
2025-03-05	DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance	Zhao Yang et.al.	2503.03689	link
2025-03-05	High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights	Yuna Kato et.al.	2503.03558	link
2025-03-05	Video Super-Resolution: All You Need is a Video Diffusion Model	Zhihao Zhan et.al.	2503.03355	null
2025-03-04	GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning	Zhun Mou et.al.	2503.02341	null
2025-03-03	VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation	Wenhao Wang et.al.	2503.01739	link
2025-03-03	VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors	Juil Koo et.al.	2503.01107	null
2025-03-02	Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think	Jie Tian et.al.	2503.00948	link
2025-03-01	Learning to Animate Images from A Few Videos to Portray Delicate Human Actions	Haoxin Li et.al.	2503.00276	null
2025-03-04	Unified Video Action Model	Shuang Li et.al.	2503.00200	null
2025-02-28	Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos	Zhiyu Tan et.al.	2502.21314	null
2025-02-28	Training-free and Adaptive Sparse Attention for Efficient Long Video Generation	Yifei Xia et.al.	2502.21079	null
2025-02-28	HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models	Xiao Wang et.al.	2502.20811	null
2025-02-28	WorldModelBench: Judging Video Generation Models As World Models	Dacheng Li et.al.	2502.20694	null
2025-02-27	Mobius: Text to Seamless Looping Video Generation via Latent Shift	Xiuli Bi et.al.	2502.20307	link
2025-02-27	FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute	Sotiris Anagnostidis et.al.	2502.20126	null
2025-02-27	C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation	Yuhao Li et.al.	2502.19868	link
2025-02-26	FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion mode	Lingzhou Mu et.al.	2502.19455	null
2025-03-03	TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis	Menghao Li et.al.	2502.19454	null
2025-02-25	SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference	Jintao Zhang et.al.	2502.18137	link
2025-02-25	ASurvey: Spatiotemporal Consistency in Video Generation	Zhiyu Yin et.al.	2502.17863	null
2025-02-24	X-Dancer: Expressive Music to Human Dance Video Generation	Zeyuan Chen et.al.	2502.17414	null
2025-02-24	VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing	Xiangpeng Yang et.al.	2502.17258	null
2025-02-24	Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions	Zhong Li et.al.	2502.17119	link
2025-02-21	RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers	Min Zhao et.al.	2502.15894	null
2025-02-21	VaViM and VaVAM: Autonomous Driving through Video Generative Modeling	Florent Bartoccioni et.al.	2502.15672	link
2025-02-20	Hardware-Friendly Static Quantization Method for Video Diffusion Transformers	Sanghyun Yi et.al.	2502.15077	null
2025-02-20	LAVID: An Agentic LVLM Framework for Diffusion-Generated Video Detection	Qingyuan Liu et.al.	2502.14994	null
2025-02-20	Improving the Diffusability of Autoencoders	Ivan Skorokhodov et.al.	2502.14831	null
2025-02-21	RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers	Ke Cao et.al.	2502.14377	null
2025-02-20	Designing Parameter and Compute Efficient Diffusion Transformers using Distillation	Vignesh Sundaresha et.al.	2502.14226	null
2025-02-19	FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation	Yunpeng Zhang et.al.	2502.13995	link
2025-02-19	LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation	Junchen Fu et.al.	2502.12945	null
2025-02-18	VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation	Xinlong Chen et.al.	2502.12782	link
2025-02-18	MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation	Sihyun Yu et.al.	2502.12632	null
2025-02-17	LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities	Florian Sestak et.al.	2502.12128	link
2025-02-17	DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation	Zhihang Yuan et.al.	2502.11897	link
2025-02-17	Object-Centric Image to Video Generation with Language Guidance	Angel Villar-Corrales et.al.	2502.11655	null
2025-02-16	MaskFlow: Discrete Flows For Flexible and Efficient Long Video Generation	Michael Fuest et.al.	2502.11234	null
2025-02-16	Phantom: Subject-consistent video generation via cross-modal alignment	Lijie Liu et.al.	2502.11079	null
2025-02-17	Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model	Guoqing Ma et.al.	2502.10248	link
2025-02-14	RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control	Teng Li et.al.	2502.10059	null
2025-02-14	GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation	Hongyin Zhang et.al.	2502.09268	null
2025-02-12	CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation	Qinghe Wang et.al.	2502.08639	null
2025-02-12	FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis	Wonjoon Jin et.al.	2502.08244	null
2025-02-12	Learning Human Skill Generators at Key-Step Levels	Yilu Wu et.al.	2502.08234	null
2025-02-12	AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance	Zhao Wang et.al.	2502.08189	null
2025-02-12	Next Block Prediction: Video Generation via Semi-Autoregressive Modeling	Shuhuai Ren et.al.	2502.07737	null
2025-02-14	Magic 1-For-1: Generating One Minute Video Clips within One Minute	Hongwei Yi et.al.	2502.07701	link
2025-02-12	VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation	Sixiao Zheng et.al.	2502.07531	null
2025-02-13	Enhance-A-Video: Better Generated Video for Free	Yang Luo et.al.	2502.07508	link
2025-02-11	Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos	Haowen Gao et.al.	2502.07327	null
2025-02-11	Articulate That Object Part (ATOP): 3D Part Articulation from Text and Motion Personalization	Aditya Vora et.al.	2502.07278	null
2025-02-11	Contextual Gesture: Co-Speech Gesture Video Generation through Context-aware Gesture Representation	Pinxin Liu et.al.	2502.07239	null
2025-02-10	Lotus: Creating Short Videos From Long Videos With Abstractive and Extractive Summarization	Aadit Barua et.al.	2502.07096	null
2025-02-10	Conditional diffusion model with spatial attention and latent embedding for medical image segmentation	Behzad Hejrati et.al.	2502.06997	link
2025-02-10	Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT	Dongyang Liu et.al.	2502.06782	null
2025-02-10	History-Guided Video Diffusion	Kiwhan Song et.al.	2502.06764	null
2025-02-10	Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists	Bojia Zi et.al.	2502.06734	null
2025-02-10	TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models	Yangguang Li et.al.	2502.06608	link
2025-02-10	CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers	D. She et.al.	2502.06527	null
2025-02-10	Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile	Hangliang Ding et.al.	2502.06155	null
2025-02-08	Towards AI-driven Sign Language Generation with Non-manual Markers	Han Zhang et.al.	2502.05661	null
2025-02-08	Training-Free Constrained Generation With Stable Diffusion Models	Stefano Zampini et.al.	2502.05625	null
2025-02-08	A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction	Yongfan Chen et.al.	2502.05503	link
2025-02-07	FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation	Shilong Zhang et.al.	2502.05179	link
2025-02-07	Goku: Flow Based Video Generative Foundation Models	Shoufa Chen et.al.	2502.04896	null
2025-02-07	HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation	Qijun Gan et.al.	2502.04847	null
2025-02-06	Fast Video Generation with Sliding Tile Attention	Peiyuan Zhang et.al.	2502.04507	null
2025-02-06	UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation	Wenzhang Sun et.al.	2502.04393	null
2025-02-06	MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation	Jinbo Xing et.al.	2502.04299	null
2025-02-06	Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression	Lirui Wang et.al.	2502.04296	null
2025-02-06	Content-Rich AIGC Video Quality Assessment via Intricate Text Alignment and Motion-Aware Consistency	Shangkun Sun et.al.	2502.04076	link
2025-02-06	UniForm: A Unified Diffusion Transformer for Audio-Video Generation	Lei Zhao et.al.	2502.03897	null
2025-02-05	Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach	Yunuo Chen et.al.	2502.03639	null
2025-02-05	FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise	Yunlong Yuan et.al.	2502.03496	null
2025-02-05	MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent	Xinyao Liao et.al.	2502.03207	null
2025-02-04	Controllable Video Generation with Provable Disentanglement	Yifan Shen et.al.	2502.02690	null
2025-02-04	VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models	Hila Chefer et.al.	2502.02492	null
2025-02-05	IPO: Iterative Preference Optimization for Text-to-Video Generation	Xiaomeng Yang et.al.	2502.02088	null
2025-02-03	VILP: Imitation Learning with Latent Video Planning	Zhengtong Xu et.al.	2502.01784	link
2025-02-03	Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity	Haocheng Xi et.al.	2502.01776	null
2025-02-05	MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation	Haibo Tong et.al.	2502.01719	null
2025-02-02	HuViDPO:Enhancing Video Generation through Direct Preference Optimization for Human-Centric Alignment	Lifan Jiang et.al.	2502.01690	null
2025-02-03	Improved Training Technique for Latent Consistency Models	Quan Dao et.al.	2502.01441	link
2025-02-03	VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control	Lifan Jiang et.al.	2502.01101	link
2025-02-03	OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models	Gaojie Lin et.al.	2502.01061	null
2025-02-03	Pushing the Boundaries of State Space Models for Image and Video Generation	Yicong Hong et.al.	2502.00972	null
2025-01-31	Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search	Yuta Oshima et.al.	2501.19252	null
2025-01-30	Every Image Listens, Every Image Dances: Music-Driven Image Animation	Zhikang Dong et.al.	2501.18801	null
2025-01-28	CascadeV: An Implementation of Wurstchen Architecture for Video Generation	Wenfeng Lin et.al.	2501.16612	link
2025-01-26	“See What I Imagine, Imagine What I See”: Human-AI Co-Creation System for 360 $^\circ$ Panoramic Video Generation in VR	Yunge Wen et.al.	2501.15456	null
2025-01-24	VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking	Runyi Hu et.al.	2501.14195	link
2025-01-23	Improving Video Generation with Human Feedback	Jie Liu et.al.	2501.13918	null
2025-01-23	EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion	Jiangchuan Wei et.al.	2501.13452	null
2025-01-21	Taming Teacher Forcing for Masked Autoregressive Video Generation	Deyu Zhou et.al.	2501.12389	null
2025-01-22	Video Depth Anything: Consistent Depth Estimation for Super-Long Videos	Sili Chen et.al.	2501.12375	null
2025-01-20	GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video	Zhenliang Ni et.al.	2501.11340	null
2025-01-20	CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation	Zheng Chong et.al.	2501.11325	link
2025-01-18	EMO2: End-Effector Guided Audio-Driven Avatar Video Generation	Linrui Tian et.al.	2501.10687	null
2025-01-17	DiffuEraser: A Diffusion Model for Video Inpainting	Xiaowen Li et.al.	2501.10018	link
2025-01-17	RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation	Yuefan Cao et.al.	2501.09982	null
2025-01-16	VideoWorld: Exploring Knowledge Learning from Unlabeled Videos	Zhongwei Ren et.al.	2501.09781	null
2025-01-16	Learnings from Scaling Visual Tokenizers for Reconstruction and Generation	Philippe Hansen-Estruch et.al.	2501.09755	null
2025-01-14	Do generative video models learn physical principles from watching videos?	Saman Motamed et.al.	2501.09038	link
2025-01-15	Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion	Jingyuan Chen et.al.	2501.09019	null
2025-01-15	RepVideo: Rethinking Cross-Layer Representation for Video Generation	Chenyang Si et.al.	2501.08994	null
2025-01-15	Comprehensive Subjective and Objective Evaluation Method for Text-generated Video	Zelu Qi et.al.	2501.08545	null
2025-01-14	Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models	Weichen Fan et.al.	2501.08453	null
2025-01-14	3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering	Meenakshi Krishnan et.al.	2501.08370	null
2025-01-14	GameFactory: Creating New Games with Generative Interactive Videos	Jiwen Yu et.al.	2501.08325	null
2025-01-14	Diffusion Adversarial Post-Training for One-Step Video Generation	Shanchuan Lin et.al.	2501.08316	null
2025-01-14	LayerAnimate: Layer-specific Control for Animation	Yuxue Yang et.al.	2501.08295	null
2025-01-14	FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors	Yabo Zhang et.al.	2501.08225	link
2025-01-13	BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations	Weixi Feng et.al.	2501.07647	null
2025-01-13	Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss	Xinyu Zhang et.al.	2501.07563	null
2025-01-11	Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning	Maomao Li et.al.	2501.06438	null
2025-01-10	MEt3R: Measuring Multi-View Consistency in Generated Images	Mohammad Asim et.al.	2501.06336	null
2025-01-10	Multi-subject Open-set Personalization in Video Generation	Tsai-Shien Chen et.al.	2501.06187	null
2025-01-10	VideoAuteur: Towards Long Narrative Video Generation	Junfei Xiao et.al.	2501.06173	null
2025-01-08	Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion	Yongjia Ma et.al.	2501.05484	null
2025-01-09	Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces	Aniruddha Mahapatra et.al.	2501.05442	null
2025-01-08	ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning	Yuzhou Huang et.al.	2501.04698	null
2025-01-08	LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition	Bowen Hao et.al.	2501.04204	null
2025-01-07	Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers	Yuechen Zhang et.al.	2501.03931	link
2025-01-09	Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control	Zekai Gu et.al.	2501.03847	link
2025-01-07	Motion-Aware Generative Frame Interpolation	Guozhen Zhang et.al.	2501.03699	null
2025-01-06	License Plate Images Generation with Diffusion Models	Mariia Shpir et.al.	2501.03374	null
2025-01-06	Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation	Guy Yariv et.al.	2501.03059	null
2025-01-06	TransPixar: Advancing Text-to-Video Generation with Transparency	Luozhou Wang et.al.	2501.03006	link
2025-01-06	Brick-Diffusion: Generating Long Videos with Brick-to-Wall Denoising	Yunlong Yuan et.al.	2501.02741	null
2025-01-05	GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking	Weikang Bian et.al.	2501.02690	null
2025-01-04	Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey	Zongxia Li et.al.	2501.02189	link
2025-01-03	JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing	Qili Wang et.al.	2501.01798	link
2025-01-06	VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control	Yuanpeng Tu et.al.	2501.01427	null
2025-01-03	Free-Form Motion Control: A Synthetic Video Generation Dataset with Controllable Camera and Object Motions	Xincheng Shuai et.al.	2501.01425	null
2025-01-02	On Unifying Video Generation and Camera Pose Estimation	Chun-Hao Paul Huang et.al.	2501.01409	null
2025-01-01	Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform	Cheonsu Jeong et.al.	2501.00750	null
2025-01-03	DreamDrive: Generative 4D Scene Modeling from Street View Images	Jiageng Mao et.al.	2501.00601	null
2024-12-30	LTX-Video: Realtime Video Latent Diffusion	Yoav HaCohen et.al.	2501.00103	link
2024-12-30	Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model	Yifei Huang et.al.	2412.21080	link
2024-12-30	VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation	Jiazheng Xu et.al.	2412.21059	link
2024-12-30	ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation	Ting Zhang et.al.	2412.20901	null
2024-12-30	Dialogue Director: Bridging the Gap in Dialogue Visualization for Multimodal Storytelling	Min Zhang et.al.	2412.20725	null
2024-12-29	Open-Sora: Democratizing Efficient Video Production for All	Zangwei Zheng et.al.	2412.20404	link
2024-12-27	Generative Video Propagation	Shaoteng Liu et.al.	2412.19761	null
2024-12-30	VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models	Tao Wu et.al.	2412.19645	null
2024-12-30	DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT	Xiaotao Hu et.al.	2412.19505	link
2024-12-25	Accelerating Diffusion Transformers with Dual Feature Caching	Chang Zou et.al.	2412.18911	link
2024-12-24	Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation	Faraz Waseem et.al.	2412.18688	null
2024-12-24	DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers	Yuntao Chen et.al.	2412.18607	null
2024-12-24	ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation	Hongjie Li et.al.	2412.18600	null
2024-12-24	DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation	Minghong Cai et.al.	2412.18597	link
2024-12-23	Large Motion Video Autoencoding with Cross-modal Video VAE	Yazhou Xing et.al.	2412.17805	null
2024-12-23	VidTwin: Video VAE with Decoupled Structure and Dynamics	Yuchi Wang et.al.	2412.17726	link
2024-12-23	FFA Sora, video generation as fundus fluorescein angiography simulator	Xinyuan Wu et.al.	2412.17346	null
2024-12-23	Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory	Xingyao Li et.al.	2412.17254	null
2024-12-22	SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults	Jinzhi Wang et.al.	2412.17077	null
2024-12-22	Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation	Luoxu Jin et.al.	2412.17042	null
2024-12-21	GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space	Souhaib Attaiki et.al.	2412.16717	null
2024-12-21	TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models	Haocheng Huang et.al.	2412.16700	null
2024-12-21	VAST 1.0: A Unified Framework for Controllable and Consistent Video Generation	Chi Zhang et.al.	2412.16677	null
2024-12-21	Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance	Beiyuan Zhang et.al.	2412.16495	null
2024-12-20	DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization	Zihan Ding et.al.	2412.15689	null
2024-12-20	CustomTTT: Motion and Appearance Customized Video Generation via Test-Time Training	Xiuli Bi et.al.	2412.15646	link
2024-12-19	AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation	Moayed Haji-Ali et.al.	2412.15191	null
2024-12-19	Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM	Yatai Ji et.al.	2412.15156	link
2024-12-19	Parallelized Autoregressive Visual Generation	Yuqing Wang et.al.	2412.15119	null
2024-12-19	Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations	Yucheng Hu et.al.	2412.14803	null
2024-12-19	Consistent Human Image and Video Generation with Spatially Conditioned Diffusion	Mingdeng Cao et.al.	2412.14531	link
2024-12-19	DirectorLLM for Human-Centric Video Generation	Kunpeng Song et.al.	2412.14484	null
2024-12-18	Autoregressive Video Generation without Vector Quantization	Haoge Deng et.al.	2412.14169	link
2024-12-18	VideoDPO: Omni-Preference Alignment for Video Diffusion Generation	Runtao Liu et.al.	2412.14167	null
2024-12-18	AKiRa: Augmentation Kit on Rays for optical video generation	Xi Wang et.al.	2412.14158	null
2024-12-18	SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation	Tong Chen et.al.	2412.14018	null
2024-12-18	Real-time One-Step Diffusion-based Expressive Portrait Videos Generation	Hanzhong Guo et.al.	2412.13479	link
2024-12-18	SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation	Kazuki Shimada et.al.	2412.13462	null
2024-12-17	CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile Devices	Andrei Znobishchev et.al.	2412.13273	null
2024-12-17	MotionBridge: Dynamic Video Inbetweening with Flexible Controls	Maham Tanveer et.al.	2412.13190	null
2024-12-17	VidTok: A Versatile and Open-Source Video Tokenizer	Anni Tang et.al.	2412.13061	link
2024-12-16	Can video generation replace cinematographers? Research on the cinematic language of generated video	Xiaozhe Li et.al.	2412.12223	null
2024-12-16	InterDyn: Controllable Interactive Dynamics with Video Diffusion Models	Rick Akkerman et.al.	2412.11785	null
2024-12-16	Generative Inbetweening through Frame-wise Conditions-Driven Video Generation	Tianyi Zhu et.al.	2412.11755	link
2024-12-16	VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting	Muhammet Furkan Ilaslan et.al.	2412.11621	link
2024-12-15	GenLit: Reformulating Single-Image Relighting as Video Generation	Shrisha Bharadwaj et.al.	2412.11224	null
2024-12-15	DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes	Jinxiu Liu et.al.	2412.11100	null
2024-12-14	Video Diffusion Transformers are In-Context Learners	Zhengcong Fei et.al.	2412.10783	link
2024-12-13	SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device	Yushu Wu et.al.	2412.10494	null
2024-12-16	TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation	Xingrui Wang et.al.	2412.10275	null
2024-12-13	Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark	Yudong Jiang et.al.	2412.10255	link
2024-12-13	LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity	Hongjie Wang et.al.	2412.09856	null
2024-12-13	MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion	Xunnong Xu et.al.	2412.09828	null
2024-12-12	Doe-1: Closed-Loop Autonomous Driving with Large World Model	Wenzhao Zheng et.al.	2412.09627	link
2024-12-12	OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation	Weiqi Li et.al.	2412.09623	null
2024-12-12	Owl-1: Omni World Model for Consistent Long Video Generation	Yuanhui Huang et.al.	2412.09600	link
2024-12-12	LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors	Yabo Chen et.al.	2412.09597	null
2024-12-12	Video Creation by Demonstration	Yihong Sun et.al.	2412.09551	null
2024-12-12	UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer	Delong Liu et.al.	2412.09389	link
2024-12-12	T-SVG: Text-Driven Stereoscopic Video Generation	Qiao Jin et.al.	2412.09323	null
2024-12-12	InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption	Tiehan Fan et.al.	2412.09283	null
2024-12-12	LVMark: Robust Watermark for latent video diffusion models	MinHyuk Jang et.al.	2412.09122	null
2024-12-12	Enhancing Facial Consistency in Conditional Video Generation via Facial Landmark Transformation	Lianrui Mu et.al.	2412.08976	null
2024-12-11	Pysical Informed Driving World Model	Zhuoran Yang et.al.	2412.08410	null
2024-12-11	FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks	Chongkai Gao et.al.	2412.08261	null
2024-12-11	VSD2M: A Large-scale Vision-language Sticker Dataset for Multi-frame Animated Sticker Generation	Zhiqiang Yuan et.al.	2412.08259	null
2024-12-11	UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics	Xi Chen et.al.	2412.07774	null
2024-12-10	From Slow Bidirectional to Fast Causal Video Generators	Tianwei Yin et.al.	2412.07772	null
2024-12-10	SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints	Jianhong Bai et.al.	2412.07760	link
2024-12-10	3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation	Xiao Fu et.al.	2412.07759	null
2024-12-10	Multi-Shot Character Consistency for Text-to-Video Generation	Yuval Atzmon et.al.	2412.07750	null
2024-12-10	StyleMaster: Stylize Your Video with Artistic Generation and Translation	Zixuan Ye et.al.	2412.07744	null
2024-12-10	STIV: Scalable Text and Image Conditioned Video Generation	Zongyu Lin et.al.	2412.07730	null
2024-12-10	ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer	Jinyi Hu et.al.	2412.07720	link
2024-12-09	SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations	Zhaorun Chen et.al.	2412.06878	null
2024-12-08	Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training	Zhenghong Zhou et.al.	2412.06029	null
2024-12-08	FlexDiT: Dynamic Token Density Control for Diffusion Transformer	Shuning Chang et.al.	2412.06028	link
2024-12-08	Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation	Hyeonho Jeong et.al.	2412.06016	null
2024-12-08	Accelerating Video Diffusion Models via Distribution Matching	Yuanzhi Zhu et.al.	2412.05899	null
2024-12-08	MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation	Shuwei Shi et.al.	2412.05848	null
2024-12-08	Self-Guidance: Boosting Flow and Diffusion Generation on Their Own	Tiancheng Li et.al.	2412.05827	null
2024-12-07	Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation	Leonardo Pina et.al.	2412.05694	null
2024-12-06	Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model	Lening Wang et.al.	2412.05280	link
2024-12-06	Mind the Time: Temporally-Controlled Multi-Event Video Generation	Ziyi Wu et.al.	2412.05263	null
2024-12-06	UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving	Rui Chen et.al.	2412.04842	link
2024-12-05	Using Diffusion Priors for Video Amodal Segmentation	Kaihua Chen et.al.	2412.04623	null
2024-12-05	PaintScene4D: Consistent 4D Scene Generation from Text Prompts	Vinayak Gupta et.al.	2412.04471	null
2024-12-05	MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation	Longtao Zheng et.al.	2412.04448	null
2024-12-05	DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models	Yizhuo Li et.al.	2412.04446	null
2024-12-05	GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration	Kaiyi Huang et.al.	2412.04440	null
2024-12-05	Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation	Yuying Ge et.al.	2412.04432	link
2024-12-05	Instructional Video Generation	Yayuan Li et.al.	2412.04189	null
2024-12-05	IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation	Sejong Yang et.al.	2412.04000	null
2024-12-05	DiffSign: AI-Assisted Generation of Customizable Sign Language Videos With Enhanced Realism	Sudha Krishnamurthy et.al.	2412.03878	link
2024-12-05	Movie Gen: SWOT Analysis of Meta’s Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries	Abul Ehtesham et.al.	2412.03837	null
2024-12-04	Advancing Auto-Regressive Continuation for Video Frames	Ruibo Ming et.al.	2412.03758	null
2024-12-04	Navigation World Models	Amir Bar et.al.	2412.03572	null
2024-12-04	Imagine360: Immersive 360 Video Generation from Perspective Anchor	Jing Tan et.al.	2412.03552	null
2024-12-04	Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention	Hannan Lu et.al.	2412.03520	null
2024-12-04	SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model	Yan Li et.al.	2412.03430	null
2024-12-04	MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers	Xiaohe Ma et.al.	2412.03225	null
2024-12-04	Mimir: Improving Video Diffusion Models for Precise Text Understanding	Shuai Tan et.al.	2412.03085	null
2024-12-03	Motion Prompting: Controlling Video Generation with Motion Trajectories	Daniel Geng et.al.	2412.02700	null
2024-12-03	AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction	Lingteng Qiu et.al.	2412.02684	null
2024-12-03	Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback	Hiroki Furuta et.al.	2412.02617	null
2024-12-03	VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation	Mingzhe Zheng et.al.	2412.02259	link
2024-12-02	World-consistent Video Diffusion with Explicit 3D Modeling	Qihang Zhang et.al.	2412.01821	null
2024-12-02	Driving Scene Synthesis on Free-form Trajectories with Generative Prior	Zeyu Yang et.al.	2412.01717	null
2024-12-04	InfinityDrive: Breaking Time Limits in Driving World Models	Xi Guo et.al.	2412.01522	null
2024-12-02	CPA: Camera-pose-awareness Diffusion Transformer for Video Generation	Yuelei Wang et.al.	2412.01429	null
2024-12-02	MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models	Xiaomin Li et.al.	2412.01343	null
2024-12-02	Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation	Xin Yan et.al.	2412.01316	null
2024-11-29	Fleximo: Towards Flexible Text-to-Human Motion Video Generation	Yuhang Zhang et.al.	2411.19459	null
2024-11-28	Trajectory Attention for Fine-grained Video Motion Control	Zeqi Xiao et.al.	2411.19324	null
2024-11-28	MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation	Daewon Yoon et.al.	2411.19121	null
2024-11-28	Timestep Embedding Tells: It’s Time to Cache for Video Diffusion Model	Feng Liu et.al.	2411.19108	null
2024-11-28	SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing	Rong-Cheng Tu et.al.	2411.18983	null
2024-12-02	AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers	Sherwin Bahmani et.al.	2411.18673	null
2024-11-27	Towards Chunk-Wise Generation for Long Videos	Siyang Zhang et.al.	2411.18668	null
2024-11-27	Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models	Yiming Wu et.al.	2411.18375	null
2024-11-30	MotionCharacter: Identity-Preserving and Motion Controllable Human Video Generation	Haopeng Fang et.al.	2411.18281	null
2024-11-26	Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey	Hong-Hanh Nguyen-Le et.al.	2411.17911	null
2024-11-27	Accelerating Vision Diffusion Transformers with Skip Branches	Guanjie Chen et.al.	2411.17616	link
2024-11-26	Identity-Preserving Text-to-Video Generation by Frequency Decomposition	Shenghai Yuan et.al.	2411.17440	link
2024-11-26	AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation	Ziyi Xu et.al.	2411.17383	null
2024-11-26	AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM	Jiarui Wang et.al.	2411.17221	link
2024-11-28	PhysMotion: Physics-Grounded Dynamics From a Single Image	Xiyang Tan et.al.	2411.17189	null
2024-11-26	PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation	Hengjia Li et.al.	2411.17048	null
2024-11-26	Free $^2$ Guide: Gradient-Free Path Integral Control for Enhancing Text-to-Video Generation with Large Vision-Language Models	Jaemin Kim et.al.	2411.17041	null
2024-11-25	Pathways on the Image Manifold: Image Editing via Video Generation	Noam Rotstein et.al.	2411.16819	null
2024-11-25	DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation	Zun Wang et.al.	2411.16657	null
2024-11-25	Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric	Zhichao Zhang et.al.	2411.16619	null
2024-11-25	Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing	Kaifeng Gao et.al.	2411.16375	link
2024-11-23	Optical-Flow Guided Prompt Optimization for Coherent Video Generation	Hyelin Nam et.al.	2411.15540	null
2024-11-22	MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation	Weijia Wu et.al.	2411.15262	link
2024-11-22	VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement	Daeun Lee et.al.	2411.15115	null
2024-11-21	Understanding World or Predicting Future? A Comprehensive Survey of World Models	Jingtao Ding et.al.	2411.14499	null
2024-11-21	StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart	Jian Shi et.al.	2411.14295	link
2024-11-21	TaQ-DiT: Time-aware Quantization for Diffusion Transformers	Xinyan Liu et.al.	2411.14172	null
2024-11-21	MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control	Ruiyuan Gao et.al.	2411.13807	null
2024-11-20	What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality	Zihan Wang et.al.	2411.13609	null
2024-11-20	REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents	Rui Tian et.al.	2411.13552	link
2024-11-20	VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models	Ziqi Huang et.al.	2411.13503	link
2024-11-19	Towards motion from video diffusion models	Paul Janson et.al.	2411.12831	null
2024-11-19	Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting	Haoyu Zhao et.al.	2411.12789	null
2024-11-19	PoM: Efficient Image and Video Generation with the Polynomial Mixer	David Picard et.al.	2411.12663	link
2024-11-18	Medical Video Generation for Disease Progression Simulation	Xu Cao et.al.	2411.11943	null
2024-11-18	SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input	Zhen Lv et.al.	2411.11934	null
2024-11-19	SoK: On the Role and Future of AIGC Watermarking in the Era of Gen-AI	Kui Ren et.al.	2411.11478	null
2024-11-18	Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge	Qinglong Cao et.al.	2411.11343	null
2024-11-17	SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration	Jintao Zhang et.al.	2411.10958	link
2024-11-16	ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models	Vipula Rawte et.al.	2411.10867	null
2024-11-16	AnimateAnything: Consistent and Controllable Animation for Video Generation	Guojun Lei et.al.	2411.10836	null
2024-11-15	OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models	Mathis Koroglu et.al.	2411.10501	null
2024-11-14	Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance	Md Fahim Anjum et.al.	2411.09174	null
2024-11-14	VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation	Youpeng Wen et.al.	2411.09153	null
2024-11-16	A Survey on Vision Autoregressive Model	Kai Jiang et.al.	2411.08666	null
2024-11-13	EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation	Xiaofeng Wang et.al.	2411.08380	null
2024-11-13	Motion Control for Enhanced Complex Action Video Generation	Qiang Zhou et.al.	2411.08328	null
2024-11-12	Artificial Intelligence for Biomedical Video Generation	Linyuan Li et.al.	2411.07619	null
2024-11-10	I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength	Wanquan Feng et.al.	2411.06525	null
2024-11-08	Autoregressive Models in Vision: A Survey	Jing Xiong et.al.	2411.05902	link
2024-11-08	WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making	Zhilong Zhang et.al.	2411.05619	null
2024-11-07	SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation	Koichi Namekata et.al.	2411.04989	null
2024-11-07	Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification	Mischa Dombrowski et.al.	2411.04956	null
2024-11-07	DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion	Wenqiang Sun et.al.	2411.04928	null
2024-11-11	StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration	Panwen Hu et.al.	2411.04925	null
2024-11-07	MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views	Yuedong Chen et.al.	2411.04924	link
2024-11-07	Taming Rectified Flow for Inversion and Editing	Jiangshan Wang et.al.	2411.04746	link
2024-11-05	TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation	Wenhao Wang et.al.	2411.04709	null
2024-11-05	Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey	Ao Fu et.al.	2411.02914	null
2024-11-07	Adaptive Caching for Faster Video Generation with Diffusion Transformers	Kumara Kahatapitiya et.al.	2411.02397	null
2024-11-04	How Far is Video Generation from World Model: A Physical Law Perspective	Bingyi Kang et.al.	2411.02385	null
2024-11-03	Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation	Zhenbin Wang et.al.	2411.01647	null
2024-11-02	Fast and Memory-Efficient Video Diffusion Using Streamlined Inference	Zheng Zhan et.al.	2411.01171	link
2024-11-01	GameGen-X: Interactive Open-world Game Video Generation	Haoxuan Che et.al.	2411.00769	link
2024-11-04	Fashion-VDM: Video Diffusion Model for Virtual Try-On	Johanna Karras et.al.	2411.00225	null
2024-10-31	Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning	Penghui Ruan et.al.	2410.24219	link
2024-10-31	Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts	Xiang Deng et.al.	2410.23836	null
2024-10-31	SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation	Yining Hong et.al.	2410.23277	null
2024-10-30	LumiSculpt: A Consistency Lighting Control Network for Video Generation	Yuxin Zhang et.al.	2410.22979	null
2024-10-30	HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models	Shengkai Zhang et.al.	2410.22901	link
2024-10-29	Investigating Memorization in Video Diffusion Models	Chen Chen et.al.	2410.21669	null
2024-10-28	LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior	Hanyu Wang et.al.	2410.21264	null
2024-10-28	Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient	Yintai Ma et.al.	2410.20657	null
2024-10-27	ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation	Zongyi Li et.al.	2410.20502	null
2024-10-26	MarDini: Masked Autoregressive Diffusion for Video Generation at Scale	Haozhe Liu et.al.	2410.20280	null
2024-10-26	Your Image is Secretly the Last Frame of a Pseudo Video	Wenlong Chen et.al.	2410.20158	null
2024-10-26	GiVE: Guiding Visual Encoder to Perceive Overlooked Information	Junjie Li et.al.	2410.20109	null
2024-10-26	GHIL-Glue: Hierarchical Control with Filtered Subgoal Images	Kyle B. Hatch et.al.	2410.20018	null
2024-10-25	FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality	Zhengyao Lv et.al.	2410.19355	null
2024-10-24	Framer: Interactive Frame Interpolation	Wen Wang et.al.	2410.18978	null
2024-10-24	Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances	Shilin Lu et.al.	2410.18775	link
2024-10-23	WorldSimBench: Towards Video Generation Models as World Simulators	Yiran Qin et.al.	2410.18072	null
2024-10-23	VISAGE: Video Synthesis using Action Graphs for Surgery	Yousef Yeganeh et.al.	2410.17751	null
2024-10-21	3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors	Xi Liu et.al.	2410.16266	null
2024-10-20	EVA: An Embodied World Model for Future Video Anticipation	Xiaowei Chi et.al.	2410.15461	null
2024-10-20	Allegro: Open the Black Box of Commercial-Level Video Generation Model	Yuan Zhou et.al.	2410.15458	link
2024-10-20	FrameBridge: Improving Image-to-Video Generation with Bridge Models	Yuji Wang et.al.	2410.15371	null
2024-10-27	VidPanos: Generative Panoramic Videos from Casual Panning Videos	Jingwei Ma et.al.	2410.13832	null
2024-10-17	DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control	Yujie Wei et.al.	2410.13830	null
2024-10-18	DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation	Hanbo Cheng et.al.	2410.13726	link
2024-10-17	Movie Gen: A Cast of Media Foundation Models	Adam Polyak et.al.	2410.13720	link
2024-10-21	DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation	Guosheng Zhao et.al.	2410.13571	null
2024-10-18	Fundus to Fluorescein Angiography Video Generation as a Retinal Generative Foundation Model	Weiyi Zhang et.al.	2410.13242	null
2024-10-17	AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations	Qian Tao et.al.	2410.13212	null
2024-10-16	SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation	Jaehong Yoon et.al.	2410.12761	null
2024-10-16	Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices	Zhiyuan Ma et.al.	2410.11795	null
2024-10-14	Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models	Jingzhi Bao et.al.	2410.10821	link
2024-10-14	LVD-2M: A Long-take Video Dataset with Temporally Dense Captions	Tianwei Xiong et.al.	2410.10816	link
2024-10-14	Boosting Camera Motion Control for Video Diffusion Transformers	Soon Yau Cheong et.al.	2410.10802	null
2024-10-14	Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention	Dejia Xu et.al.	2410.10774	null
2024-10-14	DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships	Zhang Wan et.al.	2410.10751	null
2024-10-16	MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting	Yue Zhang et.al.	2410.10122	link
2024-10-15	VideoAgent: Self-Improving Video Generation	Achint Soni et.al.	2410.10076	link
2024-10-11	Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities	Abhijay Ghildyal et.al.	2410.08534	null
2024-10-10	Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content	Qiuheng Wang et.al.	2410.08260	null
2024-10-10	Scaling Laws For Diffusion Transformers	Zhengyang Liang et.al.	2410.08184	null
2024-10-10	Progressive Autoregressive Video Diffusion Models	Desai Xie et.al.	2410.08151	link
2024-10-10	HARIVO: Harnessing Text-to-Image Models for Video Generation	Mingi Kwon et.al.	2410.07763	null
2024-10-10	Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation	Jiahao Cui et.al.	2410.07718	link
2024-10-10	MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion	Onkar Susladkar et.al.	2410.07659	link
2024-10-09	Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis	Bohan Zeng et.al.	2410.07155	link
2024-10-08	BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way	Jiazi Bu et.al.	2410.06241	null
2024-10-08	GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation	Chi-Lam Cheang et.al.	2410.06158	null
2024-10-08	Pyramidal Flow Matching for Efficient Video Generative Modeling	Yang Jin et.al.	2410.05954	link
2024-10-08	SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution	Qi Tang et.al.	2410.05799	link
2024-10-08	T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design	Jiachen Li et.al.	2410.05677	null
2024-10-08	ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler	Serin Yang et.al.	2410.05651	null
2024-10-08	TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation	Gihyun Kwon et.al.	2410.05591	link
2024-10-07	Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation	Fanqing Meng et.al.	2410.05363	link
2024-10-10	The Dawn of Video Generation: Preliminary Explorations with SORA-like Models	Ailing Zeng et.al.	2410.05227	null
2024-10-07	Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality	Ge Ya et.al.	2410.05203	link
2024-10-07	ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction	Hyungjin Chung et.al.	2410.04721	null
2024-10-06	Realizing Video Summarization from the Path of Language-based Semantic Understanding	Kuan-Chen Mu et.al.	2410.04511	null
2024-10-03	People are poorly equipped to detect AI-powered voice clones	Sarah Barrington et.al.	2410.03791	null
2024-10-04	Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach	Yaofang Liu et.al.	2410.03160	link
2024-10-04	ECHOPulse: ECG controlled echocardio-grams video generation	Yiwei Li et.al.	2410.03143	link
2024-10-03	Loong: Generating Minute-level Long Videos with Autoregressive Language Models	Yuqing Wang et.al.	2410.02757	null
2024-10-03	SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration	Jintao Zhang et.al.	2410.02367	link
2024-10-02	COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation	Mingzhen Sun et.al.	2410.01718	null
2024-10-02	MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation	Mingzhen Sun et.al.	2410.01594	link
2024-10-01	Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining	Jie Cheng et.al.	2410.00564	link
2024-09-30	ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning	Jian Shi et.al.	2410.00262	link
2024-09-30	Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs	Zicheng Zhang et.al.	2409.20063	null
2024-09-30	Replace Anyone in Videos	Xiang Wang et.al.	2409.19911	link
2024-09-27	PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation	Shaowei Liu et.al.	2409.18964	link
2024-09-27	Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions	Iskander Azangulov et.al.	2409.18804	null
2024-09-26	Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation	Huan Yang et.al.	2409.17674	null
2024-09-26	A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation	Masato Ishii et.al.	2409.17550	link
2024-09-25	Pose-Guided Fine-Grained Sign Language Video Generation	Tongkai Shi et.al.	2409.16709	null
2024-09-24	Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation	Homanga Bharadhwaj et.al.	2409.16283	null
2024-09-23	Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond	Hong Chen et.al.	2409.14993	null
2024-09-23	Advancing Video Quality Assessment for AIGC	Xinli Yue et.al.	2409.14888	null
2024-09-23	Video-to-Audio Generation with Fine-grained Temporal Semantics	Yuchen Hu et.al.	2409.14709	null
2024-09-22	Dormant: Defending against Pose-driven Human Image Animation	Jiachen Zhou et.al.	2409.14424	link
2024-09-27	JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation	Hadrien Reynaud et.al.	2409.14149	null
2024-09-20	JoyHallo: Digital human model for Mandarin	Sheng Shi et.al.	2409.13268	null
2024-09-19	Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation	Chenyu Wang et.al.	2409.12532	null
2024-09-19	Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework	Xinyi Ying et.al.	2409.12448	link
2024-09-17	OSV: One Step is Enough for High-Quality Image to Video Generation	Xiaofeng Mao et.al.	2409.11367	null
2024-09-19	The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives	Samee Arif et.al.	2409.11261	link
2024-09-16	Embodiment-Agnostic Action Planning via Object-Part Scene Flow	Weiliang Tang et.al.	2409.10032	null
2024-09-13	STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment	Yong Ren et.al.	2409.08601	null
2024-09-11	DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures	Steven Hogue et.al.	2409.07649	null
2024-09-11	Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models	Haibo Yang et.al.	2409.07452	link
2024-09-11	EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion	Jian Zhang et.al.	2409.07255	link
2024-09-10	SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation	Teng Hu et.al.	2409.06633	null
2024-09-10	G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer	Jinzhi Zhang et.al.	2409.06322	null
2024-09-11	MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control	Yining Yao et.al.	2409.06189	null
2024-09-12	DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation	Wei Wu et.al.	2409.05463	null
2024-09-06	Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task	Jing Wang et.al.	2409.04005	link
2024-09-06	DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes	Jianbiao Mei et.al.	2409.04003	link
2024-09-04	PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation	Jun Ling et.al.	2409.02657	null
2024-09-05	Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency	Jianwen Jiang et.al.	2409.02634	null
2024-09-03	DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos	Wenbo Hu et.al.	2409.02095	link
2024-09-05	CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention	Gaojie Lin et.al.	2409.01876	null
2024-09-03	DiVE: DiT-based Video Generation with Enhanced Control	Junpeng Jiang et.al.	2409.01595	null
2024-09-02	AMG: Avatar Motion Guided Video Generation	Zhangsihao Yang et.al.	2409.01502	link
2024-09-09	OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model	Liuhan Chen et.al.	2409.01199	link
2024-08-31	Compositional 3D-aware Video Generation with LLM Director	Hanxin Zhu et.al.	2409.00558	null
2024-08-30	CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion	Yiran Chen et.al.	2408.17424	null
2024-08-30	VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers	Juncan Deng et.al.	2408.17131	null
2024-08-29	One-Shot Learning Meets Depth Diffusion in Multi-Object Videos	Anisha Jain et.al.	2408.16704	null
2024-08-29	DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving	Yongjie Fu et.al.	2408.16647	null
2024-08-29	Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation	Xiaoyu Jin et.al.	2408.16506	null
2024-08-28	GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model	Yongjie Fu et.al.	2408.15868	null
2024-08-27	GenRec: Unifying Video Generation and Recognition with Diffusion Models	Zejia Weng et.al.	2408.15241	link
2024-08-27	Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance	Weiyi Zhang et.al.	2408.15217	link
2024-08-28	SurGen: Text-Guided Diffusion Model for Surgical Video Generation	Joseph Cho et.al.	2408.14028	null
2024-09-02	Training-free Long Video Generation with Chain of Diffusion Model Experts	Wenhao Li et.al.	2408.13423	null
2024-08-24	TVG: A Training-free Transition Video Generation Method with Diffusion Models	Rui Zhang et.al.	2408.13413	null
2024-08-23	CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities	Tao Wu et.al.	2408.13239	link
2024-08-23	EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation	Cong Wang et.al.	2408.13005	null
2024-08-22	xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations	Can Qin et.al.	2408.12590	null
2024-08-22	Real-Time Video Generation with Pyramid Attention Broadcast	Xuanlei Zhao et.al.	2408.12588	link
2024-08-21	DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework	Zhifei Xie et.al.	2408.11788	null
2024-08-21	TrackGo: A Flexible and Efficient Method for Controllable Video Generation	Haitao Zhou et.al.	2408.11475	null
2024-08-19	Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation	Liu He et.al.	2408.10453	null
2024-08-19	Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data	Tao Yang et.al.	2408.10119	null
2024-08-19	Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation	Yunxin Li et.al.	2408.09787	link
2024-08-18	SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama	Jing Tang et.al.	2408.09333	link
2024-08-21	JPEG-LM: LLMs as Image Generators with Canonical Codec Representations	Xiaochuang Han et.al.	2408.08459	null
2024-08-16	FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance	Jiasong Feng et.al.	2408.08189	null
2024-08-15	When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding	Pingping Zhang et.al.	2408.08093	null
2024-08-14	Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving	Yuqing Wen et.al.	2408.07605	null
2024-08-15	ControlNeXt: Powerful and Efficient Control for Image and Video Generation	Bohao Peng et.al.	2408.06070	link
2024-08-20	Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE	Yiying Yang et.al.	2408.05477	null
2024-08-10	High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model	Weizhi Zhong et.al.	2408.05416	null
2024-08-08	Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics	Ruining Li et.al.	2408.04631	null
2024-08-05	VidGen-1M: A Large-Scale Dataset for Text-to-video Generation	Zhiyu Tan et.al.	2408.02629	null
2024-08-01	Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion	Manuel Kansy et.al.	2408.00458	null
2024-07-31	Tora: Trajectory-oriented Diffusion Transformer for Video Generation	Zhenghao Zhang et.al.	2407.21705	link
2024-07-31	Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation	Junxuan Yu et.al.	2407.21490	null
2024-07-31	Fine-gained Zero-shot Video Sampling	Dengsheng Chen et.al.	2407.21475	null
2024-07-31	Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model	Zhichao Zhang et.al.	2407.21408	null
2024-08-04	Adding Multimodal Controls to Whole-body Human Motion Generation	Yuxuan Bian et.al.	2407.21136	link
2024-07-30	EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos	Aashish Rai et.al.	2407.20592	null
2024-07-29	FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention	Yu Lu et.al.	2407.19918	null
2024-07-29	Synthetic Thermal and RGB Videos for Automatic Pain Assessment utilizing a Vision-MLP Architecture	Stefanos Gkikas et.al.	2407.19811	null
2024-07-28	FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models	Changgu Chen et.al.	2407.19453	link
2024-07-27	Faster Image2Video Generation: A Closer Look at CLIP Image Embedding’s Impact on Spatio-Temporal Cross-Attentions	Ashkan Taghipour et.al.	2407.19205	null
2024-07-26	UniForensics: Face Forgery Detection via General Facial Representation	Ziyuan Fang et.al.	2407.19079	null
2024-07-24	SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency	Yiming Xie et.al.	2407.17470	null
2024-07-28	HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation	Zhenzhi Wang et.al.	2407.17438	link
2024-07-23	MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence	Canyu Zhao et.al.	2407.16655	null
2024-07-23	Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data	Hengyu Fu et.al.	2407.16134	null
2024-07-23	Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos	Jiahe Liu et.al.	2407.16124	link
2024-07-21	Flow as the Cross-Domain Manipulation Interface	Mengda Xu et.al.	2407.15208	null
2024-07-21	Anchored Diffusion for Video Face Reenactment	Idan Kligvasser et.al.	2407.15153	null
2024-07-19	T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation	Kaiyue Sun et.al.	2407.14505	link
2024-07-19	Unlearning Concepts from Text-to-Video Diffusion Models	Shiqi Liu et.al.	2407.14209	null
2024-07-25	Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion	Boyang Deng et.al.	2407.13759	null
2024-07-18	Multi-sentence Video Grounding for Long Video Generation	Wei Feng et.al.	2407.13219	null
2024-07-20	VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control	Sherwin Bahmani et.al.	2407.12781	null
2024-07-17	Towards Understanding Unsafe Video Generation	Yan Pang et.al.	2407.12581	link
2024-07-15	IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation	Yuanhao Zhai et.al.	2407.10937	link
2024-07-15	A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication	Jingyi Deng et.al.	2407.10575	null
2024-07-13	Learning Online Scale Transformation for Talking Head Video Generation	Fa-Ting Hong et.al.	2407.09965	null
2024-07-12	Inference Optimization of Foundation Models on AI Accelerators	Youngsuk Park et.al.	2407.09111	null
2024-07-16	Bora: Biomedical Generalist Video Generation Model	Weixiang Sun et.al.	2407.08944	null
2024-07-11	Still-Moving: Customized Video Generation without Customized Video Data	Hila Chefer et.al.	2407.08674	null
2024-07-11	A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights	Wentao Lei et.al.	2407.08428	link
2024-07-11	E2VIDiff: Perceptual Events-to-Video Reconstruction using Diffusion Priors	Jinxiu Liang et.al.	2407.08231	null
2024-07-10	VEnhancer: Generative Space-Time Enhancement for Video Generation	Jingwen He et.al.	2407.07667	null
2024-07-10	Video-to-Audio Generation with Hidden Alignment	Manjie Xu et.al.	2407.07464	null
2024-07-12	Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task	Yiran Yang et.al.	2407.06617	link
2024-07-08	MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions	Xuan Ju et.al.	2407.06358	null
2024-07-08	Dynamics of quantum turbulence in axially rotating thermal counterflow	Ritesh Dwivedi et.al.	2407.06311	link
2024-07-08	VIMI: Grounding Video Generation through Multi-modal Instruction	Yuwei Fang et.al.	2407.06304	null
2024-07-08	The Tug-of-War Between Deepfake Generation and Detection	Hannah Lee et.al.	2407.06174	null
2024-07-08	T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models	Yibo Miao et.al.	2407.05965	null
2024-07-08	This&That: Language-Gesture Controlled Video Generation for Robot Planning	Boyang Wang et.al.	2407.05530	null
2024-07-05	Unsupervised Video Summarization via Reinforcement Learning and a Trained Evaluator	Mehryar Abbasi et.al.	2407.04258	null
2024-07-03	Robot Shape and Location Retention in Video Generation Using Diffusion Models	Peng Wang et.al.	2407.02873	link
2024-07-02	OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation	Kepan Nan et.al.	2407.02371	null
2024-07-04	GVDIFF: Grounded Text-to-Video Generation with Diffusion Models	Huanzhang Dou et.al.	2407.01921	null
2024-07-01	Evaluation of Text-to-Video Generation Models: A Dynamics Perspective	Mingxiang Liao et.al.	2407.01094	link
2024-06-29	SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix	Peng Dai et.al.	2407.00367	null
2024-06-28	MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance	Yuang Zhang et.al.	2406.19680	null
2024-06-27	What Matters in Detecting AI-Generated Videos like Sora?	Chirui Chang et.al.	2406.19568	null
2024-06-26	ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation	Shenghai Yuan et.al.	2406.18522	link
2024-06-25	Text-Animator: Controllable Visual Text Video Generation	Lin Liu et.al.	2406.17777	null
2024-06-25	MotionBooth: Motion-Aware Customized Text-to-Video Generation	Jianzong Wu et.al.	2406.17758	null
2024-06-24	FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models	Haonan Qiu et.al.	2406.16863	link
2024-06-24	Dreamitate: Real-World Visuomotor Policy Learning via Video Generation	Junbang Liang et.al.	2406.16862	null
2024-06-24	Video-Infinity: Distributed Long Video Generation	Zhenxiong Tan et.al.	2406.16260	null
2024-06-23	Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation	Rafael Redondo et.al.	2406.16155	null
2024-06-22	MVOC: a training-free multiple video object composition method with diffusion models	Wei Wang et.al.	2406.15829	link
2024-06-24	VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation	Xuan He et.al.	2406.15252	null
2024-06-20	Fantastic Copyrighted Beasts and How (Not) to Generate Them	Luxi He et.al.	2406.14526	null
2024-06-20	SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset	Josef Dai et.al.	2406.14477	link
2024-06-20	Video Generation with Learned Action Prior	Meenakshi Sarkar et.al.	2406.14436	null
2024-06-20	ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning	Zhongjie Duan et.al.	2406.14130	link
2024-06-19	Splatter a Video: Video Gaussian Representation for Versatile Processing	Yang-Tian Sun et.al.	2406.13870	null
2024-06-21	GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation	Baiqi Li et.al.	2406.13743	link
2024-06-19	ARDuP: Active Region Video Diffusion for Universal Policies	Shuaiyi Huang et.al.	2406.13301	null
2024-06-19	Neural Residual Diffusion Models for Deep Scalable Vision Generation	Zhiyuan Ma et.al.	2406.13215	null
2024-06-18	Generative Artificial Intelligence-Guided User Studies: An Application for Air Taxi Services	Shengdi Xiao et.al.	2406.12296	null
2024-06-17	NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation	Niu Guanchen et.al.	2406.11259	null
2024-06-17	Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion	Rishab Parthasarathy et.al.	2406.11196	link
2024-06-16	ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models	Kaifeng Gao et.al.	2406.10981	link
2024-06-14	VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs	Rohit Bharadwaj et.al.	2406.10326	link
2024-06-14	Training-free Camera Control for Video Generation	Chen Hou et.al.	2406.10126	null
2024-06-13	Turns Out I’m Not Real: Towards Robust Detection of AI-Generated Videos	Qingyuan Liu et.al.	2406.09601	null
2024-06-13	Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs	Zijia Zhao et.al.	2406.09367	link
2024-06-12	Vivid-ZOO: Multi-View Video Generation with Diffusion Model	Bing Li et.al.	2406.08659	null
2024-06-12	TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation	Weixi Feng et.al.	2406.08656	link
2024-06-12	DiTFastAttn: Attention Compression for Diffusion Transformer Models	Zhihang Yuan et.al.	2406.08552	null
2024-06-12	Hierarchical Patch Diffusion Models for High-Resolution Video Generation	Ivan Skorokhodov et.al.	2406.07792	null
2024-06-11	HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness	Zihui Xue et.al.	2406.07754	null
2024-06-11	AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation	Kai Wang et.al.	2406.07686	null
2024-06-11	4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models	Heng Yu et.al.	2406.07472	null
2024-06-11	Visual Representation Learning with Stochastic Frame Prediction	Huiwon Jang et.al.	2406.07398	null
2024-06-09	Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion	Ge Ya Luo et.al.	2406.05630	link
2024-06-12	MotionClone: Training-Free Motion Cloning for Controllable Video Generation	Pengyang Ling et.al.	2406.05338	link
2024-06-07	CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion	Xingrui Wang et.al.	2406.05082	null
2024-06-07	Zero-Shot Video Editing through Adaptive Sliding Score Distillation	Lianghan Zhu et.al.	2406.04888	null
2024-06-07	Online Continual Learning of Video Diffusion Models From a Single Video Stream	Jason Yoo et.al.	2406.04814	null
2024-06-06	GenAI Arena: An Open Evaluation Platform for Generative Models	Dongfu Jiang et.al.	2406.04485	null
2024-06-06	ShareGPT4Video: Improving Video Understanding and Generation with Better Captions	Lin Chen et.al.	2406.04325	null
2024-06-06	SF-V: Single Forward Video Generation Model	Zhixing Zhang et.al.	2406.04324	link
2024-06-06	VideoTetris: Towards Compositional Text-to-Video Generation	Ye Tian et.al.	2406.04277	link
2024-06-05	VideoPhy: Evaluating Physical Commonsense for Video Generation	Hritik Bansal et.al.	2406.03520	null
2024-06-05	Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control	Jingyun Xue et.al.	2406.03035	null
2024-06-04	ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation	Tianchen Zhao et.al.	2406.02540	link
2024-06-04	V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation	Cong Wang et.al.	2406.02511	null
2024-06-04	CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation	Dejia Xu et.al.	2406.02509	null
2024-06-04	I4VGen: Image as Stepping Stone for Text-to-Video Generation	Xiefan Guo et.al.	2406.02230	null
2024-06-04	Learning Temporally Consistent Video Depth from Video Diffusion Priors	Jiahao Shao et.al.	2406.01493	null
2024-06-03	DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors	Tianyu Huang et.al.	2406.01476	link
2024-06-04	Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation	Enhui Ma et.al.	2406.01349	null
2024-06-03	UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation	Xiang Wang et.al.	2406.01188	null
2024-06-03	ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation	Shaoshu Yang et.al.	2406.00908	link
2024-06-02	EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing	Hadrien Reynaud et.al.	2406.00808	link
2024-05-31	4Diffusion: Multi-view Video Diffusion Model for 4D Generation	Haiyu Zhang et.al.	2405.20674	null
2024-05-30	Improving the Training of Rectified Flows	Sangyun Lee et.al.	2405.20320	link
2024-05-30	CV-VAE: A Compatible Video VAE for Latent Generative Video Models	Sijie Zhao et.al.	2405.20279	link
2024-06-02	MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model	Muyao Niu et.al.	2405.20222	link
2024-05-30	Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion	Jiangkai Wu et.al.	2405.20032	link
2024-05-30	DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark	Haoxing Chen et.al.	2405.19707	link
2024-05-29	EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture	Jiaqi Xu et.al.	2405.18991	link
2024-05-29	T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback	Jiachen Li et.al.	2405.18750	link
2024-05-28	Phased Consistency Model	Fu-Yun Wang et.al.	2405.18407	link
2024-05-28	RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives	Jaehong Yoon et.al.	2405.18406	link
2024-05-28	VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers	Jun Zheng et.al.	2405.18326	null
2024-05-28	EG4D: Explicit Generation of 4D Object without Score Distillation	Qi Sun et.al.	2405.18132	link
2024-05-28	MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling	Bowen Zhang et.al.	2405.18003	link
2024-05-28	Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation	Akio Hayakawa et.al.	2405.17842	link
2024-05-27	RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance	Jiaojiao Fan et.al.	2405.17661	null
2024-05-27	ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance	Jiannan Huang et.al.	2405.17532	link
2024-05-27	Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control	Zhengfei Kuang et.al.	2405.17414	null
2024-05-27	Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer	Ruizhi Shao et.al.	2405.17405	null
2024-05-27	Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability	Shenyuan Gao et.al.	2405.17398	link
2024-05-28	Controllable Longer Image Animation with Diffusion Models	Qiang Wang et.al.	2405.17306	null
2024-05-27	Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation	Zhoujie Fu et.al.	2405.16849	null
2024-05-27	Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels	Yikai Wang et.al.	2405.16822	null
2024-05-26	Towards Multi-Task Multi-Modal Models: A Video Generative Perspective	Lijun Yu et.al.	2405.16728	null
2024-05-28	Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation	Jinlin Liu et.al.	2405.16393	null
2024-05-25	Video Prediction Models as General Visual Encoders	James Maier et.al.	2405.16382	null
2024-05-24	Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation	Shentong Mo et.al.	2405.15881	null
2024-05-24	A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence	Ali Kashefi et.al.	2405.15406	link
2024-05-24	iVideoGPT: Interactive VideoGPTs are Scalable World Models	Jialong Wu et.al.	2405.15223	link
2024-05-23	Video Diffusion Models are Training-free Motion Interpreter and Controller	Zeqi Xiao et.al.	2405.14864	null
2024-05-24	Fisher Flow Matching for Generative Modeling over Discrete Data	Oscar Davis et.al.	2405.14664	null
2024-05-24	PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control	Yong Zhong et.al.	2405.14582	null
2024-05-23	MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes	Ruiyuan Gao et.al.	2405.14475	null
2024-05-22	ReVideo: Remake a Video with Motion and Content Control	Chong Mou et.al.	2405.13865	null
2024-05-22	MotionCraft: Physics-based Zero-Shot Video Generation	Luca Savant Aira et.al.	2405.13557	link
2024-05-22	Enhanced Creativity and Ideation through Stable Video Synthesis	Elijah Miller et.al.	2405.13357	null
2024-05-21	CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers	Andrew Marmon et.al.	2405.13195	null
2024-05-21	OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models	Zhaojian Yu et.al.	2405.12843	link
2024-05-21	DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control	Hong Chen et.al.	2405.12796	null
2024-05-19	FIFO-Diffusion: Generating Infinite Videos from Text without Training	Jihwan Kim et.al.	2405.11473	link
2024-05-17	From Sora What We Can See: A Survey of Text-to-Video Generation	Rui Sun et.al.	2405.10674	link
2024-05-15	Dance Any Beat: Blending Beats with Visuals in Dance Video Generation	Xuanchen Wang et.al.	2405.09266	null
2024-05-13	The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective	Andrew Shin et.al.	2405.08720	null
2024-05-10	OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation	Jinwei Lin et.al.	2405.06547	link
2024-05-08	Reviewing Intelligent Cinematography: AI research for camera-based video production	Adrian Azzarelli et.al.	2405.05039	null
2024-05-15	TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation	Hritik Bansal et.al.	2405.04682	link
2024-05-07	Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation	Dogucan Yaman et.al.	2405.04327	null
2024-05-07	Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models	Fan Bao et.al.	2405.04233	null
2024-05-07	Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models	Zhixuan Chu et.al.	2405.04180	link
2024-05-07	Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method	Peisong He et.al.	2405.04133	null
2024-05-06	Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond	Zheng Zhu et.al.	2405.03520	link
2024-05-06	Video Diffusion Models: A Survey	Andrew Melnik et.al.	2405.03150	link
2024-05-10	Matten: Video Generation with Mamba-Attention	Yu Gao et.al.	2405.03025	null
2024-05-02	StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation	Yupeng Zhou et.al.	2405.01434	link
2024-05-05	VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization	Yuliang Liu et.al.	2404.19652	link
2024-04-30	Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model	Wentao Lei et.al.	2404.19277	null
2024-04-29	FlexiFilm: Long Video Generation with Flexible Conditions	Yichen Ouyang et.al.	2404.18620	link
2024-04-25	Synthesizing Audio from Silent Video using Sequence to Sequence Modeling	Hugo Garrido-Lestache Belinchon et.al.	2404.17608	link
2024-04-25	TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models	Haomiao Ni et.al.	2404.16306	link
2024-04-26	Semantically consistent Video-to-Audio Generation using Multimodal Language Large Model	Gehui Chen et.al.	2404.16305	null
2024-04-24	Beyond Deepfake Images: Detecting AI-Generated Videos	Danial Samadi Vahdati et.al.	2404.15955	null
2024-05-01	MotionMaster: Training-free Camera Motion Transfer For Video Generation	Teng Hu et.al.	2404.15789	null
2024-04-23	ID-Animator: Zero-Shot Identity-Preserving Human Video Generation	Xuanhua He et.al.	2404.15275	link
2024-04-22	TAVGBench: Benchmarking Text to Audible-Video Generation	Yuxin Mao et.al.	2404.14381	link
2024-04-23	Accelerating Image Generation with Sub-path Linear Approximation Model	Chen Xu et.al.	2404.13903	null
2024-04-27	Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap	Bowen Qu et.al.	2404.13573	link
2024-04-21	Motion-aware Latent Diffusion Models for Video Frame Interpolation	Zhilin Huang et.al.	2404.13534	null
2024-04-20	Music Consistency Models	Zhengcong Fei et.al.	2404.13358	null
2024-04-19	PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation	Tianyuan Zhang et.al.	2404.13026	null
2024-04-19	ConCLVD: Controllable Chinese Landscape Video Generation via Diffusion Model	Dingming Liu et.al.	2404.12903	null
2024-04-18	On the Content Bias in Fréchet Video Distance	Songwei Ge et.al.	2404.12391	null
2024-04-18	RoboDreamer: Learning Compositional World Models for Robot Imagination	Siyuan Zhou et.al.	2404.12377	null
2024-04-18	AniClipart: Clipart Animation with Text-to-Video Priors	Ronghuan Wu et.al.	2404.12347	null
2024-04-15	Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model	Han Lin et.al.	2404.09967	null
2024-04-16	LoopAnimate: Loopable Salient Object Animation	Fanyi Wang et.al.	2404.09172	null
2024-04-13	THQA: A Perceptual Quality Assessment Database for Talking Heads	Yingjie Zhou et.al.	2404.09003	link
2024-04-16	LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field	Jiyang Li et.al.	2404.08966	link
2024-04-10	A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos	Suleyman Ozdel et.al.	2404.07351	null
2024-04-08	Action-conditioned video data improves predictability	Meenakshi Sarkar et.al.	2404.05439	null
2024-04-07	MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators	Shenghai Yuan et.al.	2404.05014	link
2024-04-07	AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment	Yuanfeng Xu et.al.	2404.04946	null
2024-04-02	CameraCtrl: Enabling Camera Control for Text-to-Video Generation	Hao He et.al.	2404.02101	link
2024-04-02	Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model	Xu He et.al.	2404.01862	link
2024-03-28	A Review of Multi-Modal Large Language and Vision Models	Kilian Carolan et.al.	2404.01322	null
2024-04-01	Evaluating Text-to-Visual Generation with Image-to-Text Generation	Zhiqiu Lin et.al.	2404.01291	link
2024-03-30	Grid Diffusion Models for Text-to-Video Generation	Taegyeong Lee et.al.	2404.00234	null
2024-03-29	Motion Inversion for Video Customization	Luozhou Wang et.al.	2403.20193	null
2024-03-28	Frame by Familiar Frame: Understanding Replication in Video Diffusion Models	Aimon Rahman et.al.	2403.19593	null
2024-03-26	Tutorial on Diffusion Models for Imaging and Vision	Stanley H. Chan et.al.	2403.18103	null
2024-03-26	TC4D: Trajectory-Conditioned Text-to-4D Generation	Sherwin Bahmani et.al.	2403.17920	null
2024-03-26	Annotated Biomedical Video Generation using Denoising Diffusion Probabilistic Models and Flow Fields	Rüveyda Yilmaz et.al.	2403.17808	link
2024-03-25	TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models	Zhongwei Zhang et.al.	2403.17005	null
2024-03-25	A Survey on Long Video Generation: Challenges, Methods, and Prospects	Chengxuan Li et.al.	2403.16407	null
2024-03-24	Opportunities and challenges in the application of large artificial intelligence models in radiology	Liangrui Pan et.al.	2403.16112	null
2024-03-23	Adaptive Super Resolution For One-Shot Talking-Head Generation	Luchuan Song et.al.	2403.15944	link
2024-03-22	Spectral Motion Alignment for Video Motion Transfer using Diffusion Models	Geon Yeong Park et.al.	2403.15249	null
2024-03-21	StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text	Roberto Henschel et.al.	2403.14773	link
2024-03-21	Explorative Inbetweening of Time and Space	Haiwen Feng et.al.	2403.14611	null
2024-03-22	AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks	Max Ku et.al.	2403.14468	link
2024-03-21	Enabling Visual Composition and Animation in Unsupervised Video Generation	Aram Davtyan et.al.	2403.14368	null
2024-03-21	StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN	Jongwoo Choi et.al.	2403.14186	link
2024-03-21	Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition	Sihyun Yu et.al.	2403.14148	null
2024-03-20	Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation	Fu-Yun Wang et.al.	2403.13745	link
2024-03-22	S2DM: Sector-Shaped Diffusion Models for Video Generation	Haoran Lang et.al.	2403.13408	null
2024-03-22	Mora: Enabling Generalist Video Generation via A Multi-Agent Framework	Zhengqing Yuan et.al.	2403.13248	link
2024-03-19	AnimateDiff-Lightning: Cross-Model Diffusion Distillation	Shanchuan Lin et.al.	2403.12706	null
2024-03-18	CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility	Bojia Zi et.al.	2403.12035	link
2024-03-18	VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model	Qi Zuo et.al.	2403.12010	null
2024-03-19	Subjective-Aligned Dateset and Metric for Text-to-Video Quality Assessment	Tengchuan Kou et.al.	2403.11956	link
2024-03-18	Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing	Juan Zhang et.al.	2403.11700	null
2024-03-17	Endora: Video Generation Models as Endoscopy Simulators	Chenxin Li et.al.	2403.11050	null
2024-03-15	DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers	Xuanlei Zhao et.al.	2403.10266	link
2024-03-15	Animate Your Motion: Turning Still Images into Dynamic Videos	Mingxiao Li et.al.	2403.10179	null
2024-03-14	Video Editing via Factorized Diffusion Distillation	Uriel Singer et.al.	2403.09334	null
2024-03-17	Intention-driven Ego-to-Exo Video Generation	Hongchen Luo et.al.	2403.09194	null
2024-03-13	VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis	Enric Corona et.al.	2403.08764	null
2024-03-13	Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts	Yue Ma et.al.	2403.08268	link
2024-03-12	AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production	Jiuniu Wang et.al.	2403.07952	null
2024-03-10	WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs	Deshun Yang et.al.	2403.07944	null
2024-03-12	SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces	Yuta Oshima et.al.	2403.07711	link
2024-03-15	DragAnything: Motion Control for Anything using Entity Representation	Weijia Wu et.al.	2403.07420	link
2024-03-11	DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation	Guosheng Zhao et.al.	2403.06845	null
2024-03-11	A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos	Weixia Zhang et.al.	2403.06421	link
2024-03-11	Video Generation with Consistency Tuning	Chaoyi Wang et.al.	2403.06356	null
2024-03-10	FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing	Youyuan Zhang et.al.	2403.06269	null
2024-03-10	BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering	Xinmin Qiu et.al.	2403.06243	null
2024-03-10	VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models	Wenhao Wang et.al.	2403.06098	link
2024-03-08	VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models	Yabo Zhang et.al.	2403.05438	link
2024-03-08	Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation	Joseph Cho et.al.	2403.05131	null
2024-03-07	A spatiotemporal style transfer algorithm for dynamic visual stimulus generation	Antonino Greco et.al.	2403.04940	null
2024-03-08	Pix2Gif: Motion-Guided Diffusion for GIF Generation	Hitesh Kandala et.al.	2403.04634	link
2024-03-05	Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation	Weijie Li et.al.	2403.02827	null
2024-03-06	UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control	Xuweiyi Chen et.al.	2403.02332	link
2024-03-05	AtomoVideo: High Fidelity Image-to-Video Generation	Litong Gong et.al.	2403.01800	null
2024-03-02	SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code	Ziniu Hu et.al.	2403.01248	null
2024-03-01	Abductive Ego-View Accident Video Understanding for Safe Driving Perception	Jianwu Fang et.al.	2403.00436	null
2024-02-29	Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers	Tsai-Shien Chen et.al.	2402.19479	null
2024-02-28	Context-aware Talking Face Video Generation	Meidai Xuanyuan et.al.	2402.18092	null
2024-02-27	EMO: Emote Portrait Alive – Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions	Linrui Tian et.al.	2402.17485	null
2024-02-27	Sora Generates Videos with Stunning Geometrical Consistency	Xuanyi Li et.al.	2402.17403	null
2024-02-28	Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models	Yixin Liu et.al.	2402.17177	link
2024-02-27	Video as the New Language for Real-World Decision Making	Sherry Yang et.al.	2402.17139	null
2024-02-22	Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis	Willi Menapace et.al.	2402.14797	null
2024-02-22	Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models	Yixuan Ren et.al.	2402.14780	null
2024-02-21	Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation	Kihong Kim et.al.	2402.13729	null
2024-02-24	UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing	Jianhong Bai et.al.	2402.13185	null
2024-02-20	Neural Network Diffusion	Kai Wang et.al.	2402.13144	link
2024-02-20	VGMShield: Mitigating Misuse of Video Generative Models	Yan Pang et.al.	2402.13126	link
2024-02-19	Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same	Sungjun Ahn et.al.	2402.12412	null
2024-02-16	Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation	Lanqing Guo et.al.	2402.10491	link
2024-02-14	Magic-Me: Identity-Specific Video Customized Diffusion	Ze Ma et.al.	2402.09368	link
2024-02-10	Denoising Diffusion Probabilistic Models in Six Simple Steps	Richard E. Turner et.al.	2402.04384	null