Contributors Forks Stargazers Issues

Updated on 2025.06.28

Usage instructions: here

3D Segmentation

Publish Date Title Authors PDF Code
2025-06-26 SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark Alex Costanzino et.al. 2506.21549 null
2025-06-26 GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding Zijun Lin et.al. 2506.21188 null
2025-06-24 ReCoGNet: Recurrent Context-Guided Network for 3D MRI Prostate Segmentation Ahmad Mustafa et.al. 2506.19687 null
2025-06-22 Auto-Regressive Surface Cutting Yang Li et.al. 2506.18017 null
2025-06-17 I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs Yu Qi et.al. 2506.14495 null
2025-06-20 Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition Xiaohui Jiang et.al. 2506.14243 link
2025-06-17 Unified Representation Space for 3D Visual Grounding Yinuo Zheng et.al. 2506.14238 null
2025-06-09 PIG: Physically-based Multi-Material Interaction with 3D Gaussians Zeyu Xiao et.al. 2506.07657 null
2025-06-06 NeurNCD: Novel Class Discovery via Implicit Neural Representation Junming Wang et.al. 2506.06412 null
2025-06-05 From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes Tianxu Wang et.al. 2506.04897 null
2025-06-05 Midplane based 3D single pass unbiased segment-to-segment contact interaction using penalty method Indrajeet Sahu et.al. 2506.04841 null
2025-06-05 OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model Kunshen Zhang et.al. 2506.04837 link
2025-05-28 Zero-Shot 3D Visual Grounding from Vision-Language Models Rong Li et.al. 2505.22429 null
2025-05-26 Rep3D: Re-parameterize Large 3D Kernels with Low-Rank Receptive Modeling for Medical Imaging Ho Hin Lee et.al. 2505.19603 null
2025-05-23 SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding Xuerui Qiu et.al. 2505.17674 null
2025-06-03 A Unified Multi-Scale Attention-Based Network for Automatic 3D Segmentation of Lung Parenchyma & Nodules In Thoracic CT Images Muhammad Abdullah et.al. 2505.17602 link
2025-05-23 From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation Mahmoud Chick Zaouali et.al. 2505.17402 null
2025-05-18 Attention-Enhanced U-Net for Accurate Segmentation of COVID-19 Infected Lung Regions in CT Scans Amal Lahchim et.al. 2505.12298 null
2025-05-17 iSegMan: Interactive Segment-and-Manipulate 3D Gaussians Yian Zhao et.al. 2505.11934 null
2025-05-15 MOSAIC: A Multi-View 2.5D Organ Slice Selector with Cross-Attentional Reasoning for Anatomically-Aware CT Localization in Medical Organ Segmentation Hania Ghouse et.al. 2505.10672 null
2025-05-27 HWA-UNETR: Hierarchical Window Aggregate UNETR for 3D Multimodal Gastric Lesion Segmentation Jiaming Liang et.al. 2505.10464 link
2025-05-13 Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving Zongchuang Zhao et.al. 2505.08725 link
2025-05-08 DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding Henry Zheng et.al. 2505.04965 null
2025-05-20 AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding Feng Xiao et.al. 2505.04058 link
2025-05-04 Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving Alexey Nekrasov et.al. 2505.02148 null
2025-05-03 Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes Jie Liu et.al. 2505.01726 null
2025-04-30 SAM4EM: Efficient memory-based two stage prompt-free segment anything model adapter for complex 3D neuroscience electron microscopy stacks Uzair Shah et.al. 2504.21544 link
2025-05-04 Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation Victoria Yue Chen et.al. 2504.19718 null
2025-04-24 OmniMamba4D: Spatio-temporal Mamba for longitudinal CT lesion segmentation Justin Namuk Kim et.al. 2504.09655 null
2025-04-13 Ges3ViG: Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding Atharv Mahesh Mane et.al. 2504.09623 link
2025-04-11 DSM: Building A Diverse Semantic Map for 3D Visual Grounding Qinghongbing Xie et.al. 2504.08307 null
2025-04-09 MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs Jiawei Mao et.al. 2504.06897 null
2025-04-08 InvNeRF-Seg: Fine-Tuning a Pre-Trained NeRF for 3D Object Segmentation Jiangsan Zhao et.al. 2504.05751 null
2025-04-01 Deconver: A Deconvolutional Network for Medical Image Segmentation Pooya Ashtari et.al. 2504.00302 link
2025-03-30 ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning Zhenyang Liu et.al. 2503.23297 null
2025-03-28 TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting Boyang et.al. 2503.22676 null
2025-03-28 NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving Fuhao Li et.al. 2503.22436 null
2025-03-28 Segment then Splat: A Unified Approach for 3D Open-Vocabulary Segmentation based on Gaussian Splatting Yiren Lu et.al. 2503.22204 null
2025-03-26 COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting Jiaxin Zhang et.al. 2503.19443 link
2025-03-24 DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation Karim Abou Zeid et.al. 2503.18944 link
2025-03-24 ZECO: ZeroFusion Guided 3D MRI Conditional Generation Feiran Wang et.al. 2503.18246 link
2025-03-23 PanopticSplatting: End-to-End Panoptic Gaussian Splatting Yuxuan Xie et.al. 2503.18073 null
2025-03-19 SPNeRF: Open Vocabulary 3D Neural Scene Segmentation with Superpoints Weiwen Hu et.al. 2503.15712 null
2025-03-19 Federated Continual 3D Segmentation With Single-round Communication Can Peng et.al. 2503.15414 null
2025-03-18 Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting Runsong Zhu et.al. 2503.14029 link
2025-03-17 Adaptive Transformer Attention and Multi-Scale Fusion for Spine 3D Segmentation Yanlin Xiang et.al. 2503.12853 null
2025-03-12 QuickDraw: Fast Visualization, Analysis and Active Learning for Medical Image Segmentation Daniel Syomichev et.al. 2503.09885 link
2025-03-17 WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images Yansong Guo et.al. 2503.08407 null
2025-03-11 nnInteractive: Redefining 3D Promptable Segmentation Fabian Isensee et.al. 2503.08373 link
2025-03-11 Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous Driving Runwei Guan et.al. 2503.08336 null
2025-03-10 SegResMamba: An Efficient Architecture for 3D Medical Image Segmentation Badhan Kumar Das et.al. 2503.07766 null
2025-03-07 HexPlane Representation for 3D Semantic Scene Understanding Zeren Chen et.al. 2503.05127 null
2025-03-03 OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging Yijie Tang et.al. 2503.01309 null
2025-02-27 Open-Vocabulary Semantic Part Segmentation of 3D Human Keito Suzuki et.al. 2502.19782 null
2025-02-27 Deep Learning-Based Approach for Automatic 2D and 3D MRI Segmentation of Gliomas Kiranmayee Janardhan et.al. 2502.19760 null
2025-02-27 ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding Qihang Peng et.al. 2502.19247 null
2025-02-26 Subclass Classification of Gliomas Using MRI Fusion Technique Kiranmayee Janardhan et.al. 2502.18775 null
2025-02-22 Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field Wenhao Hu et.al. 2502.16303 null
2025-02-20 Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing Yoel Levy et.al. 2502.14789 null
2025-02-19 Pericoronary adipose tissue attenuation as a predictor of functional severity of coronary stenosis Marta Pillitteri et.al. 2502.13649 null
2025-02-18 Learning Wall Segmentation in 3D Vessel Trees using Sparse Annotations Hinrich Rahlfs et.al. 2502.12801 null
2025-02-14 Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding Wenxuan Guo et.al. 2502.10392 link
2025-02-04 Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation Junha Lee et.al. 2502.02548 null
2025-02-20 Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection Boyu Mi et.al. 2502.01401 link
2025-02-01 Vision-Language Modeling in PET/CT for Visual Grounding of Positive Findings Zachary Huemann et.al. 2502.00528 null
2025-01-31 Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields Xingyu Miao et.al. 2501.19084 link
2025-01-30 Full-Head Segmentation of MRI with Abnormal Brain Anatomy: Model and Data Release Andrew M Birnbaum et.al. 2501.18716 link
2025-01-29 3DSES: an indoor Lidar point cloud segmentation dataset with real and pseudo-labels from a 3D model Maxime Mérizette et.al. 2501.17534 null
2025-01-27 CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation Xiaochuan Ma et.al. 2501.16246 null
2025-01-18 No More Sliding Window: Efficient 3D Medical Image Segmentation with Differentiable Top-k Patch Sampling Young Seok Jeon et.al. 2501.10814 null
2025-01-16 AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring Xinyi Wang et.al. 2501.09428 null
2025-01-17 Text-guided Synthetic Geometric Augmentation for Zero-shot 3D Understanding Kohei Torimi et.al. 2501.09278 null
2025-01-12 3DCoMPaT200: Language-Grounded Compositional Understanding of Parts and Materials of 3D Shapes Mahmoud Ahmed et.al. 2501.06785 link
2025-01-10 Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers Kuan Liu et.al. 2501.05961 null
2025-01-07 Self-adaptive vision-language model for 3D segmentation of pulmonary artery and vein Xiaotong Guo et.al. 2501.03722 null
2025-01-09 GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models Zhangyang Qi et.al. 2501.01428 link
2025-01-02 ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding Austin T. Wang et.al. 2501.01366 null
2024-12-31 OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies Runnan Chen et.al. 2501.00326 null
2024-12-28 Advances in Additive Manufacturing of 3D-segmented Plastic Scintillator Detectors for Particle Tracking and Calorimetry Umut Kose et.al. 2412.20267 null
2024-12-24 LangSurf: Language-Embedded Surface Gaussians for 3D Scene Understanding Hao Li et.al. 2412.17635 null
2024-12-22 GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs Xingrui Wang et.al. 2412.16932 link
2024-12-18 MobiFuse: A High-Precision On-device Depth Perception System with Multi-Data Fusion Jinrui Zhang et.al. 2412.13848 null
2024-12-14 DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting Luis Wiedmann et.al. 2412.10972 link

Reasoning Segmentation

Publish Date Title Authors PDF Code
2025-06-12 MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models Yu Huang et.al. 2506.10465 null
2025-06-11 Decoupling the Image Perception and Multimodal Reasoning for Reasoning Segmentation with Digital Twin Representations Yizhen Li et.al. 2506.07943 null
2025-06-05 OpenMaskDINO3D : Reasoning 3D Segmentation via Large Language Model Kunshen Zhang et.al. 2506.04837 link
2025-06-04 RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought Yi Lu et.al. 2506.04277 null
2025-05-29 PixelThink: Towards Efficient Chain-of-Pixel Reasoning Song Wang et.al. 2505.23727 null
2025-06-15 PARTONOMY: Large Multimodal Models with Part-Level Visual Understanding Ansel Blume et.al. 2505.20759 null
2025-05-24 Reasoning Segmentation for Images and Videos: A Survey Yiqing Shen et.al. 2505.18816 null
2025-05-22 PRS-Med: Position Reasoning Segmentation with Vision-Language Model in Medical Imaging Quoc-Huy Trinh et.al. 2505.11872 null
2025-05-17 RVTBench: A Benchmark for Visual Reasoning Tasks Yiqing Shen et.al. 2505.11838 link
2025-05-05 LISAT: Language-Instructed Segmentation Assistant for Satellite Imagery Jerome Quenum et.al. 2505.02829 null
2025-04-17 SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding Qianqian Sun et.al. 2504.12704 null
2025-04-23 MediSee: Reasoning-based Pixel-level Perception in Medical Images Qinyue Tong et.al. 2504.11008 null
2025-04-15 LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation Hanning Chen et.al. 2504.10854 null
2025-04-01 POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation Lanyun Zhu et.al. 2504.00640 null
2025-03-27 Online Reasoning Video Segmentation with Just-in-Time Digital Twins Yiqing Shen et.al. 2503.21056 null
2025-03-26 Operating Room Workflow Analysis via Reasoning Segmentation over Digital Twins Yiqing Shen et.al. 2503.21054 null
2025-03-23 MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation Jiaxin Huang et.al. 2503.18135 null
2025-03-19 VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation Shoubin Yu et.al. 2503.14350 null
2025-03-18 MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation Donggon Jang et.al. 2503.13881 link
2025-03-13 Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA Zhixuan Li et.al. 2503.10225 null
2025-03-11 TSCnet: A Text-driven Semantic-level Controllable Framework for Customized Low-Light Image Enhancement Miao Zhang et.al. 2503.08168 null
2025-03-25 Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts Shiu-hong Kao et.al. 2503.07503 null
2025-03-13 InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models Yuchen Yan et.al. 2503.06692 null
2025-03-09 Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement Yuqi Liu et.al. 2503.06520 link
2025-03-04 UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface Hao Tang et.al. 2503.01342 link
2025-02-13 Pixel-Level Reasoning Segmentation via Multi-turn Conversations Dexian Cai et.al. 2502.09447 link
2025-01-15 The Devil is in Temporal Token: High Quality Video Reasoning Segmentation Sitong Gong et.al. 2501.08549 link
2024-12-19 PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation Muntasir Wahed et.al. 2412.15209 null
2024-12-18 InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models Cong Wei et.al. 2412.14006 link
2024-12-02 HyperSeg: Towards Universal Visual Segmentation with Large Language Model Cong Wei et.al. 2411.17606 link
2024-11-21 Multimodal 3D Reasoning Segmentation with Complex Scenes Xueying Jiang et.al. 2411.13927 null
2024-11-15 Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level Andong Deng et.al. 2411.09921 null
2024-10-31 SegLLM: Multi-round Reasoning Segmentation XuDong Wang et.al. 2410.18923 null
2024-09-29 One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos Zechen Bai et.al. 2409.19603 link
2024-09-20 Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model Li Zhou et.al. 2409.13407 link
2025-02-10 Visual Agents as Fast and Slow Thinkers Guangyan Sun et.al. 2408.08862 link

3D Generative

Publish Date Title Authors PDF Code
2025-06-26 Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image Pufan Li et.al. 2506.21152 null
2025-06-25 WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration Chaojun Ni et.al. 2506.20590 null
2025-06-23 3D Arena: An Open Platform for Generative 3D Evaluation Dylan Ebert et.al. 2506.18787 null
2025-06-23 Geometry-Aware Preference Learning for 3D Texture Generation AmirHossein Zamani et.al. 2506.18331 null
2025-06-13 VEIGAR: View-consistent Explicit Inpainting and Geometry Alignment for 3D object Removal Pham Khai Nguyen Do et.al. 2506.15821 null
2025-06-18 Nabla-R2D3: Effective and Efficient 3D Diffusion Alignment with 2D Rewards Qingming Liu et.al. 2506.15684 null
2025-06-18 Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material Team Hunyuan3D et.al. 2506.15442 link
2025-06-17 RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills Chunru Lin et.al. 2506.14763 null
2025-06-16 Disentangling 3D from Large Vision-Language Models for Controlled Portrait Generation Nick Yiwen Huang et.al. 2506.14015 null
2025-06-16 Dive3D: Diverse Distillation-based Text-to-3D Generation via Score Implicit Matching Weimin Bai et.al. 2506.13594 null
2025-06-11 DreamCS: Geometry-Aware Text-to-3D Generation with Unpaired 3D Reward Supervision Xiandong Zou et.al. 2506.09814 null
2025-06-10 Orientation Matters: Making 3D Generative Models Orientation-Aligned Yichong Lu et.al. 2506.08640 null
2025-06-09 Squeeze3D: Your 3D Generation Model is Secretly an Extreme Neural Compressor Rishit Dagli et.al. 2506.07932 null
2025-06-09 R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation William Ljungbergh et.al. 2506.07826 null
2025-06-09 NOVA3D: Normal Aligned Video Diffusion Model for Single Image to 3D Generation Yuxiao Yang et.al. 2506.07698 null
2025-06-05 PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers Yuchen Lin et.al. 2506.05573 null
2025-06-02 ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding Junliang Ye et.al. 2506.01853 link
2025-05-31 ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary Zeqi Gu et.al. 2506.00742 null
2025-05-30 LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework Xin Kang et.al. 2505.24245 null
2025-05-29 Universal Radial Scaling of Large-Scale Black Hole Accretion for Magnetically Arrested And Rocking Accretion Disks Aretaios Lalakos et.al. 2505.23888 null
2025-05-28 Advancing high-fidelity 3D and Texture Generation with 2.5D latents Xin Yang et.al. 2505.21050 null
2025-05-27 Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction Yifei Wang et.al. 2505.20755 null
2025-05-30 ART-DECO: Arbitrary Text Guidance for 3D Detailizer Construction Qimin Chen et.al. 2505.20431 null
2025-05-26 Harnessing the Power of Training-Free Techniques in Text-to-2D Generation for Text-to-3D Generation via Score Distillation Sampling Junhong Lee et.al. 2505.19868 null
2025-05-26 Global stability for the compressible isentropic magnetohydrodynamic equations in 3D bounded domains with Navier-slip boundary conditions Yang Liu et.al. 2505.19749 null
2025-05-23 SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation Dekai Zhu et.al. 2505.17721 null
2025-05-26 Direct3D-S2: Gigascale 3D Generation Made Easy with Spatial Sparse Attention Shuang Wu et.al. 2505.17412 null
2025-05-22 MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM Siwei Meng et.al. 2505.16456 null
2025-05-21 Constructing a 3D Town from a Single Image Kaizhi Zheng et.al. 2505.15765 null
2025-05-20 Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image Yuxuan Wang et.al. 2505.14537 null
2025-05-21 Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling Zhihao Li et.al. 2505.14521 null
2025-05-19 Touch2Shape: Touch-Conditioned 3D Diffusion for Shape Exploration and Reconstruction Yuanbo Wang et.al. 2505.13091 null
2025-05-15 Pharmacophore-Conditioned Diffusion Model for Ligand-Based De Novo Drug Design Amira Alakhdar et.al. 2505.10545 null
2025-05-13 Long timescale numerical simulations of large, super-critical accretion discs P. Chris Fragile et.al. 2505.08859 null
2025-05-12 Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets Weiyu Li et.al. 2505.07747 null
2025-05-11 CMD: Controllable Multiview Diffusion for 3D Editing and Progressive Generation Peng Li et.al. 2505.07003 null
2025-05-07 Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation Yiming Qin et.al. 2505.05505 link
2025-05-07 Score Distillation Sampling for Audio: Source Separation, Synthesis, and Beyond Jessie Richter-Powell et.al. 2505.04621 null
2025-05-07 Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting Feng Yang et.al. 2505.04262 null
2025-05-07 S3D: Sketch-Driven 3D Model Generation Hail Song et.al. 2505.04185 link
2025-05-06 Effects of transient stellar emissions on planetary climates of tidally-locked exo-earths Howard Chen et.al. 2505.03723 null
2025-05-03 Rethinking Score Distilling Sampling for 3D Editing and Generation Xingyu Miao et.al. 2505.01888 null
2025-04-30 3D Stylization via Large Reconstruction Model Ipek Oztas et.al. 2504.21836 null
2025-04-29 A 3D pocket-aware and affinity-guided diffusion model for lead optimization Anjie Qiao et.al. 2504.21065 null
2025-04-28 CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback Chenhan Jiang et.al. 2504.19860 null
2025-04-27 Making Physical Objects with Generative AI and Robotic Assembly: Considering Fabrication Constraints, Sustainability, Time, Functionality, and Accessibility Alexander Htet Kyaw et.al. 2504.19131 null
2025-04-25 Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation Shivam Duggal et.al. 2504.18509 null
2025-04-24 DiMeR: Disentangled Mesh Reconstruction Model Lutao Jiang et.al. 2504.17670 link
2025-04-23 Global stability for compressible isentropic Navier-Stokes equations in 3D bounded domains with Navier-slip boundary conditions Yang Liu et.al. 2504.17136 null
2025-04-22 Text-based Animatable 3D Avatars with Morphable Model Alignment Yiqian Wu et.al. 2504.15835 link
2025-04-21 Cyc3D: Fine-grained Controllable 3D Generation via Cycle Consistency Regularization Hongbin Xu et.al. 2504.14975 null
2025-04-17 HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation Wenqi Dong et.al. 2504.13072 null
2025-04-17 RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins Yao Mu et.al. 2504.13059 null
2025-04-17 SOPHY: Generating Simulation-Ready Objects with Physical Materials Junyi Cao et.al. 2504.12684 null
2025-04-16 Recent Advance in 3D Object and Scene Generation: A Survey Xiang Tang et.al. 2504.11734 null
2025-04-15 3D full-GR simulations of magnetorotational core-collapse supernovae on GPUs: A systematic study of rotation rates and magnetic fields Swapnil Shankar et.al. 2504.11537 null
2025-04-14 Art3D: Training-Free 3D Generation from Flat-Colored Illustration Xiaoyan Cong et.al. 2504.10466 null
2025-04-14 ESCT3D: Efficient and Selectively Controllable Text-Driven 3D Content Generation with Gaussian Splatting Huiqi Wu et.al. 2504.10316 null
2025-04-16 GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting Junlin Hao et.al. 2504.10001 null
2025-04-11 GeoTexBuild: 3D Building Model Generation from Map Footprints Ruizhe Wang et.al. 2504.08419 null
2025-04-11 Generative AI for Film Creation: A Survey of Recent Advances Ruihan Zhang et.al. 2504.08296 null
2025-04-10 Gen3DEval: Using vLLMs for Automatic Evaluation of Generated 3D Objects Shalini Maiti et.al. 2504.08125 null
2025-04-10 ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting Junbang Liu et.al. 2504.08100 link
2025-04-11 Objaverse++: Curated 3D Object Dataset with Quality Annotations Chendi Lin et.al. 2504.07334 link
2025-04-10 Stochastic Ray Tracing of 3D Transparent Gaussians Xin Sun et.al. 2504.06598 null
2025-04-05 Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization Yikai Wang et.al. 2504.04153 link
2025-04-04 D-Garment: Physics-Conditioned Latent Diffusion for Dynamic Garment Deformations Antoine Dumoulin et.al. 2504.03468 null
2025-04-03 Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization Kangle Deng et.al. 2504.02817 null
2025-04-03 ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation Yuan Zhou et.al. 2504.02316 link
2025-04-03 WonderTurbo: Generating Interactive 3D World in 0.72 Seconds Chaojun Ni et.al. 2504.02261 null
2025-04-02 WorldPrompter: Traversable Text-to-Scene Generation Zhaoyang Zhang et.al. 2504.02045 null
2025-04-02 3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting Hao Wu et.al. 2504.01619 null
2025-04-02 High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model Yiyang Shen et.al. 2504.01512 null
2025-04-03 Distilling Multi-view Diffusion Models into 3D Generators Hao Qin et.al. 2504.00457 null
2025-03-31 Pre-training with 3D Synthetic Data: Learning 3D Point Cloud Instance Segmentation from 3D Synthetic Scenes Daichi Otsuka et.al. 2503.24229 null
2025-03-28 DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness Ruining Li et.al. 2503.22677 null
2025-03-28 Clouds and Hazes in GJ 1214b’s Metal-Rich Atmosphere Isaac Malsky et.al. 2503.22608 null
2025-03-28 CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving Yishen Ji et.al. 2503.22231 null
2025-03-27 3DGen-Bench: Comprehensive Benchmark Suite for 3D Generative Models Yuhan Zhang et.al. 2503.21745 null
2025-03-27 Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data Zhiyuan Ma et.al. 2503.21694 link
2025-03-29 GenFusion: Closing the Loop between Reconstruction and Generation via Videos Sibo Wu et.al. 2503.21219 null
2025-03-26 FB-4D: Spatial-Temporal Coherent Dynamic 3D Content Generation with Feature Banks Jinwei Li et.al. 2503.20784 link
2025-03-27 MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation Jinnan Chen et.al. 2503.20519 null
2025-03-24 MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing Lingting Zhu et.al. 2503.18461 null
2025-03-23 Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook Xu Zheng et.al. 2503.18016 null
2025-03-20 SynCity: Training-Free Generation of 3D Worlds Paul Engstler et.al. 2503.16420 null
2025-03-26 Unleashing Vecset Diffusion Model for Fast Shape Generation Zeqiang Lai et.al. 2503.16302 link
2025-03-21 Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens Shuqi Lu et.al. 2503.16278 link
2025-03-20 Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation Tiange Xiang et.al. 2503.15877 null
2025-03-19 Shap-MeD Nicolás Laverde et.al. 2503.15562 null
2025-03-18 MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling Damian Boborzi et.al. 2503.14002 link
2025-03-17 Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images Tianhao Wu et.al. 2503.13439 null
2025-03-16 VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting Songen Gu et.al. 2503.12383 null
2025-03-15 DecompDreamer: Advancing Structured 3D Asset Generation with Multi-Object Decomposition and Gaussian Splatting Utkarsh Nath et.al. 2503.11981 null
2025-03-14 PBR3DGen: A VLM-guided Mesh Generation with High-quality PBR Texture Xiaokang Wei et.al. 2503.11368 null
2025-03-08 Text-to-3D Generation using Jensen-Shannon Score Distillation Khoi Do et.al. 2503.10660 null
2025-03-13 Hyper3D: Efficient 3D Representation via Hybrid Triplane and Octree Feature for Enhanced 3D Shape Variational Auto-Encoders Jingyu Guo et.al. 2503.10403 null
2025-03-13 RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling Itay Chachy et.al. 2503.09601 link
2025-03-11 MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention Yuhan Wang et.al. 2503.08664 link
2025-03-12 CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction Zhiyuan Wu et.al. 2503.08005 null
2025-03-10 DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation Xiaoliang Ju et.al. 2503.06900 null
2025-03-09 A Mesh Is Worth 512 Numbers: Spectral-domain Diffusion Modeling for High-dimension Shape Generation Jiajie Fan et.al. 2503.06485 null
2025-03-08 GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation Ye Tao et.al. 2503.06136 null
2025-03-07 Decay of solutions of nonlinear Dirac equations Sebastian Herr et.al. 2503.05410 null
2025-03-06 Simulating the Real World: A Unified Survey of Multimodal Generative Models Yuqi Hu et.al. 2503.04641 link
2025-03-03 On the behavior of the Generalized Alignment Index (GALI) method for dissipative systems Henok Tenaw Moges et.al. 2503.01784 null
2025-03-03 The Interplay between Dust Dynamics and Turbulence Induced by the Vertical Shear Instability Pinghui Huang et.al. 2503.01656 null
2025-03-03 Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation Jiantao Lin et.al. 2503.01370 link
2025-03-02 DreamPrinting: Volumetric Printing Primitives for High-Fidelity 3D Printing Youjia Wang et.al. 2503.00887 null
2025-03-01 GenVDM: Generating Vector Displacement Maps From a Single Image Yuezhi Yang et.al. 2503.00605 null
2025-02-28 CADDreamer: CAD object Generation from Single-view Images Yuan Li et.al. 2502.20732 null
2025-02-27 Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting Hengyu Meng et.al. 2502.20045 null
2025-02-27 GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors An Li et.al. 2502.19896 null
2025-02-24 Evidence for Low Universal Equilibrium Black Hole Spin in Luminous Magnetically Arrested Disks Beverly Lowell et.al. 2502.17559 null
2025-02-24 RELICT: A Replica Detection Framework for Medical Image Generation Orhun Utku Aydin et.al. 2502.17360 link
2025-02-25 Evolution 6.0: Evolving Robotic Capabilities Through Generative Design Muhammad Haris Khan et.al. 2502.17034 null
2025-02-23 Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control Jinbo Yan et.al. 2502.16475 null
2025-02-21 Generative AI Framework for 3D Object Generation in Augmented Reality Majid Behravan et.al. 2502.15869 null
2025-02-28 WorldCraft: Photo-Realistic 3D World Creation and Customization via LLM Agents Xinhang Liu et.al. 2502.15601 null
2025-02-20 Hier-SLAM++: Neuro-Symbolic Semantic SLAM with a Hierarchically Categorical Gaussian Splatting Boying Li et.al. 2502.14931 null
2025-02-18 CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image Kaixin Yao et.al. 2502.12894 null
2025-02-18 RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation Chenxi Zheng et.al. 2502.12640 null
2025-02-18 NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation Zhiyuan Liu et.al. 2502.12638 link
2025-02-18 Not-So-Optimal Transport Flows for 3D Point Cloud Generation Ka-Hei Hui et.al. 2502.12456 null
2025-02-17 A new convection scheme for GCMs of temperate sub-Neptunes Edouard F. L. Barrier et.al. 2502.12234 null
2025-02-17 GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text Gyumin Shim et.al. 2502.11642 null
2025-02-13 X-SG $^2$ S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks Zihang Cheng et.al. 2502.10475 null
2025-02-13 Latent Radiance Fields with 3D-aware 2D Representations Chaoyi Zhou et.al. 2502.09613 null
2025-02-17 ConsistentDreamer: View-Consistent Meshes Through Balanced Multi-View Gaussian Optimization Onat Şahin et.al. 2502.09278 null
2025-02-10 Grounding Creativity in Physics: A Brief Survey of Physical Priors in AIGC Siwei Meng et.al. 2502.07007 null
2025-02-10 Transfer Your Perspective: Controllable 3D Generation from Any Viewpoint in a Driving Scene Tai-Yu Pan et.al. 2502.06682 null
2025-02-10 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Yangguang Li et.al. 2502.06608 link
2025-02-10 Relativistic Gas Accretion onto Supermassive Black Hole Binaries from Inspiral through Merger Lorenzo Ennoggi et.al. 2502.06389 null
2025-02-05 DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization Zhenglin Zhou et.al. 2502.04370 null
2025-02-04 ShapeShifter: 3D Variations Using Multiscale and Sparse Point-Voxel Diffusion Nissim Maruani et.al. 2502.02187 null
2025-01-31 TRAPPIST-1 d: Exo-Venus, Exo-Earth or Exo-Dead? M. J. Way et.al. 2502.00132 null
2025-01-29 Towards Training-Free Open-World Classification with 3D Generative Models Xinzhe Xia et.al. 2501.17547 null
2025-01-28 CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Nikolai Kalischek et.al. 2501.17162 null
2025-01-28 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Chenguo Lin et.al. 2501.16764 null
2025-01-27 BAG: Body-Aligned 3D Wearable Asset Generation Zhongjin Luo et.al. 2501.16177 null
2025-01-26 Comparative clinical evaluation of “memory-efficient” synthetic 3d generative adversarial networks (gan) head-to-head to state of art: results on computed tomography of the chest Mahshid shiri et.al. 2501.15572 null
2025-01-22 InsTex: Indoor Scenes Stylized Texture Synthesis Yunfan Zhang et.al. 2501.13969 null
2025-01-22 Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation Akshay Krishnan et.al. 2501.13087 null
2025-01-17 Mitigating Hallucinations on Object Attributes using Multiview Images and Negative Instructions Zhijie Tan et.al. 2501.10011 null
2025-01-16 CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation Hwan Heo et.al. 2501.09433 link
2025-01-13 UnCommon Objects in 3D Xingchen Liu et.al. 2501.07574 link
2025-01-12 Synthetic Prior for Few-Shot Drivable Head Avatar Inversion Wojciech Zielonka et.al. 2501.06903 null
2025-01-09 Consistent Flow Distillation for Text-to-3D Generation Runjie Yan et.al. 2501.05445 null
2025-01-09 Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation Xuyi Meng et.al. 2501.05427 null
2025-01-07 Chirpy3D: Continuous Part Latents for Creative 3D Bird Generation Kam Woh Ng et.al. 2501.04144 link
2025-01-04 Taming Feed-forward Reconstruction Models as Latent Encoders for 3D Generative Models Suttisak Wizadwongsa et.al. 2501.00651 null
2024-12-30 PERSE: Personalized 3D Generative Avatars from A Single Portrait Hyunsoo Cha et.al. 2412.21206 null
2025-01-02 Prometheus: 3D-Aware Latent Diffusion Models for Feed-Forward Text-to-3D Scene Generation Yuanbo Yang et.al. 2412.21117 null
2024-12-29 Toward Scene Graph and Layout Guided Complex 3D Scene Generation Yu-Hsiang Huang et.al. 2412.20473 null
2024-12-26 Habitability in 4-D: Predicting the Climates of Earth Analogs across Rotation and Orbital Configurations Arthur D. Adams et.al. 2412.19357 link
2024-12-29 PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models Minghao Chen et.al. 2412.18608 null
2024-12-23 ArchComplete: Autoregressive 3D Architectural Design Generation with Hierarchical Diffusion-Based Upsampling S. Rasoulzadeh et.al. 2412.17957 link
2024-12-21 GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space Souhaib Attaiki et.al. 2412.16717 null
2024-12-18 AdvIRL: Reinforcement Learning-Based Adversarial Attacks on 3D NeRF Models Tommy Nguyen et.al. 2412.16213 link
2024-12-20 GCA-3D: Towards Generalized and Consistent Domain Adaptation of 3D Generators Hengjia Li et.al. 2412.15491 null
2024-12-18 DreaMark: Rooting Watermark in Score Distillation Sampling Generated Neural Radiance Fields Xingyu Zhu et.al. 2412.15278 null
2024-12-19 DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation Wang Zhao et.al. 2412.15200 null
2024-12-19 LiftRefine: Progressively Refined View Synthesis from 3D Lifting with Volume-Triplane Representations Tung Do et.al. 2412.14464 null
2024-12-18 GraphicsDreamer: Image to 3D Generation with Physical Consistency Pei Chen et.al. 2412.14214 null
2024-12-15 Benchmarking and Learning Multi-Dimensional Quality Evaluator for Text-to-3D Generation Yujie Zhang et.al. 2412.11170 null
2024-12-17 Virtual Trial Room with Computer Vision and Machine Learning Tulashi Prasad Joshi et.al. 2412.10710 null
2024-12-13 GT23D-Bench: A Comprehensive General Text-to-3D Generation Benchmark Sitong Su et.al. 2412.09997 null
2024-12-11 DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models Kevin Miao et.al. 2412.09648 null
2024-12-19 SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing Xueting Li et.al. 2412.09545 null
2024-12-09 Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation Ruihan Gao et.al. 2412.06785 link
2024-12-09 Diverse Score Distillation Yanbo Xu et.al. 2412.06780 null
2024-12-14 You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale Baorui Ma et.al. 2412.06699 link
2024-12-09 Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy Yuxuan Xue et.al. 2412.06698 null
2024-12-08 Enhanced 3D Generation by 2D Editing Haoran Li et.al. 2412.05929 null
2024-12-07 Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation Wenqing Wang et.al. 2412.05560 null
2024-12-06 DNF: Unconditional 4D Generation with Dictionary-based Neural Fields Xinyi Zhang et.al. 2412.05161 null
2024-12-05 PaintScene4D: Consistent 4D Scene Generation from Text Prompts Vinayak Gupta et.al. 2412.04471 null
2024-12-05 Turbo3D: Ultra-fast Text-to-3D Generation Hanzhe Hu et.al. 2412.04470 null
2024-12-05 InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models Yifan Lu et.al. 2412.03934 null
2024-12-04 MV-Adapter: Multi-view Consistent Image Generation Made Easy Zehuan Huang et.al. 2412.03632 null
2024-12-04 MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation Zehuan Huang et.al. 2412.03558 null
2024-12-04 CLAS: A Machine Learning Enhanced Framework for Exploring Large 3D Design Datasets XiuYu Zhang et.al. 2412.02996 null
2024-12-03 Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation Yiftach Edelstein et.al. 2412.02631 null
2024-12-03 Continual Learning of Personalized Generative Face Models with Experience Replay Annie N. Wang et.al. 2412.02627 null
2024-12-03 HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset Zedong Chu et.al. 2412.02317 link
2024-12-03 Viewpoint Consistency in 3D Generation via Attention and CLIP Guidance Qing Zhang et.al. 2412.02287 null
2024-12-03 3D representation in 512-Byte:Variational tokenizer is the key for autoregressive 3D generation Jinzhi Zhang et.al. 2412.02202 null
2024-12-03 CLERF: Contrastive LEaRning for Full Range Head Pose Estimation Ting-Ruen Wei et.al. 2412.02066 null
2024-12-02 World-consistent Video Diffusion with Explicit 3D Modeling Qihang Zhang et.al. 2412.01821 null
2024-12-02 Structured 3D Latents for Scalable and Versatile 3D Generation Jianfeng Xiang et.al. 2412.01506 link
2024-11-30 Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects Amir Barda et.al. 2412.00518 null
2024-11-28 3D-WAG: Hierarchical Wavelet-Guided Autoregressive Generation for High-Fidelity 3D Shapes Tejaswini Medi et.al. 2411.19037 null
2024-11-28 RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning Jiacheng Wang et.al. 2411.18866 null
2024-11-27 G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation Tianxing Chen et.al. 2411.18369 null
2024-11-27 ModeDreamer: Mode Guiding Score Distillation for Text-to-3D Generation using Reference Image Prompts Uy Dieu Tran et.al. 2411.18135 null
2024-11-26 Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation Xiang Li et.al. 2411.17763 null
2024-11-27 SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE Yongwei Chen et.al. 2411.16856 null
2024-11-27 DetailGen3D: Generative 3D Geometry Enhancement via Data-Dependent Flow Ken Deng et.al. 2411.16820 null
2024-11-25 SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis Hyojun Go et.al. 2411.16443 link
2024-11-24 Fixing the Perspective: A Critical Examination of Zero-1-to-3 Jack Yu et.al. 2411.15706 null
2024-11-26 Efficient Long Video Tokenization via Coordinate-based Patch Reconstruction Huiwon Jang et.al. 2411.14762 null
2024-11-22 Any-to-3D Generation via Hybrid Diffusion Supervision Yijun Fan et.al. 2411.14715 null
2024-11-26 Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation Yuanhao Cai et.al. 2411.14384 null
2024-11-19 Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting Haoyu Zhao et.al. 2411.12789 null
2024-11-21 FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting Fangyu Wu et.al. 2411.12089 null
2024-11-18 sMoRe: Enhancing Object Manipulation and Organization in Mixed Reality Spaces with LLMs and Generative AI Yunhao Xing et.al. 2411.11752 null
2024-11-18 MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion Dongseok Shim et.al. 2411.11475 null
2024-11-18 Thickness-dependent Topological Phases and Flat Bands in Rhombohedral Multilayer Graphene H. B. Xiao et.al. 2411.11359 null
2024-11-17 Direct and Explicit 3D Generation from a Single Image Haoyu Wu et.al. 2411.10947 null
2024-11-16 ARM: Appearance Reconstruction Model for Relightable 3D Generation Xiang Feng et.al. 2411.10825 null
2024-11-14 LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Zhengyi Wang et.al. 2411.09595 null
2024-11-16 A Survey on Vision Autoregressive Model Kai Jiang et.al. 2411.08666 null
2024-11-12 GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation Yushi Lan et.al. 2411.08033 null
2024-11-12 Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings Aditya Sanghi et.al. 2411.08017 link
2024-11-16 SAMPart3D: Segment Any Part in 3D Objects Yunhan Yang et.al. 2411.07184 link
2024-11-09 AI-Driven Stylization of 3D Environments Yuanbo Chen et.al. 2411.06067 null
2024-11-08 Autoregressive Models in Vision: A Survey Jing Xiong et.al. 2411.05902 link
2024-11-07 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Wenqiang Sun et.al. 2411.04928 null
2024-11-05 Tencent Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation Xianghui Yang et.al. 2411.02293 null
2024-11-03 DreamPolish: Domain Score Distillation With Progressive Geometry Generation Yean Cheng et.al. 2411.01602 null
2024-10-31 Manipulating Vehicle 3D Shapes through Latent Space Editing JiangDong Miao et.al. 2410.23931 null
2024-11-01 Fast Transients from Magnetic Disks Around Non-Spinning Collapsar Black Holes Justin Bopp et.al. 2410.22401 null
2024-10-16 TV-3DG: Mastering Text-to-3D Customized Generation with Visual Prompt Jiahui Yang et.al. 2410.21299 null
2024-10-28 CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians Chongjian Ge et.al. 2410.20723 null
2024-10-30 DiffGS: Functional Gaussian Splatting Diffusion Junsheng Zhou et.al. 2410.19657 null
2024-10-24 3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D Generation Hansheng Chen et.al. 2410.18974 link
2024-10-23 GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation Ruowei Wang et.al. 2410.17802 link
2024-10-23 Under the magnifying glass: A combined 3D model applied to cloudy warm Saturn type exoplanets around M-dwarfs Sven Kiefer et.al. 2410.17716 null
2024-10-21 MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors Honghua Chen et.al. 2410.16272 null
2024-10-22 LucidFusion: Generating 3D Gaussians with Arbitrary Unposed Images Hao He et.al. 2410.15636 null
2024-10-20 Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint Junwei Zhou et.al. 2410.15391 null
2024-10-16 DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model Jingxiang Sun et.al. 2410.12928 null
2024-10-15 Robotic Arm Platform for Multi-View Image Acquisition and 3D Reconstruction in Minimally Invasive Surgery Alexander Saikia et.al. 2410.11703 null
2024-10-15 Evolutionary Retrofitting Mathurin Videau et.al. 2410.11330 null
2024-10-13 GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation Dingdong Yang et.al. 2410.10037 null
2024-10-12 ControLRM: Fast and Controllable 3D Generation via Large Reconstruction Model Hongbin Xu et.al. 2410.09592 null
2024-10-12 Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors Hritam Basak et.al. 2410.09467 null
2024-10-11 SceneCraft: Layout-Guided 3D Scene Generation Xiuyu Yang et.al. 2410.09049 link
2024-10-11 Semantic Score Distillation Sampling for Compositional Text-to-3D Generation Ling Yang et.al. 2410.09009 link
2024-10-11 One-shot Generative Domain Adaptation in 3D GANs Ziqiang Li et.al. 2410.08824 link
2024-10-10 RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image Xiaoxue Chen et.al. 2410.08181 null
2024-10-10 SeMv-3D: Towards Semantic and Mutil-view Consistency simultaneously for General Text-to-3D Generation with Triplane Priors Xiao Cai et.al. 2410.07658 null
2024-10-09 DreamMesh4D: Video-to-4D Generation with Sparse-Controlled Gaussian-Mesh Hybrid Representation Zhiqi Li et.al. 2410.06756 null
2024-10-02 OCC-MLLM-Alpha:Empowering Multi-modal Large Language Model for the Understanding of Occluded Objects with Self-Supervised Test-Time Learning Shuxin Yang et.al. 2410.01861 null
2024-10-02 Towards Native Generative Model for 3D Head Avatar Yiyu Zhuang et.al. 2410.01226 null
2024-10-01 Extreme scale height variations and nozzle shocks in warped disks Nicholas Kaaz et.al. 2410.00961 null
2024-10-02 Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation Junlin Han et.al. 2410.00890 null
2024-09-29 Global well-posedness of the fractional dissipative system in the framework of variable Fourier–Besov spaces Gastón Vergara-Hermosilla et.al. 2410.00060 null
2024-09-30 Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images Bahri Batuhan Bilecen et.al. 2409.20530 null
2024-09-27 Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly Alexander Htet Kyaw et.al. 2409.18390 null
2024-09-26 Long-lived neutron-star remnants from asymmetric binary neutron star mergers: element formation, kilonova signals and gravitational waves Sebastiano Bernuzzi et.al. 2409.18185 null
2024-09-25 Disco4D: Disentangled 4D Human Generation and Animation from a Single Image Hui En Pang et.al. 2409.17280 null
2024-09-19 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Zhaoxi Chen et.al. 2409.12957 link
2024-09-18 Vista3D: Unravel the 3D Darkside of a Single Image Qiuhong Shen et.al. 2409.12193 link
2024-09-17 Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion Zhenwei Wang et.al. 2409.11406 null
2024-09-16 The Spin Zone: Synchronously and Asynchronously Rotating Exoplanets Have Spectral Differences in Transmission Nicholas Scarsdale et.al. 2409.10752 null
2024-09-11 DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation Haibo Yang et.al. 2409.07454 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 Some effects of limited wall-sensor availability on flow estimation with 3D-GANs Antonio Cuéllar et.al. 2409.07348 null
2024-09-11 Detectability Simulations of a NIR Surface Biosignature on Proxima Centauri b with Future Space Observatories Connor O. Metz et.al. 2409.07289 null
2024-09-12 3DGCQA: A Quality Assessment Database for 3D AI-Generated Contents Yingjie Zhou et.al. 2409.07236 link
2024-09-10 G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer Jinzhi Zhang et.al. 2409.06322 null
2024-09-19 DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping Zeyu Cai et.al. 2409.05099 null
2024-09-04 Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models Zhibin Liu et.al. 2409.02851 link
2024-09-03 ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis Wangbo Yu et.al. 2409.02048 null
2024-08-27 OctFusion: Octree-based Diffusion Models for 3D Shape Generation Bojun Xiong et.al. 2408.14732 link
2024-08-28 PhysPart: Physically Plausible Part Completion for Interactable Objects Rundong Luo et.al. 2408.13724 null
2024-08-26 Focus on Neighbors and Know the Whole: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation Bonan Li et.al. 2408.13149 null
2024-08-23 Atlas Gaussians Diffusion for 3D Generation with Infinite Number of Points Haitao Yang et.al. 2408.13055 null
2024-08-22 Multimodal Foundational Models for Unsupervised 3D General Obstacle Detection Tamás Matuszka et.al. 2408.12322 null
2024-08-27 Pano2Room: Novel View Synthesis from a Single Indoor Panorama Guo Pu et.al. 2408.11413 link
2024-08-20 Large Point-to-Gaussian Model for Image-to-3D Generation Longfei Lu et.al. 2408.10935 null
2024-08-19 SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views Chao Xu et.al. 2408.10195 null
2024-08-15 Single-image coherent reconstruction of objects and humans Sarthak Batra et.al. 2408.08086 null
2024-08-15 MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing Chenjie Cao et.al. 2408.08000 null
2024-08-12 Efficient and Scalable Point Cloud Generation with Sparse Point-Voxel Diffusion Models Ioannis Romanelis et.al. 2408.06145 link
2024-08-12 Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation Utkarsh Nath et.al. 2408.05938 null
2024-08-09 DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow Hangyu Li et.al. 2408.05008 null
2024-08-06 An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion Xingguang Yan et.al. 2408.03178 null
2024-08-09 DreamLCM: Towards High-Quality Text-to-3D Generation via Latent Consistency Model Yiming Zhong et.al. 2408.02993 link
2024-08-05 SceneMotifCoder: Example-driven Visual Program Learning for Generating 3D Object Arrangements Hou In Ivan Tam et.al. 2408.02211 null
2024-08-02 A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness Lutao Jiang et.al. 2408.01269 null
2024-07-30 Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering Yanpeng Zhao et.al. 2407.20908 link
2024-07-28 Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle Zhenyu Tang et.al. 2407.19548 null
2024-07-25 Signatures of Low Mass Black Hole-Neutron Star Mergers Rahime Matur et.al. 2407.18045 null
2024-07-23 She’s Got Her Mother’s Hair: End-to-End Collapsar Simulations Unveil the Origin of Black Holes’ Magnetic Field Ore Gottlieb et.al. 2407.16745 null
2024-07-23 DreamDissector: Learning Disentangled Text-to-3D Generation from 2D Diffusion Priors Zizheng Yan et.al. 2407.16260 null
2024-07-19 HOTS3D: Hyper-Spherical Optimal Transport for Semantic Alignment of Text-to-3D Generation Zezeng Li et.al. 2407.14419 null
2024-07-19 PlacidDreamer: Advancing Harmony in Text-to-3D Generation Shuo Huang et.al. 2407.13976 link
2024-07-20 Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation Zongrui Li et.al. 2407.13584 link
2024-07-17 4Dynamic: Text-to-4D Generation with Hybrid Priors Yu-Jie Yuan et.al. 2407.12684 null
2024-07-17 JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation Chenhan Jiang et.al. 2407.12291 null
2024-07-16 Superintegrable families of magnetic monopoles with non-radial potential in curved background Antonella Marchesiello et.al. 2407.11709 null
2024-07-17 VividDreamer: Invariant Score Distillation For Hyper-Realistic Text-to-3D Generation Wenjie Zhuo et.al. 2407.09822 null
2024-07-08 Tailor3D: Customized 3D Assets Editing and Generation with Dual-Side Images Zhangyang Qi et.al. 2407.06191 null
2024-07-08 On a new 3D generalized Hunter-Saxton equation Sergei Sakovich et.al. 2407.05723 null
2024-07-05 Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality matters Benoit Baillif et.al. 2407.04424 link
2024-07-05 Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos Leonhard Sommer et.al. 2407.04384 link
2024-07-03 NEBULA: Neural Empirical Bayes Under Latent Representations for Efficient and Controllable Design of Molecular Libraries Ewa M. Nowara et.al. 2407.03428 link
2024-07-02 Meta 3D AssetGen: Text-to-Mesh Generation with High-Quality Geometry, Texture, and PBR Materials Yawar Siddiqui et.al. 2407.02445 null
2024-07-02 ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation Zhiyuan Ma et.al. 2407.02040 link
2024-07-01 fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence Francis Williams et.al. 2407.01781 null
2024-07-01 VolETA: One- and Few-shot Food Volume Estimation Ahmad AlMughrabi et.al. 2407.01717 link
2024-07-01 GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting Chenxin Li et.al. 2407.01301 null
2024-06-27 From Efficient Multimodal Models to World Models: A Survey Xinji Mai et.al. 2407.00118 null
2024-06-27 In LIGO’s Sight? Vigorous Coherent Gravitational Waves from Cooled Collapsar Disks Ore Gottlieb et.al. 2406.19452 null
2024-06-26 Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling Abril Corona-Figueroa et.al. 2406.18422 link
2024-06-25 Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text Xinyang Li et.al. 2406.17601 link
2024-06-25 Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds Hongliang Zeng et.al. 2406.17342 null
2024-07-01 Geometry-Aware Score Distillation via 3D Consistent Noising and Gradient Consistency Modeling Min-Seop Kwak et.al. 2406.16695 null
2024-06-24 YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals Sandeep Mishra et.al. 2406.16273 null
2024-06-21 GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation Chubin Zhang et.al. 2406.15333 link
2024-06-21 A3D: Does Diffusion Dream about 3D Alignment? Savva Ignatyev et.al. 2406.15020 null
2024-06-21 VividDreamer: Towards High-Fidelity and Efficient Text-to-3D Generation Zixuan Chen et.al. 2406.14964 null
2024-06-14 OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control Yuzhong Huang et.al. 2406.10000 null
2024-06-14 GradeADreamer: Enhanced Text-to-3D Generation Using Gaussian Splatting and Multi-View Diffusion Trapoom Ukarapol et.al. 2406.09850 link
2024-06-15 2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction Tianqi Chen et.al. 2406.08374 null
2024-06-12 Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata Dongsu Zhang et.al. 2406.08292 null
2024-06-12 SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models Abhay Rawat et.al. 2406.07840 null
2024-06-11 C3DAG: Controlled 3D Animal Generation using 3D pose guidance Sandeep Mishra et.al. 2406.07742 null
2024-06-11 Instant 3D Human Avatar Generation using Image Diffusion Models Nikos Kolotouros et.al. 2406.07516 null
2024-06-11 4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models Heng Yu et.al. 2406.07472 null
2024-06-11 Efficient 3D Molecular Generation with Flow Matching and Scale Optimal Transport Ross Irwin et.al. 2406.07266 null
2024-06-10 PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation Zhenyu Li et.al. 2406.06679 null
2024-06-10 GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation Haozhe Xie et.al. 2406.06526 link
2024-06-10 MVGamba: Unify 3D Content Generation as State Space Sequence Modeling Xuanyu Yi et.al. 2406.06367 link
2024-06-09 GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement Peiye Zhuang et.al. 2406.05649 null
2024-06-11 Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion Fangfu Liu et.al. 2406.04338 null
2024-06-07 DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data Qihao Liu et.al. 2406.04322 link
2024-06-07 GeoGen: Geometry-Aware Generative Modeling via Signed Distance Functions Salvatore Esposito et.al. 2406.04254 null
2024-06-05 Text-to-Image Rectified Flow as Plug-and-Play Priors Xiaofeng Yang et.al. 2406.03293 link
2024-06-05 Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion Hao Wen et.al. 2406.03184 link
2024-06-05 Adversarial Generation of Hierarchical Gaussians for 3D Generative Model Sangeek Hyun et.al. 2406.02968 link
2024-06-03 TAGMol: Target-Aware Gradient-guided Molecule Generation Vineeth Dorna et.al. 2406.01650 link
2024-06-03 Tetrahedron Splatting for 3D Generation Chun Gu et.al. 2406.01579 link
2024-06-04 Towards Practical Single-shot Motion Synthesis Konstantinos Roditakis et.al. 2406.01136 null
2024-06-02 Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models Wenqiang Sun et.al. 2406.00750 null
2024-06-04 Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors Ohad Rahamim et.al. 2406.00687 link
2024-05-31 Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation Shuzhou Yang et.al. 2405.20669 link
2024-05-30 What makes a cosmic filament? The dynamical origin and identity of filaments I. fundamentals in 2D Job Feldbrugge et.al. 2405.20475 null
2024-05-30 GECO: Generative Image-to-3D within a SECOnd Chen Wang et.al. 2405.20327 null
2024-06-05 PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting Qiaowei Miao et.al. 2405.19957 link
2024-05-28 Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication Yunuo Chen et.al. 2405.18515 null
2024-05-28 SubDLe: identification of substructures in cosmological simulations with deep learning Michela Esposito et.al. 2405.18257 null
2024-05-27 PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance Haohan Weng et.al. 2405.16890 null
2024-05-27 Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation Zhoujie Fu et.al. 2405.16849 null
2024-05-24 ExactDreamer: High-Fidelity Text-to-3D Content Creation via Exact Score Matching Yumin Zhang et.al. 2405.15914 link
2024-05-24 Score Distillation via Reparametrized DDIM Artem Lukoianov et.al. 2405.15891 link
2024-05-24 Automating the Diagnosis of Human Vision Disorders by Cross-modal 3D Generation Li Zhang et.al. 2405.15239 link
2024-05-23 CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner Weiyu Li et.al. 2405.14979 link
2024-05-23 Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer Shuang Wu et.al. 2405.14832 null
2024-05-23 MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes Ruiyuan Gao et.al. 2405.14475 null
2024-05-22 Multi-Zone Modeling of Black Hole Accretion and Feedback in 3D GRMHD: Bridging Vast Spatial and Temporal Scales Hyerin Cho et.al. 2405.13887 null
2024-05-22 Metabook: An Automatically Generated Augmented Reality Storybook Interaction System to Improve Children’s Engagement in Storytelling Yibo Wang et.al. 2405.13701 null
2024-05-18 Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching Xingyu Miao et.al. 2405.11252 link
2024-05-16 Flow Score Distillation for Diverse Text-to-3D Generation Runjie Yan et.al. 2405.10988 null
2024-05-23 Describing heat dissipation in the resistive state of three-dimensional superconductors Leonardo Rodrigues Cadorim et.al. 2405.10415 null
2024-05-16 Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion Xinyang Li et.al. 2405.09874 null
2024-05-16 The metallicity and carbon-to-oxygen ratio of the ultra-hot Jupiter WASP-76b from Gemini-S/IGRINS Megan Weiner Mansfield et.al. 2405.09769 null
2024-05-15 A Survey On Text-to-3D Contents Generation In The Wild Chenhan Jiang et.al. 2405.09431 null
2024-05-15 3D Shape Augmentation with Content-Aware Shape Resizing Mingxiang Chen et.al. 2405.09050 null
2024-05-13 DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation Ziang Cao et.al. 2405.08055 link
2024-05-13 Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning Wenqi Dong et.al. 2405.08054 null
2024-05-14 SketchDream: Sketch-based Text-to-3D Generation and Editing Feng-Lin Liu et.al. 2405.06461 null
2024-04-30 GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Kai Zhang et.al. 2404.19702 null
2024-04-30 MicroDreamer: Zero-shot 3D Generation in $\sim$ 20 Seconds by Score-based Iterative Reconstruction Luxi Chen et.al. 2404.19525 link
2024-04-26 Multi-view Image Prompted Multi-view Diffusion for Improved 3D Generation Seungwook Kim et.al. 2404.17419 null
2024-04-25 Interactive3D: Create What You Want by Interactive 3D Generation Shaocong Dong et.al. 2404.16510 null
2024-04-22 X-Ray: A Sequential 3D Representation for Generation Tao Hu et.al. 2404.14329 link
2024-04-18 MeshLRM: Large Reconstruction Model for High-Quality Mesh Xinyue Wei et.al. 2404.12385 null
2024-04-17 Shaping Realities: Enhancing 3D Generative AI with Fabrication Constraints Faraz Faruqi et.al. 2404.10142 null
2024-04-14 InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models Jiale Xu et.al. 2404.07191 link
2024-04-10 Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior Fan Lu et.al. 2404.06780 null
2024-04-09 Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion Fan Yang et.al. 2404.06429 link
2024-04-09 DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation Junkai Yan et.al. 2404.06119 link
2024-04-09 Hash3D: Training-free Acceleration for 3D Generation Xingyi Yang et.al. 2404.06091 link
2024-04-08 StylizedGS: Controllable Stylization for 3D Gaussian Splatting Dingxi Zhang et.al. 2404.05220 null
2024-04-11 Diffusion Time-step Curriculum for One Image to 3D Generation Xuanyu Yi et.al. 2404.04562 link
2024-04-03 Design2Cloth: 3D Cloth Generation from 2D Masks Jiali Zheng et.al. 2404.02686 null
2024-04-02 Towards Robust 3D Pose Transfer with Adversarial Learning Haoyu Chen et.al. 2404.02242 null
2024-04-02 Black Hole-Disk Interactions in Magnetically Arrested Active Galactic Nuclei: General Relativistic Magnetohydrodynamic Simulations Using A Time-Dependent, Binary Metric Sean M. Ressler et.al. 2404.02193 null
2024-04-02 Diffusion $^2$ : Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models Zeyu Yang et.al. 2404.02148 link
2024-04-07 Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation Wangguandong Zheng et.al. 2404.01843 null
2024-04-01 FlexiDreamer: Single Image-to-3D Generation with FlexiCubes Ruowen Zhao et.al. 2404.00987 link
2024-03-29 Talk3D: High-Fidelity Talking Portrait Synthesis via Personalized 3D Generative Prior Jaehoon Ko et.al. 2403.20153 link
2024-04-05 GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling Bowen Zhang et.al. 2403.19655 null
2024-03-28 Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation Yujin Chen et.al. 2403.19319 null
2024-03-29 Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction Qiuhong Shen et.al. 2403.18795 link
2024-03-25 DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion Yuanze Lin et.al. 2403.17237 null
2024-03-25 VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation Yang Chen et.al. 2403.17001 null
2024-03-25 Exploiting Priors from 3D Diffusion Models for RGB-Based One-Shot View Planning Sicong Pan et.al. 2403.16803 link
2024-03-22 InterFusion: Text-Driven Generation of 3D Human-Object Interaction Sisi Dai et.al. 2403.15612 link
2024-03-22 LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis Kevin Xie et.al. 2403.15385 null
2024-03-22 ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars Zhenwei Wang et.al. 2403.15383 link
2024-03-22 DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow Kyungmin Lee et.al. 2403.14966 null
2024-03-22 STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians Yifei Zeng et.al. 2403.14939 null
2024-03-21 DreamReward: Text-to-3D Generation with Human Preference Junliang Ye et.al. 2403.14613 null
2024-03-20 Compress3D: a Compressed Latent Space for 3D Generation from a Single Image Bowen Zhang et.al. 2403.13524 null
2024-03-17 General Line Coordinates in 3D Joshua Martinez et.al. 2403.13014 null
2024-03-19 GVGEN: Text-to-3D Generation with Volumetric Representation Xianglong He et.al. 2403.12957 null
2024-03-19 Precise-Physics Driven Text-to-3D Generation Qingshan Xu et.al. 2403.12438 null
2024-03-19 ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance Yongwei Chen et.al. 2403.12409 null
2024-03-18 VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models Junlin Han et.al. 2403.12034 null
2024-03-19 Generic 3D Diffusion Adapter Using Controlled Multi-View Editing Hansheng Chen et.al. 2403.12032 link
2024-03-18 LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation Yushi Lan et.al. 2403.12019 link
2024-03-18 SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion Vikram Voleti et.al. 2403.12008 null
2024-03-17 BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis Lutao Jiang et.al. 2403.11273 link
2024-03-15 Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding Pengkun Liu et.al. 2403.10395 link
2024-03-19 Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting Zhiqi Li et.al. 2403.09981 link
2024-03-14 Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation Fangfu Liu et.al. 2403.09625 null
2024-03-14 Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph Donglin Di et.al. 2403.09236 link
2024-03-14 Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior Cheng Chen et.al. 2403.09140 null
2024-03-13 UniLiDAR: Bridge the domain gap among different LiDARs for continual learning Zikun Xu et.al. 2403.08512 null
2024-03-11 3D simulations of TRAPPIST-1e with varying CO2, CH4 and haze profiles Mei Ting Mak et.al. 2403.06928 null
2024-03-11 ExoCubed: A Riemann-Solver based Cubed-Sphere Dynamic Core for Planetary Atmospheres Sihe Chen et.al. 2403.06844 link
2024-03-11 V3D: Video Diffusion Models are Effective 3D Generators Zilong Chen et.al. 2403.06738 link
2024-03-11 3D-aware Image Generation and Editing with Multi-modal Conditions Bo Li et.al. 2403.06470 null
2024-03-08 CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Zhengyi Wang et.al. 2403.05034 null
2024-03-04 3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors Fangzhou Hong et.al. 2403.02234 link
2024-03-04 TripoSR: Fast 3D Object Reconstruction from a Single Image Dmitry Tochilkin et.al. 2403.02151 link
2024-03-08 G3DR: Generative 3D Reconstruction in ImageNet Pradyumna Reddy et.al. 2403.00939 link
2024-02-28 The VOROS: Lifting ROC curves to 3D Christopher Ratigan et.al. 2402.18689 link
2024-02-27 DivAvatar: Diverse 3D Avatar Generation with a Single Prompt Weijing Tao et.al. 2402.17292 null
2024-02-22 Place Anything into Any Video Ziling Liu et.al. 2402.14316 null
2024-02-22 MVD $^2$ : Efficient Multiview 3D Reconstruction for Multiview Diffusion Xin-Yang Zheng et.al. 2402.14253 null
2024-02-20 MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction Shitao Tang et.al. 2402.12712 null
2024-02-19 Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability Xuelin Qian et.al. 2402.12225 null
2024-02-13 IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation Luke Melas-Kyriazi et.al. 2402.08682 null
2024-02-11 GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting Xiaoyu Zhou et.al. 2402.07207 null
2024-02-08 AvatarMMC: 3D Head Avatar Generation and Editing with Multi-Modal Conditioning Wamiq Reyaz Para et.al. 2402.05803 null
2024-02-07 SPAD : Spatially Aware Multiview Diffusers Yash Kant et.al. 2402.05235 null
2024-02-05 Retrieval-Augmented Score Distillation for Text-to-3D Generation Junyoung Seo et.al. 2402.02972 link
2024-02-02 A Comprehensive Survey on 3D Content Generation Jian Liu et.al. 2402.01166 link

3D Gaussian Splatting

Publish Date Title Authors PDF Code
2025-06-26 MADrive: Memory-Augmented Driving Scene Modeling Polina Karpikova et.al. 2506.21520 null
2025-06-26 EndoFlow-SLAM: Real-Time Endoscopic SLAM with Flow-Constrained Gaussian Splatting Taoyu Wu et.al. 2506.21420 null
2025-06-26 Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image Pufan Li et.al. 2506.21152 null
2025-06-26 User-in-the-Loop View Sampling with Error Peaking Visualization Ayaka Yasunaga et.al. 2506.21009 null
2025-06-25 3DGH: 3D Head Generation with Composable Hair and Face Chengan He et.al. 2506.20875 null
2025-06-24 Virtual Memory for 3D Gaussian Splatting Jonathan Haberl et.al. 2506.19415 null
2025-06-23 GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi-Agent Gaussian SLAM Annika Thomas et.al. 2506.18885 null
2025-06-23 Reconstructing Tornadoes in 3D with Gaussian Splatting Adam Yang et.al. 2506.18677 null
2025-06-21 3D Gaussian Splatting for Fine-Detailed Surface Reconstruction in Large-Scale Scene Shihan Chen et.al. 2506.17636 null
2025-06-20 Part $^{2}$ GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting Tianjiao Yu et.al. 2506.17212 null
2025-06-23 R3eVision: A Survey on Robust Rendering, Restoration, and Enhancement for 3D Low-Level Vision Weeyoung Kwon et.al. 2506.16262 link
2025-06-24 RA-NeRF: Robust Neural Radiance Field Reconstruction with Accurate Camera Pose Estimation under Complex Trajectories Qingsong Yan et.al. 2506.15242 null
2025-06-17 Peering into the Unknown: Active View Selection with Neural Uncertainty Maps for 3D Reconstruction Zhengquan Zhang et.al. 2506.14856 null
2025-06-17 3DGS-IEval-15K: A Large-scale Image Quality Evaluation Database for 3D Gaussian-Splatting Yuke Xing et.al. 2506.14642 link
2025-06-17 HRGS: Hierarchical Gaussian Splatting for Memory-Efficient High-Resolution 3D Reconstruction Changbai Li et.al. 2506.14229 null
2025-06-23 GAF: Gaussian Action Field as a Dynamic World Model for Robotic Manipulation Ying Chai et.al. 2506.14135 null
2025-06-16 GRaD-Nav++: Vision-Language Model Enabled Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics Qianzhong Chen et.al. 2506.14009 null
2025-06-16 PF-LHM: 3D Animatable Avatar Reconstruction from Pose-free Articulated Human Images Lingteng Qiu et.al. 2506.13766 null
2025-06-16 Multiview Geometric Regularization of Gaussian Splatting for Accurate Radiance Fields Jungeon Kim et.al. 2506.13508 null
2025-06-16 GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction Jinguang Tong et.al. 2506.13110 null
2025-06-15 Metropolis-Hastings Sampling for 3D Gaussian Reconstruction Hyunjin Kim et.al. 2506.12945 null
2025-06-17 Efficient multi-view training for 3D Gaussian Splatting Minhyuk Choi et.al. 2506.12727 null
2025-06-14 Perceptual-GS: Scene-adaptive Perceptual Densification for Gaussian Splatting Hongbi Zhou et.al. 2506.12400 null
2025-06-12 PointGS: Point Attention-Aware Sparse View Synthesis with Gaussian Splatting Lintao Xiang et.al. 2506.10335 null
2025-06-11 DGS-LRM: Real-Time Deformable 3D Gaussian Reconstruction From Monocular Videos Chieh Hubert Lin et.al. 2506.09997 null
2025-06-11 Gaussian Herding across Pens: An Optimal Transport Perspective on Global Gaussian Reduction for 3DGS Tao Wang et.al. 2506.09534 null
2025-06-11 HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene Jianing Chen et.al. 2506.09518 null
2025-06-11 TinySplat: Feedforward Approach for Generating Compact 3D Scene Representation Zetian Song et.al. 2506.09479 null
2025-06-12 ODG: Occupancy Prediction Using Dual Gaussians Yunxiao Shi et.al. 2506.09417 null
2025-06-10 StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams Zike Wu et.al. 2506.08862 link
2025-06-11 Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting Keyi Liu et.al. 2506.08777 null
2025-06-10 SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting Mengjiao Ma et.al. 2506.08710 null
2025-06-10 Complex-Valued Holographic Radiance Fields Yicheng Zhan et.al. 2506.08350 null
2025-06-09 Speedy Deformable 3D Gaussian Splatting: Fast Rendering and Compression of Dynamic Scenes Allen Tu et.al. 2506.07917 link
2025-06-09 GaussianVAE: Adaptive Learning Dynamics of 3D Gaussians for High-Fidelity Super-Resolution Shuja Khalid et.al. 2506.07897 null
2025-06-09 R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation William Ljungbergh et.al. 2506.07826 null
2025-06-09 OpenSplat3D: Open-Vocabulary 3D Instance Segmentation using Gaussian Splatting Jens Piekenbrinck et.al. 2506.07697 null
2025-06-09 ProSplat: Improved Feed-Forward 3D Gaussian Splatting for Wide-Baseline Sparse Views Xiaohan Lu et.al. 2506.07670 null
2025-06-09 PIG: Physically-based Multi-Material Interaction with 3D Gaussians Zeyu Xiao et.al. 2506.07657 null
2025-06-09 Hierarchical Scoring with 3D Gaussian Splatting for Instance Image-Goal Navigation Yijie Deng et.al. 2506.07338 null
2025-06-08 Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization Zhican Wang et.al. 2506.07069 null
2025-06-08 Hybrid Mesh-Gaussian Representation for Efficient Indoor Scene Reconstruction Binxiao Huang et.al. 2506.06988 null
2025-06-07 Gaussian Mapping for Evolving Scenes Vladimir Yugay et.al. 2506.06909 null
2025-06-06 Dy3DGS-SLAM: Monocular 3D Gaussian Splatting SLAM for Dynamic Environments Mingrui Li et.al. 2506.05965 null
2025-06-06 SurGSplat: Progressive Geometry-Constrained Gaussian Splatting for Surgical Scene Reconstruction Yuchao Zheng et.al. 2506.05935 null
2025-06-06 Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy Yu Feng et.al. 2506.05682 null
2025-06-05 VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction Ziyue Zhu et.al. 2506.05563 null
2025-06-05 On-the-fly Reconstruction for Large-Scale Novel View Synthesis from Unposed Images Andreas Meuleman et.al. 2506.05558 null
2025-06-05 ODE-GS: Latent ODEs for Dynamic Scene Extrapolation with 3D Gaussian Splatting Daniel Wang et.al. 2506.05480 null
2025-06-05 Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting Duochao Shi et.al. 2506.05327 null
2025-06-05 Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training Aneesh Deogan et.al. 2506.05092 null
2025-06-05 Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting Alfred T. Christiansen et.al. 2506.05009 null
2025-06-05 Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer Filip Slezak et.al. 2506.04908 null
2025-06-05 Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations Gaia Di Lorenzo et.al. 2506.04789 null
2025-06-04 Pseudo-Simulation for Autonomous Driving Wei Cao et.al. 2506.04218 link
2025-06-04 FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting Hengyu Liu et.al. 2506.04174 null
2025-06-04 Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data Ben Moran et.al. 2506.04120 null
2025-06-04 SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting Shengjie Lin et.al. 2506.03594 link
2025-06-04 Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting Chengqi Li et.al. 2506.03538 null
2025-06-03 Multi-Spectral Gaussian Splatting with Neural Color Representation Lukas Meyer et.al. 2506.03407 null
2025-06-03 Large Processor Chip Model Kaiyan Chang et.al. 2506.02929 null
2025-06-04 Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone Zheng Liu et.al. 2506.02774 null
2025-06-03 RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS Chuanyu Fu et.al. 2506.02751 null
2025-06-03 EyeNavGS: A 6-DoF Navigation Dataset and Record-n-Replay Software for Real-World 3DGS Scenes in VR Zihao Ding et.al. 2506.02380 link
2025-06-02 GSCodec Studio: A Modular Framework for Gaussian Splat Compression Sicheng Li et.al. 2506.01822 link
2025-06-02 WorldExplorer: Towards Generating Fully Navigable 3D Scenes Manuel-Andreas Schneider et.al. 2506.01799 null
2025-06-01 Globally Consistent RGB-D SLAM with 2D Gaussian Splatting Xingguang Zhong et.al. 2506.00970 link
2025-05-30 3D Gaussian Splat Vulnerabilities Matthew Hull et.al. 2506.00280 link
2025-05-30 Adaptive Voxelization for Transform coding of 3D Gaussian splatting data Chenjunjie Wang et.al. 2506.00271 null
2025-05-30 Understanding while Exploring: Semantics-driven Active Mapping Liyan Chen et.al. 2506.00225 null
2025-05-30 AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion Yangyi Huang et.al. 2505.24877 null
2025-05-30 TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor Cores Zimu Liao et.al. 2505.24796 link
2025-05-30 Tackling View-Dependent Semantics in 3D Language Gaussian Splatting Jiazhong Cen et.al. 2505.24746 link
2025-05-30 LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework Xin Kang et.al. 2505.24245 null
2025-05-29 3DGEER: Exact and Efficient Volumetric Rendering with 3D Gaussians Zixun Huang et.al. 2505.24053 link
2025-05-30 ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS Weijie Wang et.al. 2505.23734 link
2025-05-29 AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views Lihan Jiang et.al. 2505.23716 null
2025-05-29 Mobi- $π$ : Mobilizing Your Robot Learning Policy Jingyun Yang et.al. 2505.23692 null
2025-05-29 Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting Chuandong Liu et.al. 2505.23280 link
2025-05-29 LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering Jonas Kulhanek et.al. 2505.23158 null
2025-05-29 Pose-free 3D Gaussian splatting via shape-ray estimation Youngju Na et.al. 2505.22978 null
2025-05-28 3DGS Compression with Sparsity-guided Hierarchical Transform Coding Hao Xu et.al. 2505.22908 null
2025-05-28 STDR: Spatio-Temporal Decoupling for Real-Time Dynamic Scene Rendering Zehao Li et.al. 2505.22400 null
2025-05-28 UP-SLAM: Adaptively Structured Gaussian SLAM with Uncertainty Prediction in Dynamic Environments Wancai Zheng et.al. 2505.22335 null
2025-05-28 Learning Fine-Grained Geometry for Sparse-View Splatting via Cascade Depth Loss Wenjun Lu et.al. 2505.22279 null
2025-05-28 Hyperspectral Gaussian Splatting Sunil Kumar Narayanan et.al. 2505.21890 null
2025-05-27 Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility Yidi Li et.al. 2505.21377 link
2025-05-27 Structure from Collision Takuhiro Kaneko et.al. 2505.21335 null
2025-05-29 3D-UIR: 3D Gaussian for Underwater 3D Scene Reconstruction via Physics Based Appearance-Medium Decoupling Jieyu Yuan et.al. 2505.21238 null
2025-05-28 CityGo: Lightweight Urban Modeling and Rendering with Proxy Buildings and Residual Gaussians Weihang Liu et.al. 2505.21041 null
2025-05-27 Intern-GS: Vision Model Guided Sparse-View 3D Gaussian Splatting Xiangyu Sun et.al. 2505.20729 null
2025-05-27 Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting Zechen Li et.al. 2505.20714 link
2025-05-26 ParticleGS: Particle-Based Dynamics Modeling of 3D Gaussians for Prior-free Motion Extrapolation Jinsheng Quan et.al. 2505.20270 link
2025-05-26 OB3D: A New Dataset for Benchmarking Omnidirectional 3D Reconstruction Using Blender Shintaro Ito et.al. 2505.20126 link
2025-05-26 K-Buffers: A Plug-in Method for Enhancing Neural Fields with Multiple Buffers Haofan Ren et.al. 2505.19564 link
2025-05-25 Improving Novel view synthesis of 360 $^\circ$ Scenes in Extremely Sparse Views by Jointly Training Hemisphere Sampled Synthetic Images Guangan Chen et.al. 2505.19264 link
2025-05-25 Triangle Splatting for Real-Time Radiance Field Rendering Jan Held et.al. 2505.19175 null
2025-05-25 FHGS: Feature-Homogenized Gaussian Splatting Q. G. Duan et.al. 2505.19154 null
2025-05-25 Veta-GS: View-dependent deformable 3D Gaussian Splatting for thermal infrared Novel-view Synthesis Myeongseok Nam et.al. 2505.19138 null
2025-05-25 VPGS-SLAM: Voxel-based Progressive 3D Gaussian SLAM in Large-Scale Scenes Tianchen Deng et.al. 2505.18992 link
2025-05-24 Efficient Differentiable Hardware Rasterization for 3D Gaussian Splatting Yitian Yuan et.al. 2505.18764 null
2025-05-24 SuperGS: Consistent and Detailed 3D Super-Resolution Scene Reconstruction via Gaussian Splatting Shiyun Xie et.al. 2505.18649 null
2025-05-23 Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance Jack Goffinet et.al. 2505.18342 null
2025-05-23 CGS-GAN: 3D Consistent Gaussian Splatting GANs for High Resolution Human Head Synthesis Florian Barthel et.al. 2505.17590 null
2025-05-23 From Flight to Insight: Semantic 3D Reconstruction for Aerial Inspection via Gaussian Splatting and Language-Guided Segmentation Mahmoud Chick Zaouali et.al. 2505.17402 null
2025-05-22 Motion Matters: Compact Gaussian Streaming for Free-Viewpoint Video Reconstruction Jiacong Chen et.al. 2505.16533 null
2025-05-21 RUSplatting: Robust 3D Gaussian Splatting for Sparse-View Underwater Scene Reconstruction Zhuodong Jiang et.al. 2505.15737 null
2025-05-21 PlantDreamer: Achieving Realistic 3D Plant Models with Diffusion-Guided Gaussian Splatting Zane K J Hartley et.al. 2505.15528 null
2025-05-21 GS2E: Gaussian Splatting is an Effective Data Generator for Event Stream Generation Yuchen Li et.al. 2505.15287 null
2025-05-21 MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models Yifan Liu et.al. 2505.15185 link
2025-05-20 Scan, Materialize, Simulate: A Generalizable Framework for Physically Grounded Robot Planning Amine Elhafsi et.al. 2505.14938 null
2025-05-20 Personalize Your Gaussian: Consistent 3D Scene Personalization from a Single Image Yuxuan Wang et.al. 2505.14537 null
2025-05-20 MGStream: Motion-aware 3D Gaussian for Streamable Dynamic Scene Reconstruction Zhenyu Bao et.al. 2505.13839 link
2025-05-19 3D Gaussian Adaptive Reconstruction for Fourier Light-Field Microscopy Chenyu Xu et.al. 2505.12875 null
2025-05-19 TACOcc:Target-Adaptive Cross-Modal Fusion with Volume Rendering for 3D Semantic Occupancy Luyao Lei et.al. 2505.12693 null
2025-05-18 Is Semantic SLAM Ready for Embedded Systems ? A Comparative Survey Calvin Galagain et.al. 2505.12384 null
2025-05-17 GTR: Gaussian Splatting Tracking and Reconstruction of Unknown Objects Based on Appearance and Geometric Complexity Takuya Ikeda et.al. 2505.11905 null
2025-05-16 GrowSplat: Constructing Temporal Digital Twins of Plants with Gaussian Splats Simeon Adebola et.al. 2505.10923 null
2025-05-16 EA-3DGS: Efficient and Adaptive 3D Gaussians with Highly Enhanced Quality for outdoor scenes Jianlin Guo et.al. 2505.10787 link
2025-05-14 ExploreGS: a vision-based low overhead framework for 3D scene reconstruction Yunji Feng et.al. 2505.10578 null
2025-05-15 Consistent Quantity-Quality Control across Scenes for Deployment-Aware Gaussian Splatting Fengdi Zhang et.al. 2505.10473 link
2025-05-15 VRSplat: Fast and Robust Gaussian Splatting for Virtual Reality Xuechang Tu et.al. 2505.10144 link
2025-05-15 Advances in Radiance Field for Dynamic Scene: From Neural Field to Gaussian Field Jinlong Fan et.al. 2505.10049 link
2025-05-15 Large-Scale Gaussian Splatting SLAM Zhe Xin et.al. 2505.09915 null
2025-05-14 Real2Render2Real: Scaling Robot Data Without Dynamics Simulation or Robot Hardware Justin Yu et.al. 2505.09601 null
2025-05-13 DLO-Splatting: Tracking Deformable Linear Objects Using 3D Gaussian Splatting Holly Dinkel et.al. 2505.08644 null
2025-05-13 FOCI: Trajectory Optimization on Gaussian Splats Mario Gomez Andreu et.al. 2505.08510 null
2025-05-13 A Survey of 3D Reconstruction with Event Cameras: From Event-based Geometry to Neural 3D Rendering Chuanzhi Xu et.al. 2505.08438 null
2025-05-10 Virtualized 3D Gaussians: Flexible Cluster-based Level-of-Detail System for Real-Time Rendering of Composed Scenes Xijie Yang et.al. 2505.06523 null
2025-05-08 TeGA: Texture Space Gaussian Avatars for High-Resolution Dynamic Head Modeling Gengyan Li et.al. 2505.05672 null
2025-05-08 Steepest Descent Density Control for Compact 3D Gaussian Splatting Peihao Wang et.al. 2505.05587 null
2025-05-08 SVAD: From Single Image to 3D Avatar via Synthetic Data Generation with Video Diffusion and Data Augmentation Yonwoo Choi et.al. 2505.05475 link
2025-05-08 Time of the Flight of the Gaussians: Optimizing Depth Indirectly in Dynamic Radiance Fields Runfeng Li et.al. 2505.05356 null
2025-05-07 SGCR: Spherical Gaussians for Efficient 3D Curve Reconstruction Xinran Yang et.al. 2505.04668 link
2025-05-07 Bridging Geometry-Coherent Text-to-3D Generation with Multi-View Diffusion Priors and Gaussian Splatting Feng Yang et.al. 2505.04262 null
2025-05-06 3D Gaussian Splatting Data Compression with Mixture of Priors Lei Liu et.al. 2505.03310 null
2025-05-04 SparSplat: Fast Multi-View Reconstruction with Generalizable 2D Gaussian Splatting Shubhendu Jena et.al. 2505.02175 null
2025-05-04 GarmentGS: Point-Cloud Guided Gaussian Splatting for High-Fidelity Non-Watertight 3D Garment Reconstruction Zhihao Tang et.al. 2505.02126 null
2025-05-03 HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder Qi Yang et.al. 2505.01938 link
2025-05-03 GenSync: A Generalized Talking Head Framework for Audio-driven Multi-Subject Lip-Sync using 3D Gaussian Splatting Anushka Agarwal et.al. 2505.01928 null
2025-05-03 Visual enhancement and 3D representation for underwater scenes: a review Guoxi Huang et.al. 2505.01869 null
2025-05-03 AquaGS: Fast Underwater Scene Reconstruction with SfM-Free Gaussian Splatting Junhao Shi et.al. 2505.01799 null
2025-05-02 FalconWing: An Open-Source Platform for Ultra-Light Fixed-Wing Aircraft Research Yan Miao et.al. 2505.01383 null
2025-05-02 Compensating Spatiotemporally Inconsistent Observations for Online Dynamic 3D Gaussian Splatting Youngsik Yun et.al. 2505.01235 null
2025-04-30 A Survey on 3D Reconstruction Techniques in Plant Phenotyping: From Classical Methods to Neural Radiance Fields (NeRF), 3D Gaussian Splatting (3DGS), and Beyond Jiajia Li et.al. 2505.00737 link
2025-04-29 GauSS-MI: Gaussian Splatting Shannon Mutual Information for Active 3D Reconstruction Yuhan Xie et.al. 2504.21067 link
2025-04-29 GaussTrap: Stealthy Poisoning Attacks on 3D Gaussian Splatting for Targeted Scene Confusion Jiaxin Hong et.al. 2504.20829 null
2025-04-29 EfficientHuman: Efficient Training and Reconstruction of Moving Human using Articulated 2D Gaussian Hao Tian et.al. 2504.20607 null
2025-04-29 Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting Hanxi Liu et.al. 2504.20403 null
2025-05-01 GSFeatLoc: Visual Localization Using Feature Correspondence on 3D Gaussian Splatting Jongwon Lee et.al. 2504.20379 null
2025-04-28 Mesh-Learner: Texturing Mesh with Spherical Harmonics Yunfei Wan et.al. 2504.19938 link
2025-04-28 CE-NPBG: Connectivity Enhanced Neural Point-Based Graphics for Novel View Synthesis in Autonomous Driving Scenes Mohammad Altillawi et.al. 2504.19557 null
2025-04-28 GSFF-SLAM: 3D Semantic Gaussian Splatting SLAM via Feature Field Zuxing Lu et.al. 2504.19409 null
2025-04-30 4DGS-CC: A Contextual Coding Framework for 4D Gaussian Splatting Data Compression Zicong Chen et.al. 2504.18925 null
2025-05-01 TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians Letian Huang et.al. 2504.18768 null
2025-04-28 RGS-DR: Reflective Gaussian Surfels with Deferred Rendering for Shiny Objects Georgios Kouros et.al. 2504.18468 null
2025-04-25 PerfCam: Digital Twinning for Production Lines Using 3D Gaussian Splatting and Vision Models Michel Gokan Khan et.al. 2504.18165 link
2025-04-24 iVR-GS: Inverse Volume Rendering for Explorable Visualization via Editable 3D Gaussian Splatting Kaiyuan Tang et.al. 2504.17954 link
2025-04-23 Visibility-Uncertainty-guided 3D Gaussian Inpainting via Scene Conceptional Learning Mingxuan Cui et.al. 2504.17815 link
2025-04-24 CasualHDRSplat: Robust High Dynamic Range 3D Gaussian Splatting from Casually Captured Videos Shucheng Gong et.al. 2504.17728 link
2025-04-23 HUG: Hierarchical Urban Gaussian Splatting with Block-Based Reconstruction Zhongtao Wang et.al. 2504.16606 null
2025-04-23 ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration Andrea Conti et.al. 2504.16545 null
2025-04-21 StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians Cailin Zhuang et.al. 2504.15281 null
2025-04-21 MoBGS: Motion Deblurring Dynamic 3D Gaussian Splatting for Blurry Monocular Video Minh-Quan Viet Bui et.al. 2504.15122 null
2025-04-20 NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation Junyuan Fang et.al. 2504.14638 null
2025-04-20 VGNC: Reducing the Overfitting of Sparse-view 3DGS via Validation-guided Gaussian Number Control Lifeng Lin et.al. 2504.14548 null
2025-04-20 Metamon-GS: Enhancing Representability with Variance-Guided Densification and Light Encoding Junyan Su et.al. 2504.14460 null
2025-04-23 SEGA: Drivable 3D Gaussian Head Avatar from a Single Image Chen Guo et.al. 2504.14373 null
2025-04-18 EG-Gaussian: Epipolar Geometry and Graph Network Enhanced 3D Gaussian Splatting Beizhen Zhao et.al. 2504.13540 null
2025-04-17 Volume Encoding Gaussians: Transfer Function-Agnostic 3D Gaussians for Volume Rendering Landon Dyken et.al. 2504.13339 null
2025-04-17 Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation Sizhe Yang et.al. 2504.13175 null
2025-04-18 ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos Zetong Zhang et.al. 2504.13167 null
2025-04-17 Digital Twin Generation from Visual Data: A Survey Andrew Melnik et.al. 2504.13159 link
2025-04-17 Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs Shaohui Dai et.al. 2504.13153 link
2025-04-17 GSAC: Leveraging Gaussian Splatting for Photorealistic Avatar Creation with Unity Integration Rendong Zhang et.al. 2504.12999 link
2025-04-17 Second-order Optimization of Gaussian Splats with Importance Sampling Hamza Pehlivan et.al. 2504.12905 null
2025-04-17 AAA-Gaussians: Anti-Aliased and Artifact-Free 3D Gaussian Rendering Michael Steiner et.al. 2504.12811 null
2025-04-17 CAGE-GS: High-fidelity Cage Based 3D Gaussian Splatting Deformation Yifei Tong et.al. 2504.12800 null
2025-04-17 TSGS: Improving Gaussian Splatting for Transparent Surface Reconstruction via Normal and De-lighting Priors Mingwei Li et.al. 2504.12799 null
2025-04-17 ARAP-GS: Drag-driven As-Rigid-As-Possible 3D Gaussian Splatting Editing with Diffusion Prior Xiao Han et.al. 2504.12788 null
2025-04-16 CAGS: Open-Vocabulary 3D Scene Understanding with Context-Aware Gaussian Splatting Wei Sun et.al. 2504.11893 null
2025-04-16 3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians Zeming Wei et.al. 2504.11218 link
2025-04-15 3D Gabor Splatting: Reconstruction of High-frequency Surface Texture using Gabor Noise Haato Watanabe et.al. 2504.11003 null
2025-04-15 LL-Gaussian: Low-Light Scene Reconstruction and Enhancement via Gaussian Splatting for Novel View Synthesis Hao Sun et.al. 2504.10331 null
2025-04-14 EBAD-Gaussian: Event-driven Bundle Adjusted Deblur Gaussian Splatting Yufei Deng et.al. 2504.10012 null
2025-04-16 GaussVideoDreamer: 3D Scene Generation with Video Diffusion and Inconsistency-Aware Gaussian Splatting Junlin Hao et.al. 2504.10001 null
2025-04-13 DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering Yexing Xu et.al. 2504.09491 null
2025-04-12 A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds Jizong Peng et.al. 2504.09129 null
2025-04-12 BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting Jeongwan On et.al. 2504.09097 null
2025-04-12 You Need a Transition Plane: Bridging Continuous Panoramic 3D Reconstruction with Perspective Gaussian Splatting Zhijie Shen et.al. 2504.09062 null
2025-04-15 BlockGaussian: Efficient Large-Scale Scene Novel View Synthesis via Adaptive Block-Based Gaussian Splatting Yongchang Wu et.al. 2504.09048 link
2025-04-11 FMLGS: Fast Multilevel Language Embedded Gaussians for Part-level Interactive Agents Xin Tan et.al. 2504.08581 null
2025-04-10 InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians Kefan Chen et.al. 2504.07949 null
2025-04-10 View-Dependent Uncertainty Estimation of 3D Gaussian Splatting Chenyu Han et.al. 2504.07370 null
2025-04-09 Wheat3DGS: In-field 3D Reconstruction, Instance Segmentation and Phenotyping of Wheat Heads with Gaussian Splatting Daiwei Zhang et.al. 2504.06978 null
2025-04-09 IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments Can Zhang et.al. 2504.06827 null
2025-04-09 SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering Hanxiao Sun et.al. 2504.06815 link
2025-04-10 Stochastic Ray Tracing of 3D Transparent Gaussians Xin Sun et.al. 2504.06598 null
2025-04-08 Micro-splatting: Maximizing Isotropic Constraints for Refined Optimization in 3D Gaussian Splatting Jee Won Lee et.al. 2504.05740 null
2025-04-07 View-Dependent Deformation Fields for 2D Editing of 3D Models Martin El Mqirmi et.al. 2504.05544 null
2025-04-07 L3GS: Layered 3D Gaussian Splats for Efficient 3D Scene Delivery Yi-Zhen Tsai et.al. 2504.05517 link
2025-04-07 Let it Snow! Animating Static Gaussian Scenes With Dynamic Weather Effects Gal Fiebelman et.al. 2504.05296 null
2025-04-07 PanoDreamer: Consistent Text to 360-Degree Scene Generation Zhexiao Xiong et.al. 2504.05152 null
2025-04-07 Embracing Dynamics: Dynamics-aware 4D Gaussian Splatting SLAM Zhicong Sun et.al. 2504.04844 link
2025-04-07 DeclutterNeRF: Generative-Free 3D Scene Recovery for Occlusion Removal Wanzhou Liu et.al. 2504.04679 null
2025-04-05 3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS Zhisheng Huang et.al. 2504.04294 null
2025-04-05 Interpretable Single-View 3D Gaussian Splatting using Unsupervised Hierarchical Disentangled Representation Learning Yuyang Zhang et.al. 2504.04190 null
2025-04-04 HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration Boyuan Wang et.al. 2504.03536 null
2025-04-03 Compressing 3D Gaussian Splatting by Noise-Substituted Vector Quantization Haishan Wang et.al. 2504.03059 link
2025-04-03 MonoGS++: Fast and Accurate Monocular RGB Gaussian SLAM Renwu Li et.al. 2504.02437 null
2025-04-03 ConsDreamer: Advancing Multi-View Consistency for Zero-Shot Text-to-3D Generation Yuan Zhou et.al. 2504.02316 link
2025-04-02 UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting Jaehoon Choi et.al. 2504.02158 null
2025-04-02 Diffusion-Guided Gaussian Splatting for Large-Scale Unconstrained 3D Reconstruction and Novel View Synthesis Niluthpol Chowdhury Mithun et.al. 2504.01960 null
2025-04-02 BOGausS: Better Optimized Gaussian Splatting Stéphane Pateux et.al. 2504.01844 null
2025-04-02 FlowR: Flowing from Sparse to Dense 3D Reconstructions Tobias Fischer et.al. 2504.01647 null
2025-04-02 3DBonsai: Structure-Aware Bonsai Modeling Using Conditioned 3D Gaussian Splatting Hao Wu et.al. 2504.01619 null
2025-04-02 RealityAvatar: Towards Realistic Loose Clothing Modeling in Animatable 3D Gaussian Avatars Yahui Li et.al. 2504.01559 null
2025-04-02 Luminance-GS: Adapting 3D Gaussian Splatting to Challenging Lighting Conditions with View-Adaptive Curve Adjustment Ziteng Cui et.al. 2504.01503 link
2025-04-02 3D Gaussian Inverse Rendering with Approximated Global Illumination Zirui Wu et.al. 2504.01358 null
2025-04-01 DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting Hyunwoo Park et.al. 2504.00773 null
2025-04-01 UnIRe: Unsupervised Instance Decomposition for Dynamic Urban Scene Reconstruction Yunxuan Mao et.al. 2504.00763 null
2025-04-01 Monocular and Generalizable Gaussian Talking Head Animation Shengjie Gong et.al. 2504.00665 null
2025-03-31 StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting Shakiba Kheradmand et.al. 2503.24366 null
2025-04-01 Visual Acoustic Fields Yuelei Li et.al. 2503.24270 null
2025-03-31 DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting Seungjun Lee et.al. 2503.24210 null
2025-03-31 Learning 3D-Gaussian Simulators from RGB Videos Mikel Zhobro et.al. 2503.24009 null
2025-03-31 ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image Tianyi Gong et.al. 2503.23881 null
2025-03-30 Gaussian Blending Unit: An Edge GPU Plug-in for Real-Time Gaussian-Based Rendering in AR/VR Zhifan Ye et.al. 2503.23625 null
2025-03-30 Enhancing 3D Gaussian Splatting Compression via Spatial Condition-based Prediction Jingui Ma et.al. 2503.23337 null
2025-03-30 ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning Zhenyang Liu et.al. 2503.23297 null
2025-03-29 NeuralGS: Bridging Neural Fields and 3D Gaussian Splatting for Compact 3D Representations Zhenyu Tang et.al. 2503.23162 null
2025-03-29 CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction Yuanyuan Gao et.al. 2503.23044 null
2025-03-28 TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting Boyang et.al. 2503.22676 null
2025-03-28 AH-GS: Augmented 3D Gaussian Splatting for High-Frequency Detail Representation Chenyang Xu et.al. 2503.22324 null
2025-03-28 Follow Your Motion: A Generic Temporal Consistency Portrait Editing Framework with Trajectory Guidance Haijie Yang et.al. 2503.22225 null
2025-03-28 ABC-GS: Alignment-Based Controllable Style Transfer for 3D Gaussian Splatting Wenjie Liu et.al. 2503.22218 null
2025-03-31 Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering Hao Feng et.al. 2503.22159 null
2025-03-27 Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying Hairong Yin et.al. 2503.21767 null
2025-03-28 LandMarkSystem Technical Report Zhenxiang Ma et.al. 2503.21364 link
2025-03-27 Frequency-Aware Gaussian Splatting Decomposition Yishai Lavi et.al. 2503.21226 null
2025-03-26 PGC: Physics-Based Gaussian Cloth from a Single Pose Michelle Guo et.al. 2503.20779 null
2025-03-26 TC-GS: Tri-plane based compression for 3D Gaussian Splatting Taorui Wang et.al. 2503.20221 link
2025-03-26 EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis Sheng Miao et.al. 2503.20168 null
2025-03-25 Thin-Shell-SfT: Fine-Grained Monocular Non-rigid 3D Surface Tracking with Neural Deformation Fields Navami Kairanda et.al. 2503.19976 null
2025-03-26 A Survey on Event-driven 3D Reconstruction: Development under Different Categories Chuanzhi Xu et.al. 2503.19753 null
2025-03-28 GaussianUDF: Inferring Unsigned Distance Functions through 3D Gaussian Splatting Shujuan Li et.al. 2503.19458 null
2025-03-25 SparseGS-W: Sparse-View 3D Gaussian Splatting in the Wild with Generative Priors Yiqing Li et.al. 2503.19452 null
2025-03-26 COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting Jiaxin Zhang et.al. 2503.19443 link
2025-03-25 MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection Jee Won Lee et.al. 2503.19330 null
2025-03-25 HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting Xinpeng Liu et.al. 2503.19232 link
2025-03-24 NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting Yulong Zheng et.al. 2503.18794 null
2025-03-24 GS-Marker: Generalizable and Robust Watermarking for 3D Gaussian Splatting Lijiang Li et.al. 2503.18718 null
2025-03-24 Hardware-Rasterized Ray-Based Gaussian Splatting Samuel Rota Bulò et.al. 2503.18682 null
2025-03-24 LLGS: Unsupervised Gaussian Splatting for Image Enhancement and Reconstruction in Pure Dark Environment Haoran Wang et.al. 2503.18640 null
2025-03-25 StableGS: A Floater-Free Framework for 3D Gaussian Splatting Luchao Wang et.al. 2503.18458 null
2025-03-24 4DGC: Rate-Aware 4D Gaussian Compression for Efficient Streamable Free-Viewpoint Video Qiang Hu et.al. 2503.18421 null
2025-03-24 DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds Youyu Chen et.al. 2503.18402 null
2025-03-24 GI-SLAM: Gaussian-Inertial SLAM Xulang Liu et.al. 2503.18275 null
2025-03-23 Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving Junhao Ge et.al. 2503.18108 link
2025-03-23 PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding Hongjia Zhai et.al. 2503.18107 null
2025-03-21 TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting Jianchuan Chen et.al. 2503.17032 null
2025-03-21 DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery Jiadong Tang et.al. 2503.16964 null
2025-03-21 Optimized Minimal 3D Gaussian Splatting Joo Chan Lee et.al. 2503.16924 null
2025-03-20 SAGE: Semantic-Driven Adaptive Gaussian Splatting in Extended Reality Chiara Schiavo et.al. 2503.16747 null
2025-03-20 GauRast: Enhancing GPU Triangle Rasterizers to Accelerate 3D Gaussian Splatting Sixu Li et.al. 2503.16681 null
2025-03-20 M3: 3D-Spatial MultiModal Memory Xueyan Zou et.al. 2503.16413 link
2025-03-20 Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images Shengjun Zhang et.al. 2503.16338 null
2025-03-20 OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering Shiyong Liu et.al. 2503.16177 null
2025-03-20 Enhancing Close-up Novel View Synthesis via Pseudo-labeling Jiatong Xia et.al. 2503.15908 link
2025-03-20 VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling Hyojun Go et.al. 2503.15855 null
2025-03-20 BARD-GS: Blur-Aware Reconstruction of Dynamic Scenes via Gaussian Splatting Yiren Lu et.al. 2503.15835 null
2025-03-18 HandSplat: Embedding-Driven Gaussian Splatting for High-Fidelity Hand Rendering Yilan Dong et.al. 2503.14736 null
2025-03-18 Optimized 3D Gaussian Splatting using Coarse-to-Fine Image Frequency Modulation Umar Farooq et.al. 2503.14475 null
2025-03-18 Improving Adaptive Density Control for 3D Gaussian Splatting Glenn Grubert et.al. 2503.14274 link
2025-03-18 Lightweight Gradient-Aware Upscaling of 3D Gaussian Splatting Images Simon Niedermayr et.al. 2503.14171 null
2025-03-18 Light4GS: Lightweight Compact 4D Gaussian Splatting Generation via Context Model Mufan Liu et.al. 2503.13948 null
2025-03-17 Gaussian On-the-Fly Splatting: A Progressive Framework for Robust Near Real-Time 3DGS Optimization Yiwei Xu et.al. 2503.13086 null
2025-03-17 CAT-3DGS Pro: A New Benchmark for Efficient 3DGS Compression Yu-Ting Zhan et.al. 2503.12862 null
2025-03-17 CompMarkGS: Robust Watermarking for Compression 3D Gaussian Splatting Sumin In et.al. 2503.12836 null
2025-03-17 AV-Surf: Surface-Enhanced Geometry-Aware Novel-View Acoustic Synthesis Hadam Baek et.al. 2503.12806 null
2025-03-16 SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs Guibiao Liao et.al. 2503.12535 null
2025-03-16 VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting Songen Gu et.al. 2503.12383 null
2025-03-18 GS-I $^{3}$ : Gaussian Splatting for Surface Reconstruction from Illumination-Inconsistent Images Tengfei Wang et.al. 2503.12335 link
2025-03-16 Swift4D:Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene Jiahao Wu et.al. 2503.12307 null
2025-03-18 3D Gaussian Splatting against Moving Objects for High-Fidelity Street Scene Reconstruction Peizhen Zheng et.al. 2503.12001 link
2025-03-15 DynaGSLAM: Real-Time Gaussian-Splatting SLAM for Online Rendering, Tracking, Motion Predictions of Moving Objects in Dynamic Scenes Runfa Blark Li et.al. 2503.11979 null
2025-03-14 Advancing 3D Gaussian Splatting Editing with Complementary and Consensus Information Xuanqi Zhang et.al. 2503.11601 null
2025-03-14 EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting Di Li et.al. 2503.11345 null
2025-03-14 Uncertainty-Aware Normal-Guided Gaussian Splatting for Surface Reconstruction from Sparse Image Sequences Zhen Tan et.al. 2503.11172 null
2025-03-13 LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds Lingteng Qiu et.al. 2503.10625 link
2025-03-13 VicaSplat: A Single Run is All You Need for 3D Gaussian Splatting and Camera Estimation from Unposed Video Frames Zhiqi Li et.al. 2503.10286 null
2025-03-13 ROODI: Reconstructing Occluded Objects with Denoising Inpainters Yeonjin Chang et.al. 2503.10256 null
2025-03-15 3D Student Splatting and Scooping Jialin Zhu et.al. 2503.10148 link
2025-03-13 GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping Jinfeng Liu et.al. 2503.10143 null
2025-03-12 Physics-Aware Human-Object Rendering from Sparse Views via 3D Gaussian Splatting Weiquan Wang et.al. 2503.09640 null
2025-03-12 Hybrid Rendering for Multimodal Autonomous Driving: Merging Neural and Physics-Based Simulation Máté Tóth et.al. 2503.09464 null
2025-03-12 Online Language Splatting Saimouli Katragadda et.al. 2503.09447 null
2025-03-12 Close-up-GS: Enhancing Close-Up View Synthesis in 3D Gaussian Splatting with Progressive Self-Training Jiatong Xia et.al. 2503.09396 null
2025-03-11 PCGS: Progressive Compression of 3D Gaussian Splatting Yihang Chen et.al. 2503.08511 link
2025-03-11 HRAvatar: High-Quality and Relightable Gaussian Head Avatar Dongbin Zhang et.al. 2503.08224 null
2025-03-11 S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction Guangting Zheng et.al. 2503.08217 null
2025-03-11 Dynamic Scene Reconstruction: Recent Advance in Real-time Rendering and Streaming Jiaxuan Zhu et.al. 2503.08166 null
2025-03-11 ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting Junfu Guo et.al. 2503.08135 null
2025-03-13 MVGSR: Multi-View Consistency Gaussian Splatting for Robust Surface Reconstruction Chenfeng Hou et.al. 2503.08093 null
2025-03-11 GigaSLAM: Large-Scale Monocular SLAM with Hierachical Gaussian Splats Kai Deng et.al. 2503.08071 link
2025-03-11 7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting Zhongpai Gao et.al. 2503.07946 null
2025-03-10 POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality Joey Wilson et.al. 2503.07819 null
2025-03-10 SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting Jiahui Zhang et.al. 2503.07476 null
2025-03-10 EigenGS Representation: From Eigenspace to Gaussian Image Space Lo-Wei Tai et.al. 2503.07446 null
2025-03-10 All That Glitters Is Not Gold: Key-Secured 3D Secrets within 3D Gaussian Splatting Yan Ren et.al. 2503.07191 link
2025-03-10 Frequency-Aware Density Control via Reparameterization for High-Quality Rendering of 3D Gaussian Splatting Zhaojie Zeng et.al. 2503.07000 link
2025-03-09 REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints Di Wu et.al. 2503.06677 null
2025-03-09 StructGS: Adaptive Spherical Harmonics and Rendering Enhancements for Superior 3D Gaussian Splatting Zexu Huang et.al. 2503.06462 null
2025-03-08 SplatTalk: 3D VQA with Gaussian Splatting Anh Thai et.al. 2503.06271 null
2025-03-08 StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams Yang LI et.al. 2503.06235 null
2025-03-08 ForestSplats: Deformable transient field for Gaussian Splatting in the Wild Wongi Park et.al. 2503.06179 null
2025-03-08 Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction Kai Li et.al. 2503.06161 null
2025-03-07 Free Your Hands: Lightweight Relightable Turntable Capture Pipeline Jiahui Fan et.al. 2503.05511 null
2025-03-07 LiDAR-enhanced 3D Gaussian Splatting Mapping Jian Shen et.al. 2503.05425 null
2025-03-07 Self-Modeling Robots by Photographing Kejun Hu et.al. 2503.05398 null
2025-03-07 CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images Jungho Lee et.al. 2503.05332 link
2025-03-07 STGA: Selective-Training Gaussian Head Avatars Hanzhi Guo et.al. 2503.05196 null
2025-03-07 MGSR: 2D/3D Mutual-boosted Gaussian Splatting for High-fidelity Surface Reconstruction under Various Light Conditions Qingyuan Zhou et.al. 2503.05182 null
2025-03-07 SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting Linqi Yang et.al. 2503.05174 null
2025-03-07 SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting Xiaotong Huang et.al. 2503.05168 null
2025-03-07 EvolvingGS: High-Fidelity Streamable Volumetric Video via Evolving 3D Gaussian Representation Chao Zhang et.al. 2503.05162 null
2025-03-07 GaussianCAD: Robust Self-Supervised CAD Reconstruction from Three Orthographic Views Using 3D Gaussian Splatting Zheng Zhou et.al. 2503.05161 null
2025-03-06 S2Gaussian: Sparse-View Super-Resolution 3D Gaussian Splatting Yecong Wan et.al. 2503.04314 null
2025-03-06 Instrument-Splatting: Controllable Photorealistic Reconstruction of Surgical Instruments Using Gaussian Splatting Shuojue Yang et.al. 2503.04082 null
2025-03-06 Beyond Existance: Fulfill 3D Reconstructed Scenes with Pseudo Details Yifei Gao et.al. 2503.04037 null
2025-03-06 GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding Xihan Wang et.al. 2503.04034 null
2025-03-06 GRaD-Nav: Efficiently Learning Visual Drone Navigation with Gaussian Radiance Fields and Differentiable Dynamics Qianzhong Chen et.al. 2503.03984 null
2025-03-04 2DGS-Avatar: Animatable High-fidelity Clothed Avatar via 2D Gaussian Splatting Qipeng Yan et.al. 2503.02452 null
2025-03-04 DQO-MAP: Dual Quadrics Multi-Object mapping with Gaussian Splatting Haoyuan Li et.al. 2503.02223 link
2025-03-03 Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization Jamie Wynn et.al. 2503.02009 null
2025-03-03 Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Jay Zhangjie Wu et.al. 2503.01774 null
2025-03-03 OpenGS-SLAM: Open-Set Dense Semantic SLAM with 3D Gaussian Splatting for Object-Level Scene Understanding Dianyi Yang et.al. 2503.01646 null
2025-03-03 FGS-SLAM: Fourier-based Gaussian Splatting for Real-time SLAM with Sparse and Dense Map Fusion Yansong Xu et.al. 2503.01109 null
2025-03-02 Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization You Shen et.al. 2503.00881 null
2025-03-02 Vid2Fluid: 3D Dynamic Fluid Assets from Single-View Videos with Generative Gaussian Splatting Zhiwei Zhao et.al. 2503.00868 null
2025-03-02 PSRGS:Progressive Spectral Residual of 3D Gaussian for High-Frequency Recovery BoCheng Li et.al. 2503.00848 null
2025-03-02 DoF-Gaussian: Controllable Depth-of-Field for 3D Gaussian Splatting Liao Shen et.al. 2503.00746 null
2025-03-03 FlexDrive: Toward Trajectory Flexibility in Driving Scene Reconstruction and Rendering Jingqiu Zhou et.al. 2502.21093 null
2025-02-28 EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering John J. Han et.al. 2502.20669 null
2025-02-27 No Parameters, No Problem: 3D Gaussian Splatting without Camera Intrinsics and Extrinsics Dongbo Shi et.al. 2502.19800 null
2025-02-27 Open-Vocabulary Semantic Part Segmentation of 3D Human Keito Suzuki et.al. 2502.19782 null
2025-02-26 Compression in 3D Gaussian Splatting: A Survey of Methods, Trends, and Future Directions Muhammad Salman Ali et.al. 2502.19457 null
2025-02-26 Does 3D Gaussian Splatting Need Accurate Volumetric Rendering? Adam Celarek et.al. 2502.19318 link
2025-02-28 OpenFly: A Versatile Toolchain and Large-scale Benchmark for Aerial Vision-Language Navigation Yunpeng Gao et.al. 2502.18041 null
2025-02-27 UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting Haoyuan Li et.al. 2502.17860 null
2025-02-24 Laplace-Beltrami Operator for Gaussian Splatting Hongyu Zhou et.al. 2502.17531 null
2025-02-24 Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting Chong Cheng et.al. 2502.17377 null
2025-02-24 VR-Pipe: Streamlining Hardware Graphics Pipeline for Volume Rendering Junseo Lee et.al. 2502.17078 null
2025-02-23 Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration Kim Jun-Seong et.al. 2502.16652 null
2025-02-23 Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control Jinbo Yan et.al. 2502.16475 null
2025-02-21 RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes Sicheng Yu et.al. 2502.15633 null
2025-02-20 GS-Cache: A GS-Cache Inference Framework for Large-scale Gaussian Splatting Models Miao Tao et.al. 2502.14938 null
2025-02-20 Hier-SLAM++: Neuro-Symbolic Semantic SLAM with a Hierarchically Categorical Gaussian Splatting Boying Li et.al. 2502.14931 null
2025-02-20 CDGS: Confidence-Aware Depth Regularization for 3D Gaussian Splatting Qilin Zhang et.al. 2502.14684 link
2025-02-20 OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving Yedong Shen et.al. 2502.14235 null
2025-02-19 GlossGau: Efficient Inverse Rendering for Glossy Surface with Anisotropic Spherical Gaussian Bang Du et.al. 2502.14129 null
2025-02-19 3D Gaussian Splatting aided Localization for Large and Complex Indoor-Environments Vincent Ress et.al. 2502.13803 null
2025-02-18 RadSplatter: Extending 3D Gaussian Splatting to Radio Frequencies for Wireless Radiomap Extrapolation Yiheng Wang et.al. 2502.12686 null
2025-02-17 3D Gaussian Inpainting with Depth-Guided Cross-View Consistency Sheng-Yu Huang et.al. 2502.11801 null
2025-02-17 Exploring the Versal AI Engine for 3D Gaussian Splatting Kotaro Shimamura et.al. 2502.11782 null
2025-02-17 GaussianMotion: End-to-End Learning of Animatable Gaussian Avatars with Pose Guidance from Text Gyumin Shim et.al. 2502.11642 null
2025-02-16 OMG: Opacity Matters in Material Modeling with Gaussian Splatting Silong Yong et.al. 2502.10988 null
2025-02-16 GS-GVINS: A Tightly-integrated GNSS-Visual-Inertial Navigation System Augmented by 3D Gaussian Splatting Zelin Zhou et.al. 2502.10975 null
2025-02-15 E-3DGS: Event-Based Novel View Rendering of Large-Scale Scenes Using 3D Gaussian Splatting Sohaib Zahid et.al. 2502.10827 null
2025-02-13 X-SG $^2$ S: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks Zihang Cheng et.al. 2502.10475 null
2025-02-12 Interactive Holographic Visualization for 3D Facial Avatar Tri Tung Nguyen Nguyen et.al. 2502.08085 null
2025-02-11 TranSplat: Surface Embedding-guided 3D Gaussian Splatting for Transparent Object Manipulation Jeongyun Kim et.al. 2502.07840 link
2025-02-11 Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors Lin-Zhuo Chen et.al. 2502.07615 null
2025-02-05 GARAD-SLAM: 3D GAussian splatting for Real-time Anti Dynamic SLAM Mingrui Li et.al. 2502.03228 null
2025-02-05 GP-GS: Gaussian Processes for Enhanced Gaussian Splatting Zhihao Guo et.al. 2502.02283 link
2025-02-04 LAYOUTDREAMER: Physics-guided Layout for Text-to-3D Compositional Scene Generation Yang Zhou et.al. 2502.01949 null
2025-02-11 UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping Aashish Rai et.al. 2502.01846 null
2025-02-03 Scalable 3D Gaussian Splatting-Based RF Signal Spatial Propagation Modeling Kang Yang et.al. 2502.01826 null
2025-02-03 VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion Shaoting Zhu et.al. 2502.01536 null
2025-02-02 EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis Junuk Cha et.al. 2502.00654 null
2025-01-31 Lifting by Gaussians: A Simple, Fast and Flexible Method for 3D Instance Segmentation Rohan Chacko et.al. 2502.00173 null
2025-01-31 Advancing Dense Endoscopic Reconstruction with Gaussian Splatting-driven Surface Normal-aware Tracking and Mapping Yiming Huang et.al. 2501.19319 link
2025-01-31 RaySplats: Ray Tracing based Gaussian Splatting Krzysztof Byrski et.al. 2501.19196 link
2025-01-31 JGHand: Joint-Driven Animatable Hand Avater via 3D Gaussian Splatting Zhoutao Sun et.al. 2501.19088 null
2025-01-30 Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting Yansong Qu et.al. 2501.18672 null
2025-01-29 3D Reconstruction of Shoes for Augmented Reality Pratik Shrestha et.al. 2501.18643 null
2025-01-31 VoD-3DGS: View-opacity-Dependent 3D Gaussian Splatting Mateusz Nowak et.al. 2501.17978 null
2025-01-29 CrowdSplat: Exploring Gaussian Splatting For Crowd Rendering Xiaohan Sun et.al. 2501.17792 link
2025-01-29 FeatureGS: Eigenvalue-Feature Optimization in 3D Gaussian Splatting for Geometrically Accurate and Artifact-Reduced Reconstruction Miriam Jäger et.al. 2501.17655 null
2025-01-28 Evaluating CrowdSplat: Perceived Level of Detail for Gaussian Crowds Xiaohan Sun et.al. 2501.17085 null
2025-01-28 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation Chenguo Lin et.al. 2501.16764 null
2025-01-25 Towards Better Robustness: Progressively Joint Pose-3DGS Learning for Arbitrarily Long Videos Zhen-Hui Dong et.al. 2501.15096 null
2025-01-25 HuGDiffusion: Generalizable Single-Image Human Rendering via 3D Gaussian Diffusion Yingzhi Tang et.al. 2501.15008 null
2025-01-24 HAMMER: Heterogeneous, Multi-Robot Semantic Gaussian Splatting Javier Yu et.al. 2501.14147 null
2025-01-27 3DGS $^2$ : Near Second-order Converging 3D Gaussian Splatting Lei Lan et.al. 2501.13975 null
2025-01-23 GoDe: Gaussians on Demand for Progressive Level of Detail and Scalable Compression Francesco Di Sario et.al. 2501.13558 null
2025-01-23 MultiDreamer3D: Multi-concept 3D Customization with Concept-Aware Diffusion Guidance Wooseok Song et.al. 2501.13449 null
2025-01-23 GeomGS: LiDAR-Guided Geometry-Aware Gaussian Splatting for Robot Localization Jaewon Lee et.al. 2501.13417 null
2025-01-23 VIGS SLAM: IMU-based Large-Scale 3D Gaussian Splatting SLAM Gyuhyeon Pak et.al. 2501.13402 null
2025-01-23 Deblur-Avatar: Animatable Avatars from Motion-Blurred Monocular Videos Xianrui Luo et.al. 2501.13335 null
2025-01-22 Sketch and Patch: Efficient 3D Gaussian Representation for Man-Made Scenes Yuang Shi et.al. 2501.13045 null
2025-01-21 DARB-Splatting: Generalizing Splatting with Decaying Anisotropic Radial Basis Functions Vishagar Arunan et.al. 2501.12369 null
2025-01-22 HAC++: Towards 100X Compression of 3D Gaussian Splatting Yihang Chen et.al. 2501.12255 link
2025-01-22 GSVC: Efficient Video Representation and Compression Through 2D Gaussian Splatting Longan Wang et.al. 2501.12060 null
2025-01-20 See In Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization Zongqi He et.al. 2501.11508 null
2025-01-19 RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering Chenlu Zhan et.al. 2501.11102 null
2025-01-15 BloomScene: Lightweight Structured 3D Gaussian Splatting for Crossmodal Scene Generation Xiaolu Hou et.al. 2501.10462 link
2025-01-20 GSTAR: Gaussian Surface Tracking and Reconstruction Chengwei Zheng et.al. 2501.10283 null
2025-01-16 Creating Virtual Environments with 3D Gaussian Splatting: A Comparative Study Shi Qiu et.al. 2501.09302 null
2025-01-15 CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation Qi Ma et.al. 2501.08982 null
2025-01-15 GS-LIVO: Real-Time LiDAR, Inertial, and Visual Multi-sensor Fused Odometry with Gaussian Mapping Sheng Hong et.al. 2501.08672 null
2025-01-14 3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering Meenakshi Krishnan et.al. 2501.08370 null
2025-01-13 UnCommon Objects in 3D Xingchen Liu et.al. 2501.07574 link
2025-01-13 3DGS-to-PC: Convert a 3D Gaussian Splatting Scene into a Dense Point Cloud or Mesh Lewis A G Stuart et.al. 2501.07478 link
2025-01-14 SplatMAP: Online Dense Monocular SLAM with 3D Gaussian Splatting Yue Hu et.al. 2501.07015 null
2025-01-12 Synthetic Prior for Few-Shot Drivable Head Avatar Inversion Wojciech Zielonka et.al. 2501.06903 null
2025-01-12 ActiveGAMER: Active GAussian Mapping through Efficient Rendering Liyan Chen et.al. 2501.06897 null
2025-01-11 NVS-SQA: Exploring Self-Supervised Quality Representation Learning for Neurally Synthesized Scenes without References Qiang Qu et.al. 2501.06488 link
2025-01-10 Locality-aware Gaussian Compression for Fast and High-quality Rendering Seungjoo Shin et.al. 2501.05757 null
2025-01-13 Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance Dimitrios Gerogiannis et.al. 2501.05379 null
2025-01-09 Scaffold-SLAM: Structured 3D Gaussians for Simultaneous Localization and Photorealistic Mapping Wen Tianci et.al. 2501.05242 null
2025-01-08 GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting Andrew Bond et.al. 2501.04782 null
2025-01-07 MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting Sangwoon Kwak et.al. 2501.03714 null
2025-01-07 DehazeGS: Seeing Through Fog with 3D Gaussian Splatting Jinze Yu et.al. 2501.03659 null
2025-01-07 ConcealGS: Concealing Invisible Copyright Information in 3D Gaussian Splatting Yifeng Yang et.al. 2501.03605 link
2025-01-06 Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs Soonbin Lee et.al. 2501.03399 null
2025-01-06 HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation Wentian Qu et.al. 2501.02845 null
2025-01-03 Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision Alberta Longhini et.al. 2501.01715 null
2025-01-03 CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction Chenhao Zhang et.al. 2501.01695 null
2025-01-03 PG-SAG: Parallel Gaussian Splatting for Fine-Grained Large-Scale Urban Buildings Reconstruction via Semantic-Aware Grouping Tengfei Wang et.al. 2501.01677 link
2025-01-02 Deformable Gaussian Splatting for Efficient and High-Fidelity Reconstruction of Surgical Scenes Jiwei Shan et.al. 2501.01101 null
2025-01-02 EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting Easy Ao Gao et.al. 2501.01003 null
2024-12-31 PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM Runnan Chen et.al. 2501.00352 null
2024-12-31 SG-Splatting: Accelerating 3D Gaussian Splatting with Spherical Gaussians Yiwen Wang et.al. 2501.00342 null
2024-12-30 PERSE: Personalized 3D Generative Avatars from A Single Portrait Hyunsoo Cha et.al. 2412.21206 null
2024-12-30 KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences Keng-Wei Chang et.al. 2412.20767 null
2024-12-29 MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks Yifei Liu et.al. 2412.20522 link
2024-12-28 DEGSTalk: Decomposed Per-Embedding Gaussian Fields for Hair-Preserving Talking Face Synthesis Kaijun Deng et.al. 2412.20148 link
2024-12-28 GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting Atticus J. Zeller et.al. 2412.20056 link
2024-12-27 Dust to Tower: Coarse-to-Fine Photo-Realistic Scene Reconstruction from Sparse Uncalibrated Images Xudong Cai et.al. 2412.19518 null
2024-12-27 Learning Radiance Fields from a Single Snapshot Compressive Image Yunhao Li et.al. 2412.19483 null
2024-12-26 BeSplat – Gaussian Splatting from a Single Blurry Image and Event Stream Gopi Raju Matta et.al. 2412.19370 link
2024-12-26 Generating Editable Head Avatars with 3D Gaussian GANs Guohao Li et.al. 2412.19149 link
2024-12-26 CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting Siyu Jiao et.al. 2412.19142 null
2024-12-26 MVS-GS: High-Quality 3D Gaussian Splatting Mapping via Online Multi-View Stereo Byeonggwon Lee et.al. 2412.19130 null
2024-12-25 WeatherGS: 3D Scene Reconstruction in Adverse Weather Conditions via Gaussian Splatting Chenghao Qian et.al. 2412.18862 link
2024-12-25 GSAVS: Gaussian Splatting-based Autonomous Vehicle Simulator Rami Wilson et.al. 2412.18816 null
2024-12-25 ArtNVG: Content-Style Separated Artistic Neighboring-View Gaussian Stylization Zixiao Gu et.al. 2412.18783 null
2024-12-24 RSGaussian:3D Gaussian Splatting with LiDAR for Aerial Remote Sensing Novel View Synthesis Yiling Yao et.al. 2412.18380 null
2024-12-23 GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance Jingqiu Zhou et.al. 2412.17715 null
2024-12-23 CoSurfGS:Collaborative 3D Surface Gaussian Splatting with Distributed Learning for Large Scene Reconstruction Yuanyuan Gao et.al. 2412.17612 null
2024-12-23 Balanced 3DGS: Gaussian-wise Parallelism Rendering with Fine-Grained Tiling Hao Gui et.al. 2412.17378 null
2024-12-22 GSemSplat: Generalizable Semantic 3D Gaussian Splatting from Uncalibrated Image Pairs Xingrui Wang et.al. 2412.16932 link
2024-12-22 GeoTexDensifier: Geometry-Texture-Aware Densification for High-Quality Photorealistic 3D Gaussian Splatting Hanqing Jiang et.al. 2412.16809 null
2024-12-21 Topology-Aware 3D Gaussian Splatting: Leveraging Persistent Homology for Optimized Structural Integrity Tianqi Shen et.al. 2412.16619 link
2024-12-21 OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities Suyoung Lee et.al. 2412.16604 null
2024-12-20 Interactive Scene Authoring with Specialized Generative Primitives Clément Jambon et.al. 2412.16253 null
2024-12-20 CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images Jungho Lee et.al. 2412.16028 null
2024-12-20 AvatarPerfect: User-Assisted 3D Gaussian Splatting Avatar Refinement with Automatic Pose Suggestion Jotaro Sakamiya et.al. 2412.15609 null
2024-12-20 EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene Yixiong Huo et.al. 2412.15550 link
2024-12-19 GSRender: Deduplicated Occupancy Prediction via Weakly Supervised 3D Gaussian Splatting Qianpu Sun et.al. 2412.14579 null
2024-12-19 Improving Geometry in Sparse-View 3DGS via Reprojection-based DoF Separation Yongsung Kim et.al. 2412.14568 null
2024-12-18 GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians Xiaobao Wei et.al. 2412.13983 link
2024-12-18 GAGS: Granularity-Aware Feature Distillation for Language Gaussian Splatting Yuning Peng et.al. 2412.13654 null
2024-12-18 4D Radar-Inertial Odometry based on Gaussian Modeling and Multi-Hypothesis Scan Matching Fernando Amodeo et.al. 2412.13639 link
2024-12-18 Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields Tao Lu et.al. 2412.13547 null
2024-12-18 Vivar: A Generative AR System for Intuitive Multi-Modal Sensor Data Presentation Yunqi Guo et.al. 2412.13509 null
2024-12-17 CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image Wonseok Roh et.al. 2412.12906 null
2024-12-17 HyperGS: Hyperspectral 3D Gaussian Splatting Christopher Thirgood et.al. 2412.12849 null
2024-12-17 3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting Qi Wu et.al. 2412.12507 link
2024-12-16 Wonderland: Navigating 3D Scenes from a Single Image Hanwen Liang et.al. 2412.12091 null
2024-12-16 SweepEvGS: Event-Based 3D Gaussian Splatting for Macro and Micro Radiance Field Rendering from a Single Sweep Jingqian Wu et.al. 2412.11579 null
2024-12-16 EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting Dong In Lee et.al. 2412.11520 null
2024-12-14 DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting Luis Wiedmann et.al. 2412.10972 link
2024-12-13 SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians Siyun Liang et.al. 2412.10231 null
2024-12-18 SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video Jongmin Park et.al. 2412.09982 null
2024-12-13 RP-SLAM: Real-time Photorealistic SLAM with Efficient 3D Gaussian Splatting Lizhi Bai et.al. 2412.09868 null
2024-12-12 PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields Sean Wu et.al. 2412.09680 link
2024-12-12 LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors Yabo Chen et.al. 2412.09597 null
2024-12-12 LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting Haotian Mao et.al. 2412.09176 null
2024-12-10 Proc-GS: Procedural Building Generation for City Assembly with 3D Gaussians Yixuan Li et.al. 2412.07660 null
2024-12-10 Faster and Better 3D Splatting via Group Training Chengbo Wang et.al. 2412.07608 null
2024-12-10 ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery Yanzhe Lyu et.al. 2412.07494 null
2024-12-10 EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering Toshiya Yura et.al. 2412.07293 null
2024-12-09 Deblur4DGS: 4D Gaussian Splatting from Blurry Monocular Video Renlong Wu et.al. 2412.06424 link
2024-12-09 4D Gaussian Splatting with Scale-aware Residual Field and Adaptive Optimization for Real-time Rendering of Temporally Complex Dynamic Scenes Jinbo Yan et.al. 2412.06299 null
2024-12-12 Advancing Extended Reality with 3D Gaussian Splatting: Innovations and Prospects Shi Qiu et.al. 2412.06257 null
2024-12-09 Splatter-360: Generalizable 360 $^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images Zheng Chen et.al. 2412.06250 link
2024-12-09 Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction Seungtae Nam et.al. 2412.06234 null
2024-12-07 Temporally Compressed 3D Gaussian Splatting for Dynamic Scenes Saqib Javed et.al. 2412.05700 null
2024-12-07 WATER-GS: Toward Copyright Protection for 3D Gaussian Splatting via Universal Watermarking Yuqi Tan et.al. 2412.05695 null
2024-12-07 Template-free Articulated Gaussian Splatting for Real-time Reposable Dynamic View Synthesis Diwen Wan et.al. 2412.05570 null
2024-12-07 Text-to-3D Gaussian Splatting with Physics-Grounded Motion Generation Wenqing Wang et.al. 2412.05560 null
2024-12-07 Radiant: Large-scale 3D Gaussian Rendering based on Hierarchical Framework Haosong Peng et.al. 2412.05546 null
2024-12-06 Extrapolated Urban View Synthesis Benchmark Xiangyu Han et.al. 2412.05256 link
2024-12-06 MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussian Splatting Peng Chen et.al. 2412.04955 link
2024-12-06 Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction Jixuan Fan et.al. 2412.04887 link
2024-12-06 WRF-GS: Wireless Radiation Field Reconstruction with 3D Gaussian Splatting Chaozheng Wen et.al. 2412.04832 link
2024-12-06 Pushing Rendering Boundaries: Hard Gaussian Splatting Qingshan Xu et.al. 2412.04826 null
2024-12-05 QUEEN: QUantized Efficient ENcoding of Dynamic Gaussians for Streaming Free-viewpoint Videos Sharath Girish et.al. 2412.04469 null
2024-12-06 PBDyG: Position Based Dynamic Gaussians for Motion-Aware Clothed Human Avatars Shota Sasaki et.al. 2412.04433 null
2024-12-05 Multi-View Pose-Agnostic Change Localization with Zero Labels Chamuditha Jayanga Galappaththige et.al. 2412.03911 link
2024-12-05 HybridGS: Decoupling Transients and Statics with 2D and 3D Gaussian Splatting Jingyu Lin et.al. 2412.03844 link
2024-12-04 Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos Hanxue Liang et.al. 2412.03526 null
2024-12-04 2DGS-Room: Seed-Guided 2D Gaussian Splatting with Geometric Constrains for High-Fidelity Indoor Scene Reconstruction Wanting Zhang et.al. 2412.03428 null
2024-12-04 Volumetrically Consistent 3D Gaussian Rasterization Chinmay Talegaonkar et.al. 2412.03378 link
2024-12-04 SGSST: Scaling Gaussian Splatting StyleTransfer Bruno Galerne et.al. 2412.03371 link
2024-12-04 Splats in Splats: Embedding Invisible 3D Watermark within Gaussian Splatting Yijia Guo et.al. 2412.03121 null
2024-12-03 Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects Abdurrahman Zeybey et.al. 2412.02803 null
2024-12-03 RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians Qiankun Gao et.al. 2412.02493 link
2024-12-03 GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos Zhiyuan Chen et.al. 2412.02267 null
2024-12-03 Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance Jing Zeng et.al. 2412.02249 null
2024-12-03 How to Use Diffusion Priors under Sparse Views? Qisen Wang et.al. 2412.02225 link
2024-12-03 SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images Junqiu Yu et.al. 2412.02140 null
2024-12-03 Gaussian Object Carver: Object-Compositional Gaussian Splatting with surfaces completion Liu Liu et.al. 2412.02075 link
2024-12-02 Occam’s LGS: A Simple Approach for Language Gaussian Splatting Jiahuan Cheng et.al. 2412.01807 null
2024-12-02 CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion Kai He et.al. 2412.01792 null
2024-12-02 Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes Lihan Jiang et.al. 2412.01745 null
2024-12-02 HUGSIM: A Real-Time, Photo-Realistic and Closed-Loop Simulator for Autonomous Driving Hongyu Zhou et.al. 2412.01718 null
2024-12-02 GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting Zixuan Chen et.al. 2411.19895 link
2024-11-29 TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting Bojun Xiong et.al. 2411.19654 link
2024-11-29 Tortho-Gaussian: Splatting True Digital Orthophoto Maps Xin Wang et.al. 2411.19594 null
2024-11-29 Gaussian Splashing: Direct Volumetric Rendering Underwater Nir Mualem et.al. 2411.19588 null
2024-11-29 Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding Wenbo Zhang et.al. 2411.19551 link
2024-12-02 GausSurf: Geometry-Guided 3D Gaussian Splatting for Surface Reconstruction Jiepeng Wang et.al. 2411.19454 null
2024-11-29 RF-3DGS: Wireless Channel Modeling with Radio Radiance Field and 3D Gaussian Splatting Lihao Zhang et.al. 2411.19420 link
2024-11-28 InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception Haijie Li et.al. 2411.19235 null
2024-11-28 Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes Thomas Wimmer et.al. 2411.19233 link
2024-11-28 RIGI: Rectifying Image-to-3D Generation Inconsistency via Uncertainty-aware Learning Jiacheng Wang et.al. 2411.18866 null
2024-11-27 Textured Gaussians for Enhanced 3D Scene Appearance Modeling Brian Chao et.al. 2411.18625 null
2024-11-27 PhyCAGE: Physically Plausible Compositional 3D Asset Generation from a Single Image Han Yan et.al. 2411.18548 null
2024-11-27 HEMGS: A Hybrid Entropy Model for 3D Gaussian Splatting Data Compression Lei Liu et.al. 2411.18473 null
2024-11-27 Neural Surface Priors for Editable Gaussian Splatting Jakub Szymkowiak et.al. 2411.18311 link
2024-11-27 Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters Zhiyang Guo et.al. 2411.18197 null
2024-11-27 GLS: Geometry-aware 3D Language Gaussian Splatting Jiaxiong Qiu et.al. 2411.18066 link
2024-11-27 HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction Wei Zhang et.al. 2411.17982 link
2024-11-26 DROID-Splat: Combining end-to-end SLAM with 3D Gaussian Splatting Christian Homeyer et.al. 2411.17660 link
2024-11-26 Distractor-free Generalizable 3D Gaussian Splatting Yanqi Bao et.al. 2411.17605 link
2024-11-28 SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting Gyeongjin Kang et.al. 2411.17190 null
2024-11-25 G2SDF: Surface Reconstruction from Explicit Gaussians with Implicit SDFs Kunyi Li et.al. 2411.16898 null
2024-11-25 PreF3R: Pose-Free Feed-Forward 3D Gaussian Splatting from Variable-length Image Sequence Zequn Chen et.al. 2411.16877 null
2024-11-25 SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving Georg Hess et.al. 2411.16816 link
2024-11-25 SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis Hyojun Go et.al. 2411.16443 link
2024-11-25 Quadratic Gaussian Splatting for Efficient and Detailed Surface Reconstruction Ziyu Zhang et.al. 2411.16392 null
2024-11-25 Event-boosted Deformable 3D Gaussians for Fast Dynamic Scene Reconstruction Wenhao Xu et.al. 2411.16180 null
2024-11-24 ZeroGS: Training 3D Gaussian Splatting from Unposed Images Yu Chen et.al. 2411.15779 null
2024-11-24 GSurf: 3D Reconstruction via Signed Distance Fields with Direct Gaussian Supervision Xu Baixin et.al. 2411.15723 link
2024-11-23 Gassidy: Gaussian Splatting SLAM in Dynamic Environments Long Wen et.al. 2411.15476 null
2024-11-23 SplatSDF: Boosting Neural Implicit SDF via Gaussian Splatting Fusion Runfa Blark Li et.al. 2411.15468 null
2024-11-22 UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations Yuan Ren et.al. 2411.15355 null
2024-11-22 3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes Jan Held et.al. 2411.14974 link
2024-11-22 Dynamics-Aware Gaussian Splatting Streaming Towards Fast On-the-Fly Training for 4D Reconstruction Zhening Liu et.al. 2411.14847 null
2024-11-22 VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving Haiming Zhang et.al. 2411.14716 null
2024-11-21 NexusSplats: Efficient 3D Gaussian Splatting in the Wild Yuzhou Tang et.al. 2411.14514 null
2024-11-21 Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation Zhuoman Liu et.al. 2411.14423 null
2024-11-21 SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching Arjun P S et.al. 2411.14322 link
2024-11-20 Generating 3D-Consistent Videos from Unposed Internet Photos Gene Chou et.al. 2411.13549 null
2024-11-20 GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting Xiaobao Wei et.al. 2411.12981 null
2024-11-19 PR-ENDO: Physically Based Relightable Gaussian Splatting for Endoscopy Joanna Kaleta et.al. 2411.12510 link
2024-11-19 SCIGS: 3D Gaussians Splatting from a Snapshot Compressive Image Zixu Wang et.al. 2411.12471 null
2024-11-20 Beyond Gaussians: Fast and High-Fidelity 3D Splatting with Linear Kernels Haodong Chen et.al. 2411.12440 null
2024-11-19 LiV-GS: LiDAR-Vision Integration for 3D Gaussian Splatting SLAM in Outdoor Environments Renxiang Xiao et.al. 2411.12185 null
2024-11-19 Sketch-guided Cage-based 3D Gaussian Splatting Deformation Tianhao Xie et.al. 2411.12168 null
2024-11-21 FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting Fangyu Wu et.al. 2411.12089 null
2024-11-18 TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction DaDong Jiang et.al. 2411.11941 null
2024-11-18 DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes Chensheng Peng et.al. 2411.11921 link
2024-11-18 RoboGSim: A Real2Sim2Real Robotic Gaussian Splatting Simulator Xinhai Li et.al. 2411.11839 null
2024-11-18 GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views Boyao Zhou et.al. 2411.11363 null
2024-11-17 VeGaS: Video Gaussian Splatting Weronika Smolak-Dyżewska et.al. 2411.11024 link
2024-11-15 The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods Yifu Tao et.al. 2411.10546 null
2024-11-15 USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting Kang Chen et.al. 2411.10504 link
2024-11-15 Efficient Density Control for 3D Gaussian Splatting Xiaobin Deng et.al. 2411.10133 link
2024-11-15 GSEditPro: 3D Gaussian Splatting Editing with Attention-based Progressive Localization Yanhao Sun et.al. 2411.10033 null
2024-11-15 GGAvatar: Reconstructing Garment-Separated 3D Gaussian Splatting Avatars from Monocular Video Jingxuan Chen et.al. 2411.09952 link
2024-11-14 Adversarial Attacks Using Differentiable Rendering: A Survey Matthew Hull et.al. 2411.09749 null
2024-11-14 DyGASR: Dynamic Generalized Exponential Splatting with Surface Alignment for Accelerated 3D Mesh Reconstruction Shengchao Zhao et.al. 2411.09156 null
2024-11-13 Towards More Accurate Fake Detection on Images Generated from Advanced Generative and Neural Rendering Models Chengdong Dong et.al. 2411.08642 null
2024-11-13 Biomass phenotyping of oilseed rape through UAV multi-view oblique imaging with 3DGS and SAM model Yutao Shen et.al. 2411.08453 null
2024-11-13 MBA-SLAM: Motion Blur Aware Dense Visual SLAM with Radiance Fields Representation Peng Wang et.al. 2411.08279 link
2024-11-14 Projecting Gaussian Ellipsoids While Avoiding Affine Projection Approximation Han Qi et.al. 2411.07579 null
2024-11-12 GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting Umangi Jain et.al. 2411.07555 null
2024-11-12 HiCoM: Hierarchical Coherent Motion for Streamable Dynamic Scene with 3D Gaussian Splatting Qiankun Gao et.al. 2411.07541 link
2024-11-12 GUS-IR: Gaussian Splatting with Unified Shading for Inverse Rendering Zhihao Liang et.al. 2411.07478 null
2024-11-11 A Hierarchical Compression Technique for 3D Gaussian Splatting Compression He Huang et.al. 2411.06976 null
2024-11-10 Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction Decai Chen et.al. 2411.06602 null
2024-11-12 SplatFormer: Point Transformer for Robust 3D Gaussian Splatting Yutong Chen et.al. 2411.06390 link
2024-11-10 Through the Curved Cover: Synthesizing Cover Aberrated Scenes with Refractive Field Liuyue Xie et.al. 2411.06365 null
2024-11-09 AI-Driven Stylization of 3D Environments Yuanbo Chen et.al. 2411.06067 null
2024-11-09 GaussianSpa: An “Optimizing-Sparsifying” Simplification Framework for Compact and High-Quality 3D Gaussian Splatting Yangming Zhang et.al. 2411.06019 null
2024-11-07 ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing Jun-Kun Chen et.al. 2411.05006 null
2024-11-07 MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views Yuedong Chen et.al. 2411.04924 link
2024-11-08 GS2Pose: Two-stage 6D Object Pose Estimation Guided by Gaussian Splatting Jilan Mei et.al. 2411.03807 null
2024-11-06 3DGS-CD: 3D Gaussian Splatting-based Change Detection for Physical Object Rearrangement Ziqi Lu et.al. 2411.03706 link
2024-11-06 Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis Rui Peng et.al. 2411.03637 link
2024-11-05 Object and Contact Point Tracking in Demonstrations Using 3D Gaussian Splatting Michael Büttner et.al. 2411.03555 null
2024-11-05 HFGaussian: Learning Generalizable Gaussian Human with Integrated Human Features Arnab Dey et.al. 2411.03086 null
2024-11-05 LVI-GS: Tightly-coupled LiDAR-Visual-Inertial SLAM using 3D Gaussian Splatting Huibin Zhao et.al. 2411.02703 null
2024-11-04 Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting Joey Wilson et.al. 2411.02547 null
2024-11-06 SplatOverflow: Asynchronous Hardware Troubleshooting Amritansh Kwatra et.al. 2411.02332 null
2024-11-05 FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training Ruihong Yin et.al. 2411.02229 null
2024-11-06 GVKF: Gaussian Voxel Kernel Functions for Highly Efficient Surface Reconstruction in Open Scenes Gaochao Song et.al. 2411.01853 null
2024-11-01 CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes Yang Liu et.al. 2411.00771 null
2024-10-31 Aquatic-GS: A Hybrid 3D Representation for Underwater Scenes Shaohua Liu et.al. 2411.00239 null
2024-10-31 Self-Ensembling Gaussian Splatting for Few-shot Novel View Synthesis Chen Zhao et.al. 2411.00144 link
2024-10-31 No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images Botao Ye et.al. 2410.24207 link
2024-11-01 GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering Kai Ye et.al. 2410.24204 null
2024-10-31 GaussianMarker: Uncertainty-Aware Copyright Protection of 3D Gaussian Splatting Xiufeng Huang et.al. 2410.23718 null
2024-10-31 GS-Blur: A 3D Scene-Based Dataset for Realistic Image Deblurring Dongwoo Lee et.al. 2410.23658 link
2024-10-30 ELMGS: Enhancing memory and computation scaLability through coMpression for 3D Gaussian Splatting Muhammad Salman Ali et.al. 2410.23213 null
2024-10-31 Epipolar-Free 3D Gaussian Splatting for Generalizable Novel View Synthesis Zhiyuan Min et.al. 2410.22817 null
2024-10-29 PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting Sunghwan Hong et.al. 2410.22128 link
2024-10-29 FreeGaussian: Guidance-free Controllable 3D Gaussian Splats with Flow Derivatives Qizhi Chen et.al. 2410.22070 null
2024-10-28 CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians Chongjian Ge et.al. 2410.20723 null
2024-10-28 ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings Suyoung Lee et.al. 2410.20686 link
2024-10-27 Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering Meng Wei et.al. 2410.20593 null
2024-10-30 DiffGS: Functional Gaussian Splatting Diffusion Junsheng Zhou et.al. 2410.19657 null
2024-10-25 Robotic Learning in your Backyard: A Neural Simulator from Open Source Components Liyou Zhou et.al. 2410.19564 link
2024-10-25 Content-Aware Radiance Fields: Aligning Model Complexity with Scene Intricacy Through Learned Bitwidth Quantization Weihang Liu et.al. 2410.19483 link
2024-10-24 Sort-free Gaussian Splatting via Weighted Sum Rendering Qiqi Hou et.al. 2410.18931 null
2024-10-24 Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling Mingtong Zhang et.al. 2410.18912 null
2024-10-27 Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis Liang Han et.al. 2410.18822 null
2024-10-23 VR-Splatting: Foveated Radiance Field Rendering via 3D Gaussian Splatting and Neural Points Linus Franke et.al. 2410.17932 null
2024-10-23 PLGS: Robust Panoptic Lifting with 3D Gaussian Splatting Yu Wang et.al. 2410.17505 null
2024-10-22 AG-SLAM: Active Gaussian Splatting SLAM Wen Jiang et.al. 2410.17422 null
2024-10-22 SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes Cheng-De Fan et.al. 2410.17249 null
2024-10-18 GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting Yusen Xie et.al. 2410.17084 null
2024-10-22 E-3DGS: Gaussian Splatting with Exposure and Motion Events Xiaoting Yin et.al. 2410.16995 link
2024-10-21 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors Xi Liu et.al. 2410.16266 null
2024-10-22 Fully Explicit Dynamic Gaussian Splatting Junoh Lee et.al. 2410.15629 null
2024-10-22 EF-3DGS: Event-Aided Free-Trajectory 3D Gaussian Splatting Bohao Liao et.al. 2410.15392 null
2024-10-18 Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set Wenyuan Zhang et.al. 2410.14189 null
2024-10-17 DepthSplat: Connecting Gaussian Splatting and Depth Haofei Xu et.al. 2410.13862 link
2024-10-17 DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering Jiahao Lu et.al. 2410.13607 link
2024-10-17 GlossyGS: Inverse Rendering of Glossy Objects with 3D Gaussian Splatting Shuichang Lai et.al. 2410.13349 null
2024-10-16 3D Gaussian Splatting in Robotics: A Survey Siting Zhu et.al. 2410.12262 link
2024-10-15 SplatPose+: Real-time Image-Based Pose-Agnostic 3D Anomaly Detection Yizhe Liu et.al. 2410.12080 link
2024-10-15 LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images Yuzhou Cheng et.al. 2410.11505 null
2024-10-15 MCGS: Multiview Consistency Enhancement for Sparse-View 3D Gaussian Radiance Fields Yuru Xiao et.al. 2410.11394 null
2024-10-15 GSORB-SLAM: Gaussian Splatting SLAM benefits from ORB features and Transmittance information Wancai Zheng et.al. 2410.11356 null
2024-10-15 Scalable Indoor Novel-View Synthesis using Drone-Captured 360 Imagery with 3D Gaussian Splatting Yuanbo Chen et.al. 2410.11285 null
2024-10-14 Few-shot Novel View Synthesis using Depth Aware 3D Gaussian Splatting Raja Kumar et.al. 2410.11080 link
2024-10-15 4-LEGS: 4D Language Embedded Gaussian Splatting Gal Fiebelman et.al. 2410.10719 null
2024-10-11 SurgicalGS: Dynamic 3D Gaussian Splatting for Accurate Robotic-Assisted Surgical Scene Reconstruction Jialei Chen et.al. 2410.09292 null
2024-10-11 MeshGS: Adaptive Mesh-Aligned Gaussian Splatting for High-Quality Rendering Jaehoon Choi et.al. 2410.08941 null
2024-10-11 Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars Xuan Huang et.al. 2410.08840 link
2024-10-11 Look Gauss, No Pose: Novel View Synthesis using Gaussian Splatting without Accurate Pose Initialization Christian Schmidt et.al. 2410.08743 link
2024-10-10 FusionSense: Bridging Common Sense, Vision, and Touch for Robust Sparse-View Reconstruction Irving Fang et.al. 2410.08282 null
2024-10-10 Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics Junyi Cao et.al. 2410.08257 null
2024-10-10 Poison-splat: Computation Cost Attack on 3D Gaussian Splatting Jiahao Lu et.al. 2410.08190 link
2024-10-10 DifFRelight: Diffusion-Based Facial Performance Relighting Mingming He et.al. 2410.08188 null
2024-10-10 Efficient Perspective-Correct 3D Gaussian Splatting Using Hybrid Transparency Florian Hahlbohm et.al. 2410.08129 null
2024-10-10 IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera Jian Huang et.al. 2410.08107 link
2024-10-11 Fast Feedforward 3D Gaussian Splatting Compression Yihang Chen et.al. 2410.08017 link
2024-10-10 MotionGS: Exploring Explicit Motion Guidance for Deformable 3D Gaussian Splatting Ruijie Zhu et.al. 2410.07707 link
2024-10-09 Spiking GS: Towards High-Accuracy and Low-Cost Surface Reconstruction via Spiking Neuron-based Gaussian Splatting Weixing Zhang et.al. 2410.07266 link
2024-10-09 3D Representation Methods: A Survey Zhengren Wang et.al. 2410.06475 null
2024-10-08 HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction Shengji Tang et.al. 2410.06245 null
2024-10-08 GSLoc: Visual Localization with 3D Gaussian Splatting Kazii Botashev et.al. 2410.06165 null
2024-10-08 Comparative Analysis of Novel View Synthesis and Photogrammetry for 3D Forest Stand Reconstruction and extraction of individual tree parameters Guoji Tian et.al. 2410.05772 null
2024-10-07 GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting Yukang Cao et.al. 2410.05259 null
2024-10-07 DreamSat: Towards a General 3D Model for Novel View Synthesis of Space Objects Nidhi Mathihalli et.al. 2410.05097 link
2024-10-07 PhotoReg: Photometrically Registering 3D Gaussian Splatting Models Ziwen Yuan et.al. 2410.05044 null
2024-10-07 6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering Zhongpai Gao et.al. 2410.04974 null
2024-10-07 Next Best Sense: Guiding Vision and Touch with FisherRF for 3D Gaussian Splatting Matthew Strong et.al. 2410.04680 link
2024-10-06 Mode-GS: Monocular Depth Guided Anchored 3D Gaussian Splatting for Robust Ground-View Scene Rendering Yonghan Lee et.al. 2410.04646 null
2024-10-04 Variational Bayes Gaussian Splatting Toon Van de Maele et.al. 2410.03592 link
2024-10-03 Flash-Splat: 3D Reflection Removal with Flash Cues and Gaussian Splats Mingyang Xie et.al. 2410.02764 null
2024-10-03 GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering Hongze Chen et.al. 2410.02619 null
2024-10-07 SuperGS: Super-Resolution 3D Gaussian Splatting via Latent Feature Field and Gradient-guided Splitting Shiyun Xie et.al. 2410.02571 link
2024-10-02 MVGS: Multi-view-regulated Gaussian Splatting for Novel View Synthesis Xiaobiao Du et.al. 2410.02103 link
2024-10-03 EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis Alexander Mai et.al. 2410.01804 null
2024-10-02 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Yang Cao et.al. 2410.01647 link
2024-10-02 Gaussian Splatting in Mirrors: Reflection-Aware Rendering via Virtual Camera Optimization Zihan Wang et.al. 2410.01614 link
2024-10-02 UW-GS: Distractor-Aware 3D Gaussian Splatting for Enhanced Underwater Scene Reconstruction Haoran Wang et.al. 2410.01517 link
2024-10-02 EVA-Gaussian: 3D Gaussian-based Real-time Human Novel View Synthesis under Diverse Camera Settings Yingdong Hu et.al. 2410.01425 null
2024-10-02 CaRtGS: Computational Alignment for Real-Time Gaussian Splatting SLAM Dapeng Feng et.al. 2410.00486 link
2024-10-01 Seamless Augmented Reality Integration in Arthroscopy: A Pipeline for Articular Reconstruction and Guidance Hongchao Shu et.al. 2410.00386 null
2024-10-01 GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving Zhangshuo Qi et.al. 2410.00299 link
2024-09-30 RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning Yuxuan Wu et.al. 2409.20291 null
2024-09-30 Robust Gaussian Splatting SLAM by Leveraging Loop Closure Zunjie Zhu et.al. 2409.20111 null
2024-10-01 RNG: Relightable Neural Gaussians Jiahui Fan et.al. 2409.19702 null
2024-09-28 1st Place Solution to the 8th HANDS Workshop Challenge – ARCTIC Track: 3DGS-based Bimanual Category-agnostic Interaction Reconstruction Jeongwan On et.al. 2409.19215 null
2024-09-26 HGS-Planner: Hierarchical Planning Framework for Active Scene Reconstruction Using 3D Gaussian Splatting Zijun Xu et.al. 2409.17624 null
2024-09-25 SeaSplat: Representing Underwater Scenes with 3D Gaussian Splatting and a Physically Grounded Image Formation Model Daniel Yang et.al. 2409.17345 null
2024-09-25 Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM Phu Pham et.al. 2409.16944 null
2024-09-24 GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization Gennady Sidorov et.al. 2409.16502 link
2024-09-24 Frequency-based View Selection in Gaussian Splatting Reconstruction Monica M. Q. Li et.al. 2409.16470 null
2024-09-26 Gaussian Deja-vu: Creating Controllable 3D Gaussian Head-Avatars with Enhanced Generalization and Personalization Abilities Peizhi Yan et.al. 2409.16147 link
2024-09-23 Human Hair Reconstruction with Strand-Aligned 3D Gaussians Egor Zakharov et.al. 2409.14778 null
2024-09-22 MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views Wangze Xu et.al. 2409.14316 null
2024-09-21 SplatLoc: 3D Gaussian Splatting-based Visual Localization for Augmented Reality Hongjia Zhai et.al. 2409.14067 null
2024-09-20 Elite-EvGS: Learning Event-based 3D Gaussian Splatting by Distilling Event-to-Video Priors Zixin Zhang et.al. 2409.13392 null
2024-09-20 3D-GSW: 3D Gaussian Splatting Watermark for Protecting Copyrights in Radiance Fields Youngdong Jang et.al. 2409.13222 null
2024-09-19 MGSO: Monocular Real-time Photometric SLAM with Efficient 3D Gaussian Splatting Yan Song Hu et.al. 2409.13055 null
2024-09-18 SRIF: Semantic Shape Registration Empowered by Diffusion-based Image Morphing and Flow Estimation Mingze Sun et.al. 2409.11682 link
2024-09-18 Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks Joji Joseph et.al. 2409.11681 link
2024-09-17 GS-Net: Generalizable Plug-and-Play 3D Gaussian Splatting Module Yichen Zhang et.al. 2409.11307 null
2024-09-17 SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction Marko Mihajlovic et.al. 2409.11211 null
2024-09-17 GLC-SLAM: Gaussian Splatting SLAM with Efficient Loop Closure Ziheng Xu et.al. 2409.10982 null
2024-09-16 Phys3DGS: Physically-based 3D Gaussian Splatting for Inverse Rendering Euntae Choi et.al. 2409.10335 null
2024-09-16 BEINGS: Bayesian Embodied Image-goal Navigation with Gaussian Splatting Wugang Meng et.al. 2409.10216 link
2024-09-16 DENSER: 3D Gaussians Splatting for Scene Reconstruction of Dynamic Urban Environments Mahmud A. Mohamad et.al. 2409.10041 link
2024-09-15 MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation Shuzhao Xie et.al. 2409.09756 null
2024-09-17 A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis Yohan Poirier-Ginter et.al. 2409.08947 null
2024-09-13 AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius Xinzhe Wang et.al. 2409.08669 null
2024-09-13 Dense Point Clouds Matter: Dust-GS for Scene Reconstruction from Sparse Viewpoints Shan Chen et.al. 2409.08613 null
2024-09-13 CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian Splatting Runze Chen et.al. 2409.08562 null
2024-09-12 FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally Qiuhong Shen et.al. 2409.08270 link
2024-09-12 Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis Qian Chen et.al. 2409.08042 link
2024-09-12 SwinGS: Sliding Window Gaussian Splatting for Volumetric Video Streaming with Arbitrary Length Bangya Liu et.al. 2409.07759 null
2024-09-11 Self-Evolving Depth-Supervised 3D Gaussian Splatting from Rendered Stereo Pairs Sadra Safadoust et.al. 2409.07456 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 ThermalGaussian: Thermal 3D Gaussian Splatting Rongfeng Lu et.al. 2409.07200 link
2024-09-10 GigaGS: Scaling up Planar-Based 3D Gaussians for Large Scene Surface Reconstruction Junyi Chen et.al. 2409.06685 null
2024-09-10 Sources of Uncertainty in 3D Scene Reconstruction Marcus Klasson et.al. 2409.06407 link
2024-09-09 Lagrangian Hashing for Compressed Neural Field Representations Shrisudhan Govindarajan et.al. 2409.05334 null
2024-09-08 GS-PT: Exploiting 3D Gaussian Splatting for Comprehensive Point Cloud Understanding via Self-supervised Learning Keyi Liu et.al. 2409.04963 null
2024-09-11 Fisheye-GS: Lightweight and Extensible Gaussian Splatting Module for Fisheye Cameras Zimu Liao et.al. 2409.04751 link
2024-09-06 GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Lorenza Prospero et.al. 2409.04196 link
2024-09-06 3D-GP-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors Yujun Huang et.al. 2409.04013 link
2024-09-05 LM-Gaussian: Boost Sparse-view 3D Gaussian Splatting with Large Model Priors Hanyang Yu et.al. 2409.03456 null
2024-09-05 Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction Shen Chen et.al. 2409.03213 null
2024-09-04 Object Gaussian for Monocular 6D Pose Estimation from Sparse Views Luqing Luo et.al. 2409.02581 null
2024-09-04 GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving Huasong Han et.al. 2409.02382 null
2024-09-03 DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction Jenny Seidenschwarz et.al. 2409.02104 null
2024-09-03 PRoGS: Progressive Rendering of Gaussian Splats Brent Zoomers et.al. 2409.01761 null
2024-09-03 GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting Zixuan Guo et.al. 2409.01581 null
2024-09-02 Free-DyGS: Camera-Pose-Free Scene Reconstruction based on Gaussian Splatting for Dynamic Surgical Videos Qian Li et.al. 2409.01003 null
2024-09-06 3D Gaussian Splatting for Large-scale 3D Surface Reconstruction from Aerial Images YuanZheng Wu et.al. 2409.00381 null
2024-08-30 OG-Mapping: Octree-based Structured 3D Gaussians for Online Dense Mapping Meng Wang et.al. 2408.17223 null
2024-08-29 ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model Fangfu Liu et.al. 2408.16767 null
2024-08-28 Towards Realistic Example-based Modeling via 3D Gaussian Stitching Xinyu Gao et.al. 2408.15708 null
2024-08-27 Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty Saining Zhang et.al. 2408.15242 link
2024-08-27 Learning-based Multi-View Stereo: A Survey Fangjinhua Wang et.al. 2408.15235 null
2024-08-27 LapisGS: Layered Progressive 3D Gaussian Splatting for Adaptive Streaming Yuang Shi et.al. 2408.14823 link
2024-08-26 Avatar Concept Slider: Manipulate Concepts In Your Human Avatar With Fine-grained Control Yixuan He et.al. 2408.13995 null
2024-08-27 Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs Brandon Smart et.al. 2408.13912 null
2024-08-25 TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers Chuanrui Zhang et.al. 2408.13770 null
2024-08-25 SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting Wenrui Li et.al. 2408.13711 link
2024-08-23 BiGS: Bidirectional Gaussian Primitives for Relightable 3D Gaussian Splatting Zhenyuan Liu et.al. 2408.13370 null
2024-08-23 FLoD: Integrating Flexible Level of Detail into 3D Gaussian Splatting for Customizable Rendering Yunji Seo et.al. 2408.12894 null
2024-08-26 GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF Fusion Jiaxin Wei et.al. 2408.12677 link
2024-08-22 Subsurface Scattering for 3D Gaussian Splatting Jan-Niklas Dihlmann et.al. 2408.12282 null
2024-08-21 Robust 3D Gaussian Splatting for Novel View Synthesis in Presence of Distractors Paul Ungermann et.al. 2408.11697 link
2024-08-27 Pano2Room: Novel View Synthesis from a Single Indoor Panorama Guo Pu et.al. 2408.11413 link
2024-08-20 GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting Changkun Liu et.al. 2408.11085 link
2024-08-20 ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining Qi Ma et.al. 2408.10906 null
2024-08-20 DEGAS: Detailed Expressions on Full-Body Gaussian Avatars Zhijing Shao et.al. 2408.10588 link
2024-08-20 LoopSplat: Loop Closure by Registering 3D Gaussian Splats Liyuan Zhu et.al. 2408.10154 link
2024-08-20 Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting Sheng Ye et.al. 2408.09130 link
2024-08-16 Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS Wei Sun et.al. 2408.08723 null
2024-08-15 WaterSplatting: Fast Underwater 3D Scene Reconstruction Using Gaussian Splatting Huapeng Li et.al. 2408.08206 null
2024-08-19 FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering Guofeng Feng et.al. 2408.07967 link
2024-08-14 3D Gaussian Editing with A Single Image Guan Luo et.al. 2408.07540 null
2024-08-13 SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis Saptarshi Neil Sinha et.al. 2408.06975 null
2024-08-12 Mipmap-GS: Let Gaussians Deform with Scale-specific Mipmap for Anti-aliasing Rendering Jiameng Li et.al. 2408.06286 link
2024-08-12 Developing Smart MAVs for Autonomous Inspection in GPS-denied Constructions Paoqiang Pan et.al. 2408.06030 null
2024-08-10 Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis Zhongche Qu et.al. 2408.05635 null
2024-08-09 DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow Hangyu Li et.al. 2408.05008 null
2024-08-08 InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting Xin-Yi Yu et.al. 2408.04249 null
2024-08-07 Towards Real-Time Gaussian Splatting: Accelerating 3DGS through Photometric SLAM Yan Song Hu et.al. 2408.03825 null
2024-08-07 Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields Joo Chan Lee et.al. 2408.03822 null
2024-08-07 3iGS: Factorised Tensorial Illumination for 3D Gaussian Splatting Zhe Jun Tang et.al. 2408.03753 link
2024-08-07 PRTGS: Precomputed Radiance Transfer of Gaussian Splats for Real-Time High-Quality Relighting Yijia Guo et.al. 2408.03538 null
2024-08-02 A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness Lutao Jiang et.al. 2408.01269 null
2024-08-02 Reality Fusion: Robust Real-time Immersive Mobile Robot Teleoperation with Volumetric Visual Data Fusion Ke Li et.al. 2408.01225 link
2024-08-07 IG-SLAM: Instant Gaussian SLAM F. Aykut Sarikamis et.al. 2408.01126 null
2024-08-01 LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting Zhenyu Bao et.al. 2408.00254 null
2024-07-31 Localized Gaussian Splatting Editing with Contextual Awareness Hanyuan Xiao et.al. 2408.00083 null
2024-07-31 Expressive Whole-Body 3D Gaussian Avatar Gyeongsik Moon et.al. 2407.21686 null
2024-07-30 SceneTeller: Language-to-3D Scene Generation Başak Melis Öcal et.al. 2407.20727 null
2024-07-29 Radiance Fields for Robotic Teleoperation Maximum Wilder-Smith et.al. 2407.20194 link
2024-07-24 3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities Yanqi Bao et.al. 2407.17418 link
2024-07-23 HDRSplat: Gaussian Splatting for High Dynamic Range 3D Scene Reconstruction from Raw Images Shreyas Singh et.al. 2407.16503 link
2024-07-23 Integrating Meshes and 3D Gaussians for Indoor Scene Reconstruction with SAM Mask Guidance Jiyeop Kim et.al. 2407.16173 null
2024-07-22 6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model Matteo Bortolon et.al. 2407.15484 null
2024-07-22 Enhancement of 3D Gaussian Splatting using Raw Mesh for Photorealistic Recreation of Architectures Ruizhe Wang et.al. 2407.15435 null
2024-07-21 HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions Haiyang Zhou et.al. 2407.15187 null
2024-07-20 Realistic Surgical Image Dataset Generation Based On 3D Gaussian Splatting Tianle Zeng et.al. 2407.14846 null
2024-07-19 DirectL: Efficient Radiance Fields Rendering for 3D Light Field Displays Zongyuan Yang et.al. 2407.14053 null
2024-07-20 Connecting Consistency Distillation to Score Distillation for Text-to-3D Generation Zongrui Li et.al. 2407.13584 link
2024-07-18 EaDeblur-GS: Event assisted 3D Deblur Reconstruction with Gaussian Splatting Yuchen Weng et.al. 2407.13520 null
2024-07-17 Splatfacto-W: A Nerfstudio Implementation of Gaussian Splatting for Unconstrained Photo Collections Congrong Xu et.al. 2407.12306 null
2024-07-16 MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification Zhuoxiao Li et.al. 2407.11840 null
2024-07-16 Click-Gaussian: Interactive Segmentation to Any 3D Gaussians Seokhun Choi et.al. 2407.11793 null
2024-07-16 SlingBAG: Sliding ball adaptive growth algorithm with differentiable radiation enables super-efficient iterative 3D photoacoustic image reconstruction Shuang Li et.al. 2407.11781 link
2024-07-16 Ev-GS: Event-based Gaussian splatting for Efficient and Accurate Radiance Field Rendering Jingqian Wu et.al. 2407.11343 null
2024-07-14 3DEgo: 3D Editing on the Go! Umar Khalid et.al. 2407.10102 null
2024-07-14 SpikeGS: 3D Gaussian Splatting from Spike Streams with High-Speed Camera Motion Jiyuan Zhang et.al. 2407.10062 null
2024-07-12 StyleSplat: 3D Object Style Transfer with Gaussian Splatting Sahil Jain et.al. 2407.09473 null
2024-07-11 WildGaussians: 3D Gaussian Splatting in the Wild Jonas Kulhanek et.al. 2407.08447 link
2024-07-11 Survey on Fundamental Deep Learning 3D Reconstruction Techniques Yonge Bai et.al. 2407.08137 null
2024-07-17 MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition Aggelina Chatziagapi et.al. 2407.07284 null
2024-07-09 Reference-based Controllable Scene Stylization with Gaussian Splatting Yiqun Mei et.al. 2407.07220 null
2024-07-10 3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes Nicolas Moenne-Loccoz et.al. 2407.07090 null
2024-07-07 PICA: Physics-Integrated Clothed Avatar Bo Peng et.al. 2407.05324 null
2024-07-06 SurgicalGaussian: Deformable 3D Gaussians for High-Fidelity Surgical Scene Reconstruction Weixing Xie et.al. 2407.05023 link
2024-07-12 Segment Any 4D Gaussians Shengxiang Ji et.al. 2407.04504 null
2024-07-04 PFGS: High Fidelity Point Cloud Rendering via Feature Splatting Jiaxu Wang et.al. 2407.03857 link
2024-07-04 SpikeGS: Reconstruct 3D scene via fast-moving bio-inspired sensors Yijia Guo et.al. 2407.03771 null
2024-07-04 VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors Sungwon Hwang et.al. 2407.02945 link
2024-07-03 Free-SurGS: SfM-Free 3D Gaussian Splatting for Surgical Scene Reconstruction Jiaxin Guo et.al. 2407.02918 link
2024-07-04 AutoSplat: Constrained Gaussian Splatting for Autonomous Driving Scene Reconstruction Mustafa Khan et.al. 2407.02598 null
2024-07-02 TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation Chaofan Luo et.al. 2407.02034 null
2024-07-01 GaussianStego: A Generalizable Stenography Pipeline for Generative 3D Gaussians Splatting Chenxin Li et.al. 2407.01301 null
2024-07-02 RTGS: Enabling Real-Time Gaussian Splatting on Mobile Devices Using Efficiency-Guided Pruning and Foveated Rendering Weikai Lin et.al. 2407.00435 link
2024-06-29 OccFusion: Rendering Occluded Humans with Generative Diffusion Priors Adam Sun et.al. 2407.00316 null
2024-06-28 SpotlessSplats: Ignoring Distractors in 3D Gaussian Splatting Sara Sabour et.al. 2406.20055 null
2024-06-28 EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting Daiwei Zhang et.al. 2406.19811 null
2024-06-27 Lightweight Predictive 3D Gaussian Splats Junli Cao et.al. 2406.19434 link
2024-06-26 On Scaling Up 3D Gaussian Splatting Training Hexu Zhao et.al. 2406.18533 link
2024-06-26 GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality Taoran Yi et.al. 2406.18462 null
2024-06-26 Trimming the Fat: Efficient Compression of 3D Gaussian Splats through Pruning Muhammad Salman Ali et.al. 2406.18214 link
2024-06-26 GS-Octree: Octree-based 3D Gaussian Splatting for Robust Object-level 3D Reconstruction Under Strong Lighting Jiaze Li et.al. 2406.18199 null
2024-06-25 NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods Jonas Kulhanek et.al. 2406.17345 null
2024-06-24 Reducing the Memory Footprint of 3D Gaussian Splatting Panagiotis Papantonakis et.al. 2406.17074 null
2024-06-23 LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction Hengyu Liu et.al. 2406.16073 link
2024-06-23 Learning with Noisy Ground Truth: From 2D Classification to 3D Reconstruction Yangdi Lu et.al. 2406.15982 null
2024-06-21 Taming 3DGS: High-Quality Radiance Fields with Limited Resources Saswat Subhajyoti Mallick et.al. 2406.15643 link
2024-06-21 Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks Alex Quach et.al. 2406.15149 null
2024-06-18 Sampling 3D Gaussian Scenes in Seconds with Latent Diffusion Models Paul Henderson et.al. 2406.13099 null
2024-06-18 HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors Panwang Pan et.al. 2406.12459 link
2024-06-17 A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets Bernhard Kerbl et.al. 2406.12080 null
2024-06-22 RetinaGS: Scalable Training for Dense Scene Rendering with Billion-Scale 3D Gaussians Bingling Li et.al. 2406.11836 null
2024-06-18 Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting Junha Hyung et.al. 2406.11672 null
2024-06-14 Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections Jiacong Xu et.al. 2406.10373 null
2024-06-14 L4GM: Large 4D Gaussian Reconstruction Model Jiawei Ren et.al. 2406.10324 null
2024-06-14 PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting Alex Hanson et.al. 2406.10219 link
2024-06-14 GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors Xiqian Yu et.al. 2406.10111 null
2024-06-14 Unified Gaussian Primitives for Scene Representation and Rendering Yang Zhou et.al. 2406.09733 null
2024-06-13 Modeling Ambient Scene Dynamics for Free-view Synthesis Meng-Li Shih et.al. 2406.09395 null
2024-06-13 GGHead: Fast and Generalizable 3D Gaussian Heads Tobias Kirschstein et.al. 2406.09377 null
2024-06-13 Gaussian-Forest: Hierarchical-Hybrid 3D Gaussian Splatting for Compressed Scene Modeling Fengyi Zhang et.al. 2406.08759 null
2024-06-12 ICE-G: Image Conditional Editing of 3D Gaussian Splats Vishnu Jaganathan et.al. 2406.08488 null
2024-06-12 Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models Yuxuan Xue et.al. 2406.08475 null
2024-06-12 From Chaos to Clarity: 3DGS in the Dark Zhihao Li et.al. 2406.08300 null
2024-06-11 Trim 3D Gaussian Splatting for Accurate Geometry Representation Lue Fan et.al. 2406.07499 null
2024-06-11 Cinematic Gaussians: Real-Time HDR Radiance Fields with Depth of Field Chao Wang et.al. 2406.07329 null
2024-06-10 GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation Haozhe Xie et.al. 2406.06526 link
2024-06-10 PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction Danpeng Chen et.al. 2406.06521 null
2024-06-10 MVGamba: Unify 3D Content Generation as State Space Sequence Modeling Xuanyu Yi et.al. 2406.06367 link
2024-06-10 Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis Xin Jin et.al. 2406.06216 link
2024-06-09 RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering Rui Zhang et.al. 2406.05852 null
2024-06-09 VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction Hanlin Chen et.al. 2406.05774 null
2024-06-06 A Survey on 3D Human Avatar Modeling – From Reconstruction to Generation Ruihe Wang et.al. 2406.04253 null
2024-06-06 Localized Gaussian Point Management Haosen Yang et.al. 2406.04251 null
2024-06-06 Superpoint Gaussian Splatting for Real-Time High-Fidelity Dynamic Scene Reconstruction Diwen Wan et.al. 2406.03697 link
2024-06-10 Event3DGS: Event-Based 3D Gaussian Splatting for High-Speed Robot Egomotion Tianyi Xiong et.al. 2406.02972 null
2024-06-05 Adversarial Generation of Hierarchical Gaussians for 3D Generative Model Sangeek Hyun et.al. 2406.02968 link
2024-06-04 3D-HGS: 3D Half-Gaussian Splatting Haolin Li et.al. 2406.02720 link
2024-06-06 Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting Inkyu Shin et.al. 2406.02541 null
2024-06-04 SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition Van Minh Nguyen et.al. 2406.02533 null
2024-06-04 DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering Zhongpai Gao et.al. 2406.02518 null
2024-06-04 WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections Yuze Wang et.al. 2406.02407 null
2024-06-04 Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning Jiaxu Wang et.al. 2406.02370 null
2024-06-04 OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding Yanmin Wu et.al. 2406.02058 null
2024-06-04 FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping Yuzhou Ji et.al. 2406.01916 null
2024-06-03 Tetrahedron Splatting for 3D Generation Chun Gu et.al. 2406.01579 link
2024-06-03 DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors Tianyu Huang et.al. 2406.01476 link
2024-06-03 RaDe-GS: Rasterizing Depth in Gaussian Splatting Baowen Zhang et.al. 2406.01467 link
2024-05-31 ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model Yufei Wang et.al. 2405.20721 link
2024-05-31 R $^2$ -Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction Ruyi Zha et.al. 2405.20693 link
2024-05-30 $\textit{S}^3$ Gaussian: Self-Supervised Street Gaussians for Autonomous Driving Nan Huang et.al. 2405.20323 link
2024-06-03 A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction Jianghao Shen et.al. 2405.20310 null
2024-05-29 EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images Wangbo Yu et.al. 2405.20224 null
2024-05-30 Object-centric Reconstruction and Tracking of Dynamic Unknown Objects using 3D Gaussian Splatting Kuldeep R Barad et.al. 2405.20104 null
2024-05-30 GaussianRoom: Improving 3D Gaussian Splatting with SDF Guidance and Monocular Cues for Indoor Scene Reconstruction Haodong Xiang et.al. 2405.19671 null
2024-05-30 Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian Wei Sun et.al. 2405.19657 null
2024-05-30 TAMBRIDGE: Bridging Frame-Centered Tracking and 3D Gaussian Splatting for Enhanced SLAM Peifeng Jiang et.al. 2405.19614 null
2024-05-29 NPGA: Neural Parametric Gaussian Avatars Simon Giebenhain et.al. 2405.19331 null
2024-05-29 LP-3DGS: Learning to Prune 3D Gaussian Splatting Zhaoliang Zhang et.al. 2405.18784 link
2024-05-28 A Grid-Free Fluid Solver based on Gaussian Spatial Representation Jingrui Xing et.al. 2405.18133 null
2024-05-28 FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes Yunsong Wang et.al. 2405.17958 link
2024-05-28 A Refined 3D Gaussian Representation for High-Quality Dynamic Scene Reconstruction Bin Zhang et.al. 2405.17891 null
2024-05-29 HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction Haoyu Zhao et.al. 2405.17872 link
2024-05-30 Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting Shuojue Yang et.al. 2405.17835 link
2024-05-28 Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh Xiangjun Gao et.al. 2405.17811 null
2024-05-28 SafeguardGS: 3D Gaussian Primitive Pruning While Avoiding Catastrophic Scene Destruction Yongjae Lee et.al. 2405.17793 link
2024-05-29 DC-Gaussian: Improving 3D Gaussian Splatting for Reflective Dash Cam Videos Linhan Wang et.al. 2405.17705 link
2024-05-27 GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane Yansong Qu et.al. 2405.17596 null
2024-05-27 DOF-GS: Adjustable Depth-of-Field 3D Gaussian Splatting for Refocusing,Defocus Rendering and Blur Removal Yujie Wang et.al. 2405.17351 null
2024-05-27 Memorize What Matters: Emergent Scene Decomposition from Multitraverse Yiming Li et.al. 2405.17187 link
2024-05-28 F-3DGS: Factorized Coordinates and Representations for 3D Gaussian Splatting Xiangyu Sun et.al. 2405.17083 null
2024-05-28 SA-GS: Semantic-Aware Gaussian Splatting for Large Scene Reconstruction with Geometry Constrain Butian Xiong et.al. 2405.16923 null
2024-05-28 PyGS: Large-scale Scene Representation with Pyramidal 3D Gaussian Splatting Zipeng Wang et.al. 2405.16829 null
2024-05-26 Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians Erik Sandström et.al. 2405.16544 link
2024-05-24 Feature Splatting for Better Novel View Synthesis with Low Overlap T. Berriel Martins et.al. 2405.15518 link
2024-05-24 GSDeformer: Direct Cage-based Deformation for 3D Gaussian Splatting Jiajun Huang et.al. 2405.15491 null
2024-05-27 HDR-GS: Efficient High Dynamic Range Novel View Synthesis at 1000x Speed via Gaussian Splatting Yuanhao Cai et.al. 2405.15125 link
2024-05-24 GS-Hider: Hiding Messages into 3D Gaussian Splatting Xuanyu Zhang et.al. 2405.15118 null
2024-05-23 TIGER: Text-Instructed 3D Gaussian Retrieval and Coherent Editing Teng Xu et.al. 2405.14455 null
2024-05-24 RoGS: Large Scale Road Surface Reconstruction based on 2D Gaussian Splatting Zhiheng Feng et.al. 2405.14342 link
2024-05-22 DoGaussian: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus Yu Chen et.al. 2405.13943 link
2024-05-22 Gaussian Time Machine: A Real-Time Rendering Methodology for Time-Variant Appearances Licheng Shen et.al. 2405.13694 null
2024-05-21 Gaussian Control with Hierarchical Semantic Graphs in 3D Human Recovery Hongsheng Wang et.al. 2405.12477 null
2024-05-20 GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details Boqian Li et.al. 2405.12420 link
2024-05-22 AtomGS: Atomizing Gaussian Splatting for High-Fidelity Radiance Field Rong Liu et.al. 2405.12369 link
2024-05-20 Embracing Radiance Field Rendering in 6G: Over-the-Air Training and Inference with 3D Contents Guanlin Wu et.al. 2405.12155 null
2024-05-20 CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization Jiawei Zhang et.al. 2405.12110 link
2024-05-21 Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping Tianhao Wu et.al. 2405.12069 null
2024-05-20 MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections Jiayue Liu et.al. 2405.11921 null
2024-05-18 Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching Xingyu Miao et.al. 2405.11252 link
2024-05-18 MotionGS : Compact Gaussian Splatting SLAM by Motion Filter Xinli Guo et.al. 2405.11129 link
2024-05-17 Photorealistic 3D Urban Scene Reconstruction and Point Cloud Extraction using Google Earth Imagery and Gaussian Splatting Kyle Gao et.al. 2405.11021 null
2024-05-17 ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation Pengzhi Li et.al. 2405.10508 null
2024-05-16 GS-Planner: A Gaussian-Splatting-based Planning Framework for Active High-Fidelity Reconstruction Rui Jin et.al. 2405.10142 null
2024-05-11 Direct Learning of Mesh and Appearance via 3D Gaussian Splatting Ancheng Lin et.al. 2405.06945 null
2024-05-10 I3DGS: Improve 3D Gaussian Splatting from Multiple Dimensions Jinwei Lin et.al. 2405.06408 null
2024-05-09 DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation Sitian Shen et.al. 2405.05800 null
2024-05-09 FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting Yikun Ma et.al. 2405.05768 null
2024-05-18 NGM-SLAM: Gaussian Splatting SLAM with Radiance Field Submap Mingrui Li et.al. 2405.05702 null
2024-05-09 Benchmarking Neural Radiance Fields for Autonomous Robots: An Overview Yuhang Ming et.al. 2405.05526 null
2024-05-08 GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields Yuanhao Gong et.al. 2405.05446 null
2024-05-06 A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose Kaiwen Jiang et.al. 2405.03659 null
2024-05-03 HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2 Miriam Jäger et.al. 2405.02005 null
2024-05-01 Spectrally Pruned Gaussian Fields with Neural Compensation Runyi Yang et.al. 2405.00676 link
2024-04-30 GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting Kai Zhang et.al. 2404.19702 null
2024-04-29 SAGS: Structure-Aware 3D Gaussian Splatting Evangelos Ververas et.al. 2404.19149 null
2024-04-29 MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing Cong Wang et.al. 2404.19026 null
2024-04-29 DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing Minghao Chen et.al. 2404.18929 null
2024-04-29 Bootstrap 3D Reconstructed Scenes from 3D Gaussian Splatting Yifei Gao et.al. 2404.18669 null
2024-04-29 3D Gaussian Splatting with Deferred Reflection Keyang Ye et.al. 2404.18454 link
2024-04-29 Reconstructing Satellites in 3D from Amateur Telescope Images Zhiming Chang et.al. 2404.18394 null
2024-04-26 SLAM for Indoor Mapping of Wide Area Construction Environments Vincent Ress et.al. 2404.17215 null
2024-04-25 GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting Kyusun Cho et.al. 2404.16012 link
2024-04-25 OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation Lizhi Wang et.al. 2404.15891 link
2024-04-22 Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses Inhee Lee et.al. 2404.14410 null
2024-04-22 CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding Guibiao Liao et.al. 2404.14249 link
2024-04-28 GaussianTalker: Speaker-specific Talking Head Synthesis via 3D Gaussian Splatting Hongyun Yu et.al. 2404.14037 null
2024-04-21 GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal Yuxin Wang et.al. 2404.13679 null
2024-04-19 Learn2Talk: 3D Talking Face Learns from 2D Talking Face Yixiang Zhuang et.al. 2404.12888 null
2024-04-19 EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation Wenkai Liu et.al. 2404.12777 null
2024-04-22 Does Gaussian Splatting need SFM Initialization? Yalda Foroutan et.al. 2404.12547 null
2024-04-22 Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos Isabella Liu et.al. 2404.12379 null
2024-04-17 RainyScape: Unsupervised Rainy Scene Reconstruction using Decoupled Neural Rendering Xianqiang Lyu et.al. 2404.11401 null
2024-04-18 DeblurGS: Gaussian Splatting for Camera Motion Blur Jeongtaek Oh et.al. 2404.11358 null
2024-04-17 Novel View Synthesis for Cinematic Anatomy on Mobile and Immersive Displays Simon Niedermayr et.al. 2404.11285 null
2024-04-16 Gaussian Opacity Fields: Efficient and Compact Surface Reconstruction in Unbounded Scenes Zehao Yu et.al. 2404.10772 null
2024-04-16 Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks Florian Barthel et.al. 2404.10625 null
2024-04-16 AbsGS: Recovering Fine Details for 3D Gaussian Splatting Zongxin Ye et.al. 2404.10484 null
2024-04-16 SRGS: Super-Resolution 3D Gaussian Splatting Xiang Feng et.al. 2404.10318 link
2024-04-15 LetsGo: Large-Scale Garage Modeling and Rendering via LiDAR-Assisted Gaussian Primitives Jiadi Cui et.al. 2404.09748 null
2024-04-15 3D Gaussian Splatting as Markov Chain Monte Carlo Shakiba Kheradmand et.al. 2404.09591 null
2024-04-16 LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field Jiyang Li et.al. 2404.08966 link
2024-04-15 OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering Jingrui Ye et.al. 2404.08449 null
2024-04-10 RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion Jaidev Shriram et.al. 2404.07199 null
2024-04-10 Gaussian-LIC: Photo-realistic LiDAR-Inertial-Camera SLAM with 3D Gaussian Splatting Xiaolei Lang et.al. 2404.06926 null
2024-04-10 SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection Mathis Kruse et.al. 2404.06832 link
2024-04-12 SpikeNVS: Enhancing Novel View Synthesis from Blurry Images via Spike Camera Gaole Dai et.al. 2404.06710 null
2024-04-14 3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis Zhicheng Lu et.al. 2404.06270 null
2024-04-09 Gaussian Pancakes: Geometrically-Regularized 3D Gaussian Splatting for Realistic Endoscopic Reconstruction Sierra Bonilla et.al. 2404.06128 link
2024-04-09 Revising Densification in Gaussian Splatting Samuel Rota Bulò et.al. 2404.06109 null
2024-04-09 Hash3D: Training-free Acceleration for 3D Generation Xingyi Yang et.al. 2404.06091 link
2024-04-08 StylizedGS: Controllable Stylization for 3D Gaussian Splatting Dingxi Zhang et.al. 2404.05220 null
2024-04-06 Z-Splat: Z-Axis Gaussian Splatting for Camera-Sonar Fusion Ziyuan Qu et.al. 2404.04687 link
2024-04-05 Robust Gaussian Splatting François Darmon et.al. 2404.04211 null
2024-04-04 Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting Jeongmin Bae et.al. 2404.03613 null
2024-04-08 OmniGS: Omnidirectional Gaussian Splatting for Fast Radiance Field Reconstruction using Omnidirectional Images Longwei Li et.al. 2404.03202 link
2024-04-03 TCLC-GS: Tightly Coupled LiDAR-Camera Gaussian Splatting for Surrounding Autonomous Driving Scenes Cheng Zhao et.al. 2404.02410 null
2024-04-01 Mirror-3DGS: Incorporating Mirror Reflections into 3D Gaussian Splatting Jiarui Meng et.al. 2404.01168 null
2024-04-07 CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians Yang Liu et.al. 2404.01133 link
2024-04-01 MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements Lisong C. Sun et.al. 2404.00923 null
2024-03-30 3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting Xiaoyang Lyu et.al. 2404.00409 null
2024-03-29 InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds Zhiwen Fan et.al. 2403.20309 link
2024-03-29 Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces Mauro Comi et.al. 2403.20275 null
2024-03-29 HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes Ke Wu et.al. 2403.20159 null
2024-03-29 SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior Zhongrui Yu et.al. 2403.20079 null
2024-03-29 HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes Zhuopeng Li et.al. 2403.20032 null
2024-03-28 GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling Bowen Zhang et.al. 2403.19655 null
2024-03-28 GauStudio: A Modular Framework for 3D Gaussian Splatting and Beyond Chongjie Ye et.al. 2403.19632 link
2024-03-28 CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians Avinash Paliwal et.al. 2403.19495 link
2024-03-29 Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction Qiuhong Shen et.al. 2403.18795 link
2024-03-26 Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians Kerui Ren et.al. 2403.17898 link
2024-03-26 2D Gaussian Splatting for Geometrically Accurate Radiance Fields Binbin Huang et.al. 2403.17888 link
2024-03-26 DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing Matias Turkulainen et.al. 2403.17822 link
2024-03-25 GSDF: 3DGS Meets SDF for Improved Rendering and Reconstruction Mulin Yu et.al. 2403.16964 null
2024-03-23 Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections Dongbin Zhang et.al. 2403.15704 null
2024-03-22 Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting Jun Guo et.al. 2403.15624 null
2024-03-22 Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting Zheng Zhang et.al. 2403.15530 null
2024-03-22 STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians Yifei Zeng et.al. 2403.14939 null
2024-03-21 MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images Yuedong Chen et.al. 2403.14627 link
2024-03-21 Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering Antoine Guédon et.al. 2403.14554 null
2024-03-21 HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression Yihang Chen et.al. 2403.14530 link
2024-03-21 Isotropic Gaussian Splatting for Real-Time Radiance Field Rendering Yuanhao Gong et.al. 2403.14244 null
2024-03-19 GVGEN: Text-to-3D Generation with Volumetric Representation Xianglong He et.al. 2403.12957 null
2024-03-19 HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting Hongyu Zhou et.al. 2403.12722 null
2024-03-22 RGBD GS-ICP SLAM Seongbo Ha et.al. 2403.12550 link
2024-03-19 High-Fidelity SLAM Using Gaussian Splatting with Rendering-Guided Densification and Regularized Optimization Shuo Sun et.al. 2403.12535 link
2024-03-20 View-Consistent 3D Editing with Gaussian Splatting Yuxuan Wang et.al. 2403.11868 null
2024-03-19 BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting Lingzhe Zhao et.al. 2403.11831 link
2024-03-18 NEDS-SLAM: A Novel Neural Explicit Dense Semantic SLAM Framework using 3D Gaussian Splatting Yiming Ji et.al. 2403.11679 null
2024-03-20 GaussNav: Gaussian Splatting for Visual Navigation Xiaohan Lei et.al. 2403.11625 link
2024-03-18 3DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration Quentin Herau et.al. 2403.11577 null
2024-03-18 Fed3DGS: Scalable 3D Gaussian Splatting with Federated Learning Teppei Suzuki et.al. 2403.11460 link
2024-03-18 Bridging 3D Gaussian and Mesh for Freeview Video Rendering Yuting Xiao et.al. 2403.11453 null
2024-03-18 Motion-aware 3D Gaussian Splatting for Efficient Dynamic Scene Reconstruction Zhiyang Guo et.al. 2403.11447 null
2024-03-18 BAGS: Building Animatable Gaussian Splatting from a Monocular Video with Diffusion Priors Tingyang Zhang et.al. 2403.11427 null
2024-03-18 Beyond Uncertainty: Risk-Aware Active View Acquisition for Safe Robot Navigation and 3D Scene Understanding with FisherRF Guangyi Liu et.al. 2403.11396 null
2024-03-17 3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization Peng Jiang et.al. 2403.11367 null
2024-03-17 Compact 3D Gaussian Splatting For Dense Visual SLAM Tianchen Deng et.al. 2403.11247 link
2024-03-15 SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians Hiba Dahmani et.al. 2403.10427 null
2024-03-15 GGRt: Towards Generalizable 3D Gaussians without Pose Priors in Real-Time Hao Li et.al. 2403.10147 null
2024-03-15 Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing Tian-Xing Xu et.al. 2403.10050 null
2024-03-14 Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting Aiden Swann et.al. 2403.09875 null
2024-03-14 GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping Yuhang Zheng et.al. 2403.09637 link
2024-03-14 Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting Jaewoo Jung et.al. 2403.09413 link
2024-03-14 A New Split Algorithm for 3D Gaussian Splatting Qiyuan Feng et.al. 2403.09143 null
2024-03-14 GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing Jing Wu et.al. 2403.08733 link
2024-03-13 Gaussian Splatting in Style Abhishek Saroha et.al. 2403.08498 null
2024-03-12 StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting Kunhao Liu et.al. 2403.07807 null
2024-03-13 DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization Jiahe Li et.al. 2403.06912 link
2024-03-11 FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization Jiahui Zhang et.al. 2403.06908 null
2024-03-07 Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis Yuanhao Cai et.al. 2403.04116 link
2024-02-29 3D Gaussian Model for Animation and Texturing Xiangzhi Eric Wang et.al. 2402.19441 null
2024-02-27 VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction Jiaqi Lin et.al. 2402.17427 null
2024-02-24 Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting Ziyi Yang et.al. 2402.15870 null
2024-02-22 GaussianPro: 3D Gaussian Splatting with Progressive Propagation Kai Cheng et.al. 2402.14650 null
2024-02-21 Identifying Unnecessary 3D Gaussians using Clustering for Fast Rendering of 3D Gaussian Splatting Joongho Jo et.al. 2402.13827 null
2024-02-20 How NeRFs and 3D Gaussian Splatting are Reshaping SLAM: a Survey Fabio Tosi et.al. 2402.13255 link
2024-02-15 GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering Abdullah Hamdi et.al. 2402.10128 link
2024-02-11 3D Gaussian as a New Vision Era: A Survey Ben Fei et.al. 2402.07181 null
2024-02-13 GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data Haoyuan Li et.al. 2402.06198 null
2024-02-09 HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Zhenglin Zhou et.al. 2402.06149 link
2024-02-06 Rig3DGS: Creating Controllable Portraits from Casual Monocular Videos Alfredo Rivero et.al. 2402.03723 null
2024-02-07 4D Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes Yuanxing Duan et.al. 2402.03307 link
2024-02-01 360-GS: Layout-guided Panoramic Gaussian Splatting For Indoor Roaming Jiayang Bai et.al. 2402.00763 null

Text-to-Video

Publish Date Title Authors PDF Code
2025-06-26 SmoothSinger: A Conditional Diffusion Model for Singing Voice Synthesis with Multi-Resolution Architecture Kehan Sui et.al. 2506.21478 null
2025-06-26 ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models Hongbo Liu et.al. 2506.21356 null
2025-06-26 HieraSurg: Hierarchy-Aware Diffusion Model for Surgical Video Generation Diego Biagini et.al. 2506.21287 null
2025-06-26 Video Virtual Try-on with Conditional Diffusion Transformer Inpainter Cheng Zou et.al. 2506.21270 null
2025-06-26 DFVEdit: Conditional Delta Flow Vector for Zero-shot Video Editing Lingling Cai et.al. 2506.20967 null
2025-06-26 Consistent Zero-shot 3D Texture Synthesis Using Geometry-aware Diffusion and Temporal Video Models Donggoo Kang et.al. 2506.20946 null
2025-06-25 Video Perception Models for 3D Scene Synthesis Rui Huang et.al. 2506.20601 null
2025-06-25 BrokenVideos: A Benchmark Dataset for Fine-Grained Artifact Localization in AI-Generated Videos Jiahao Lin et.al. 2506.20103 null
2025-06-24 Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation Xingyang Li et.al. 2506.19852 null
2025-06-24 GenHSI: Controllable Generation of Human-Scene Interaction Videos Zekun Li et.al. 2506.19840 null
2025-06-24 SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Liangbin Xie et.al. 2506.19838 null
2025-06-24 Bind-Your-Avatar: Multi-Talking-Character Video Generation with Dynamic 3D-mask-based Embedding Router Yubo Huang et.al. 2506.19833 null
2025-06-24 Training-Free Motion Customization for Distilled Video Generators with Adaptive Test-Time Distillation Jintao Rong et.al. 2506.19348 null
2025-06-23 VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory Runjia Li et.al. 2506.18903 null
2025-06-23 From Virtual Games to Real-World Play Wenqiang Sun et.al. 2506.18901 null
2025-06-23 FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation Kaiyi Huang et.al. 2506.18899 null
2025-06-23 MinD: Unified Visual Imagination and Control via Hierarchical World Models Xiaowei Chi et.al. 2506.18897 null
2025-06-23 OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation Qijun Gan et.al. 2506.18866 null
2025-06-23 Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset Zhuowei Chen et.al. 2506.18851 null
2025-06-23 Matrix-Game: Interactive World Foundation Model Yifan Zhang et.al. 2506.18701 null
2025-06-23 RDPO: Real Data Preference Optimization for Physics Consistency Video Generation Wenxu Qian et.al. 2506.18655 null
2025-06-23 BulletGen: Improving 4D Reconstruction with Bullet-Time Generation Denys Rozumnyi et.al. 2506.18601 null
2025-06-23 VQ-Insight: Teaching VLMs for AI-Generated Video Quality Understanding via Progressive Visual Reinforcement Learning Xuanyu Zhang et.al. 2506.18564 null
2025-06-23 Emergent Temporal Correspondences from Video Diffusion Transformers Jisu Nam et.al. 2506.17220 link
2025-06-20 Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Jiaqi Li et.al. 2506.17201 null
2025-06-20 Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation Riccardo Corvi et.al. 2506.16802 null
2025-06-19 VideoGAN-based Trajectory Proposal for Automated Vehicles Annajoyce Mariani et.al. 2506.16209 null
2025-06-19 FastInit: Fast Noise Initialization for Temporally Consistent Video Generation Chengyu Bai et.al. 2506.16119 null
2025-06-19 PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models Tianchen Zhao et.al. 2506.16054 null
2025-06-19 Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization Cong Wang et.al. 2506.15980 null
2025-06-20 Sekai: A Video Dataset towards World Exploration Zhen Li et.al. 2506.15675 null
2025-06-20 Show-o2: Improved Native Unified Multimodal Models Jinheng Xie et.al. 2506.15564 link
2025-06-17 Causally Steered Diffusion for Automated Video Counterfactual Generation Nikos Spyrou et.al. 2506.14404 null
2025-06-17 CausalDiffTab: Mixed-Type Causal-Aware Diffusion for Tabular Data Generation Jia-Chen Zhang et.al. 2506.14206 null
2025-06-18 VideoMAR: Autoregressive Video Generatio with Continuous Tokens Hu Yu et.al. 2506.14168 null
2025-06-16 UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions Zhucun Xue et.al. 2506.13691 null
2025-06-16 STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation Jiamin Wang et.al. 2506.13138 null
2025-06-15 iDiT-HOI: Inpainting-based Hand Object Interaction Reenactment via Video Diffusion Transformer Zhelun Shen et.al. 2506.12847 null
2025-06-13 SignAligner: Harmonizing Complementary Pose Modalities for Coherent Sign Language Generation Xu Wang et.al. 2506.11621 null
2025-06-12 GenWorld: Towards Detecting AI-generated Real-world Simulation Videos Weiliang Chen et.al. 2506.10975 null
2025-06-12 M4V: Multi-Modal Mamba for Text-to-Video Generation Jiancheng Huang et.al. 2506.10915 null
2025-06-12 GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning Xiaoyi Bao et.al. 2506.10639 null
2025-06-12 DreamActor-H1: High-Fidelity Human-Product Demonstration Video Generation via Motion-designed Diffusion Transformers Lizhen Wang et.al. 2506.10568 null
2025-06-12 AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation Haoyuan Shi et.al. 2506.10540 null
2025-06-11 PlayerOne: Egocentric World Simulator Yuanpeng Tu et.al. 2506.09995 null
2025-06-11 InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions Zhenzhi Wang et.al. 2506.09984 null
2025-06-11 ReSim: Reliable World Simulation for Autonomous Driving Jiazhi Yang et.al. 2506.09981 null
2025-06-11 DGAE: Diffusion-Guided Autoencoder for Efficient Latent Representation Learning Dongxu Liu et.al. 2506.09644 null
2025-06-11 Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Shanchuan Lin et.al. 2506.09350 null
2025-06-10 Seedance 1.0: Exploring the Boundaries of Video Generation Models Yu Gao et.al. 2506.09113 null
2025-06-10 FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation Zheqi He et.al. 2506.09081 null
2025-06-10 MagCache: Fast Video Generation with Magnitude-Aware Cache Zehong Ma et.al. 2506.09045 link
2025-06-11 Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models Xuanchi Ren et.al. 2506.09042 link
2025-06-10 HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation Ziyao Huang et.al. 2506.08797 null
2025-06-10 How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models Huixuan Zhang et.al. 2506.08351 null
2025-06-09 Seeing Voices: Generating A-Roll Video from Audio with Mirage Aditi Sundararaman et.al. 2506.08279 null
2025-06-09 Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion Xun Huang et.al. 2506.08009 null
2025-06-09 Dreamland: Controllable World Creation with Simulator and Generative Models Sicheng Mo et.al. 2506.08006 null
2025-06-09 Audio-Sync Video Generation with Multi-Stream Temporal Control Shuchen Weng et.al. 2506.08003 null
2025-06-09 Generative Modeling of Weights: Generalization or Memorization? Boya Zeng et.al. 2506.07998 link
2025-06-09 Video Unlearning via Low-Rank Refusal Vector Simone Facchiano et.al. 2506.07891 null
2025-06-09 PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement Teng Hu et.al. 2506.07848 null
2025-06-09 Consistent Video Editing as Flow-Driven Image-to-Video Generation Ge Wang et.al. 2506.07713 null
2025-06-10 From Generation to Generalization: Emergent Few-Shot Learning in Video Diffusion Models Pablo Acuaviva et.al. 2506.07280 null
2025-06-08 TV-LiVE: Training-Free, Text-Guided Video Editing via Layer Informed Vitality Exploitation Min-Jung Kim et.al. 2506.07205 null
2025-06-08 Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models Sangwon Jang et.al. 2506.07177 null
2025-06-06 Restereo: Diffusion stereo video generation and restoration Xingchang Huang et.al. 2506.06023 null
2025-06-06 LLIA – Enabling Low-Latency Interactive Avatars: Real-Time Audio-Driven Portrait Video Generation with Diffusion Models Haojie Yu et.al. 2506.05806 null
2025-06-05 EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh Tao Hu et.al. 2506.05554 null
2025-06-05 ContentV: Efficient Training of Video Generation Models with Limited Compute Wenfeng Lin et.al. 2506.05343 null
2025-06-09 Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers Haosong Liu et.al. 2506.05096 null
2025-06-05 FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video Generation Huihan Wang et.al. 2506.04956 null
2025-06-05 DualX-VSR: Dual Axial Spatial $\times$ Temporal Transformer for Real-World Video Super-Resolution without Motion Compensation Shuo Cao et.al. 2506.04830 null
2025-06-06 FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion Akide Liu et.al. 2506.04648 null
2025-06-05 Follow-Your-Creation: Empowering 4D Creation through Video Inpainting Yue Ma et.al. 2506.04590 null
2025-06-04 LayerFlow: A Unified Model for Layer-aware Video Generation Sihui Ji et.al. 2506.04228 null
2025-06-04 UNIC: Unified In-Context Video Editing Zixuan Ye et.al. 2506.04216 null
2025-06-05 FullDiT2: Efficient In-Context Conditioning for Video Diffusion Transformers Xuanhua He et.al. 2506.04213 null
2025-06-04 DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models Ziyi Wu et.al. 2506.03517 null
2025-06-03 Chipmunk: Training-Free Acceleration of Diffusion Transformers with Dynamic Column-Sparse Deltas Austin Silveria et.al. 2506.03275 null
2025-06-03 IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation Yuanze Lin et.al. 2506.03150 null
2025-06-03 Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval Jiwen Yu et.al. 2506.03141 null
2025-06-03 CamCloneMaster: Enabling Reference-based Camera Control for Video Generation Yawen Luo et.al. 2506.03140 null
2025-06-03 AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Lu Qiu et.al. 2506.03126 null
2025-06-03 DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation Zhengyao Lv et.al. 2506.03123 null
2025-06-03 TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models Chetwin Low et.al. 2506.03099 null
2025-06-03 ORV: 4D Occupancy-centric Robot Video Generation Xiuyu Yang et.al. 2506.03079 link
2025-06-03 Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers Pengtao Chen et.al. 2506.03065 null
2025-06-03 LinkTo-Anime: A 2D Animation Optical Flow Dataset from 3D Model Rendering Xiaoyi Feng et.al. 2506.02733 null
2025-06-03 LumosFlow: Motion-Guided Long Video Generation Jiahao Chen et.al. 2506.02497 null
2025-05-30 MiniMax-Remover: Taming Bad Noise Helps Video Object Removal Bojia Zi et.al. 2505.24873 null
2025-05-30 DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds Jiaxu Zhang et.al. 2505.24733 null
2025-05-30 UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation Yang-Tian Sun et.al. 2505.24521 null
2025-05-30 Interactive Video Generation via Domain Adaptation Ishaan Rawal et.al. 2505.24253 null
2025-05-30 STORK: Improving the Fidelity of Mid-NFE Sampling for Diffusion and Flow Matching Models Zheng Tan et.al. 2505.24210 link
2025-05-29 MAGREF: Masked Guidance for Any-Reference Video Generation Yufan Deng et.al. 2505.23742 link
2025-05-29 VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Tingyu Song et.al. 2505.23693 link
2025-05-29 VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Xiangdong Zhang et.al. 2505.23656 link
2025-05-29 VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality Evaluation Shi-Xue Zhang et.al. 2505.23484 link
2025-05-29 Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis Hengyuan Cao et.al. 2505.23325 null
2025-05-29 RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer Liu Liu et.al. 2505.23171 null
2025-05-29 Zero-to-Hero: Zero-Shot Initialization Empowering Reference-Based Video Appearance Editing Tongtong Su et.al. 2505.23134 link
2025-05-29 MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation Siyuan Wang et.al. 2505.23120 link
2025-05-29 GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion Gwanghyun Kim et.al. 2505.23085 null
2025-05-29 MOVi: Training-free Text-conditioned Multi-Object Video Generation Aimon Rahman et.al. 2505.22980 null
2025-05-28 Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation Zhe Kong et.al. 2505.22647 link
2025-05-28 Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers Weilun Feng et.al. 2505.22167 null
2025-05-28 FaceEditTalker: Interactive Talking Head Generation with Facial Attribute Editing Guanwen Feng et.al. 2505.22141 null
2025-05-28 LatentMove: Towards Complex Human Movement Video Generation Ashkan Taghipour et.al. 2505.22046 null
2025-05-28 PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms Yifei Xia et.al. 2505.22016 null
2025-05-28 Learning World Models for Interactive Video Generation Taiye Chen et.al. 2505.21996 null
2025-05-27 HDRSDR-VQA: A Subjective Video Quality Dataset for HDR and SDR Comparative Evaluation Bowen Chen et.al. 2505.21831 null
2025-05-27 Think Before You Diffuse: LLMs-Guided Physics-Aware Video Generation Ke Zhang et.al. 2505.21653 null
2025-05-27 VideoMarkBench: Benchmarking Robustness of Video Watermarking Zhengyuan Jiang et.al. 2505.21620 link
2025-05-27 Frame In-N-Out: Unbounded Controllable Image-to-Video Generation Boyang Wang et.al. 2505.21491 null
2025-05-27 Dynamic Vision from EEG Brain Recordings: How much does EEG know? Prajwal Singh et.al. 2505.21385 null
2025-05-28 SageAttention2++: A More Efficient Implementation of SageAttention2 Jintao Zhang et.al. 2505.21136 link
2025-05-27 Minute-Long Videos with Dual Parallelisms Zeqing Wang et.al. 2505.21070 link
2025-05-27 RainFusion: Adaptive Video Generation Acceleration via Multi-Dimensional Visual Redundancy Aiyue Chen et.al. 2505.21036 null
2025-05-27 Frame-Level Captions for Long Video Generation with Complex Multi Scenes Guangcong Zheng et.al. 2505.20827 null
2025-05-27 Learning Generalizable Robot Policy with Human Demonstration Video as a Prompt Xiang Zhu et.al. 2505.20795 null
2025-05-27 Photography Perspective Composition: Towards Aesthetic Perspective Recommendation Lujian Yao et.al. 2505.20655 null
2025-05-27 Incorporating Flexible Image Conditioning into Text-to-Video Diffusion Models without Training Bolin Lai et.al. 2505.20629 null
2025-05-28 OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Shenghai Yuan et.al. 2505.20292 link
2025-05-27 Dynamic-I2V: Exploring Image-to-Video Generation Models via Multimodal LLM Peng Liu et.al. 2505.19901 null
2025-05-26 DriveCamSim: Generalizable Camera Simulation via Explicit Camera Modeling for Autonomous Driving Wenchao Sun et.al. 2505.19692 link
2025-05-26 TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs Juntong Wang et.al. 2505.19535 null
2025-05-26 The Role of Video Generation in Enhancing Data-Limited Action Understanding Wei Li et.al. 2505.19495 null
2025-05-26 Force Prompting: Video Generation Models Can Learn and Generalize Physics-based Control Signals Nate Gillman et.al. 2505.19386 null
2025-05-25 From Single Images to Motion Policies via Video-Generation Environment Representations Weiming Zhi et.al. 2505.19306 null
2025-05-25 SRDiffusion: Accelerate Video Diffusion Inference via Sketching-Rendering Cooperation Shenggan Cheng et.al. 2505.19151 null
2025-05-25 WorldEval: World Model as Real-World Robot Policies Evaluator Yaxuan Li et.al. 2505.19017 null
2025-05-24 Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation Shuo Yang et.al. 2505.18875 null
2025-05-24 VORTA: Efficient Video Diffusion via Routing Sparse Attention Wenhao Sun et.al. 2505.18809 link
2025-05-23 WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions Zizhang Li et.al. 2505.18151 null
2025-05-23 DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation Junhao Chen et.al. 2505.18078 null
2025-05-23 SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain Jiawei Zhou et.al. 2505.17727 null
2025-05-23 Scaling Image and Video Generation via Test-Time Evolutionary Search Haoran He et.al. 2505.17618 null
2025-05-23 InfLVG: Reinforce Inference-Time Consistent Long Video Generation with GRPO Xueji Fang et.al. 2505.17574 link
2025-05-22 Training-Free Efficient Video Generation via Dynamic Token Carving Yuechen Zhang et.al. 2505.16864 link
2025-05-22 Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts Taewon Kang et.al. 2505.16819 null
2025-05-22 MAGIC: Motion-Aware Generative Inference via Confidence-Guided LLM Siwei Meng et.al. 2505.16456 null
2025-05-23 Challenger: Affordable Adversarial Driving Video Generation Zhiyuan Xu et.al. 2505.15880 null
2025-05-21 Generative AI for Autonomous Driving: A Review Katharina Winter et.al. 2505.15863 null
2025-05-25 Interspatial Attention for Efficient 4D Human Video Generation Ruizhi Shao et.al. 2505.15800 null
2025-05-21 AvatarShield: Visual Reinforcement Learning for Human-Centric Video Forgery Detection Zhipei Xu et.al. 2505.15173 null
2025-05-21 CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation Xinran Wang et.al. 2505.15145 link
2025-05-20 Programmatic Video Prediction Using Large Language Models Hao Tang et.al. 2505.14948 link
2025-05-20 Grouping First, Attending Smartly: Training-Free Acceleration for Diffusion Transformers Sucheng Ren et.al. 2505.14687 link
2025-05-20 LMP: Leveraging Motion Prior in Zero-Shot Video Generation with Diffusion Transformer Changgu Chen et.al. 2505.14167 null
2025-05-20 Hunyuan-Game: Industrial-grade Intelligent Game Creation Model Ruihuang Li et.al. 2505.14135 null
2025-05-19 FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance Dian Shao et.al. 2505.13437 null
2025-05-19 MAGI-1: Autoregressive Video Generation at Scale Sand. ai et.al. 2505.13211 link
2025-05-19 DreamGen: Unlocking Generalization in Robot Learning through Neural Trajectories Joel Jang et.al. 2505.12705 link
2025-05-19 Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking Zihan Su et.al. 2505.12667 null
2025-05-19 BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation Haiquan Wen et.al. 2505.12620 link
2025-05-18 Video-GPT via Next Clip Diffusion Shaobin Zhuang et.al. 2505.12489 null
2025-05-17 LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation Jiarui Wang et.al. 2505.12098 link
2025-05-17 VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption Tianxiong Zhong et.al. 2505.12053 null
2025-05-16 QVGen: Pushing the Limit of Quantized Video Generative Models Yushi Huang et.al. 2505.11497 null
2025-05-16 Face Consistency Benchmark for GenAI Video Michal Podstawski et.al. 2505.11425 null
2025-05-14 Aquarius: A Family of Industry-Level Video Generation Models for Marketing Scenarios Huafeng Shi et.al. 2505.10584 null
2025-05-16 MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation Yanbo Ding et.al. 2505.10238 link
2025-05-15 ToonifyGB: StyleGAN-based Gaussian Blendshapes for 3D Stylized Head Avatars Rui-Yang Ju et.al. 2505.10072 null
2025-05-18 EWMBench: Evaluating Scene, Motion, and Semantic Quality in Embodied World Models Hu Yue et.al. 2505.09694 link
2025-05-15 Generating time-consistent dynamics with discriminator-guided image diffusion models Philipp Hess et.al. 2505.09089 null
2025-05-13 Generative AI for Autonomous Driving: Frontiers and Opportunities Yuping Wang et.al. 2505.08854 link
2025-05-13 Symbolically-Guided Visual Plan Inference from Uncurated Video Data Wenyan Yang et.al. 2505.08444 null
2025-05-12 DanceGRPO: Unleashing GRPO on Visual Generation Zeyue Xue et.al. 2505.07818 null
2025-05-12 ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models Ozgur Kara et.al. 2505.07652 null
2025-05-16 Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model Wei Li et.al. 2505.07449 link
2025-05-15 Generative Pre-trained Autoregressive Diffusion Transformer Yuan Zhang et.al. 2505.07344 null
2025-05-11 DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models Junhao Xia et.al. 2505.07057 null
2025-05-11 BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation Panwen Hu et.al. 2505.06985 null
2025-05-10 Jailbreaking the Text-to-Video Generative Models Jiayang Liu et.al. 2505.06679 null
2025-05-10 ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images Xianghao Kong et.al. 2505.06537 null
2025-05-08 T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models Xuyang Guo et.al. 2505.04946 null
2025-05-08 HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Teng Hu et.al. 2505.04512 null
2025-05-06 Real-Time Person Image Synthesis Using a Flow Matching Model Jiwoo Jeong et.al. 2505.03562 link
2025-05-06 Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights Zhaiming Shen et.al. 2505.03205 null
2025-05-04 DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization Wenchuan Wang et.al. 2505.02192 null
2025-05-03 PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth Bu Jin et.al. 2505.01729 null
2025-05-02 VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations for Synthetic Videos Zongxia Li et.al. 2505.01481 link
2025-05-02 FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis Jiangtong Tan et.al. 2505.01172 link
2025-05-01 Controllable Weather Synthesis and Removal with Video Diffusion Models Chih-Hao Lin et.al. 2505.00704 null
2025-05-01 T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation Xuyang Guo et.al. 2505.00337 null
2025-04-30 Direct Motion Models for Assessing Generated Videos Kelsey Allen et.al. 2505.00209 null
2025-04-30 Eye2Eye: A Simple Approach for Monocular-to-Stereo Video Synthesis Michal Geyer et.al. 2505.00135 null
2025-04-30 ReVision: High-Quality, Low-Cost Video Generation with Explicit 3D Physics Modeling for Complex Motion and Interaction Qihao Liu et.al. 2504.21855 null
2025-04-30 HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation Haiyang Zhou et.al. 2504.21650 link
2025-04-30 Simple Visual Artifact Detection in Sora-Generated Videos Misora Sugiyama et.al. 2504.21334 null
2025-04-30 Capturing Conditional Dependence via Auto-regressive Diffusion Models Xunpeng Huang et.al. 2504.21314 null
2025-04-29 TesserAct: Learning 4D Embodied World Models Haoyu Zhen et.al. 2504.20995 null
2025-04-29 DDPS: Discrete Diffusion Posterior Sampling for Paths in Layered Graphs Hao Luan et.al. 2504.20754 null
2025-04-29 Advance Fake Video Detection via Vision Transformers Joy Battocchio et.al. 2504.20669 null
2025-04-28 DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer Junpeng Jiang et.al. 2504.19614 null
2025-04-26 Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning Yifan Xie et.al. 2504.18810 null
2025-04-26 Stealing Creator’s Workflow: A Creator-Inspired Agentic Framework with Iterative Feedback Loop for Improved Scientific Short-form Generation Jong Inn Park et.al. 2504.18805 null
2025-04-25 NoiseController: Towards Consistent Multi-view Video Generation via Noise Decomposition and Collaboration Haotian Dong et.al. 2504.18448 null
2025-04-23 Subject-driven Video Generation via Disentangled Identity and Motion Daneul Kim et.al. 2504.17816 null
2025-04-24 Dynamic Camera Poses and Where to Find Them Chris Rockwell et.al. 2504.17788 null
2025-04-24 MV-Crafter: An Intelligent System for Music-guided Video Generation Chuer Chen et.al. 2504.17267 null
2025-04-24 DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks Yinqi Li et.al. 2504.17253 link
2025-04-25 We’ll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback Minkyu Choi et.al. 2504.17180 null
2025-04-23 BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation Ruotong Wang et.al. 2504.16907 null
2025-04-23 ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance Ying Li et.al. 2504.16464 null
2025-04-23 VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models Xuming Hu et.al. 2504.16359 null
2025-04-22 Survey of Video Diffusion Models: Foundations, Implementations, and Applications Yimu Wang et.al. 2504.16081 link
2025-04-22 Efficient Temporal Consistency in Diffusion-Based Video Editing with Adaptor Modules: A Theoretical Framework Xinyuan Song et.al. 2504.16016 null
2025-04-22 Reasoning Physical Video Generation with Diffusion Timestep Tokens via Reinforcement Learning Wang Lin et.al. 2504.15932 null
2025-04-22 Satellite to GroundScape – Large-scale Consistent Ground View Generation from Satellite Views Ningli Xu et.al. 2504.15786 null
2025-04-22 DiTPainter: Efficient Video Inpainting with Diffusion Transformers Xian Wu et.al. 2504.15661 null
2025-04-21 Solving New Tasks by Adapting Internet Video Knowledge Calvin Luo et.al. 2504.15369 null
2025-04-21 Tiger200K: Manually Curated High Visual Quality Video Dataset from UGC Platform Xianpan Zhou et.al. 2504.15182 null
2025-04-21 DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation Weijie He et.al. 2504.15032 null
2025-04-21 Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation Chenjie Cao et.al. 2504.14899 link
2025-04-20 Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis Jingjing Ren et.al. 2504.14470 null
2025-04-19 SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Minho Park et.al. 2504.14396 link
2025-04-21 SkyReels-V2: Infinite-length Film Generative Model Guibin Chen et.al. 2504.13074 link
2025-04-21 Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Lvmin Zhang et.al. 2504.12626 link
2025-04-16 VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate Zhihang Yuan et.al. 2504.12259 link
2025-04-16 Modular-Cam: Modular Dynamic Camera-view Video Generation with LLM Zirui Pan et.al. 2504.12048 null
2025-04-16 The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation Bingjie Gao et.al. 2504.11739 null
2025-04-17 VideoPanda: Video Panoramic Diffusion with Multi-view Attention Kevin Xie et.al. 2504.11389 null
2025-04-15 InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation Yukang Lin et.al. 2504.10905 null
2025-04-15 OmniVDiff: Omni Controllable Video Diffusion for Generation and Understanding Dianbing Xi et.al. 2504.10825 null
2025-04-14 H-MoRe: Learning Human-centric Motion Representation for Action Analysis Zhanbo Huang et.al. 2504.10676 link
2025-04-14 H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models Yushu Wu et.al. 2504.10567 null
2025-04-14 FingER: Content Aware Fine-grained Evaluation with Reasoning for AI-Generated Videos Rui Chen et.al. 2504.10358 null
2025-04-14 Aligning Anime Video Generation with Human Feedback Bingwen Zhu et.al. 2504.10044 null
2025-04-14 EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise Chao Liu et.al. 2504.09789 null
2025-04-13 CamMimic: Zero-Shot Image To Camera Motion Personalized Video Generation Using Diffusion Models Pooja Guhan et.al. 2504.09472 null
2025-04-11 Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Team Seawead et.al. 2504.08685 null
2025-04-11 Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization Jialu Li et.al. 2504.08641 null
2025-04-11 Diffusion Models for Robotic Manipulation: A Survey Rosa Wolf et.al. 2504.08438 null
2025-04-11 EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model Renda Li et.al. 2504.08344 null
2025-04-11 RealCam-Vid: High-resolution Video Dataset with Dynamic Scenes and Metric-scale Camera Movements Guangcong Zheng et.al. 2504.08212 link
2025-04-11 TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation Ruineng Li et.al. 2504.08181 null
2025-04-10 Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction Zeren Jiang et.al. 2504.07961 link
2025-04-10 Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos Rundong Luo et.al. 2504.07940 null
2025-04-10 Diffusion Transformers for Tabular Data Time Series Generation Fabrizio Garuti et.al. 2504.07566 link
2025-04-09 EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation Diljeet Jagpal et.al. 2504.06861 null
2025-04-09 DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation Wangbo Zhao et.al. 2504.06803 link
2025-04-09 RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism Elia Peruzzo et.al. 2504.06672 null
2025-04-09 Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception Ruotian Peng et.al. 2504.06666 null
2025-04-08 CamContextI2V: Context-aware Controllable Video Generation Luis Denninger et.al. 2504.06022 link
2025-04-07 One-Minute Video Generation with Test-Time Training Karan Dalal et.al. 2504.05298 null
2025-04-07 Video-Bench: Human-Aligned Video Generation Benchmark Hui Han et.al. 2504.04907 null
2025-04-05 Video4DGen: Enhancing Video and 4D Generation through Mutual Optimization Yikai Wang et.al. 2504.04153 link
2025-04-05 Multi-identity Human Image Animation with Structural Video Diffusion Zhenzhi Wang et.al. 2504.04126 null
2025-04-05 Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models Xuyang Guo et.al. 2504.04051 null
2025-04-05 DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion Maksim Siniukov et.al. 2504.04010 null
2025-04-04 Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models Xuran Ma et.al. 2504.03140 link
2025-04-03 How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models Pascal Chang et.al. 2504.03072 null
2025-04-03 Morpheus: Benchmarking Physical Reasoning of Video Generative Models with Real Physical Experiments Chenyu Zhang et.al. 2504.02918 null
2025-04-03 Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets Chuning Zhu et.al. 2504.02792 null
2025-04-03 Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model Shengjun Zhang et.al. 2504.02764 null
2025-04-04 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Fa-Ting Hong et.al. 2504.02542 link
2025-04-03 ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer Jiayi Gao et.al. 2504.02451 link
2025-04-03 SkyReels-A2: Compose Anything in Video Diffusion Transformers Zhengcong Fei et.al. 2504.02436 link
2025-04-04 MG-Gen: Single Image to Motion Graphics Generation with Layer Decomposition Takahiro Shirakawa et.al. 2504.02361 null
2025-04-03 OmniCam: Unified Multimodal Video Generation via Camera Control Xiaoda Yang et.al. 2504.02312 null
2025-04-02 WorldPrompter: Traversable Text-to-Scene Generation Zhaoyang Zhang et.al. 2504.02045 null
2025-04-03 VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Hanyang Wang et.al. 2504.01956 null
2025-04-01 WorldScore: A Unified Evaluation Benchmark for World Generation Haoyi Duan et.al. 2504.00983 null
2025-04-01 DecoFuse: Decomposing and Fusing the “What”, “Where”, and “How” for Brain-Inspired fMRI-to-Video Decoding Chong Li et.al. 2504.00432 null
2025-03-31 GazeLLM: Multimodal LLMs incorporating Human Visual Attention Jun Rekimoto et.al. 2504.00221 null
2025-03-31 Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Shengqiong Wu et.al. 2503.24379 null
2025-04-01 HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation Boyuan Wang et.al. 2503.24026 null
2025-03-31 JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation Fangda Chen et.al. 2503.23951 null
2025-04-01 On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile Devices Bosung Kim et.al. 2503.23796 link
2025-03-31 HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation Kun Liu et.al. 2503.23715 null
2025-03-30 VideoGen-Eval: Agent-based System for Video Generation Evaluation Yuhang Yang et.al. 2503.23452 link
2025-03-30 JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization Kai Liu et.al. 2503.23377 null
2025-04-02 Towards Physically Plausible Video Generation via VLM Planning Xindi Yang et.al. 2503.23368 null
2025-03-30 MoCha: Towards Movie-Grade Talking Character Synthesis Cong Wei et.al. 2503.23307 null
2025-03-30 SketchVideo: Sketch-based Video Generation and Editing Feng-Lin Liu et.al. 2503.23284 null
2025-03-28 Zero4D: Training-Free 4D Video Generation From Single Video Using Off-the-Shelf Video Diffusion Model Jangho Park et.al. 2503.22622 null
2025-03-28 EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation Hadrien Reynaud et.al. 2503.22357 null
2025-03-28 CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving Yishen Ji et.al. 2503.22231 null
2025-03-27 VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models Chi-Pin Huang et.al. 2503.21781 null
2025-03-27 Exploring the Evolution of Physics Cognition in Video Generation: A Survey Minghui Lin et.al. 2503.21765 link
2025-03-27 VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness Dian Zheng et.al. 2503.21755 link
2025-03-27 Audio-driven Gesture Generation via Deviation Feature in the Latent Space Jiahui Chen et.al. 2503.21616 null
2025-03-27 ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model Jinwei Qi et.al. 2503.21144 null
2025-03-26 RecTable: Fast Modeling Tabular Data with Rectified Flow Masane Fuchi et.al. 2503.20731 link
2025-03-26 AccidentSim: Generating Physically Realistic Vehicle Collision Videos from Real-World Accident Reports Xiangwen Zhang et.al. 2503.20654 null
2025-03-26 GAIA-2: A Controllable Multi-View Generative World Model for Autonomous Driving Lloyd Russell et.al. 2503.20523 null
2025-03-26 VPO: Aligning Text-to-Video Generation Models with Prompt Optimization Jiale Cheng et.al. 2503.20491 link
2025-03-26 Wan: Open and Advanced Large-Scale Video Generative Models WanTeam et.al. 2503.20314 link
2025-03-26 Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models Prin Phunyaphibarn et.al. 2503.20240 null
2025-03-26 Video Motion Graphs Haiyang Liu et.al. 2503.20218 null
2025-03-25 Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors Yuke Lou et.al. 2503.20118 null
2025-03-25 Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals Stefan Stojanov et.al. 2503.19953 null
2025-03-25 FullDiT: Multi-Task Video Generative Foundation Model with Full Attention Xuan Ju et.al. 2503.19907 null
2025-03-25 Mask $^2$ DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation Tianhao Qi et.al. 2503.19881 null
2025-03-25 AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers Jiazhi Guan et.al. 2503.19824 null
2025-03-25 AccVideo: Accelerating Video Diffusion Model with Synthetic Dataset Haiyu Zhang et.al. 2503.19462 null
2025-03-26 Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing Jaihoon Kim et.al. 2503.19385 null
2025-03-25 MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation Yukang Lin et.al. 2503.19383 null
2025-03-26 EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models Yufei Cai et.al. 2503.19369 link
2025-03-25 Long-Context Autoregressive Video Modeling with Next-Frame Prediction Yuchao Gu et.al. 2503.19325 link
2025-03-25 Aether: Geometric-Aware Unified World Modeling Aether Team et.al. 2503.18945 null
2025-03-24 Video-T1: Test-Time Scaling for Video Generation Fangfu Liu et.al. 2503.18942 null
2025-03-24 Training-free Diffusion Acceleration with Bottleneck Sampling Ye Tian et.al. 2503.18940 null
2025-03-25 AMD-Hummingbird: Towards an Efficient Text-to-Video Model Takashi Isobe et.al. 2503.18559 link
2025-03-24 EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation Qiang Qu et.al. 2503.18552 null
2025-03-24 Can Text-to-Video Generation help Video-Language Alignment? Luca Zanella et.al. 2503.18507 null
2025-03-24 Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation Dingcheng Zhen et.al. 2503.18429 null
2025-03-24 Resource-Efficient Motion Control for Video Generation via Dynamic Mask Guidance Sicong Feng et.al. 2503.18386 null
2025-03-23 LongDiff: Training-Free Long Video Generation in One Go Zhuoling Li et.al. 2503.18150 null
2025-03-23 TransAnimate: Taming Layer Diffusion to Generate RGBA Video Xuewei Chen et.al. 2503.17934 null
2025-03-21 Position: Interactive Generative Video as Next-Generation Game Engine Jiwen Yu et.al. 2503.17359 null
2025-03-21 AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process Junjie Hu et.al. 2503.17029 null
2025-03-21 Enabling Versatile Controls for Video Diffusion Models Xu Zhang et.al. 2503.16983 link
2025-03-21 Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model Yingying Fan et.al. 2503.16942 null
2025-03-20 XAttention: Block Sparse Attention with Antidiagonal Scoring Ruyi Xu et.al. 2503.16428 link
2025-03-20 MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance Quanhao Li et.al. 2503.16421 null
2025-03-20 ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos Haolin Yang et.al. 2503.16400 null
2025-03-20 PoseTraj: Pose-Aware Trajectory Control in Video Diffusion Longbin Ji et.al. 2503.16068 null
2025-03-20 Animating the Uncaptured: Humanoid Mesh Animation with Video Diffusion Models Marc Benedí San Millán et.al. 2503.15996 null
2025-03-20 MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving Haiguang Wang et.al. 2503.15875 link
2025-03-20 VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling Hyojun Go et.al. 2503.15855 null
2025-03-19 Temporal Regularization Makes Your Video Generator Stronger Harold Haodong Chen et.al. 2503.15417 null
2025-03-20 VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention Mingzhe Zheng et.al. 2503.15138 null
2025-03-18 MusicInfuser: Making Video Diffusion Listen and Dance Susung Hong et.al. 2503.14505 null
2025-03-18 MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation Hongyu Zhang et.al. 2503.14428 null
2025-03-18 Impossible Videos Zechen Bai et.al. 2503.14378 null
2025-03-18 LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models Yu Cheng et.al. 2503.14325 link
2025-03-18 Concat-ID: Towards Universal Identity-Preserving Video Synthesis Yong Zhong et.al. 2503.14151 null
2025-03-18 Fast Autoregressive Video Generation with Diagonal Decoding Yang Ye et.al. 2503.14070 null
2025-03-18 AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark Xinhao Xiang et.al. 2503.14064 link
2025-03-17 Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction Zheyuan Liu et.al. 2503.12953 null
2025-03-17 AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations Quang Trung Truong et.al. 2503.12828 null
2025-03-16 SPC-GS: Gaussian Splatting with Semantic-Prompt Consistency for Indoor Open-World Free-view Synthesis from Sparse Inputs Guibiao Liao et.al. 2503.12535 null
2025-03-15 A Speech-to-Video Synthesis Approach Using Spatio-Temporal Diffusion for Vocal Tract MRI Paula Andrea Pérez-Toro et.al. 2503.12102 null
2025-03-15 SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering Byeongjun Park et.al. 2503.12024 link
2025-03-14 ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Jianhong Bai et.al. 2503.11647 null
2025-03-14 HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models Ziqin Zhou et.al. 2503.11513 null
2025-03-14 TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation Hongxiang Zhao et.al. 2503.11423 null
2025-03-14 Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model Haoyang Huang et.al. 2503.11251 link
2025-03-14 Cross-Modal Learning for Music-to-Music-Video Description Generation Zhuoyuan Mao et.al. 2503.11190 null
2025-03-13 CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models Hao He et.al. 2503.10592 null
2025-03-13 Long Context Tuning for Video Generation Yuwei Guo et.al. 2503.10589 null
2025-03-13 CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Yufan Deng et.al. 2503.10391 null
2025-03-13 Semantic Latent Motion for Portrait Video Generation Qiyuan Zhang et.al. 2503.10096 null
2025-03-16 VMBench: A Benchmark for Perception-Aligned Video Motion Generation Xinran Ling et.al. 2503.10076 link
2025-03-13 UVE: Are MLLMs Unified Evaluators for AI-Generated Videos? Yuanxin Liu et.al. 2503.09949 link
2025-03-13 VideoMerge: Towards Training-free Long Video Generation Siyang Zhang et.al. 2503.09926 null
2025-03-12 LuciBot: Automated Robot Policy Learning from Generated Videos Xiaowen Qiu et.al. 2503.09871 null
2025-03-14 On the Limitations of Vision-Language Models in Understanding Image Transforms Ahmad Mustafa Anis et.al. 2503.09837 null
2025-03-12 I2V3D: Controllable image-to-video generation with 3D guidance Zhiyuan Zhang et.al. 2503.09733 null
2025-03-12 PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop Chenyu Li et.al. 2503.09595 link
2025-03-12 Unified Dense Prediction of Video Diffusion Lehan Yang et.al. 2503.09344 null
2025-03-12 Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latant Space Jian Zhu et.al. 2503.09215 null
2025-03-13 WonderVerse: Extendable 3D Scene Generation with Video Generative Models Hao Feng et.al. 2503.09160 null
2025-03-12 Reangle-A-Video: 4D Video Generation as Video-to-Video Translation Hyeonho Jeong et.al. 2503.09151 null
2025-03-11 REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder Yitian Zhang et.al. 2503.08665 null
2025-03-11 Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling Subin Kim et.al. 2503.08605 null
2025-03-12 $^R$ FLAV: Rolling Flow matching for infinite Audio Video generation Alex Ergasti et.al. 2503.08307 link
2025-03-11 WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation Jing Wang et.al. 2503.08153 null
2025-03-11 ObjectMover: Generative Object Movement with Video Prior Xin Yu et.al. 2503.08037 null
2025-03-11 How Can Video Generative AI Transform K-12 Education? Examining Teachers’ Perspectives through TPACK and TAM Unggi Lee et.al. 2503.08003 null
2025-03-10 DreamRelation: Relation-Centric Video Customization Yujie Wei et.al. 2503.07602 null
2025-03-11 VACE: All-in-One Video Creation and Editing Zeyinzi Jiang et.al. 2503.07598 null
2025-03-10 AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion Mingzhen Sun et.al. 2503.07418 null
2025-03-10 Automated Movie Generation via Multi-Agent CoT Planning Weijia Wu et.al. 2503.07314 link
2025-03-09 VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation Hritik Bansal et.al. 2503.06800 null
2025-03-09 TR-DQ: Time-Rotation Diffusion Quantization Yihua Shao et.al. 2503.06564 null
2025-03-09 QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation Junyi Wu et.al. 2503.06545 link
2025-03-11 LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation Quanjian Song et.al. 2503.06508 link
2025-03-09 Generative Video Bi-flow Chen Liu et.al. 2503.06364 null
2025-03-08 Text2Story: Advancing Video Storytelling with Text Guidance Taewon Kang et.al. 2503.06310 null
2025-03-08 Object-Centric World Model for Language-Guided Manipulation Youngjoon Jeong et.al. 2503.06170 null
2025-03-08 VACT: A Video Automatic Causal Testing System and a Benchmark Haotong Yang et.al. 2503.06163 null
2025-03-07 MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio Xuenan Xu et.al. 2503.05242 link
2025-03-07 Unified Reward Model for Multimodal Understanding and Generation Yibin Wang et.al. 2503.05236 null
2025-03-06 Toward Lightweight and Fast Decoders for Diffusion Models in Image and Video Generation Alexey Buzovkin et.al. 2503.04871 link
2025-03-06 FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video Yue Gao et.al. 2503.04720 null
2025-03-06 What Are You Doing? A Closer Look at Controllable Human Video Generation Emanuele Bugliarello et.al. 2503.04666 null
2025-03-08 The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation Aoxiong Yin et.al. 2503.04606 link
2025-03-05 GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control Xuanchi Ren et.al. 2503.03751 link
2025-03-08 Rethinking Video Tokenization: A Conditioned Diffusion-based Approach Nianzu Yang et.al. 2503.03708 link
2025-03-05 DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance Zhao Yang et.al. 2503.03689 link
2025-03-05 High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights Yuna Kato et.al. 2503.03558 link
2025-03-05 Video Super-Resolution: All You Need is a Video Diffusion Model Zhihao Zhan et.al. 2503.03355 null
2025-03-04 GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning Zhun Mou et.al. 2503.02341 null
2025-03-03 VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation Wenhao Wang et.al. 2503.01739 link
2025-03-03 VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors Juil Koo et.al. 2503.01107 null
2025-03-02 Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think Jie Tian et.al. 2503.00948 link
2025-03-01 Learning to Animate Images from A Few Videos to Portray Delicate Human Actions Haoxin Li et.al. 2503.00276 null
2025-03-04 Unified Video Action Model Shuang Li et.al. 2503.00200 null
2025-02-28 Raccoon: Multi-stage Diffusion Training with Coarse-to-Fine Curating Videos Zhiyu Tan et.al. 2502.21314 null
2025-02-28 Training-free and Adaptive Sparse Attention for Efficient Long Video Generation Yifei Xia et.al. 2502.21079 null
2025-02-28 HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models Xiao Wang et.al. 2502.20811 null
2025-02-28 WorldModelBench: Judging Video Generation Models As World Models Dacheng Li et.al. 2502.20694 null
2025-02-27 Mobius: Text to Seamless Looping Video Generation via Latent Shift Xiuli Bi et.al. 2502.20307 link
2025-02-27 FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute Sotiris Anagnostidis et.al. 2502.20126 null
2025-02-27 C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation Yuhao Li et.al. 2502.19868 link
2025-02-26 FLAP: Fully-controllable Audio-driven Portrait Video Generation through 3D head conditioned diffusion mode Lingzhou Mu et.al. 2502.19455 null
2025-03-03 TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis Menghao Li et.al. 2502.19454 null
2025-02-25 SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference Jintao Zhang et.al. 2502.18137 link
2025-02-25 ASurvey: Spatiotemporal Consistency in Video Generation Zhiyu Yin et.al. 2502.17863 null
2025-02-24 X-Dancer: Expressive Music to Human Dance Video Generation Zeyuan Chen et.al. 2502.17414 null
2025-02-24 VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Xiangpeng Yang et.al. 2502.17258 null
2025-02-24 Diffusion Models for Tabular Data: Challenges, Current Progress, and Future Directions Zhong Li et.al. 2502.17119 link
2025-02-21 RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers Min Zhao et.al. 2502.15894 null
2025-02-21 VaViM and VaVAM: Autonomous Driving through Video Generative Modeling Florent Bartoccioni et.al. 2502.15672 link
2025-02-20 Hardware-Friendly Static Quantization Method for Video Diffusion Transformers Sanghyun Yi et.al. 2502.15077 null
2025-02-20 LAVID: An Agentic LVLM Framework for Diffusion-Generated Video Detection Qingyuan Liu et.al. 2502.14994 null
2025-02-20 Improving the Diffusability of Autoencoders Ivan Skorokhodov et.al. 2502.14831 null
2025-02-21 RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers Ke Cao et.al. 2502.14377 null
2025-02-20 Designing Parameter and Compute Efficient Diffusion Transformers using Distillation Vignesh Sundaresha et.al. 2502.14226 null
2025-02-19 FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation Yunpeng Zhang et.al. 2502.13995 link
2025-02-19 LLMPopcorn: An Empirical Study of LLMs as Assistants for Popular Micro-video Generation Junchen Fu et.al. 2502.12945 null
2025-02-18 VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation Xinlong Chen et.al. 2502.12782 link
2025-02-18 MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation Sihyun Yu et.al. 2502.12632 null
2025-02-17 LaM-SLidE: Latent Space Modeling of Spatial Dynamical Systems via Linked Entities Florian Sestak et.al. 2502.12128 link
2025-02-17 DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation Zhihang Yuan et.al. 2502.11897 link
2025-02-17 Object-Centric Image to Video Generation with Language Guidance Angel Villar-Corrales et.al. 2502.11655 null
2025-02-16 MaskFlow: Discrete Flows For Flexible and Efficient Long Video Generation Michael Fuest et.al. 2502.11234 null
2025-02-16 Phantom: Subject-consistent video generation via cross-modal alignment Lijie Liu et.al. 2502.11079 null
2025-02-17 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Guoqing Ma et.al. 2502.10248 link
2025-02-14 RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control Teng Li et.al. 2502.10059 null
2025-02-14 GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation Hongyin Zhang et.al. 2502.09268 null
2025-02-12 CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation Qinghe Wang et.al. 2502.08639 null
2025-02-12 FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis Wonjoon Jin et.al. 2502.08244 null
2025-02-12 Learning Human Skill Generators at Key-Step Levels Yilu Wu et.al. 2502.08234 null
2025-02-12 AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance Zhao Wang et.al. 2502.08189 null
2025-02-12 Next Block Prediction: Video Generation via Semi-Autoregressive Modeling Shuhuai Ren et.al. 2502.07737 null
2025-02-14 Magic 1-For-1: Generating One Minute Video Clips within One Minute Hongwei Yi et.al. 2502.07701 link
2025-02-12 VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation Sixiao Zheng et.al. 2502.07531 null
2025-02-13 Enhance-A-Video: Better Generated Video for Free Yang Luo et.al. 2502.07508 link
2025-02-11 Generative Ghost: Investigating Ranking Bias Hidden in AI-Generated Videos Haowen Gao et.al. 2502.07327 null
2025-02-11 Articulate That Object Part (ATOP): 3D Part Articulation from Text and Motion Personalization Aditya Vora et.al. 2502.07278 null
2025-02-11 Contextual Gesture: Co-Speech Gesture Video Generation through Context-aware Gesture Representation Pinxin Liu et.al. 2502.07239 null
2025-02-10 Lotus: Creating Short Videos From Long Videos With Abstractive and Extractive Summarization Aadit Barua et.al. 2502.07096 null
2025-02-10 Conditional diffusion model with spatial attention and latent embedding for medical image segmentation Behzad Hejrati et.al. 2502.06997 link
2025-02-10 Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT Dongyang Liu et.al. 2502.06782 null
2025-02-10 History-Guided Video Diffusion Kiwhan Song et.al. 2502.06764 null
2025-02-10 Señorita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists Bojia Zi et.al. 2502.06734 null
2025-02-10 TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Yangguang Li et.al. 2502.06608 link
2025-02-10 CustomVideoX: 3D Reference Attention Driven Dynamic Adaptation for Zero-Shot Customized Video Diffusion Transformers D. She et.al. 2502.06527 null
2025-02-10 Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile Hangliang Ding et.al. 2502.06155 null
2025-02-08 Towards AI-driven Sign Language Generation with Non-manual Markers Han Zhang et.al. 2502.05661 null
2025-02-08 Training-Free Constrained Generation With Stable Diffusion Models Stefano Zampini et.al. 2502.05625 null
2025-02-08 A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction Yongfan Chen et.al. 2502.05503 link
2025-02-07 FlashVideo:Flowing Fidelity to Detail for Efficient High-Resolution Video Generation Shilong Zhang et.al. 2502.05179 link
2025-02-07 Goku: Flow Based Video Generative Foundation Models Shoufa Chen et.al. 2502.04896 null
2025-02-07 HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation Qijun Gan et.al. 2502.04847 null
2025-02-06 Fast Video Generation with Sliding Tile Attention Peiyuan Zhang et.al. 2502.04507 null
2025-02-06 UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation Wenzhang Sun et.al. 2502.04393 null
2025-02-06 MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Jinbo Xing et.al. 2502.04299 null
2025-02-06 Learning Real-World Action-Video Dynamics with Heterogeneous Masked Autoregression Lirui Wang et.al. 2502.04296 null
2025-02-06 Content-Rich AIGC Video Quality Assessment via Intricate Text Alignment and Motion-Aware Consistency Shangkun Sun et.al. 2502.04076 link
2025-02-06 UniForm: A Unified Diffusion Transformer for Audio-Video Generation Lei Zhao et.al. 2502.03897 null
2025-02-05 Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach Yunuo Chen et.al. 2502.03639 null
2025-02-05 FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise Yunlong Yuan et.al. 2502.03496 null
2025-02-05 MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent Xinyao Liao et.al. 2502.03207 null
2025-02-04 Controllable Video Generation with Provable Disentanglement Yifan Shen et.al. 2502.02690 null
2025-02-04 VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Hila Chefer et.al. 2502.02492 null
2025-02-05 IPO: Iterative Preference Optimization for Text-to-Video Generation Xiaomeng Yang et.al. 2502.02088 null
2025-02-03 VILP: Imitation Learning with Latent Video Planning Zhengtong Xu et.al. 2502.01784 link
2025-02-03 Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity Haocheng Xi et.al. 2502.01776 null
2025-02-05 MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation Haibo Tong et.al. 2502.01719 null
2025-02-02 HuViDPO:Enhancing Video Generation through Direct Preference Optimization for Human-Centric Alignment Lifan Jiang et.al. 2502.01690 null
2025-02-03 Improved Training Technique for Latent Consistency Models Quan Dao et.al. 2502.01441 link
2025-02-03 VidSketch: Hand-drawn Sketch-Driven Video Generation with Diffusion Control Lifan Jiang et.al. 2502.01101 link
2025-02-03 OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Gaojie Lin et.al. 2502.01061 null
2025-02-03 Pushing the Boundaries of State Space Models for Image and Video Generation Yicong Hong et.al. 2502.00972 null
2025-01-31 Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search Yuta Oshima et.al. 2501.19252 null
2025-01-30 Every Image Listens, Every Image Dances: Music-Driven Image Animation Zhikang Dong et.al. 2501.18801 null
2025-01-28 CascadeV: An Implementation of Wurstchen Architecture for Video Generation Wenfeng Lin et.al. 2501.16612 link
2025-01-26 “See What I Imagine, Imagine What I See”: Human-AI Co-Creation System for 360 $^\circ$ Panoramic Video Generation in VR Yunge Wen et.al. 2501.15456 null
2025-01-24 VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking Runyi Hu et.al. 2501.14195 link
2025-01-23 Improving Video Generation with Human Feedback Jie Liu et.al. 2501.13918 null
2025-01-23 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion Jiangchuan Wei et.al. 2501.13452 null
2025-01-21 Taming Teacher Forcing for Masked Autoregressive Video Generation Deyu Zhou et.al. 2501.12389 null
2025-01-22 Video Depth Anything: Consistent Depth Estimation for Super-Long Videos Sili Chen et.al. 2501.12375 null
2025-01-20 GenVidBench: A Challenging Benchmark for Detecting AI-Generated Video Zhenliang Ni et.al. 2501.11340 null
2025-01-20 CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation Zheng Chong et.al. 2501.11325 link
2025-01-18 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation Linrui Tian et.al. 2501.10687 null
2025-01-17 DiffuEraser: A Diffusion Model for Video Inpainting Xiaowen Li et.al. 2501.10018 link
2025-01-17 RichSpace: Enriching Text-to-Video Prompt Space via Text Embedding Interpolation Yuefan Cao et.al. 2501.09982 null
2025-01-16 VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Zhongwei Ren et.al. 2501.09781 null
2025-01-16 Learnings from Scaling Visual Tokenizers for Reconstruction and Generation Philippe Hansen-Estruch et.al. 2501.09755 null
2025-01-14 Do generative video models learn physical principles from watching videos? Saman Motamed et.al. 2501.09038 link
2025-01-15 Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Jingyuan Chen et.al. 2501.09019 null
2025-01-15 RepVideo: Rethinking Cross-Layer Representation for Video Generation Chenyang Si et.al. 2501.08994 null
2025-01-15 Comprehensive Subjective and Objective Evaluation Method for Text-generated Video Zelu Qi et.al. 2501.08545 null
2025-01-14 Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models Weichen Fan et.al. 2501.08453 null
2025-01-14 3D Gaussian Splatting with Normal Information for Mesh Extraction and Improved Rendering Meenakshi Krishnan et.al. 2501.08370 null
2025-01-14 GameFactory: Creating New Games with Generative Interactive Videos Jiwen Yu et.al. 2501.08325 null
2025-01-14 Diffusion Adversarial Post-Training for One-Step Video Generation Shanchuan Lin et.al. 2501.08316 null
2025-01-14 LayerAnimate: Layer-specific Control for Animation Yuxue Yang et.al. 2501.08295 null
2025-01-14 FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors Yabo Zhang et.al. 2501.08225 link
2025-01-13 BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations Weixi Feng et.al. 2501.07647 null
2025-01-13 Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss Xinyu Zhang et.al. 2501.07563 null
2025-01-11 Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning Maomao Li et.al. 2501.06438 null
2025-01-10 MEt3R: Measuring Multi-View Consistency in Generated Images Mohammad Asim et.al. 2501.06336 null
2025-01-10 Multi-subject Open-set Personalization in Video Generation Tsai-Shien Chen et.al. 2501.06187 null
2025-01-10 VideoAuteur: Towards Long Narrative Video Generation Junfei Xiao et.al. 2501.06173 null
2025-01-08 Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion Yongjia Ma et.al. 2501.05484 null
2025-01-09 Progressive Growing of Video Tokenizers for Highly Compressed Latent Spaces Aniruddha Mahapatra et.al. 2501.05442 null
2025-01-08 ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning Yuzhou Huang et.al. 2501.04698 null
2025-01-08 LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition Bowen Hao et.al. 2501.04204 null
2025-01-07 Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers Yuechen Zhang et.al. 2501.03931 link
2025-01-09 Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Zekai Gu et.al. 2501.03847 link
2025-01-07 Motion-Aware Generative Frame Interpolation Guozhen Zhang et.al. 2501.03699 null
2025-01-06 License Plate Images Generation with Diffusion Models Mariia Shpir et.al. 2501.03374 null
2025-01-06 Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation Guy Yariv et.al. 2501.03059 null
2025-01-06 TransPixar: Advancing Text-to-Video Generation with Transparency Luozhou Wang et.al. 2501.03006 link
2025-01-06 Brick-Diffusion: Generating Long Videos with Brick-to-Wall Denoising Yunlong Yuan et.al. 2501.02741 null
2025-01-05 GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking Weikang Bian et.al. 2501.02690 null
2025-01-04 Benchmark Evaluations, Applications, and Challenges of Large Vision Language Models: A Survey Zongxia Li et.al. 2501.02189 link
2025-01-03 JoyGen: Audio-Driven 3D Depth-Aware Talking-Face Video Editing Qili Wang et.al. 2501.01798 link
2025-01-06 VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Yuanpeng Tu et.al. 2501.01427 null
2025-01-03 Free-Form Motion Control: A Synthetic Video Generation Dataset with Controllable Camera and Object Motions Xincheng Shuai et.al. 2501.01425 null
2025-01-02 On Unifying Video Generation and Camera Pose Estimation Chun-Hao Paul Huang et.al. 2501.01409 null
2025-01-01 Beyond Text: Implementing Multimodal Large Language Model-Powered Multi-Agent Systems Using a No-Code Platform Cheonsu Jeong et.al. 2501.00750 null
2025-01-03 DreamDrive: Generative 4D Scene Modeling from Street View Images Jiageng Mao et.al. 2501.00601 null
2024-12-30 LTX-Video: Realtime Video Latent Diffusion Yoav HaCohen et.al. 2501.00103 link
2024-12-30 Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model Yifei Huang et.al. 2412.21080 link
2024-12-30 VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation Jiazheng Xu et.al. 2412.21059 link
2024-12-30 ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation Ting Zhang et.al. 2412.20901 null
2024-12-30 Dialogue Director: Bridging the Gap in Dialogue Visualization for Multimodal Storytelling Min Zhang et.al. 2412.20725 null
2024-12-29 Open-Sora: Democratizing Efficient Video Production for All Zangwei Zheng et.al. 2412.20404 link
2024-12-27 Generative Video Propagation Shaoteng Liu et.al. 2412.19761 null
2024-12-30 VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models Tao Wu et.al. 2412.19645 null
2024-12-30 DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT Xiaotao Hu et.al. 2412.19505 link
2024-12-25 Accelerating Diffusion Transformers with Dual Feature Caching Chang Zou et.al. 2412.18911 link
2024-12-24 Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation Faraz Waseem et.al. 2412.18688 null
2024-12-24 DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers Yuntao Chen et.al. 2412.18607 null
2024-12-24 ZeroHSI: Zero-Shot 4D Human-Scene Interaction by Video Generation Hongjie Li et.al. 2412.18600 null
2024-12-24 DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Minghong Cai et.al. 2412.18597 link
2024-12-23 Large Motion Video Autoencoding with Cross-modal Video VAE Yazhou Xing et.al. 2412.17805 null
2024-12-23 VidTwin: Video VAE with Decoupled Structure and Dynamics Yuchi Wang et.al. 2412.17726 link
2024-12-23 FFA Sora, video generation as fundus fluorescein angiography simulator Xinyuan Wu et.al. 2412.17346 null
2024-12-23 Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory Xingyao Li et.al. 2412.17254 null
2024-12-22 SubstationAI: Multimodal Large Model-Based Approaches for Analyzing Substation Equipment Faults Jinzhi Wang et.al. 2412.17077 null
2024-12-22 Adapting Image-to-Video Diffusion Models for Large-Motion Frame Interpolation Luoxu Jin et.al. 2412.17042 null
2024-12-21 GANFusion: Feed-Forward Text-to-3D with Diffusion in GAN Space Souhaib Attaiki et.al. 2412.16717 null
2024-12-21 TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models Haocheng Huang et.al. 2412.16700 null
2024-12-21 VAST 1.0: A Unified Framework for Controllable and Consistent Video Generation Chi Zhang et.al. 2412.16677 null
2024-12-21 Follow-Your-MultiPose: Tuning-Free Multi-Character Text-to-Video Generation via Pose Guidance Beiyuan Zhang et.al. 2412.16495 null
2024-12-20 DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization Zihan Ding et.al. 2412.15689 null
2024-12-20 CustomTTT: Motion and Appearance Customized Video Generation via Test-Time Training Xiuli Bi et.al. 2412.15646 link
2024-12-19 AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation Moayed Haji-Ali et.al. 2412.15191 null
2024-12-19 Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM Yatai Ji et.al. 2412.15156 link
2024-12-19 Parallelized Autoregressive Visual Generation Yuqing Wang et.al. 2412.15119 null
2024-12-19 Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations Yucheng Hu et.al. 2412.14803 null
2024-12-19 Consistent Human Image and Video Generation with Spatially Conditioned Diffusion Mingdeng Cao et.al. 2412.14531 link
2024-12-19 DirectorLLM for Human-Centric Video Generation Kunpeng Song et.al. 2412.14484 null
2024-12-18 Autoregressive Video Generation without Vector Quantization Haoge Deng et.al. 2412.14169 link
2024-12-18 VideoDPO: Omni-Preference Alignment for Video Diffusion Generation Runtao Liu et.al. 2412.14167 null
2024-12-18 AKiRa: Augmentation Kit on Rays for optical video generation Xi Wang et.al. 2412.14158 null
2024-12-18 SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation Tong Chen et.al. 2412.14018 null
2024-12-18 Real-time One-Step Diffusion-based Expressive Portrait Videos Generation Hanzhong Guo et.al. 2412.13479 link
2024-12-18 SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation Kazuki Shimada et.al. 2412.13462 null
2024-12-17 CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile Devices Andrei Znobishchev et.al. 2412.13273 null
2024-12-17 MotionBridge: Dynamic Video Inbetweening with Flexible Controls Maham Tanveer et.al. 2412.13190 null
2024-12-17 VidTok: A Versatile and Open-Source Video Tokenizer Anni Tang et.al. 2412.13061 link
2024-12-16 Can video generation replace cinematographers? Research on the cinematic language of generated video Xiaozhe Li et.al. 2412.12223 null
2024-12-16 InterDyn: Controllable Interactive Dynamics with Video Diffusion Models Rick Akkerman et.al. 2412.11785 null
2024-12-16 Generative Inbetweening through Frame-wise Conditions-Driven Video Generation Tianyi Zhu et.al. 2412.11755 link
2024-12-16 VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting Muhammet Furkan Ilaslan et.al. 2412.11621 link
2024-12-15 GenLit: Reformulating Single-Image Relighting as Video Generation Shrisha Bharadwaj et.al. 2412.11224 null
2024-12-15 DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes Jinxiu Liu et.al. 2412.11100 null
2024-12-14 Video Diffusion Transformers are In-Context Learners Zhengcong Fei et.al. 2412.10783 link
2024-12-13 SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device Yushu Wu et.al. 2412.10494 null
2024-12-16 TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation Xingrui Wang et.al. 2412.10275 null
2024-12-13 Exploring the Frontiers of Animation Video Generation in the Sora Era: Method, Dataset and Benchmark Yudong Jiang et.al. 2412.10255 link
2024-12-13 LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity Hongjie Wang et.al. 2412.09856 null
2024-12-13 MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion Xunnong Xu et.al. 2412.09828 null
2024-12-12 Doe-1: Closed-Loop Autonomous Driving with Large World Model Wenzhao Zheng et.al. 2412.09627 link
2024-12-12 OmniDrag: Enabling Motion Control for Omnidirectional Image-to-Video Generation Weiqi Li et.al. 2412.09623 null
2024-12-12 Owl-1: Omni World Model for Consistent Long Video Generation Yuanhui Huang et.al. 2412.09600 link
2024-12-12 LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors Yabo Chen et.al. 2412.09597 null
2024-12-12 Video Creation by Demonstration Yihong Sun et.al. 2412.09551 null
2024-12-12 UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer Delong Liu et.al. 2412.09389 link
2024-12-12 T-SVG: Text-Driven Stereoscopic Video Generation Qiao Jin et.al. 2412.09323 null
2024-12-12 InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption Tiehan Fan et.al. 2412.09283 null
2024-12-12 LVMark: Robust Watermark for latent video diffusion models MinHyuk Jang et.al. 2412.09122 null
2024-12-12 Enhancing Facial Consistency in Conditional Video Generation via Facial Landmark Transformation Lianrui Mu et.al. 2412.08976 null
2024-12-11 Pysical Informed Driving World Model Zhuoran Yang et.al. 2412.08410 null
2024-12-11 FLIP: Flow-Centric Generative Planning for General-Purpose Manipulation Tasks Chongkai Gao et.al. 2412.08261 null
2024-12-11 VSD2M: A Large-scale Vision-language Sticker Dataset for Multi-frame Animated Sticker Generation Zhiqiang Yuan et.al. 2412.08259 null
2024-12-11 UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics Xi Chen et.al. 2412.07774 null
2024-12-10 From Slow Bidirectional to Fast Causal Video Generators Tianwei Yin et.al. 2412.07772 null
2024-12-10 SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Jianhong Bai et.al. 2412.07760 link
2024-12-10 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation Xiao Fu et.al. 2412.07759 null
2024-12-10 Multi-Shot Character Consistency for Text-to-Video Generation Yuval Atzmon et.al. 2412.07750 null
2024-12-10 StyleMaster: Stylize Your Video with Artistic Generation and Translation Zixuan Ye et.al. 2412.07744 null
2024-12-10 STIV: Scalable Text and Image Conditioned Video Generation Zongyu Lin et.al. 2412.07730 null
2024-12-10 ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Jinyi Hu et.al. 2412.07720 link
2024-12-09 SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations Zhaorun Chen et.al. 2412.06878 null
2024-12-08 Latent-Reframe: Enabling Camera Control for Video Diffusion Model without Training Zhenghong Zhou et.al. 2412.06029 null
2024-12-08 FlexDiT: Dynamic Token Density Control for Diffusion Transformer Shuning Chang et.al. 2412.06028 link
2024-12-08 Track4Gen: Teaching Video Diffusion Models to Track Points Improves Video Generation Hyeonho Jeong et.al. 2412.06016 null
2024-12-08 Accelerating Video Diffusion Models via Distribution Matching Yuanzhi Zhu et.al. 2412.05899 null
2024-12-08 MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation Shuwei Shi et.al. 2412.05848 null
2024-12-08 Self-Guidance: Boosting Flow and Diffusion Generation on Their Own Tiancheng Li et.al. 2412.05827 null
2024-12-07 Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation Leonardo Pina et.al. 2412.05694 null
2024-12-06 Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model Lening Wang et.al. 2412.05280 link
2024-12-06 Mind the Time: Temporally-Controlled Multi-Event Video Generation Ziyi Wu et.al. 2412.05263 null
2024-12-06 UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving Rui Chen et.al. 2412.04842 link
2024-12-05 Using Diffusion Priors for Video Amodal Segmentation Kaihua Chen et.al. 2412.04623 null
2024-12-05 PaintScene4D: Consistent 4D Scene Generation from Text Prompts Vinayak Gupta et.al. 2412.04471 null
2024-12-05 MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation Longtao Zheng et.al. 2412.04448 null
2024-12-05 DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models Yizhuo Li et.al. 2412.04446 null
2024-12-05 GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration Kaiyi Huang et.al. 2412.04440 null
2024-12-05 Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation Yuying Ge et.al. 2412.04432 link
2024-12-05 Instructional Video Generation Yayuan Li et.al. 2412.04189 null
2024-12-05 IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation Sejong Yang et.al. 2412.04000 null
2024-12-05 DiffSign: AI-Assisted Generation of Customizable Sign Language Videos With Enhanced Realism Sudha Krishnamurthy et.al. 2412.03878 link
2024-12-05 Movie Gen: SWOT Analysis of Meta’s Generative AI Foundation Model for Transforming Media Generation, Advertising, and Entertainment Industries Abul Ehtesham et.al. 2412.03837 null
2024-12-04 Advancing Auto-Regressive Continuation for Video Frames Ruibo Ming et.al. 2412.03758 null
2024-12-04 Navigation World Models Amir Bar et.al. 2412.03572 null
2024-12-04 Imagine360: Immersive 360 Video Generation from Perspective Anchor Jing Tan et.al. 2412.03552 null
2024-12-04 Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention Hannan Lu et.al. 2412.03520 null
2024-12-04 SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model Yan Li et.al. 2412.03430 null
2024-12-04 MaterialPicker: Multi-Modal Material Generation with Diffusion Transformers Xiaohe Ma et.al. 2412.03225 null
2024-12-04 Mimir: Improving Video Diffusion Models for Precise Text Understanding Shuai Tan et.al. 2412.03085 null
2024-12-03 Motion Prompting: Controlling Video Generation with Motion Trajectories Daniel Geng et.al. 2412.02700 null
2024-12-03 AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction Lingteng Qiu et.al. 2412.02684 null
2024-12-03 Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback Hiroki Furuta et.al. 2412.02617 null
2024-12-03 VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation Mingzhe Zheng et.al. 2412.02259 link
2024-12-02 World-consistent Video Diffusion with Explicit 3D Modeling Qihang Zhang et.al. 2412.01821 null
2024-12-02 Driving Scene Synthesis on Free-form Trajectories with Generative Prior Zeyu Yang et.al. 2412.01717 null
2024-12-04 InfinityDrive: Breaking Time Limits in Driving World Models Xi Guo et.al. 2412.01522 null
2024-12-02 CPA: Camera-pose-awareness Diffusion Transformer for Video Generation Yuelei Wang et.al. 2412.01429 null
2024-12-02 MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models Xiaomin Li et.al. 2412.01343 null
2024-12-02 Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation Xin Yan et.al. 2412.01316 null
2024-11-29 Fleximo: Towards Flexible Text-to-Human Motion Video Generation Yuhang Zhang et.al. 2411.19459 null
2024-11-28 Trajectory Attention for Fine-grained Video Motion Control Zeqi Xiao et.al. 2411.19324 null
2024-11-28 MSG score: A Comprehensive Evaluation for Multi-Scene Video Generation Daewon Yoon et.al. 2411.19121 null
2024-11-28 Timestep Embedding Tells: It’s Time to Cache for Video Diffusion Model Feng Liu et.al. 2411.19108 null
2024-11-28 SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing Rong-Cheng Tu et.al. 2411.18983 null
2024-12-02 AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers Sherwin Bahmani et.al. 2411.18673 null
2024-11-27 Towards Chunk-Wise Generation for Long Videos Siyang Zhang et.al. 2411.18668 null
2024-11-27 Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models Yiming Wu et.al. 2411.18375 null
2024-11-30 MotionCharacter: Identity-Preserving and Motion Controllable Human Video Generation Haopeng Fang et.al. 2411.18281 null
2024-11-26 Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey Hong-Hanh Nguyen-Le et.al. 2411.17911 null
2024-11-27 Accelerating Vision Diffusion Transformers with Skip Branches Guanjie Chen et.al. 2411.17616 link
2024-11-26 Identity-Preserving Text-to-Video Generation by Frequency Decomposition Shenghai Yuan et.al. 2411.17440 link
2024-11-26 AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation Ziyi Xu et.al. 2411.17383 null
2024-11-26 AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM Jiarui Wang et.al. 2411.17221 link
2024-11-28 PhysMotion: Physics-Grounded Dynamics From a Single Image Xiyang Tan et.al. 2411.17189 null
2024-11-26 PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation Hengjia Li et.al. 2411.17048 null
2024-11-26 Free $^2$ Guide: Gradient-Free Path Integral Control for Enhancing Text-to-Video Generation with Large Vision-Language Models Jaemin Kim et.al. 2411.17041 null
2024-11-25 Pathways on the Image Manifold: Image Editing via Video Generation Noam Rotstein et.al. 2411.16819 null
2024-11-25 DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation Zun Wang et.al. 2411.16657 null
2024-11-25 Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric Zhichao Zhang et.al. 2411.16619 null
2024-11-25 Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing Kaifeng Gao et.al. 2411.16375 link
2024-11-23 Optical-Flow Guided Prompt Optimization for Coherent Video Generation Hyelin Nam et.al. 2411.15540 null
2024-11-22 MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation Weijia Wu et.al. 2411.15262 link
2024-11-22 VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Daeun Lee et.al. 2411.15115 null
2024-11-21 Understanding World or Predicting Future? A Comprehensive Survey of World Models Jingtao Ding et.al. 2411.14499 null
2024-11-21 StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart Jian Shi et.al. 2411.14295 link
2024-11-21 TaQ-DiT: Time-aware Quantization for Diffusion Transformers Xinyan Liu et.al. 2411.14172 null
2024-11-21 MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control Ruiyuan Gao et.al. 2411.13807 null
2024-11-20 What You See Is What Matters: A Novel Visual and Physics-Based Metric for Evaluating Video Generation Quality Zihan Wang et.al. 2411.13609 null
2024-11-20 REDUCIO! Generating 1024 $\times$ 1024 Video within 16 Seconds using Extremely Compressed Motion Latents Rui Tian et.al. 2411.13552 link
2024-11-20 VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models Ziqi Huang et.al. 2411.13503 link
2024-11-19 Towards motion from video diffusion models Paul Janson et.al. 2411.12831 null
2024-11-19 Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting Haoyu Zhao et.al. 2411.12789 null
2024-11-19 PoM: Efficient Image and Video Generation with the Polynomial Mixer David Picard et.al. 2411.12663 link
2024-11-18 Medical Video Generation for Disease Progression Simulation Xu Cao et.al. 2411.11943 null
2024-11-18 SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input Zhen Lv et.al. 2411.11934 null
2024-11-19 SoK: On the Role and Future of AIGC Watermarking in the Era of Gen-AI Kui Ren et.al. 2411.11478 null
2024-11-18 Teaching Video Diffusion Model with Latent Physical Phenomenon Knowledge Qinglong Cao et.al. 2411.11343 null
2024-11-17 SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration Jintao Zhang et.al. 2411.10958 link
2024-11-16 ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models Vipula Rawte et.al. 2411.10867 null
2024-11-16 AnimateAnything: Consistent and Controllable Animation for Video Generation Guojun Lei et.al. 2411.10836 null
2024-11-15 OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models Mathis Koroglu et.al. 2411.10501 null
2024-11-14 Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance Md Fahim Anjum et.al. 2411.09174 null
2024-11-14 VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation Youpeng Wen et.al. 2411.09153 null
2024-11-16 A Survey on Vision Autoregressive Model Kai Jiang et.al. 2411.08666 null
2024-11-13 EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation Xiaofeng Wang et.al. 2411.08380 null
2024-11-13 Motion Control for Enhanced Complex Action Video Generation Qiang Zhou et.al. 2411.08328 null
2024-11-12 Artificial Intelligence for Biomedical Video Generation Linyuan Li et.al. 2411.07619 null
2024-11-10 I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength Wanquan Feng et.al. 2411.06525 null
2024-11-08 Autoregressive Models in Vision: A Survey Jing Xiong et.al. 2411.05902 link
2024-11-08 WHALE: Towards Generalizable and Scalable World Models for Embodied Decision-making Zhilong Zhang et.al. 2411.05619 null
2024-11-07 SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation Koichi Namekata et.al. 2411.04989 null
2024-11-07 Uncovering Hidden Subspaces in Video Diffusion Models Using Re-Identification Mischa Dombrowski et.al. 2411.04956 null
2024-11-07 DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion Wenqiang Sun et.al. 2411.04928 null
2024-11-11 StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration Panwen Hu et.al. 2411.04925 null
2024-11-07 MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views Yuedong Chen et.al. 2411.04924 link
2024-11-07 Taming Rectified Flow for Inversion and Editing Jiangshan Wang et.al. 2411.04746 link
2024-11-05 TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation Wenhao Wang et.al. 2411.04709 null
2024-11-05 Exploring the Interplay Between Video Generation and World Models in Autonomous Driving: A Survey Ao Fu et.al. 2411.02914 null
2024-11-07 Adaptive Caching for Faster Video Generation with Diffusion Transformers Kumara Kahatapitiya et.al. 2411.02397 null
2024-11-04 How Far is Video Generation from World Model: A Physical Law Perspective Bingyi Kang et.al. 2411.02385 null
2024-11-03 Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation Zhenbin Wang et.al. 2411.01647 null
2024-11-02 Fast and Memory-Efficient Video Diffusion Using Streamlined Inference Zheng Zhan et.al. 2411.01171 link
2024-11-01 GameGen-X: Interactive Open-world Game Video Generation Haoxuan Che et.al. 2411.00769 link
2024-11-04 Fashion-VDM: Video Diffusion Model for Virtual Try-On Johanna Karras et.al. 2411.00225 null
2024-10-31 Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning Penghui Ruan et.al. 2410.24219 link
2024-10-31 Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts Xiang Deng et.al. 2410.23836 null
2024-10-31 SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation Yining Hong et.al. 2410.23277 null
2024-10-30 LumiSculpt: A Consistency Lighting Control Network for Video Generation Yuxin Zhang et.al. 2410.22979 null
2024-10-30 HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models Shengkai Zhang et.al. 2410.22901 link
2024-10-29 Investigating Memorization in Video Diffusion Models Chen Chen et.al. 2410.21669 null
2024-10-28 LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior Hanyu Wang et.al. 2410.21264 null
2024-10-28 Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient Yintai Ma et.al. 2410.20657 null
2024-10-27 ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation Zongyi Li et.al. 2410.20502 null
2024-10-26 MarDini: Masked Autoregressive Diffusion for Video Generation at Scale Haozhe Liu et.al. 2410.20280 null
2024-10-26 Your Image is Secretly the Last Frame of a Pseudo Video Wenlong Chen et.al. 2410.20158 null
2024-10-26 GiVE: Guiding Visual Encoder to Perceive Overlooked Information Junjie Li et.al. 2410.20109 null
2024-10-26 GHIL-Glue: Hierarchical Control with Filtered Subgoal Images Kyle B. Hatch et.al. 2410.20018 null
2024-10-25 FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality Zhengyao Lv et.al. 2410.19355 null
2024-10-24 Framer: Interactive Frame Interpolation Wen Wang et.al. 2410.18978 null
2024-10-24 Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances Shilin Lu et.al. 2410.18775 link
2024-10-23 WorldSimBench: Towards Video Generation Models as World Simulators Yiran Qin et.al. 2410.18072 null
2024-10-23 VISAGE: Video Synthesis using Action Graphs for Surgery Yousef Yeganeh et.al. 2410.17751 null
2024-10-21 3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors Xi Liu et.al. 2410.16266 null
2024-10-20 EVA: An Embodied World Model for Future Video Anticipation Xiaowei Chi et.al. 2410.15461 null
2024-10-20 Allegro: Open the Black Box of Commercial-Level Video Generation Model Yuan Zhou et.al. 2410.15458 link
2024-10-20 FrameBridge: Improving Image-to-Video Generation with Bridge Models Yuji Wang et.al. 2410.15371 null
2024-10-27 VidPanos: Generative Panoramic Videos from Casual Panning Videos Jingwei Ma et.al. 2410.13832 null
2024-10-17 DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control Yujie Wei et.al. 2410.13830 null
2024-10-18 DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation Hanbo Cheng et.al. 2410.13726 link
2024-10-17 Movie Gen: A Cast of Media Foundation Models Adam Polyak et.al. 2410.13720 link
2024-10-21 DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation Guosheng Zhao et.al. 2410.13571 null
2024-10-18 Fundus to Fluorescein Angiography Video Generation as a Retinal Generative Foundation Model Weiyi Zhang et.al. 2410.13242 null
2024-10-17 AsymKV: Enabling 1-Bit Quantization of KV Cache with Layer-Wise Asymmetric Quantization Configurations Qian Tao et.al. 2410.13212 null
2024-10-16 SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation Jaehong Yoon et.al. 2410.12761 null
2024-10-16 Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices Zhiyuan Ma et.al. 2410.11795 null
2024-10-14 Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models Jingzhi Bao et.al. 2410.10821 link
2024-10-14 LVD-2M: A Long-take Video Dataset with Temporally Dense Captions Tianwei Xiong et.al. 2410.10816 link
2024-10-14 Boosting Camera Motion Control for Video Diffusion Transformers Soon Yau Cheong et.al. 2410.10802 null
2024-10-14 Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention Dejia Xu et.al. 2410.10774 null
2024-10-14 DragEntity: Trajectory Guided Video Generation using Entity and Positional Relationships Zhang Wan et.al. 2410.10751 null
2024-10-16 MuseTalk: Real-Time High Quality Lip Synchronization with Latent Space Inpainting Yue Zhang et.al. 2410.10122 link
2024-10-15 VideoAgent: Self-Improving Video Generation Achint Soni et.al. 2410.10076 link
2024-10-11 Quality Prediction of AI Generated Images and Videos: Emerging Trends and Opportunities Abhijay Ghildyal et.al. 2410.08534 null
2024-10-10 Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content Qiuheng Wang et.al. 2410.08260 null
2024-10-10 Scaling Laws For Diffusion Transformers Zhengyang Liang et.al. 2410.08184 null
2024-10-10 Progressive Autoregressive Video Diffusion Models Desai Xie et.al. 2410.08151 link
2024-10-10 HARIVO: Harnessing Text-to-Image Models for Video Generation Mingi Kwon et.al. 2410.07763 null
2024-10-10 Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation Jiahao Cui et.al. 2410.07718 link
2024-10-10 MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion Onkar Susladkar et.al. 2410.07659 link
2024-10-09 Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis Bohan Zeng et.al. 2410.07155 link
2024-10-08 BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way Jiazi Bu et.al. 2410.06241 null
2024-10-08 GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation Chi-Lam Cheang et.al. 2410.06158 null
2024-10-08 Pyramidal Flow Matching for Efficient Video Generative Modeling Yang Jin et.al. 2410.05954 link
2024-10-08 SeeClear: Semantic Distillation Enhances Pixel Condensation for Video Super-Resolution Qi Tang et.al. 2410.05799 link
2024-10-08 T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design Jiachen Li et.al. 2410.05677 null
2024-10-08 ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler Serin Yang et.al. 2410.05651 null
2024-10-08 TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation Gihyun Kwon et.al. 2410.05591 link
2024-10-07 Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation Fanqing Meng et.al. 2410.05363 link
2024-10-10 The Dawn of Video Generation: Preliminary Explorations with SORA-like Models Ailing Zeng et.al. 2410.05227 null
2024-10-07 Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality Ge Ya et.al. 2410.05203 link
2024-10-07 ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction Hyungjin Chung et.al. 2410.04721 null
2024-10-06 Realizing Video Summarization from the Path of Language-based Semantic Understanding Kuan-Chen Mu et.al. 2410.04511 null
2024-10-03 People are poorly equipped to detect AI-powered voice clones Sarah Barrington et.al. 2410.03791 null
2024-10-04 Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach Yaofang Liu et.al. 2410.03160 link
2024-10-04 ECHOPulse: ECG controlled echocardio-grams video generation Yiwei Li et.al. 2410.03143 link
2024-10-03 Loong: Generating Minute-level Long Videos with Autoregressive Language Models Yuqing Wang et.al. 2410.02757 null
2024-10-03 SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Jintao Zhang et.al. 2410.02367 link
2024-10-02 COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation Mingzhen Sun et.al. 2410.01718 null
2024-10-02 MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation Mingzhen Sun et.al. 2410.01594 link
2024-10-01 Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining Jie Cheng et.al. 2410.00564 link
2024-09-30 ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning Jian Shi et.al. 2410.00262 link
2024-09-30 Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs Zicheng Zhang et.al. 2409.20063 null
2024-09-30 Replace Anyone in Videos Xiang Wang et.al. 2409.19911 link
2024-09-27 PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation Shaowei Liu et.al. 2409.18964 link
2024-09-27 Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions Iskander Azangulov et.al. 2409.18804 null
2024-09-26 Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation Huan Yang et.al. 2409.17674 null
2024-09-26 A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation Masato Ishii et.al. 2409.17550 link
2024-09-25 Pose-Guided Fine-Grained Sign Language Video Generation Tongkai Shi et.al. 2409.16709 null
2024-09-24 Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation Homanga Bharadhwaj et.al. 2409.16283 null
2024-09-23 Multi-Modal Generative AI: Multi-modal LLM, Diffusion and Beyond Hong Chen et.al. 2409.14993 null
2024-09-23 Advancing Video Quality Assessment for AIGC Xinli Yue et.al. 2409.14888 null
2024-09-23 Video-to-Audio Generation with Fine-grained Temporal Semantics Yuchen Hu et.al. 2409.14709 null
2024-09-22 Dormant: Defending against Pose-driven Human Image Animation Jiachen Zhou et.al. 2409.14424 link
2024-09-27 JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation Hadrien Reynaud et.al. 2409.14149 null
2024-09-20 JoyHallo: Digital human model for Mandarin Sheng Shi et.al. 2409.13268 null
2024-09-19 Denoising Reuse: Exploiting Inter-frame Motion Consistency for Efficient Video Latent Generation Chenyu Wang et.al. 2409.12532 null
2024-09-19 Infrared Small Target Detection in Satellite Videos: A New Dataset and A Novel Recurrent Feature Refinement Framework Xinyi Ying et.al. 2409.12448 link
2024-09-17 OSV: One Step is Enough for High-Quality Image to Video Generation Xiaofeng Mao et.al. 2409.11367 null
2024-09-19 The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives Samee Arif et.al. 2409.11261 link
2024-09-16 Embodiment-Agnostic Action Planning via Object-Part Scene Flow Weiliang Tang et.al. 2409.10032 null
2024-09-13 STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment Yong Ren et.al. 2409.08601 null
2024-09-11 DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures Steven Hogue et.al. 2409.07649 null
2024-09-11 Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Haibo Yang et.al. 2409.07452 link
2024-09-11 EMOdiffhead: Continuously Emotional Control in Talking Head Generation via Diffusion Jian Zhang et.al. 2409.07255 link
2024-09-10 SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation Teng Hu et.al. 2409.06633 null
2024-09-10 G3PT: Unleash the power of Autoregressive Modeling in 3D Generation via Cross-scale Querying Transformer Jinzhi Zhang et.al. 2409.06322 null
2024-09-11 MyGo: Consistent and Controllable Multi-View Driving Video Generation with Camera Control Yining Yao et.al. 2409.06189 null
2024-09-12 DriveScape: Towards High-Resolution Controllable Multi-View Driving Video Generation Wei Wu et.al. 2409.05463 null
2024-09-06 Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task Jing Wang et.al. 2409.04005 link
2024-09-06 DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes Jianbiao Mei et.al. 2409.04003 link
2024-09-04 PoseTalk: Text-and-Audio-based Pose Control and Motion Refinement for One-Shot Talking Head Generation Jun Ling et.al. 2409.02657 null
2024-09-05 Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Jianwen Jiang et.al. 2409.02634 null
2024-09-03 DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Wenbo Hu et.al. 2409.02095 link
2024-09-05 CyberHost: Taming Audio-driven Avatar Diffusion Model with Region Codebook Attention Gaojie Lin et.al. 2409.01876 null
2024-09-03 DiVE: DiT-based Video Generation with Enhanced Control Junpeng Jiang et.al. 2409.01595 null
2024-09-02 AMG: Avatar Motion Guided Video Generation Zhangsihao Yang et.al. 2409.01502 link
2024-09-09 OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model Liuhan Chen et.al. 2409.01199 link
2024-08-31 Compositional 3D-aware Video Generation with LLM Director Hanxin Zhu et.al. 2409.00558 null
2024-08-30 CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion Yiran Chen et.al. 2408.17424 null
2024-08-30 VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers Juncan Deng et.al. 2408.17131 null
2024-08-29 One-Shot Learning Meets Depth Diffusion in Multi-Object Videos Anisha Jain et.al. 2408.16704 null
2024-08-29 DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Yongjie Fu et.al. 2408.16647 null
2024-08-29 Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation Xiaoyu Jin et.al. 2408.16506 null
2024-08-28 GenDDS: Generating Diverse Driving Video Scenarios with Prompt-to-Video Generative Model Yongjie Fu et.al. 2408.15868 null
2024-08-27 GenRec: Unifying Video Generation and Recognition with Diffusion Models Zejia Weng et.al. 2408.15241 link
2024-08-27 Fundus2Video: Cross-Modal Angiography Video Generation from Static Fundus Photography with Clinical Knowledge Guidance Weiyi Zhang et.al. 2408.15217 link
2024-08-28 SurGen: Text-Guided Diffusion Model for Surgical Video Generation Joseph Cho et.al. 2408.14028 null
2024-09-02 Training-free Long Video Generation with Chain of Diffusion Model Experts Wenhao Li et.al. 2408.13423 null
2024-08-24 TVG: A Training-free Transition Video Generation Method with Diffusion Models Rui Zhang et.al. 2408.13413 null
2024-08-23 CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities Tao Wu et.al. 2408.13239 link
2024-08-23 EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation Cong Wang et.al. 2408.13005 null
2024-08-22 xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Can Qin et.al. 2408.12590 null
2024-08-22 Real-Time Video Generation with Pyramid Attention Broadcast Xuanlei Zhao et.al. 2408.12588 link
2024-08-21 DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework Zhifei Xie et.al. 2408.11788 null
2024-08-21 TrackGo: A Flexible and Efficient Method for Controllable Video Generation Haitao Zhou et.al. 2408.11475 null
2024-08-19 Kubrick: Multimodal Agent Collaborations for Synthetic Video Generation Liu He et.al. 2408.10453 null
2024-08-19 Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data Tao Yang et.al. 2408.10119 null
2024-08-19 Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation Yunxin Li et.al. 2408.09787 link
2024-08-18 SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama Jing Tang et.al. 2408.09333 link
2024-08-21 JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Xiaochuang Han et.al. 2408.08459 null
2024-08-16 FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance Jiasong Feng et.al. 2408.08189 null
2024-08-15 When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding Pingping Zhang et.al. 2408.08093 null
2024-08-14 Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving Yuqing Wen et.al. 2408.07605 null
2024-08-15 ControlNeXt: Powerful and Efficient Control for Image and Video Generation Bohao Peng et.al. 2408.06070 link
2024-08-20 Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE Yiying Yang et.al. 2408.05477 null
2024-08-10 High-fidelity and Lip-synced Talking Face Synthesis via Landmark-based Diffusion Model Weizhi Zhong et.al. 2408.05416 null
2024-08-08 Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics Ruining Li et.al. 2408.04631 null
2024-08-05 VidGen-1M: A Large-Scale Dataset for Text-to-video Generation Zhiyu Tan et.al. 2408.02629 null
2024-08-01 Reenact Anything: Semantic Video Motion Transfer Using Motion-Textual Inversion Manuel Kansy et.al. 2408.00458 null
2024-07-31 Tora: Trajectory-oriented Diffusion Transformer for Video Generation Zhenghao Zhang et.al. 2407.21705 link
2024-07-31 Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation Junxuan Yu et.al. 2407.21490 null
2024-07-31 Fine-gained Zero-shot Video Sampling Dengsheng Chen et.al. 2407.21475 null
2024-07-31 Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model Zhichao Zhang et.al. 2407.21408 null
2024-08-04 Adding Multimodal Controls to Whole-body Human Motion Generation Yuxuan Bian et.al. 2407.21136 link
2024-07-30 EgoSonics: Generating Synchronized Audio for Silent Egocentric Videos Aashish Rai et.al. 2407.20592 null
2024-07-29 FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention Yu Lu et.al. 2407.19918 null
2024-07-29 Synthetic Thermal and RGB Videos for Automatic Pain Assessment utilizing a Vision-MLP Architecture Stefanos Gkikas et.al. 2407.19811 null
2024-07-28 FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models Changgu Chen et.al. 2407.19453 link
2024-07-27 Faster Image2Video Generation: A Closer Look at CLIP Image Embedding’s Impact on Spatio-Temporal Cross-Attentions Ashkan Taghipour et.al. 2407.19205 null
2024-07-26 UniForensics: Face Forgery Detection via General Facial Representation Ziyuan Fang et.al. 2407.19079 null
2024-07-24 SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency Yiming Xie et.al. 2407.17470 null
2024-07-28 HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation Zhenzhi Wang et.al. 2407.17438 link
2024-07-23 MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence Canyu Zhao et.al. 2407.16655 null
2024-07-23 Diffusion Transformer Captures Spatial-Temporal Dependencies: A Theory for Gaussian Process Data Hengyu Fu et.al. 2407.16134 null
2024-07-23 Fréchet Video Motion Distance: A Metric for Evaluating Motion Consistency in Videos Jiahe Liu et.al. 2407.16124 link
2024-07-21 Flow as the Cross-Domain Manipulation Interface Mengda Xu et.al. 2407.15208 null
2024-07-21 Anchored Diffusion for Video Face Reenactment Idan Kligvasser et.al. 2407.15153 null
2024-07-19 T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Kaiyue Sun et.al. 2407.14505 link
2024-07-19 Unlearning Concepts from Text-to-Video Diffusion Models Shiqi Liu et.al. 2407.14209 null
2024-07-25 Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion Boyang Deng et.al. 2407.13759 null
2024-07-18 Multi-sentence Video Grounding for Long Video Generation Wei Feng et.al. 2407.13219 null
2024-07-20 VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control Sherwin Bahmani et.al. 2407.12781 null
2024-07-17 Towards Understanding Unsafe Video Generation Yan Pang et.al. 2407.12581 link
2024-07-15 IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation Yuanhao Zhai et.al. 2407.10937 link
2024-07-15 A Survey of Defenses against AI-generated Visual Media: Detection, Disruption, and Authentication Jingyi Deng et.al. 2407.10575 null
2024-07-13 Learning Online Scale Transformation for Talking Head Video Generation Fa-Ting Hong et.al. 2407.09965 null
2024-07-12 Inference Optimization of Foundation Models on AI Accelerators Youngsuk Park et.al. 2407.09111 null
2024-07-16 Bora: Biomedical Generalist Video Generation Model Weixiang Sun et.al. 2407.08944 null
2024-07-11 Still-Moving: Customized Video Generation without Customized Video Data Hila Chefer et.al. 2407.08674 null
2024-07-11 A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights Wentao Lei et.al. 2407.08428 link
2024-07-11 E2VIDiff: Perceptual Events-to-Video Reconstruction using Diffusion Priors Jinxiu Liang et.al. 2407.08231 null
2024-07-10 VEnhancer: Generative Space-Time Enhancement for Video Generation Jingwen He et.al. 2407.07667 null
2024-07-10 Video-to-Audio Generation with Hidden Alignment Manjie Xu et.al. 2407.07464 null
2024-07-12 Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation Task Yiran Yang et.al. 2407.06617 link
2024-07-08 MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions Xuan Ju et.al. 2407.06358 null
2024-07-08 Dynamics of quantum turbulence in axially rotating thermal counterflow Ritesh Dwivedi et.al. 2407.06311 link
2024-07-08 VIMI: Grounding Video Generation through Multi-modal Instruction Yuwei Fang et.al. 2407.06304 null
2024-07-08 The Tug-of-War Between Deepfake Generation and Detection Hannah Lee et.al. 2407.06174 null
2024-07-08 T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models Yibo Miao et.al. 2407.05965 null
2024-07-08 This&That: Language-Gesture Controlled Video Generation for Robot Planning Boyang Wang et.al. 2407.05530 null
2024-07-05 Unsupervised Video Summarization via Reinforcement Learning and a Trained Evaluator Mehryar Abbasi et.al. 2407.04258 null
2024-07-03 Robot Shape and Location Retention in Video Generation Using Diffusion Models Peng Wang et.al. 2407.02873 link
2024-07-02 OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation Kepan Nan et.al. 2407.02371 null
2024-07-04 GVDIFF: Grounded Text-to-Video Generation with Diffusion Models Huanzhang Dou et.al. 2407.01921 null
2024-07-01 Evaluation of Text-to-Video Generation Models: A Dynamics Perspective Mingxiang Liao et.al. 2407.01094 link
2024-06-29 SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix Peng Dai et.al. 2407.00367 null
2024-06-28 MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance Yuang Zhang et.al. 2406.19680 null
2024-06-27 What Matters in Detecting AI-Generated Videos like Sora? Chirui Chang et.al. 2406.19568 null
2024-06-26 ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation Shenghai Yuan et.al. 2406.18522 link
2024-06-25 Text-Animator: Controllable Visual Text Video Generation Lin Liu et.al. 2406.17777 null
2024-06-25 MotionBooth: Motion-Aware Customized Text-to-Video Generation Jianzong Wu et.al. 2406.17758 null
2024-06-24 FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models Haonan Qiu et.al. 2406.16863 link
2024-06-24 Dreamitate: Real-World Visuomotor Policy Learning via Video Generation Junbang Liang et.al. 2406.16862 null
2024-06-24 Video-Infinity: Distributed Long Video Generation Zhenxiong Tan et.al. 2406.16260 null
2024-06-23 Listen and Move: Improving GANs Coherency in Agnostic Sound-to-Video Generation Rafael Redondo et.al. 2406.16155 null
2024-06-22 MVOC: a training-free multiple video object composition method with diffusion models Wei Wang et.al. 2406.15829 link
2024-06-24 VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Xuan He et.al. 2406.15252 null
2024-06-20 Fantastic Copyrighted Beasts and How (Not) to Generate Them Luxi He et.al. 2406.14526 null
2024-06-20 SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset Josef Dai et.al. 2406.14477 link
2024-06-20 Video Generation with Learned Action Prior Meenakshi Sarkar et.al. 2406.14436 null
2024-06-20 ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning Zhongjie Duan et.al. 2406.14130 link
2024-06-19 Splatter a Video: Video Gaussian Representation for Versatile Processing Yang-Tian Sun et.al. 2406.13870 null
2024-06-21 GenAI-Bench: Evaluating and Improving Compositional Text-to-Visual Generation Baiqi Li et.al. 2406.13743 link
2024-06-19 ARDuP: Active Region Video Diffusion for Universal Policies Shuaiyi Huang et.al. 2406.13301 null
2024-06-19 Neural Residual Diffusion Models for Deep Scalable Vision Generation Zhiyuan Ma et.al. 2406.13215 null
2024-06-18 Generative Artificial Intelligence-Guided User Studies: An Application for Air Taxi Services Shengdi Xiao et.al. 2406.12296 null
2024-06-17 NLDF: Neural Light Dynamic Fields for Efficient 3D Talking Head Generation Niu Guanchen et.al. 2406.11259 null
2024-06-17 Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion Rishab Parthasarathy et.al. 2406.11196 link
2024-06-16 ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models Kaifeng Gao et.al. 2406.10981 link
2024-06-14 VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs Rohit Bharadwaj et.al. 2406.10326 link
2024-06-14 Training-free Camera Control for Video Generation Chen Hou et.al. 2406.10126 null
2024-06-13 Turns Out I’m Not Real: Towards Robust Detection of AI-Generated Videos Qingyuan Liu et.al. 2406.09601 null
2024-06-13 Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs Zijia Zhao et.al. 2406.09367 link
2024-06-12 Vivid-ZOO: Multi-View Video Generation with Diffusion Model Bing Li et.al. 2406.08659 null
2024-06-12 TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation Weixi Feng et.al. 2406.08656 link
2024-06-12 DiTFastAttn: Attention Compression for Diffusion Transformer Models Zhihang Yuan et.al. 2406.08552 null
2024-06-12 Hierarchical Patch Diffusion Models for High-Resolution Video Generation Ivan Skorokhodov et.al. 2406.07792 null
2024-06-11 HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness Zihui Xue et.al. 2406.07754 null
2024-06-11 AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Kai Wang et.al. 2406.07686 null
2024-06-11 4Real: Towards Photorealistic 4D Scene Generation via Video Diffusion Models Heng Yu et.al. 2406.07472 null
2024-06-11 Visual Representation Learning with Stochastic Frame Prediction Huiwon Jang et.al. 2406.07398 null
2024-06-09 Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion Ge Ya Luo et.al. 2406.05630 link
2024-06-12 MotionClone: Training-Free Motion Cloning for Controllable Video Generation Pengyang Ling et.al. 2406.05338 link
2024-06-07 CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion Xingrui Wang et.al. 2406.05082 null
2024-06-07 Zero-Shot Video Editing through Adaptive Sliding Score Distillation Lianghan Zhu et.al. 2406.04888 null
2024-06-07 Online Continual Learning of Video Diffusion Models From a Single Video Stream Jason Yoo et.al. 2406.04814 null
2024-06-06 GenAI Arena: An Open Evaluation Platform for Generative Models Dongfu Jiang et.al. 2406.04485 null
2024-06-06 ShareGPT4Video: Improving Video Understanding and Generation with Better Captions Lin Chen et.al. 2406.04325 null
2024-06-06 SF-V: Single Forward Video Generation Model Zhixing Zhang et.al. 2406.04324 link
2024-06-06 VideoTetris: Towards Compositional Text-to-Video Generation Ye Tian et.al. 2406.04277 link
2024-06-05 VideoPhy: Evaluating Physical Commonsense for Video Generation Hritik Bansal et.al. 2406.03520 null
2024-06-05 Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control Jingyun Xue et.al. 2406.03035 null
2024-06-04 ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation Tianchen Zhao et.al. 2406.02540 link
2024-06-04 V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation Cong Wang et.al. 2406.02511 null
2024-06-04 CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation Dejia Xu et.al. 2406.02509 null
2024-06-04 I4VGen: Image as Stepping Stone for Text-to-Video Generation Xiefan Guo et.al. 2406.02230 null
2024-06-04 Learning Temporally Consistent Video Depth from Video Diffusion Priors Jiahao Shao et.al. 2406.01493 null
2024-06-03 DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors Tianyu Huang et.al. 2406.01476 link
2024-06-04 Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation Enhui Ma et.al. 2406.01349 null
2024-06-03 UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation Xiang Wang et.al. 2406.01188 null
2024-06-03 ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation Shaoshu Yang et.al. 2406.00908 link
2024-06-02 EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing Hadrien Reynaud et.al. 2406.00808 link
2024-05-31 4Diffusion: Multi-view Video Diffusion Model for 4D Generation Haiyu Zhang et.al. 2405.20674 null
2024-05-30 Improving the Training of Rectified Flows Sangyun Lee et.al. 2405.20320 link
2024-05-30 CV-VAE: A Compatible Video VAE for Latent Generative Video Models Sijie Zhao et.al. 2405.20279 link
2024-06-02 MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model Muyao Niu et.al. 2405.20222 link
2024-05-30 Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion Jiangkai Wu et.al. 2405.20032 link
2024-05-30 DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark Haoxing Chen et.al. 2405.19707 link
2024-05-29 EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture Jiaqi Xu et.al. 2405.18991 link
2024-05-29 T2V-Turbo: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback Jiachen Li et.al. 2405.18750 link
2024-05-28 Phased Consistency Model Fu-Yun Wang et.al. 2405.18407 link
2024-05-28 RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives Jaehong Yoon et.al. 2405.18406 link
2024-05-28 VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers Jun Zheng et.al. 2405.18326 null
2024-05-28 EG4D: Explicit Generation of 4D Object without Score Distillation Qi Sun et.al. 2405.18132 link
2024-05-28 MAVIN: Multi-Action Video Generation with Diffusion Models via Transition Video Infilling Bowen Zhang et.al. 2405.18003 link
2024-05-28 Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation Akio Hayakawa et.al. 2405.17842 link
2024-05-27 RefDrop: Controllable Consistency in Image or Video Generation via Reference Feature Guidance Jiaojiao Fan et.al. 2405.17661 null
2024-05-27 ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance Jiannan Huang et.al. 2405.17532 link
2024-05-27 Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control Zhengfei Kuang et.al. 2405.17414 null
2024-05-27 Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer Ruizhi Shao et.al. 2405.17405 null
2024-05-27 Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability Shenyuan Gao et.al. 2405.17398 link
2024-05-28 Controllable Longer Image Animation with Diffusion Models Qiang Wang et.al. 2405.17306 null
2024-05-27 Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation Zhoujie Fu et.al. 2405.16849 null
2024-05-27 Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels Yikai Wang et.al. 2405.16822 null
2024-05-26 Towards Multi-Task Multi-Modal Models: A Video Generative Perspective Lijun Yu et.al. 2405.16728 null
2024-05-28 Disentangling Foreground and Background Motion for Enhanced Realism in Human Video Generation Jinlin Liu et.al. 2405.16393 null
2024-05-25 Video Prediction Models as General Visual Encoders James Maier et.al. 2405.16382 null
2024-05-24 Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation Shentong Mo et.al. 2405.15881 null
2024-05-24 A Misleading Gallery of Fluid Motion by Generative Artificial Intelligence Ali Kashefi et.al. 2405.15406 link
2024-05-24 iVideoGPT: Interactive VideoGPTs are Scalable World Models Jialong Wu et.al. 2405.15223 link
2024-05-23 Video Diffusion Models are Training-free Motion Interpreter and Controller Zeqi Xiao et.al. 2405.14864 null
2024-05-24 Fisher Flow Matching for Generative Modeling over Discrete Data Oscar Davis et.al. 2405.14664 null
2024-05-24 PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control Yong Zhong et.al. 2405.14582 null
2024-05-23 MagicDrive3D: Controllable 3D Generation for Any-View Rendering in Street Scenes Ruiyuan Gao et.al. 2405.14475 null
2024-05-22 ReVideo: Remake a Video with Motion and Content Control Chong Mou et.al. 2405.13865 null
2024-05-22 MotionCraft: Physics-based Zero-Shot Video Generation Luca Savant Aira et.al. 2405.13557 link
2024-05-22 Enhanced Creativity and Ideation through Stable Video Synthesis Elijah Miller et.al. 2405.13357 null
2024-05-21 CamViG: Camera Aware Image-to-Video Generation with Multimodal Transformers Andrew Marmon et.al. 2405.13195 null
2024-05-21 OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models Zhaojian Yu et.al. 2405.12843 link
2024-05-21 DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control Hong Chen et.al. 2405.12796 null
2024-05-19 FIFO-Diffusion: Generating Infinite Videos from Text without Training Jihwan Kim et.al. 2405.11473 link
2024-05-17 From Sora What We Can See: A Survey of Text-to-Video Generation Rui Sun et.al. 2405.10674 link
2024-05-15 Dance Any Beat: Blending Beats with Visuals in Dance Video Generation Xuanchen Wang et.al. 2405.09266 null
2024-05-13 The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective Andrew Shin et.al. 2405.08720 null
2024-05-10 OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation Jinwei Lin et.al. 2405.06547 link
2024-05-08 Reviewing Intelligent Cinematography: AI research for camera-based video production Adrian Azzarelli et.al. 2405.05039 null
2024-05-15 TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation Hritik Bansal et.al. 2405.04682 link
2024-05-07 Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation Dogucan Yaman et.al. 2405.04327 null
2024-05-07 Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models Fan Bao et.al. 2405.04233 null
2024-05-07 Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models Zhixuan Chu et.al. 2405.04180 link
2024-05-07 Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method Peisong He et.al. 2405.04133 null
2024-05-06 Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond Zheng Zhu et.al. 2405.03520 link
2024-05-06 Video Diffusion Models: A Survey Andrew Melnik et.al. 2405.03150 link
2024-05-10 Matten: Video Generation with Mamba-Attention Yu Gao et.al. 2405.03025 null
2024-05-02 StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Yupeng Zhou et.al. 2405.01434 link
2024-05-05 VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization Yuliang Liu et.al. 2404.19652 link
2024-04-30 Bridge to Non-Barrier Communication: Gloss-Prompted Fine-grained Cued Speech Gesture Generation with Diffusion Model Wentao Lei et.al. 2404.19277 null
2024-04-29 FlexiFilm: Long Video Generation with Flexible Conditions Yichen Ouyang et.al. 2404.18620 link
2024-04-25 Synthesizing Audio from Silent Video using Sequence to Sequence Modeling Hugo Garrido-Lestache Belinchon et.al. 2404.17608 link
2024-04-25 TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models Haomiao Ni et.al. 2404.16306 link
2024-04-26 Semantically consistent Video-to-Audio Generation using Multimodal Language Large Model Gehui Chen et.al. 2404.16305 null
2024-04-24 Beyond Deepfake Images: Detecting AI-Generated Videos Danial Samadi Vahdati et.al. 2404.15955 null
2024-05-01 MotionMaster: Training-free Camera Motion Transfer For Video Generation Teng Hu et.al. 2404.15789 null
2024-04-23 ID-Animator: Zero-Shot Identity-Preserving Human Video Generation Xuanhua He et.al. 2404.15275 link
2024-04-22 TAVGBench: Benchmarking Text to Audible-Video Generation Yuxin Mao et.al. 2404.14381 link
2024-04-23 Accelerating Image Generation with Sub-path Linear Approximation Model Chen Xu et.al. 2404.13903 null
2024-04-27 Exploring AIGC Video Quality: A Focus on Visual Harmony, Video-Text Consistency and Domain Distribution Gap Bowen Qu et.al. 2404.13573 link
2024-04-21 Motion-aware Latent Diffusion Models for Video Frame Interpolation Zhilin Huang et.al. 2404.13534 null
2024-04-20 Music Consistency Models Zhengcong Fei et.al. 2404.13358 null
2024-04-19 PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation Tianyuan Zhang et.al. 2404.13026 null
2024-04-19 ConCLVD: Controllable Chinese Landscape Video Generation via Diffusion Model Dingming Liu et.al. 2404.12903 null
2024-04-18 On the Content Bias in Fréchet Video Distance Songwei Ge et.al. 2404.12391 null
2024-04-18 RoboDreamer: Learning Compositional World Models for Robot Imagination Siyuan Zhou et.al. 2404.12377 null
2024-04-18 AniClipart: Clipart Animation with Text-to-Video Priors Ronghuan Wu et.al. 2404.12347 null
2024-04-15 Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model Han Lin et.al. 2404.09967 null
2024-04-16 LoopAnimate: Loopable Salient Object Animation Fanyi Wang et.al. 2404.09172 null
2024-04-13 THQA: A Perceptual Quality Assessment Database for Talking Heads Yingjie Zhou et.al. 2404.09003 link
2024-04-16 LoopGaussian: Creating 3D Cinemagraph with Multi-view Images via Eulerian Motion Field Jiyang Li et.al. 2404.08966 link
2024-04-10 A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos Suleyman Ozdel et.al. 2404.07351 null
2024-04-08 Action-conditioned video data improves predictability Meenakshi Sarkar et.al. 2404.05439 null
2024-04-07 MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators Shenghai Yuan et.al. 2404.05014 link
2024-04-07 AnimateZoo: Zero-shot Video Generation of Cross-Species Animation via Subject Alignment Yuanfeng Xu et.al. 2404.04946 null
2024-04-02 CameraCtrl: Enabling Camera Control for Text-to-Video Generation Hao He et.al. 2404.02101 link
2024-04-02 Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model Xu He et.al. 2404.01862 link
2024-03-28 A Review of Multi-Modal Large Language and Vision Models Kilian Carolan et.al. 2404.01322 null
2024-04-01 Evaluating Text-to-Visual Generation with Image-to-Text Generation Zhiqiu Lin et.al. 2404.01291 link
2024-03-30 Grid Diffusion Models for Text-to-Video Generation Taegyeong Lee et.al. 2404.00234 null
2024-03-29 Motion Inversion for Video Customization Luozhou Wang et.al. 2403.20193 null
2024-03-28 Frame by Familiar Frame: Understanding Replication in Video Diffusion Models Aimon Rahman et.al. 2403.19593 null
2024-03-26 Tutorial on Diffusion Models for Imaging and Vision Stanley H. Chan et.al. 2403.18103 null
2024-03-26 TC4D: Trajectory-Conditioned Text-to-4D Generation Sherwin Bahmani et.al. 2403.17920 null
2024-03-26 Annotated Biomedical Video Generation using Denoising Diffusion Probabilistic Models and Flow Fields Rüveyda Yilmaz et.al. 2403.17808 link
2024-03-25 TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models Zhongwei Zhang et.al. 2403.17005 null
2024-03-25 A Survey on Long Video Generation: Challenges, Methods, and Prospects Chengxuan Li et.al. 2403.16407 null
2024-03-24 Opportunities and challenges in the application of large artificial intelligence models in radiology Liangrui Pan et.al. 2403.16112 null
2024-03-23 Adaptive Super Resolution For One-Shot Talking-Head Generation Luchuan Song et.al. 2403.15944 link
2024-03-22 Spectral Motion Alignment for Video Motion Transfer using Diffusion Models Geon Yeong Park et.al. 2403.15249 null
2024-03-21 StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text Roberto Henschel et.al. 2403.14773 link
2024-03-21 Explorative Inbetweening of Time and Space Haiwen Feng et.al. 2403.14611 null
2024-03-22 AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks Max Ku et.al. 2403.14468 link
2024-03-21 Enabling Visual Composition and Animation in Unsupervised Video Generation Aram Davtyan et.al. 2403.14368 null
2024-03-21 StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN Jongwoo Choi et.al. 2403.14186 link
2024-03-21 Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition Sihyun Yu et.al. 2403.14148 null
2024-03-20 Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation Fu-Yun Wang et.al. 2403.13745 link
2024-03-22 S2DM: Sector-Shaped Diffusion Models for Video Generation Haoran Lang et.al. 2403.13408 null
2024-03-22 Mora: Enabling Generalist Video Generation via A Multi-Agent Framework Zhengqing Yuan et.al. 2403.13248 link
2024-03-19 AnimateDiff-Lightning: Cross-Model Diffusion Distillation Shanchuan Lin et.al. 2403.12706 null
2024-03-18 CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility Bojia Zi et.al. 2403.12035 link
2024-03-18 VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model Qi Zuo et.al. 2403.12010 null
2024-03-19 Subjective-Aligned Dateset and Metric for Text-to-Video Quality Assessment Tengchuan Kou et.al. 2403.11956 link
2024-03-18 Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing Juan Zhang et.al. 2403.11700 null
2024-03-17 Endora: Video Generation Models as Endoscopy Simulators Chenxin Li et.al. 2403.11050 null
2024-03-15 DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers Xuanlei Zhao et.al. 2403.10266 link
2024-03-15 Animate Your Motion: Turning Still Images into Dynamic Videos Mingxiao Li et.al. 2403.10179 null
2024-03-14 Video Editing via Factorized Diffusion Distillation Uriel Singer et.al. 2403.09334 null
2024-03-17 Intention-driven Ego-to-Exo Video Generation Hongchen Luo et.al. 2403.09194 null
2024-03-13 VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis Enric Corona et.al. 2403.08764 null
2024-03-13 Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts Yue Ma et.al. 2403.08268 link
2024-03-12 AesopAgent: Agent-driven Evolutionary System on Story-to-Video Production Jiuniu Wang et.al. 2403.07952 null
2024-03-10 WorldGPT: A Sora-Inspired Video AI Agent as Rich World Models from Text and Image Inputs Deshun Yang et.al. 2403.07944 null
2024-03-12 SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces Yuta Oshima et.al. 2403.07711 link
2024-03-15 DragAnything: Motion Control for Anything using Entity Representation Weijia Wu et.al. 2403.07420 link
2024-03-11 DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation Guosheng Zhao et.al. 2403.06845 null
2024-03-11 A Comparative Study of Perceptual Quality Metrics for Audio-driven Talking Head Videos Weixia Zhang et.al. 2403.06421 link
2024-03-11 Video Generation with Consistency Tuning Chaoyi Wang et.al. 2403.06356 null
2024-03-10 FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing Youyuan Zhang et.al. 2403.06269 null
2024-03-10 BlazeBVD: Make Scale-Time Equalization Great Again for Blind Video Deflickering Xinmin Qiu et.al. 2403.06243 null
2024-03-10 VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models Wenhao Wang et.al. 2403.06098 link
2024-03-08 VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models Yabo Zhang et.al. 2403.05438 link
2024-03-08 Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation Joseph Cho et.al. 2403.05131 null
2024-03-07 A spatiotemporal style transfer algorithm for dynamic visual stimulus generation Antonino Greco et.al. 2403.04940 null
2024-03-08 Pix2Gif: Motion-Guided Diffusion for GIF Generation Hitesh Kandala et.al. 2403.04634 link
2024-03-05 Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation Weijie Li et.al. 2403.02827 null
2024-03-06 UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control Xuweiyi Chen et.al. 2403.02332 link
2024-03-05 AtomoVideo: High Fidelity Image-to-Video Generation Litong Gong et.al. 2403.01800 null
2024-03-02 SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code Ziniu Hu et.al. 2403.01248 null
2024-03-01 Abductive Ego-View Accident Video Understanding for Safe Driving Perception Jianwu Fang et.al. 2403.00436 null
2024-02-29 Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Tsai-Shien Chen et.al. 2402.19479 null
2024-02-28 Context-aware Talking Face Video Generation Meidai Xuanyuan et.al. 2402.18092 null
2024-02-27 EMO: Emote Portrait Alive – Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Linrui Tian et.al. 2402.17485 null
2024-02-27 Sora Generates Videos with Stunning Geometrical Consistency Xuanyi Li et.al. 2402.17403 null
2024-02-28 Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Yixin Liu et.al. 2402.17177 link
2024-02-27 Video as the New Language for Real-World Decision Making Sherry Yang et.al. 2402.17139 null
2024-02-22 Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis Willi Menapace et.al. 2402.14797 null
2024-02-22 Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models Yixuan Ren et.al. 2402.14780 null
2024-02-21 Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation Kihong Kim et.al. 2402.13729 null
2024-02-24 UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing Jianhong Bai et.al. 2402.13185 null
2024-02-20 Neural Network Diffusion Kai Wang et.al. 2402.13144 link
2024-02-20 VGMShield: Mitigating Misuse of Video Generative Models Yan Pang et.al. 2402.13126 link
2024-02-19 Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same Sungjun Ahn et.al. 2402.12412 null
2024-02-16 Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation Lanqing Guo et.al. 2402.10491 link
2024-02-14 Magic-Me: Identity-Specific Video Customized Diffusion Ze Ma et.al. 2402.09368 link
2024-02-10 Denoising Diffusion Probabilistic Models in Six Simple Steps Richard E. Turner et.al. 2402.04384 null