scene4D
updated
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large
Language Models
Paper
•
2503.10437
•
Published
•
33
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in
$200k
Paper
•
2503.09642
•
Published
•
19
VGGT: Visual Geometry Grounded Transformer
Paper
•
2503.11651
•
Published
•
35
1000+ FPS 4D Gaussian Splatting for Dynamic Scene Rendering
Paper
•
2503.16422
•
Published
•
14
SynCity: Training-Free Generation of 3D Worlds
Paper
•
2503.16420
•
Published
•
27
M3: 3D-Spatial MultiModal Memory
Paper
•
2503.16413
•
Published
•
15
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
Paper
•
2503.18470
•
Published
•
3
Any6D: Model-free 6D Pose Estimation of Novel Objects
Paper
•
2503.18673
•
Published
•
3
FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D
Object Placement
Paper
•
2503.04919
•
Published
•
8
FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View
Synthesis
Paper
•
2503.13265
•
Published
•
15
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile
Gaussian Feature Fields
Paper
•
2503.20776
•
Published
•
10
Segment Any Motion in Videos
Paper
•
2503.22268
•
Published
•
19
Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal
Consistency
Paper
•
2503.20785
•
Published
•
22
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in
One Step
Paper
•
2504.01956
•
Published
•
41
TAPIP3D: Tracking Any Point in Persistent 3D Geometry
Paper
•
2504.14717
•
Published
•
8
Towards Understanding Camera Motions in Any Video
Paper
•
2504.15376
•
Published
•
155
EmbodiedGen: Towards a Generative 3D World Engine for Embodied
Intelligence
Paper
•
2506.10600
•
Published
•
8
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated
Video Streams
Paper
•
2506.08862
•
Published
•
5
PlayerOne: Egocentric World Simulator
Paper
•
2506.09995
•
Published
•
34
π^3: Scalable Permutation-Equivariant Visual Geometry Learning
Paper
•
2507.13347
•
Published
•
65
SpatialTrackerV2: 3D Point Tracking Made Easy
Paper
•
2507.12462
•
Published
•
18
PhysX: Physical-Grounded 3D Asset Generation
Paper
•
2507.12465
•
Published
•
43
Streaming 4D Visual Geometry Transformer
Paper
•
2507.11539
•
Published
•
14
Yume: An Interactive World Generation Model
Paper
•
2507.17744
•
Published
•
88
Reconstructing 4D Spatial Intelligence: A Survey
Paper
•
2507.21045
•
Published
•
35
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D
Worlds from Words or Pixels
Paper
•
2507.21809
•
Published
•
136
NeRF Is a Valuable Assistant for 3D Gaussian Splatting
Paper
•
2507.23374
•
Published
•
11
DreamScene: 3D Gaussian-based End-to-end Text-to-3D Scene Generation
Paper
•
2507.13985
•
Published
•
6
Matrix-3D: Omnidirectional Explorable 3D World Generation
Paper
•
2508.08086
•
Published
•
75
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction,
Forecasting, and Generation
Paper
•
2508.01126
•
Published
•
5
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior
Integration
Paper
•
2508.11379
•
Published
•
12
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer
Paper
•
2508.10893
•
Published
•
31
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian
Splatting
Paper
•
2508.17811
•
Published
•
6
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from
Pixels
Paper
•
2508.17437
•
Published
•
38
DA^2: Depth Anything in Any Direction
Paper
•
2509.26618
•
Published
•
25
TTT3R: 3D Reconstruction as Test-Time Training
Paper
•
2509.26645
•
Published
•
14
Game-TARS: Pretrained Foundation Models for Scalable Generalist
Multimodal Game Agents
Paper
•
2510.23691
•
Published
•
53