mesh

Table of Contents

1. OTHERS

1.1. DIFFERENT PRIMITIVE

1.1.1. FLEXIDREAMER

  • FlexiDreamer: Single Image-to-3D Generation with FlexiCubes
    • leveraging a flexible gradient-based extraction known as FlexiCubes
    • direct acquisition of the target mesh
    • 1 minute, first extracting mesh then texturing

1.2. MESH HUMAN FEEDBACK

2. CAD

  • BrepGen: A B-rep Generative Diffusion Model with Structured Latent Geometry
    • structured latent geometry in a hierarchical tree, representing a whole CAD solid
      • generating complicated geometry

3. FROM SOME SOURCE

  • ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections
    • using 2d diffusion as prior
  • PIXEL ALIGNMENT

3.1. SKETCH

  • Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation
    • both sketches and CADs into meshes
    • without requiring any paired datasets during training
  • Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation (sketch control to 3D generation)
  • 3Doodle: Compact Abstraction of Objects with 3D Strokes
    • abstract sketches containing 3D characteristic shapes of the objects
  • Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation
    • generating realistic 3D Gaussians consistent with color-style described in textual description

3.2. SIMPLIFICATION

  • Point-Cloud Completion with Pretrained Text-to-image Diffusion Models
    • from incomplete point cloud to 3d model
  • Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives
    • turned into big blocks; interpretable, easy to manipulate and suited for physics-based simulations
  • FlexiCubes: Flexible Isosurface Extraction for Gradient-Based Mesh Optimization
    • hierarchically-adaptive meshes, local flexible adjustments
  • ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance
    • learning to combine multiple models
    • adjust their sizes, rotation angles, and locations to create a 3D asset that matches the given image
  • GetMesh: A Controllable Model for High-quality Mesh Generation and Manipulation
    • mesh autoencoder, e-organized to the triplane representation by projecting the points to the triplane
    • combining mesh parts across categories, adding/removing mesh parts,

3.2.1. LIFTING

  • 3D-LFM: Lifting Foundation Model =best=
    • lfting of 3D structure and camera from 2D landmarks
  • 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D
    • utilize multi-view 3D reconstruction to fuse generated MV images into consistent 3D objects

3.3. IMAGE TO 3D

  • SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections (landscapes, scenarios)
  • Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation
    • image + text + shape prior = mesh
  • One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization
    • use sd to force consistency of diffusion generation
    • Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors =best=
      • for both stages; first stage coarse geometry first then details
  • DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model (generates nerf)
  • ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation =best=
  • Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
    • subject-specific and multi-modal diffusion model; gets NeRF
    • enhances texture from coarse
  • DREAMDISTRIBUTION
  • HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D
  • ZeroShape: Regression-based Zero-shot Shape Reconstruction (just the shape)
    • rained to directly regress the object shape, computationally efficient
  • FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
    • extract 3D geometric features from the 2D input
    • incorporating attention to fuse images from different viewpoints
  • GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
    • encodes the image, depth, and normal into latent space, fed into U-Net to generate
  • T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image
    • coarse-to-fine to refine mesh details
  • Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion
    • refines coarse generative results, high consistency

3.3.1. FAST

  • Splatter Image: Ultra-Fast Single-View 3D Reconstruction
  • TGS: Triplane meets Gaussian Splatting =best=
    • fast single-view 3d reconstruction with transformers
3.3.1.1. LRM
  • LRM: Large Reconstruction Model for Single Image to 3D (5 seconds)
    • learn to directly predict NeRF from image
  • OpenLRM: Open-Source Large Reconstruction Models
    • Image-to-3D in 10+ seconds!

3.3.2. ZERO-1-TO-3

  • Zero-1-to-3: Zero-shot One Image to 3D Object: Zero123Plus is model
    • DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior =best=
      • image to guide the geometry sculpting and texture boosting
      • imbuing it with 3D knowledge of the scene being optimized = view-consistent guidance for the scene
    • Wonder3D: Single Image to 3D using Cross-Domain Diffusion =best=
      • geometry-aware normal fusion algorithm that merges the generated 2D representations
      • PERFLOW
    • iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views
      • repurposing Zero123 for camera pose estimation
  • ViewFusion: Towards Multi-View Consistency via Interpolated Denoising
    • auto-regressive method, previously generated views as context for the next view generation
3.3.2.1. ISOTROPIC3D
  • Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding
    • fine-tune a text-to-3D diffusion model by substituting its text encoder with an image encoder
    • single image CLIP embedding to generating multi-view images
3.3.2.2. HYPERDREAMER
  • HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image
    • interactively select region via clicks and edit the texture text guidance
    • modeling region-aware materials with high-resolution textures and enabling user-friendly editing
    • guidance: semantic segmentation and data-driven priors(albedo, roughness, specular properties)
3.3.2.3. ESCHERNET
  • EscherNet: A Generative Model for Scalable View Synthesis
    • self-attention with M target views to ensure target consistency, more consistent across frames
3.3.2.4. TRIPOSR
  • TripoSR:
    • capable of creating high-quality outputs in less than a second
    • USDZ format
3.3.2.5. FDGAUSSIAN
  • FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model
    • extract 3D geometric features from the 2D input
3.3.2.6. MVD-FUSION
  • MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation
    • output to multiple 3D representations such as point clouds, textured meshes, and Gaussian splats

4. MESH EDITION

  • SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds
    • 3D editing directly by a feed-forward network, one second per edit
  • Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting
    • visibility-aware adaptive repainting (view-consistent)
  • IMAGE SCULPTING =best=
  • STABLEIDENTITY inserting identity
  • threefiner: interface for text-guided mesh refinement

4.1. INPAINTING

  • SC-Diff: 3D Shape Completion with Latent Diffusion Models
    • realistic shape completions at superior resolutions

4.2. MESH TO MESH

  • Generic 3D Diffusion Adapter Using Controlled Multi-View Editing
    • MVEdit as 3D counterpart of SDEdit
    • 3D Adapter lifts 2D; also for texture generation

4.3. MOUSE EDITING

  • MagicClay: Sculpting Meshes With Generative Neural Fields
    • maintains s consistency between both a mesh and a Signed Distance Field (SDF) representations

4.3.1. DRAGTEX

  • DragTex: Generative Point-Based Texture Editing on 3D Mesh
    • draggint the texture
    • diffusion model to blend locally inconsistent textures between different views
      • enabling locally consistent texture editing

5. MULTIVIEW DIFFUSION

  • 1.1.1
  • MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion, panorama
    • MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single to Sparse-view 3D Object Reconstruction
      • 2D latent features learns 3D consistency
  • Zero123++: a Single Image to Consistent Multi-view Diffusion
  • ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
    • single-image novel view synthesis; multi-object scenes with complex backgrounds, indoor and outdoor
    • camera conditioning parameterization and normalization scheme
  • LIFTING 3.3.2
  • HOLODIFFUSION: Training a 3D Diffusion Model using 2D Images

5.1. FROM VIDEO

  • IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
    • using video-diffusion instead, reduces the number of evaluations of the 2D generator network 10-100x
  • V3D: Video Diffusion Models are Effective 3D Generators
    • given a single image

5.2. FROM SD

  • CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models
    • start from sd checkpoints
    • 3D novel view synthesis into object customization process =best=
    • with flexible background control
  • DreamComposer: Controllable 3D Object Generation via Multi-View Conditions =best=
    • view-aware 3D lifting module to obtain 3D representations which are injected into 2d-model
  • 7.3.1.1

5.2.1. MVDREAM

  • MVDream: Multi-view Diffusion for 3D Generation =best one=
    • generate nerf without janus, from normal 2d diffusion
    • sd + multi-view dataset rendered from 3D assets
    • 2D diffusion + consistency of 3D data
    • SPAD: Spatially Aware Multiview Diffusers
      • utilize Plucker coordinates derived from camera rays and inject them as positional encoding
      • janus issue resolved
      • leverage the multi-view Score Distillation Sampling (SDS) for 3D asset generation

5.3. NOVEL VIEW

  • GeNVS: Generative Novel View Synthesis with 3D-Aware Diffusion Models (1 image to 3d video)
    • MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
      • generate new consistent perspective view
  • SyncDreamer: Generating Multiview-consistent Images from a Single-view Image
    • multiview diffusion model that models the joint probability distribution of multiview images
    • 3D-aware feature attention
  • Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
    • Multi-view Reconstruction Consistency (MRC) metric
  • POSE - POSITION

5.3.1. GAUSSIAN NOVEL VIEW

  • MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
    • via plane sweeping in the 3D space, geometric cues estimating depth, 10× fewer 2× faster
  • SNAP-IT

5.3.2. DREAMDISTRIBUTION

  • DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models =best=
    • finds a prompt distribution of reference images, then use it to generates new 2D-3D instances

5.3.3. 3D-AWARE IMAGE EDITING

  • Reference-Based 3D-Aware Image Editing with Triplane
    • Leveraging 3D-aware triplanes, our edits are versatile, allowing for rendering from various viewpoints

6. TEXTURES

  • TEXTURE
  • Learning Disentangled Avatars with Hybrid 3D Representations
    • hair, face, body and clothing can be learned then disentangled yet jointly rendered
  • Paint Anything 3D with Lighting-Less Texture Diffusion Models (Tencent) =best=
    • lighting-less textures; image painted over input model, generate texture from text

6.1. SCENE TEXTURES

  • SCENE SYNTHESIS
  • Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models (textures)
  • DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation (vr)
    • coarse-to-fine panoramic texture generation which both considers geometry and texture cues
      • imagine a panoramic then propagate it with inpainting

6.2. MATERIAL

  • ControlMat: A Controlled Generative Approach to Material Capture; diffusion
    • uncontrolled illumination as input; get variety of materials which could correspond to the input image
  • TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
    • synthesize textures for given 3D geometries
    • aggregate the different denoising predictions on a shared latent texture map
  • Alchemist: Parametric Control of Material Properties with Diffusion Models
    • control material attributes of objects like roughness, metallic, albedo, and transparency in real images
    • edit material properties in real-world images while preserving all other attributes
  • TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
    • relightable textures from a small number of input images to target 3D shapes
  • UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures
    • removes lighting effects to then rendered under various lighting conditions
  • Holo-Gen: by Unity, generate physically-based rendering (PBR) material properties for 3D objects
  • FlashTex: Fast Relightable Mesh Texturing with LightControlNet
    • texturing an input 3D mesh with prompt, high-quality and relightable textures
    • based on the ControlNet architecture
  • 3DStyle-Diffusion: Pursuing Fine-grained Text-driven 3D Stylization with 2D Diffusion Models =best=
    • given input mesh, use 2d diffusion to generate coherent and relightable textures exploiting depth maps
  • 4.2
  • SyncTweedies: A General Generative Framework Based on Synchronized Diffusions
    • denoising in multiple instance spaces
    • 3D Mesh Texturing, gaussian splat texturing, depth to panorama
  • MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment
    • sd as a prior to texture a 3D model

6.2.1. FURTHER ENHANCEMENT

  • Enhancing Texture Generation with High-Fidelity Using Advanced Texture Priors
    • rough texture as the initial input, texture enhancement, eliminating noise and ”point gaps”

6.2.2. HAIR

  • Neural Haircut: Prior-Guided Strand-Based Hair Reconstruction, realism hair modeling, personalization
  • CT2Hair: High-Fidelity 3D Hair Modeling using Computed Tomography
    • real-world hair wigs as input, populates dense strands
  • HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles
    • latent diffusion model that operates in a common hairstyle UV space

6.2.3. CLOTH

  • parent: diffusion
  • DANCING
  • Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models
    • clothes mesh diffusion using text generation
  • TryOnDiffusion: A Tale of Two UNets (dress one with clothes from another, pose-body change)
  • DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion
  • X-MDPT: Cross-view Masked Diffusion Transformers for Person Image Synthesis
    • change the pose keep the clothes, employs masked diffusion transformers on latent patches
  • OOTDiffusion: A highly controllable open source tool for virtual clothing try-on =best one=
  • GALA: Generating Animatable Layered Assets from a Single Scan
    • from one image get all the clothes in independant separated segmentated pieces-parts
  • DressCode: Autoregressively Sewing and Generating Garments from Text Guidance
    • generate sewing patterns with text guidance, also for editing
6.2.3.1. GEN MESH
  • Garment3DGen: 3D Garment Stylization and Texture Generation
    • synthesize 3D garment assets from a base mesh given a single input image as guidance
  • Design2Cloth: 3D Cloth Generation from 2D Masks
    • synthesize the decomposed 3d meshes from single image

6.3. FACE

6.3.1. AVATAR

  • TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
    • diffusion model, imagines it, outputs geometry and texture
  • Relightable and Animatable Neural Avatar from Sparse-View Video
    • relightable neural avatars from monocular inputs; pose AND surface intersection, light visibility
  • SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance
    • geometry is constrained to human-body prior; high-quality meshes and textures
  • AlteredAvatar: Stylizing Dynamic 3D Avatars with Fast Style Adaptation
  • GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning
    • SDF-based implicit mesh learning; primitive-based 3D Gaussian representation to facilitate animation
  • Text2Avatar: Text to 3D Human Avatar Generation with Codebook-Driven Body Controllable Attribute
    • intermediate codebook features
  • AniDress: Animatable Loose-Dressed Avatar from Sparse Views Using Garment Rigging Model
    • generating animatable human avatars in loose clothes from sparse multi-view videos
  • M3Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and Editing
    • multi-modal framework for face generation and editing
  • One2Avatar: Generative Implicit Head Avatar For Few-shot User Adaptation =best=
    • 3D animatable photo-realistic head avatar as prior, personalizable with few shot (one image)
  • MAGICMIRROR: style and subject transformations =best=
6.3.1.1. EXPRESSION
  • Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis
    • to generate a talking portrait video, estimating expression and head pose
  • Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
    • annotated emotional and style labels, auto-encoder decoupling expressions and identities
  • Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters =best=
    • automatically animate virtual human faces
  1. AUDI TO VIDEO
    • EMO: Emote Portrait Alive =best=
      • audio to video (directly)
    • AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
      • driven by audio and a reference portrait image

6.3.2. FACE DIFFUSION

  • parent: diffusion
  • Relightify: Relightable 3D Faces from a Single Image via Diffusion Models
    • create from photo a set of textures(BRDF) to generate realistic 3d face
  • SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
    • mitigates identity leakage by masking facial regions and utilizing disentangled identity and non-identity features

7. MESH DIFFUSION

7.1. TEXT-TO-3D

  • Text-to-3D with classifier score distillation =best=
    • guidance alone is enough for effective text-to-3D generation
  • StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D =best quality= (Gaussians)
    • image-space diffusion for geometric precision, latent-space diffusion for vivid color
    • high-fidelity(quality) 3D models
  • Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors =best=
    • both a 3D and a 2D diffusion process (priors), bidirectional guidance (20 minutes)
  • UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation
    • albedo-normal aligned multi-view diffusion (to enable relighting)
  • PolyDiff: Generating 3D Polygonal Meshes with Diffusion Models
    • operates natively on the polygonal mesh, trained to restore the original mesh structure
  • 3.3.1.1
  • RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
    • trained with extra image-to-depth and image-normal priors; maps diffused together
  • HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation (7 seconds)
  • AToM: Amortized Text-to-Mesh using 2D Diffusion
    • high-quality textured meshes, 1 second inference, 10 times reduction in training cost, unseen prompts
  • L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects
    • compose a desired object via trial-and-error within the 3D simulation environment
  • Instant Text-to-3D Mesh with PeRFlow-T2I + TripoSR

7.1.1. VOLUME DIFFUSION

  • WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space
    • trained without direct supervision from multiview or 3D and dont require pose or camera distributions
      • autoencoder captures the images underlying 3D structure (unlike previous gan approaches)
  • VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
    • 3D latent representation, seconds to minutes

7.2. CHARACTER

  • AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose
    • text descriptions and pose guidance
    • uses mlp nerf
  • HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation
  • Fast Registration of Photorealistic Avatars for VR Facial Animation
    • Synthesizing Moving People with 3D Control

7.3. FROM 2D DIFFUSION

7.3.1. 3D PRIOR

  • GeoDream: Disentangling 2D and Geometric Priors for High-Fidelity and Consistent 3D Generation
    • multi-view diffusion serves as native 3D geometric priors
    • disentangling 2D and 3D priors allows us to refine 3D geometric priors further
  • DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors
    • diffusion guides the 3D generator finetuning with informative direction
  • CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model
    • Convolutional Reconstruction Model (CRM), feed-forward single image-to-3D generative
    • triplane exhibits spatial correspondence of six orthographic images
  • Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior
    • modulate the output of the 2D diffusion model to the correct patterns of the template views
  • ISOTROPIC3D
  • Compress3D: a Compressed Latent Space for 3D Generation from a Single Image
    • encodes 3D models into a compact triplane latent space; 7 seconds diffusion
7.3.1.1. 3D ADDED TO 2D PRIOR
  • Retrieval-Augmented Score Distillation for Text-to-3D Generation =best=
    • adapt the diffusion model’s 2D prior toward view consistency
    • added controllability and negligible training cost
7.3.1.2. VIEWDIFF
  • ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models
    • autoregressive 3D-consistent images at any viewpoint
    • integrate 3D volume-rendering and cross-frame-attention layers into each block of sd
    • the prior could instead be an animation prior? since puts attention on the batch
      • 3d consistent space

7.3.2. FAST

  • Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model =best, fast=
    • generates 4 views and then regresses a NeRF from them
    • Instant3D: Instant Text-to-3D Generation
  • One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
    • multi-view image generation, then to 3D using multi-view 3D native diffusion models
  • MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture (2 stages, within 20 minutes)

8. MESH NERF

  • parent: nerf
  • A Unified Framework for Surface Reconstruction, 3d from nerf
  • Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures; latent nerf
    • AutoDecoding Latent 3D Diffusion Models, view-consistent appearance and geometry
  • SweetDreamer converts 2D drawings from certain models into 3D by fixing geometry-related issues
  • SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering
    • regularization term that encourages the gaussians to align well with the surface of the scene
    • and binds them which enables easy editing, sculpting, rigging, animating, compositing and relighting
  • NERF FROM TEXT LUCIDDREAMER

Author: Tekakutli

Created: 2024-04-13 Sat 04:35