software

Table of Contents

1. IMAGE GENERATION

1.1. WORKFLOWS

1.1.1. KEYWORDS

1.2. SD GUIDES

1.2.1. LORA

1.3. SIDE TOOLS

1.3.1. FACE GENERATION - SWAP

1.4. HELPERS

  • DeepBump: generate normal & height maps from single pictures
  • ProPainter: removal for videos using metaclip and sam
  • VCHITECT
  • supervision: people-object trakers ready to use (reusable computer vision tools) employees at the zone

1.4.1. 2SKETCH, TO SKETCH

  • Anime2Sketch: generate sketch from anime image
  • Stylized Face Sketch Extraction via Generative Prior with Limited Data
    • generate sketch from image, using sketch-image-example to set style

1.4.2. STORYTELLING CAPTIONING

  • The Manga Whisperer: Automatically Generating Transcriptions for Comics (magi) storytelling, storyboard
    • generate a transcript, ocr, order panels, cluster characters

1.4.3. OPENPOSE EDITOR

1.4.5. DETECTORS (COMPUTER VISION)

1.4.6. ANTI GLOW

1.5. UIs

1.5.1. FRONT-ENDS

1.5.1.2. CODE
1.5.1.3. CPU
  • ggml: inference in pure c/c++ (interoperability, no python dependency hell)
  • FastSD CPU: Faster version of stable diffusion running on CPU
    • FastSD CPU beta 16 release with 2 steps fast inference
1.5.1.4. FASTER

1.6. TRAINING

1.6.1. FINETUNING

1.7. MODELS

1.7.1. CONTROLNET

1.8. COMFY

1.8.1. INSTALLATION SNIPPET

1.8.2. NEGATIVE LORAS

1.8.3. PROGRAMMATIC

1.8.3.1. PYTHON
1.8.3.2. CUSHY

1.8.4. COMFY NODES

1.8.4.1. AUDIO
1.8.4.2. REGIONAL EDITING
1.8.4.3. NATIVE OFFSET NOISE
  • Vectorscope CC: Offset Noise* natively (control over light, contrast, shadows)
1.8.4.4. UI MANAGER
1.8.4.6. OPTIMIZATION
  • comfy-todo: Token Downsampling for Efficient Generation of High-Resolution Images
1.8.4.7. 3D
1.8.4.8. STYLE
  1. TARGET STYLE-SUBJECT
    • StyleAligned: consistent style to all images in a batch
    • face swap
    • ComfyUI Portrait Master: generates prompts for skin color, expresion, shape, light direction
1.8.4.10. IMAGE PROCESSING
  1. MASKING
    1. EDITING MASK
    2. IMPACT PACK
    3. PER-INSTANCE MASK
    4. GENERATE WITH TRANSPARENCY
  2. PREPROCESSORS
    1. LIGHTING
1.8.4.11. TEXT
  1. LLM
  2. TEXT ENCODERS
    1. PROMPT ENHANCE
    2. AUTO1111 TOKENS ON COMFY

      A1111’s token normalization and weighting in ComfyU This means you can reproduce the same images generated on stable-diffusion-webui on ComfyUI.

      1. LINK
    3. ADVANCED TOKEN WEIGHTS
1.8.4.12. IMAGE ENCODING
  1. IP ADAPTER
  2. COMFYUI INSTANTID
  3. VISION MODEL, IMAGE TO PROMPT
1.8.4.13. VIDEO
  1. DANCING
  2. MOTION
    1. LIP SYNC
  3. NOT JUST IMAGES

1.9. ONLINE SERVICES

1.9.1. THE HORDE

2. OTHERS VISUAL

  • PALLAIDIUM: Generative AI for the Blender VSE(Blender Video Sequence Editor)
    • Text, video or image to video, image and audio
  • blender-stable-diffusion-render: addon for using Stable Diffusion to render texture bakes for objects

2.2. 3D

2.2.1. MESH GENERATION

  • threestudio: A unified framework for 3D content generation
    • ProlificDreamer, DreamFusion, Magic3D, SJC, Latent-NeRF, Fantasia3D, TextMesh, Zero-1-to-3, Magic123, InstructNeRF2NeRF, and Control4D are all implemented in this framework.
  • GSGEN: Text-to-3D using Gaussian Splatting
  • 3DTopia: Two-stage text-to-3D generation model (5 minutes)

2.2.3. GAUSSIAN

3. TEXT

3.1. INFERENCE

3.1.1. UI

3.2. CODE

  • Open Interpreter, an open-source Code Interpreter
    • create and edit photos, summarize pdfs, control your browser, plot and analyze large datasets
  • DeepSeek Coder: several models

3.3. DATASET

4. VOICE GENERATION

4.1. REALTIME

Author: Tekakutli

Created: 2024-04-13 Sat 04:35