Spaces:

C4G-HKUST
/

AnyTalker

Running on Zero

C4G-HKUST commited on 9 days ago

Commit

d0973b6

1 Parent(s): e69c1a8

Add free-tier user limitation note: Fast mode can generate ~6s two-person video max

Files changed (2) hide show

README.md CHANGED Viewed

@@ -212,11 +212,12 @@ python app.py
 #### Generation Modes
 The Gradio demo provides two generation modes:
-- **Fast Mode (up to 240s GPU budget)**:
-  - Fixed 12 denoising steps for quick generation
   - Suitable for single-person videos or quick previews
   - Lower GPU usage quota consumption
-  - The 240s is the maximum GPU allocation time (budget), not the actual generation time
 - **Quality Mode (up to 720s GPU budget)**:
   - Custom denoising steps (adjustable via "Diffusion steps" slider)
@@ -225,7 +226,7 @@ The Gradio demo provides two generation modes:
   - The 720s is the maximum GPU allocation time (budget), not the actual generation time
   - With 40 denoising steps, approximately 10 seconds of video can be generated
-**Design Rationale**: Multi-person videos generally have longer duration and require more computational resources. To achieve better quality, especially for complex multi-person interactions, more denoising steps and longer GPU allocation time are needed. The Quality Mode provides sufficient Usage Quota (up to 720 seconds) to accommodate these requirements, while the Fast Mode offers a quick preview option with fixed 12 steps for faster iteration. Note that the GPU duration values (240s/720s) represent the maximum budget allocated, not the actual generation time.

 #### Generation Modes
 The Gradio demo provides two generation modes:
+- **Fast Mode (up to 210s GPU budget)**:
+  - Fixed 10 denoising steps for quick generation
   - Suitable for single-person videos or quick previews
   - Lower GPU usage quota consumption
+  - The 210s is the maximum GPU allocation time (budget), not the actual generation time
+  - **For free-tier users: Fast mode can generate approximately 6 seconds of two-person video at most; longer videos may timeout.**
 - **Quality Mode (up to 720s GPU budget)**:
   - Custom denoising steps (adjustable via "Diffusion steps" slider)
   - The 720s is the maximum GPU allocation time (budget), not the actual generation time
   - With 40 denoising steps, approximately 10 seconds of video can be generated
+**Design Rationale**: Multi-person videos generally have longer duration and require more computational resources. To achieve better quality, especially for complex multi-person interactions, more denoising steps and longer GPU allocation time are needed. The Quality Mode provides sufficient Usage Quota (up to 720 seconds) to accommodate these requirements, while the Fast Mode offers a quick preview option with fixed 10 steps for faster iteration. Note that the GPU duration values (210s/720s) represent the maximum budget allocated, not the actual generation time.

app.py CHANGED Viewed

@@ -768,7 +768,7 @@ def run_graio_demo(args):
                     )
                 gr.Markdown("""
                 **Generation Modes:**
-                - **Fast Mode (up to 210s GPU budget)**: Fixed 10 denoising steps for quick generation. Suitable for single-person videos or quick previews. The 210s is the maximum GPU allocation time, not the actual generation time.
                 - **Quality Mode (up to 720s GPU budget)**: Custom denoising steps (adjustable via "Diffusion steps" slider). Recommended for multi-person videos that require higher quality. The 720s is the maximum GPU allocation time, not the actual generation time. With 40 denoising steps, approximately 10 seconds of video can be generated.
                 *Note: The GPU duration (210s/720s) represents the maximum budget allocated, not the actual generation time. Multi-person videos generally require longer duration and more Usage Quota for better quality.*

                     )
                 gr.Markdown("""
                 **Generation Modes:**
+                - **Fast Mode (up to 210s GPU budget)**: Fixed 10 denoising steps for quick generation. Suitable for single-person videos or quick previews. The 210s is the maximum GPU allocation time, not the actual generation time. **For free-tier users: Fast mode can generate approximately 6 seconds of two-person video at most; longer videos may timeout.**
                 - **Quality Mode (up to 720s GPU budget)**: Custom denoising steps (adjustable via "Diffusion steps" slider). Recommended for multi-person videos that require higher quality. The 720s is the maximum GPU allocation time, not the actual generation time. With 40 denoising steps, approximately 10 seconds of video can be generated.
                 *Note: The GPU duration (210s/720s) represents the maximum budget allocated, not the actual generation time. Multi-person videos generally require longer duration and more Usage Quota for better quality.*