Stable diffusion thread

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
ODE - ordinary differential equation.
SDE - stochastic differential equation.
Also got Runge-Kutta stuff.

 

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
AutoStudio (Lenovo Research) has a comic strip generation solution better than StoryDiffusion.

MOFA-video from Tencent AI Lab combines different elements together for image-to-video generation.

This is the low vram solution for converting a realistic video to smooth anime.
It uses less control-net models compared to the more complex Diffutoon.
 

Prismatic

Supremacy Member
Joined
Jun 8, 2004
Messages
6,834
Reaction score
2,471
I'm a bit slow to the game so just starting out on images first.

Trying to push the limits of how much details I can get into the image without it looking weird. Experimenting with culture-specific refererences, but quite hard cos existing models quite limited.

https://www.instagram.com/nihon_josei_zukan/





 

x1243x

Arch-Supremacy Member
Joined
Oct 29, 2004
Messages
11,828
Reaction score
3,671
I'm a bit slow to the game so just starting out on images first.

Trying to push the limits of how much details I can get into the image without it looking weird. Experimenting with culture-specific refererences, but quite hard cos existing models quite limited.

https://www.instagram.com/nihon_josei_zukan/






first and third not bad.. just that the hand in the first pic is off.. can inpaint it
 

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
Tencent AI Lab presents MimicMotion, better than its own MusePose.
Now AI-characters can do better dancing and talking, with better looking hands (separated).
Requires 23gb vram!

Here are 2 samples from the project page.
 
Last edited:

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
Tested DynamiCrafter CIL version.
I used 945M and disabled analytic_init. Has watermark.


Waiting for non-watermarked model to be released.
 

AZE

High Supremacy Member
Joined
May 2, 2000
Messages
36,925
Reaction score
3,987
As for Lumina, have to wait until they finish researching the 16-channel VAE that SD3 uses.
In the meantime, Kijai has provided a ComfyUI wrapper for it.
Maybe worth trying out their Next-SFT model first and the Gemma-2b LLM that it uses.

https://huggingface.co/AuraDiffusion/16ch-vae
https://huggingface.co/ostris/vae-kl-f8-d16

Got a few ppl/groups training their own VAEs already.
Archerly still don't know huai they creat 16 channel VAE.

Original ish 8x8 compression, channel 3-> 4, rike e.g. 64x64x3 -> 8x8x4,
16 channel ish same 8x8 compression, channel 3-16, e.g. 64x64x3 -> 8x8x16,
Huai do they not drop pixel space compression to 4x4, while keeping the channels the same? e.g. 64x64x3 -> 16x16x4.

Anyway a 16channel ones being better than a 4 channel one ish obvious, all other things being equal, they basically decrease compression by 4x anyway.:crazy:
 

AZE

High Supremacy Member
Joined
May 2, 2000
Messages
36,925
Reaction score
3,987
Lumina T2X uses DiT architecture.
Can do different outputs, not just image.



All the modern models, Pixart, Hunyuan, Lumina and SD3 are different variants of custom DiT models.:o
Other than Pixart that ish specialised on T2I, all the other 3's architecture seems to be designed to be extensible for multimodal systems.
 

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
Testing biomechanical creature using PixArt-Sigma.
At the moment, Midjourney still do the best in this area.

PixArt-Sigma 1K model.
FmCNciT.png


PixArt-Sigma 2K model.
VWGSLFH.jpeg
 

doogyhatts

Arch-Supremacy Member
Joined
Feb 13, 2018
Messages
13,745
Reaction score
3,950
Testing biomechanical creature using PixArt-Sigma.
At the moment, Midjourney still do the best in this area.

PixArt-Sigma 1K model.
FmCNciT.png


PixArt-Sigma 2K model.
VWGSLFH.jpeg

Comparing with Lumina-Next-SFT HF demo.
Rdq4VPd.png


And Hunyuan HF demo.
kJ1caQd.png
 
Last edited:
Important Forum Advisory Note
This forum is moderated by volunteer moderators who will react only to members' feedback on posts. Moderators are not employees or representatives of HWZ Forums. Forum members and moderators are responsible for their own posts. Please refer to our Community Guidelines and Standards and Terms and Conditions for more information.
Top