Stable diffusion thread

doogyhatts · Jun 26, 2024

zheng said:
pixart?

Yes. But refined using Photon.

doogyhatts · Jun 26, 2024

ODE - ordinary differential equation.
SDE - stochastic differential equation.
Also got Runge-Kutta stuff.

doogyhatts · Jun 27, 2024

AutoStudio (Lenovo Research) has a comic strip generation solution better than StoryDiffusion.

MOFA-video from Tencent AI Lab combines different elements together for image-to-video generation.

This is the low vram solution for converting a realistic video to smooth anime.
It uses less control-net models compared to the more complex Diffutoon.

doogyhatts · Jun 28, 2024

zheng said:
trying dream machine... wolf transform into a man... this is what i get

https://streamable.com/nuczki

Dream Machine can now input two images and transform.

doogyhatts · Jun 28, 2024

doogyhatts · Jun 29, 2024

Tutorial on how to sign up for Kling.

doogyhatts · Jun 30, 2024

New samplers and schedulers!

Prismatic · Jun 30, 2024

I'm a bit slow to the game so just starting out on images first.

Trying to push the limits of how much details I can get into the image without it looking weird. Experimenting with culture-specific refererences, but quite hard cos existing models quite limited.

https://www.instagram.com/nihon_josei_zukan/

x1243x · Jun 30, 2024

Prismatic said:
I'm a bit slow to the game so just starting out on images first.

Trying to push the limits of how much details I can get into the image without it looking weird. Experimenting with culture-specific refererences, but quite hard cos existing models quite limited.

https://www.instagram.com/nihon_josei_zukan/

first and third not bad.. just that the hand in the first pic is off.. can inpaint it

Prismatic · Jun 30, 2024

x1243x said:
first and third not bad.. just that the hand in the first pic is off.. can inpaint it

Oh thanks! I didn't actually notice it the first time! Managed to remove it in ver 2.

doogyhatts · Jul 1, 2024

DynamiCrafter will be updated soon with a new inference solution that allows for wider motions!
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model

After which, we can simply extend the video length by using the last frame and regenerate a connecting video segment.
Then, we don't need to depend so much on Dream Machine.

doogyhatts · Jul 2, 2024

Tencent AI Lab presents MimicMotion, better than its own MusePose.
Now AI-characters can do better dancing and talking, with better looking hands (separated).
Requires 23gb vram!

Here are 2 samples from the project page.

doogyhatts · Jul 3, 2024

Tested DynamiCrafter CIL version.
I used 945M and disabled analytic_init. Has watermark.

Waiting for non-watermarked model to be released.

AZE · Jul 6, 2024

doogyhatts said:
As for Lumina, have to wait until they finish researching the 16-channel VAE that SD3 uses.
In the meantime, Kijai has provided a ComfyUI wrapper for it.
Maybe worth trying out their Next-SFT model first and the Gemma-2b LLM that it uses.

https://huggingface.co/AuraDiffusion/16ch-vae
https://huggingface.co/ostris/vae-kl-f8-d16

Got a few ppl/groups training their own VAEs already.
Archerly still don't know huai they creat 16 channel VAE.

Original ish 8x8 compression, channel 3-> 4, rike e.g. 64x64x3 -> 8x8x4,
16 channel ish same 8x8 compression, channel 3-16, e.g. 64x64x3 -> 8x8x16,
Huai do they not drop pixel space compression to 4x4, while keeping the channels the same? e.g. 64x64x3 -> 16x16x4.

Anyway a 16channel ones being better than a 4 channel one ish obvious, all other things being equal, they basically decrease compression by 4x anyway. :crazy:

AZE · Jul 6, 2024

zheng said:
sai is gone case already

They planning for an "updated/fixed" model with new license.

AZE · Jul 6, 2024

doogyhatts said:
Lumina T2X uses DiT architecture.
Can do different outputs, not just image.

All the modern models, Pixart, Hunyuan, Lumina and SD3 are different variants of custom DiT models.

Other than Pixart that ish specialised on T2I, all the other 3's architecture seems to be designed to be extensible for multimodal systems.

doogyhatts · Jul 6, 2024

Testing biomechanical creature using PixArt-Sigma.
At the moment, Midjourney still do the best in this area.

PixArt-Sigma 1K model.

PixArt-Sigma 2K model.

focus1974 · Jul 6, 2024

Prismatic said:
Oh thanks! I didn't actually notice it the first time! Managed to remove it in ver 2.

wa.. making money with instagram model ..

is it easy?
what you use? SDXL or just SD1.5?

doogyhatts · Jul 6, 2024

focus1974 said:
wa.. making money with instagram model ..

I think it is not possible to make any money off advertisements even if the account reached past 10K followers.
https://help.instagram.com/5485466918184985

doogyhatts · Jul 6, 2024

doogyhatts said:
Testing biomechanical creature using PixArt-Sigma.
At the moment, Midjourney still do the best in this area.

PixArt-Sigma 1K model.

PixArt-Sigma 2K model.

Comparing with Lumina-Next-SFT HF demo.

And Hunyuan HF demo.

Stable diffusion thread

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Supremacy Member

Arch-Supremacy Member

Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member

High Supremacy Member

High Supremacy Member

High Supremacy Member

Arch-Supremacy Member

Greater Supremacy Member

Arch-Supremacy Member

Arch-Supremacy Member