Diffusers documentation

StableCascadeUNet

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.39.0).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

StableCascadeUNet

A UNet model from the Stable Cascade pipeline.

StableCascadeUNet

class diffusers.models.StableCascadeUNet

< >

( in_channels: int = 16out_channels: int = 16timestep_ratio_embedding_dim: int = 64patch_size: int = 1conditioning_dim: int = 2048block_out_channels: tuple = (2048, 2048)num_attention_heads: tuple = (32, 32)down_num_layers_per_block: tuple = (8, 24)up_num_layers_per_block: tuple = (24, 8)down_blocks_repeat_mappers: tuple[int] | None = (1, 1)up_blocks_repeat_mappers: tuple[int] | None = (1, 1)block_types_per_layer: tuple = (('SDCascadeResBlock', 'SDCascadeTimestepBlock', 'SDCascadeAttnBlock'), ('SDCascadeResBlock', 'SDCascadeTimestepBlock', 'SDCascadeAttnBlock'))clip_text_in_channels: int | None = Noneclip_text_pooled_in_channels = 1280clip_image_in_channels: int | None = Noneclip_seq = 4effnet_in_channels: int | None = Nonepixel_mapper_in_channels: int | None = Nonekernel_size = 3dropout: float | tuple[float] = (0.1, 0.1)self_attn: bool | tuple[bool] = Truetimestep_conditioning_type: tuple = ('sca', 'crp')switch_level: tuple[bool] | None = None )

forward

< >

( sampletimestep_ratioclip_text_pooledclip_text = Noneclip_img = Noneeffnet = Nonepixels = Nonesca = Nonecrp = Nonereturn_dict = True )

Parameters

  • sample (torch.Tensor) — The noisy input sample.
  • timestep_ratio (torch.Tensor) — Timestep ratio used to compute the timestep embedding.
  • clip_text_pooled (torch.Tensor) — Pooled CLIP text embeddings.
  • clip_text (torch.Tensor, optional) — Sequence-level CLIP text embeddings.
  • clip_img (torch.Tensor, optional) — CLIP image embeddings.
  • effnet (torch.Tensor, optional) — EfficientNet feature map used as additional conditioning.
  • pixels (torch.Tensor, optional) — Pixel-level conditioning tensor. If None, a tensor of zeros is used.
  • sca (torch.Tensor, optional) — Optional sca conditioning value used to build the timestep embedding.
  • crp (torch.Tensor, optional) — Optional crp conditioning value used to build the timestep embedding.
  • return_dict (bool, optional, defaults to True) — Whether or not to return a StableCascadeUNetOutput instead of a plain tuple.
Update on GitHub