AnimateDiff|高质量文本到动画视频生成
AnimateDiff(动画差异)是一个用于实现个性化文本到图像扩散模型的动画化,在没有特定调整的情况下动画化您的个性化文本到图像扩散模型。
简而言之可以通过你给出的文字
内容生成图像
,然后该项目自动根据该图像生成
较为优雅的动画
,效果比Stable Diffusion WebUI
中的图生图
生成视频效果更好,闪烁趋近于无。
项目仓库
官网:AnimateDiff
GitHub:guoyww/AnimateDiff: Official implementation of AnimateDiff. (github.com)
前置条件
在执行项目安装之前,我们还需要安装Git
和Conda
,如果您的电脑还未安装这两款软件,请先根据本站所给出的教程安装。
Windows系统安装Git请参阅此文章:
Windows系统安装Conda请参阅此文章:
网络问题
在安装过程中,你可能即便开启了魔法上网也无法下载一些编程依赖库,关于魔法上网的相关配置问题不方便在站内讲解,请自行查看【魔法上网】的教程内容。
安装教程
如果您是初学者,对于命令行不太理解,那么请按下键盘上的Win键+R键
后,在弹出的新窗口内输入CMD
并按下回车
键,在CMD窗口
中按顺序执行
如下的每一条命令
。
首先我们需要确认一个工作目录,用来存放该项目的相关环境依赖文件。本站所选择的目录为D盘
的根目录下openai.wiki
文件夹,完整路径为:D:\openai.wiki
。
在CMD中执行如下命令,这将会自动检测D盘
是否在openai.wiki
文件夹,没有则自动创建
该文件夹
。
if not exist D:\openai.wiki mkdir D:\openai.wiki
继续执行如下命令
,在CMD中强制切换当前工作路径为D盘
的openai.wiki
文件夹。
cd /d D:\openai.wiki
拉取
该项目的Github仓库
文件,将其下载至openai.wiki
文件夹内。
git clone https://github.com/guoyww/AnimateDiff.git
如果您无法完成此步骤,执行后报错或者无法下载,可以下载该文件将其解压至D:\openai.wiki
即可。
环境部署
在CMD中执行如下命令,强制切换至AnimateDiff
的项目目录。
cd /d D:\openai.wiki\AnimateDiff
这里我们需要手动修改
项目根目录下的environment.yaml
文件,否则将会报错xformers
无法安装的问题。
打开该文件之后,你将会看到如下内容:
name: animatediff channels: - pytorch - xformers dependencies: - python=3.10 - pytorch==1.12.1 - torchvision==0.13.1 - torchaudio==0.12.1 - cudatoolkit=11.3 - xformers - pip - pip: - diffusers[torch]==0.11.1 - transformers==4.25.1 - imageio==2.27.0 - gdown - einops - omegaconf - safetensors
我们将第11行
的- xformers
命令,移动至第13行
即可,更改后的内容示例如下。
备注:如果你看不懂也没关系,直接复制下面的,覆盖你自己environment.yaml
文件内的所有内容即可,然后记得保存。
name: animatediff channels: - pytorch - xformers dependencies: - python=3.10 - pytorch==1.12.1 - torchvision==0.13.1 - torchaudio==0.12.1 - cudatoolkit=11.3 - pip - pip: - xformers - diffusers[torch]==0.11.1 - transformers==4.25.1 - imageio==2.27.0 - gdown - einops - omegaconf - safetensors
在CMD中执行下面的命令行,这将会自动根据environment.yaml
来创建
一个名为animatediff
的Conda虚拟环境
。
conda env create -f environment.yaml
初始化Conda环境
,防止
后续操作可能存在报错
等问题。
conda init cmd.exe
激活
已创建的Conda环境
,这样我们可以将我们后续所需要的所有环境依赖都安装至此环境下。
conda activate animatediff
执行如上命令之后,你的当前CMD窗口将会处于一个名为animatediff
的Python虚拟环境
中。
我们执行如下代码,卸载当前Python环境中已安装的torch
|torchvision
|torchaudio
模块。
pip uninstall -y torch torchvision torchaudio
why?
如果我们不重新安装的话,当前默认安装会以PyTorch
的CPU
方式运行
,会出现如下报错
内容。
(animatediff) D:\openai.wiki\AnimateDiff>python -m scripts.animate --config configs/prompts/1-ToonYou.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu) Python 3.10.11 (you have 3.10.12) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: Could not find module 'C:\Users\openA\miniconda3\envs\animatediff\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax. warn(f"Failed to load image Python extension: {e}") loaded temporal unet's pretrained weights from models/StableDiffusion\unet ... ### missing keys: 560; ### unexpected keys: 0; ### Temporal Module Parameters: 417.1376 M Traceback (most recent call last): File "C:\Users\openA\miniconda3\envs\animatediff\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\openA\miniconda3\envs\animatediff\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "D:\openai.wiki\AnimateDiff\scripts\animate.py", line 159, in <module> main(args) File "D:\openai.wiki\AnimateDiff\scripts\animate.py", line 55, in main if is_xformers_available(): unet.enable_xformers_memory_efficient_attention() File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 215, in enable_xformers_memory_efficient_attention self.set_use_memory_efficient_attention_xformers(True) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 203, in set_use_memory_efficient_attention_xformers fn_recursive_set_mem_eff(module) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 196, in fn_recursive_set_mem_eff module.set_use_memory_efficient_attention_xformers(valid) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 203, in set_use_memory_efficient_attention_xformers fn_recursive_set_mem_eff(module) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 199, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\diffusers\modeling_utils.py", line 196, in fn_recursive_set_mem_eff module.set_use_memory_efficient_attention_xformers(valid) File "D:\openai.wiki\AnimateDiff\animatediff\models\attention.py", line 237, in set_use_memory_efficient_attention_xformers raise ValueError( ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
就像你所看到的上面报错这样,无法调用GPU,找不到可用的GPU。这个问题困扰了我一天,真的很烦,因为官方文档里没有写,也没有看到有人在官方反馈中遇到这个问题。
我们卸载完成后,执行如下代码,重新安装torch
|torchvision
|torchaudio
模块,这样才可以调用GPU硬件设备。
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
这可能要等待一会,具体取决于你的魔法配置。这一步结束之后,就代表我们已经完成了所有关于环境部署的部分。
模型说明
以下是各个模型的生成效果展示,官方给出的YAML文件对应着以下的各个模型。
D:\openai.wiki\AnimateDiff\configs\prompts └─1-ToonYou.yaml └─2-Lyriel.yaml └─3-RcnzCartoon.yaml └─4-MajicMix.yaml └─5-RealisticVision.yaml └─6-Tusun.yaml └─7-FilmVelvia.yaml └─8-GhibliBackground.yaml
ToonYou
Civitai:https://civitai.com/models/30240/toonyou
Counterfeit V3.0
Civitai:https://civitai.com/models/4468/counterfeit-v30
Realistic Vision V2.0
Civitai:https://civitai.com/models/4201/realistic-vision-v20
majicMIX Realistic
Civitai:https://civitai.com/models/43331/majicmix-realistic
RCNZ Cartoon
Civitai:https://civitai.com/models/66347/rcnz-cartoon-3d
FilmVelvia
Civitai:https://civitai.com/models/33208/filmgirl-film-grain-lora-and-loha
模型下载|官方
以下模型下载时必须开启魔法网络,而且即使你开了魔法环境也不一定真的有用,所以不建议通过官方下载,而是通过本站所提供的国内网盘下载。
StableDiffusion
在CMD中执行如下命令,强制切换至AnimateDiff
的项目目录。
cd /d D:\openai.wiki\AnimateDiff
逐行执行
如下代码
,这将会自动下载
并编译模型
至models/StableDiffusion
目录。
git lfs install git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
注意:这非常非常耗时,站长的垃圾配置用了近20个小时才编译完成,编译后的文件共计75.2GB,除了编译文件外,实际的模型大小为37.6GB。
Motion_Module
在CMD中执行如下命令,强制切换至AnimateDiff
的项目目录。
cd /d D:\openai.wiki\AnimateDiff
逐行执行
如下代码
,这将会自动下载
并模型
至models/Motion_Module
目录。
bash download_bashscripts/0-MotionModule.sh
注意:这将会自动下载两个模型文件,大小共计3.11GB
,下载后将会自动存储在models/Motion_Module
目录内。如果你无法通过此方式下载,也可以【点击此处】前往Google云盘下载。
DreamBooth_LoRA
官方目前提供了该模型的8个下载地址
,共计11个模型
文件,大小共计27.6GB
。
逐行执行
如下代码
,这将会自动下载
并模型
至models/DreamBooth_LoRA
目录。
bash download_bashscripts/1-ToonYou.sh bash download_bashscripts/2-Lyriel.sh bash download_bashscripts/3-RcnzCartoon.sh bash download_bashscripts/4-MajicMix.sh bash download_bashscripts/5-RealisticVision.sh bash download_bashscripts/6-Tusun.sh bash download_bashscripts/7-FilmVelvia.sh bash download_bashscripts/8-GhibliBackground.sh
模型下载|网盘
以下文件全部下载,不要更改目录结构,共计68.4GB
,将名为models
的文件夹移动至项目的根目录
下,如果提示是否覆盖
,选择是
即可。
运行方式
在以后每次运行该项目时,只需要先激活我们刚刚所创建的Conda虚拟Python环境,然后运行启动文件即可。
在CMD中执行如下命令,强制切换至项目目录文件夹。
cd /d D:\openai.wiki\AnimateDiff
激活已创建的Conda环境,这样才可以正常使用该项目,否则将会自动调用系统中的默认Python。
conda activate animatediff
使用不同的模型,需要加载不同的yaml文件,官方内置了8个Prompt文件,可以帮你快速生成想要的内容。具体如何实现,我们先拿一条来举例子。
python -m scripts.animate --config configs/prompts/1-ToonYou.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512
上面的代码,在执行
之后,就会自动生成
一个GIF
文件至项目根目录
的samples文件夹
内。
python -m scripts.animate
- 代表使用Python环境运行项目
根目录
下scripts文件夹
内名为animate.py
的脚本
- 代表使用Python环境运行项目
--config
- 参数:
configs/prompts/1-ToonYou.yaml
- 该文件则是这个脚本执行时,必须要填写的
参数
,用来决定生成什么样的内容,也就是Prompt。
- 参数:
--pretrained_model_path
- 参数:models/StableDiffusion
- 这里用来填写StableDiffusion官方模型的路径,无需修改。
--L 16 --W 512 --H 512
—l
- 没搞懂,好像是模型位度?或者其它的某个维度,经测试,好像只可以填写8的平方数。
- 总之,官方默认就是16,一般没有必要修改。
—w
- 代表生成的动画图像宽度
—h
- 代表生成的动画图像高度
大家可能还有一个疑问,我怎么控制我想生成的内容是什么?这就要看1-ToonYou.yaml
文件的内容了,打开该文件看一下,文件的内容如下所示。
ToonYou: base: "" path: "models/DreamBooth_LoRA/toonyou_beta3.safetensors" motion_module: - "models/Motion_Module/mm_sd_v14.ckpt" - "models/Motion_Module/mm_sd_v15.ckpt" seed: [10788741199826055526, 6520604954829636163, 6519455744612555650, 16372571278361863751] steps: 25 guidance_scale: 7.5 prompt: - "best quality, masterpiece, 1girl, looking at viewer, blurry background, upper body, contemporary, dress" - "masterpiece, best quality, 1girl, solo, cherry blossoms, hanami, pink flower, white flower, spring season, wisteria, petals, flower, plum blossoms, outdoors, falling petals, white hair, black eyes," - "best quality, masterpiece, 1boy, formal, abstract, looking at viewer, masculine, marble pattern" - "best quality, masterpiece, 1girl, cloudy sky, dandelion, contrapposto, alternate hairstyle," n_prompt: - "" - "badhandv4,easynegative,ng_deepnegative_v1_75t,verybadimagenegative_v1.3, bad-artist, bad_prompt_version2-neg, teeth" - "" - ""
base
基础模型
路径,默认
为空
即可。
path
- 代表你本次生成图像时所使用
LoRA
的模型路径
,你也可以定义一个你自己的模型。
- 代表你本次生成图像时所使用
motion_module
官方微调模型
的路径
,一般不用修改。
seed
- 生成图像时的
种子
,这里和SD是一样的。
- 生成图像时的
steps
步数
,你可以理解为生成一张图像需要计算多少张。
guidance_scale
引导比例
,与SD的默认功能无异。
prompt
正向提示词
,每一行都可以视为一次任务,比如本示例中的Prompt为4行,那么也就是共计生成4次,每一次都将逐行按Prompt内容生成GIF图像。
n_prompt
反向提示词
,每一行都可以视为一次任务,这将与Prompt正向提示词一一对应,也可以留空。- 总之,你不想生成什么内容,那就填写在这里,比如NSFW。
好啦,我们已经完成了所有理论,下面开始生成环境,无论怎么样,我们先执行一下试试效果,看看程序能不能跑的起来。
python -m scripts.animate --config configs/prompts/1-ToonYou.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512
这一段需要等待非常非常久,毕竟站长的显卡是2080TI,真的越来越力不从心了。
(animatediff) D:\openai.wiki\AnimateDiff>python -m scripts.animate --config configs/prompts/1-ToonYou.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 A matching Triton is not available, some optimizations will not be enabled. loaded temporal unet's pretrained weights from models/StableDiffusion\unet ... ### missing keys: 560; ### unexpected keys: 0; ### Temporal Module Parameters: 417.1376 M Downloading pytorch_model.bin: 100%|██████████████████████████████████████████████| 1.71G/1.71G [09:11<00:00, 3.10MB/s] C:\Users\openA\miniconda3\envs\animatediff\lib\site-packages\huggingface_hub\file_download.py:133: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\openA\.cache\huggingface\hub. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations. To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development warnings.warn(message) Some weights of the model checkpoint at openai/clip-vit-large-patch14 were not used when initializing CLIPTextModel: ['vision_model.encoder.layers.14.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.self_attn.v_proj.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.mlp.fc1.weight', 'vision_model.encoder.layers.17.mlp.fc1.weight', 'vision_model.encoder.layers.10.layer_norm1.weight', 'vision_model.encoder.layers.0.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.weight', 'vision_model.encoder.layers.9.self_attn.k_proj.weight', 'vision_model.encoder.layers.2.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.weight', 'vision_model.encoder.layers.20.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.mlp.fc2.bias', 'vision_model.encoder.layers.12.self_attn.q_proj.bias', 'vision_model.encoder.layers.0.self_attn.q_proj.weight', 'vision_model.encoder.layers.14.mlp.fc2.weight', 'vision_model.encoder.layers.22.layer_norm2.bias', 'vision_model.encoder.layers.9.layer_norm2.bias', 'vision_model.encoder.layers.12.layer_norm1.weight', 'vision_model.encoder.layers.18.self_attn.q_proj.weight', 'vision_model.embeddings.class_embedding', 'vision_model.encoder.layers.14.mlp.fc1.bias', 'vision_model.encoder.layers.21.layer_norm2.bias', 'vision_model.encoder.layers.3.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.mlp.fc1.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.weight', 'vision_model.encoder.layers.19.layer_norm1.bias', 'vision_model.encoder.layers.21.mlp.fc1.weight', 'vision_model.encoder.layers.11.layer_norm2.weight', 'vision_model.encoder.layers.21.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.bias', 'vision_model.encoder.layers.2.layer_norm1.weight', 'vision_model.encoder.layers.2.layer_norm1.bias', 'vision_model.encoder.layers.19.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.layer_norm2.bias', 'vision_model.encoder.layers.15.mlp.fc1.bias', 'vision_model.encoder.layers.5.layer_norm1.weight', 'vision_model.encoder.layers.4.mlp.fc1.bias', 'vision_model.encoder.layers.11.self_attn.out_proj.weight', 'vision_model.encoder.layers.23.mlp.fc1.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.self_attn.v_proj.bias', 'vision_model.encoder.layers.5.self_attn.out_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.bias', 'vision_model.encoder.layers.7.layer_norm2.weight', 'vision_model.encoder.layers.19.layer_norm2.weight', 'vision_model.encoder.layers.8.layer_norm1.weight', 'vision_model.encoder.layers.15.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.mlp.fc2.bias', 'vision_model.encoder.layers.2.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.self_attn.v_proj.weight', 'vision_model.encoder.layers.13.mlp.fc1.weight', 'vision_model.encoder.layers.3.self_attn.k_proj.weight', 'vision_model.encoder.layers.13.self_attn.k_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.bias', 'vision_model.encoder.layers.1.self_attn.v_proj.weight', 'vision_model.encoder.layers.9.mlp.fc2.weight', 'vision_model.encoder.layers.13.self_attn.q_proj.bias', 'vision_model.encoder.layers.21.mlp.fc2.bias', 'vision_model.encoder.layers.4.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.layer_norm1.weight', 'vision_model.encoder.layers.20.self_attn.k_proj.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.weight', 'vision_model.encoder.layers.21.mlp.fc2.weight', 'vision_model.encoder.layers.18.mlp.fc1.weight', 'vision_model.encoder.layers.14.layer_norm2.bias', 'vision_model.encoder.layers.23.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.layer_norm2.bias', 'vision_model.encoder.layers.5.self_attn.q_proj.weight', 'vision_model.post_layernorm.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.weight', 'vision_model.encoder.layers.23.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.out_proj.bias', 'vision_model.encoder.layers.5.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.mlp.fc2.weight', 'vision_model.encoder.layers.18.mlp.fc2.weight', 'vision_model.encoder.layers.18.layer_norm1.weight', 'vision_model.encoder.layers.8.layer_norm1.bias', 'vision_model.encoder.layers.7.layer_norm2.bias', 'vision_model.encoder.layers.14.layer_norm1.bias', 'vision_model.encoder.layers.15.self_attn.k_proj.bias', 'vision_model.encoder.layers.14.self_attn.v_proj.weight', 'vision_model.encoder.layers.6.self_attn.out_proj.weight', 'vision_model.encoder.layers.21.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.mlp.fc1.bias', 'vision_model.encoder.layers.19.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.out_proj.bias', 'vision_model.encoder.layers.18.mlp.fc2.bias', 'vision_model.encoder.layers.21.self_attn.v_proj.weight', 'vision_model.encoder.layers.19.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.layer_norm1.weight', 'vision_model.encoder.layers.10.self_attn.out_proj.bias', 'vision_model.encoder.layers.22.mlp.fc1.weight', 'vision_model.encoder.layers.10.layer_norm1.bias', 'vision_model.encoder.layers.18.self_attn.q_proj.bias', 'vision_model.encoder.layers.6.mlp.fc1.weight', 'vision_model.encoder.layers.2.layer_norm2.weight', 'vision_model.encoder.layers.4.layer_norm2.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.mlp.fc2.weight', 'vision_model.encoder.layers.1.mlp.fc1.bias', 'vision_model.encoder.layers.17.self_attn.v_proj.weight', 'vision_model.encoder.layers.21.self_attn.q_proj.bias', 'vision_model.encoder.layers.11.layer_norm1.bias', 'vision_model.encoder.layers.17.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.bias', 'vision_model.encoder.layers.6.self_attn.v_proj.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.bias', 'vision_model.encoder.layers.22.mlp.fc1.bias', 'vision_model.encoder.layers.2.layer_norm2.bias', 'vision_model.encoder.layers.21.self_attn.out_proj.bias', 'vision_model.encoder.layers.13.mlp.fc1.bias', 'vision_model.encoder.layers.22.mlp.fc2.bias', 'vision_model.embeddings.position_embedding.weight', 'vision_model.encoder.layers.0.mlp.fc2.bias', 'vision_model.encoder.layers.8.self_attn.k_proj.bias', 'vision_model.encoder.layers.5.mlp.fc1.weight', 'vision_model.encoder.layers.7.self_attn.q_proj.weight', 'vision_model.encoder.layers.20.layer_norm1.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.bias', 'vision_model.encoder.layers.16.self_attn.q_proj.weight', 'vision_model.encoder.layers.10.layer_norm2.weight', 'vision_model.encoder.layers.23.self_attn.k_proj.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.weight', 'vision_model.encoder.layers.17.self_attn.out_proj.weight', 'vision_model.encoder.layers.6.mlp.fc1.bias', 'vision_model.encoder.layers.10.mlp.fc1.bias', 'vision_model.encoder.layers.7.mlp.fc1.weight', 'vision_model.encoder.layers.14.mlp.fc2.bias', 'vision_model.encoder.layers.23.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.weight', 'vision_model.encoder.layers.6.layer_norm1.bias', 'vision_model.encoder.layers.10.self_attn.out_proj.weight', 'vision_model.encoder.layers.22.self_attn.k_proj.bias', 'vision_model.encoder.layers.1.layer_norm2.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.weight', 'vision_model.encoder.layers.16.layer_norm2.bias', 'vision_model.embeddings.position_ids', 'vision_model.encoder.layers.5.layer_norm2.weight', 'vision_model.encoder.layers.11.self_attn.q_proj.bias', 'vision_model.encoder.layers.15.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.mlp.fc2.bias', 'vision_model.encoder.layers.3.mlp.fc2.weight', 'vision_model.encoder.layers.14.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.mlp.fc2.weight', 'vision_model.encoder.layers.1.mlp.fc1.weight', 'vision_model.encoder.layers.16.self_attn.k_proj.weight', 'vision_model.encoder.layers.19.self_attn.v_proj.bias', 'vision_model.encoder.layers.16.layer_norm1.bias', 'vision_model.encoder.layers.13.mlp.fc2.weight', 'vision_model.encoder.layers.21.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.layer_norm2.bias', 'vision_model.encoder.layers.9.layer_norm1.weight', 'vision_model.encoder.layers.14.layer_norm1.weight', 'vision_model.encoder.layers.0.layer_norm1.weight', 'vision_model.encoder.layers.14.mlp.fc1.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.bias', 'vision_model.encoder.layers.19.layer_norm1.weight', 'vision_model.encoder.layers.23.self_attn.v_proj.bias', 'vision_model.encoder.layers.6.self_attn.q_proj.bias', 'vision_model.encoder.layers.20.self_attn.k_proj.weight', 'vision_model.encoder.layers.20.mlp.fc1.bias', 'vision_model.encoder.layers.20.layer_norm2.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.bias', 'vision_model.encoder.layers.18.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.layer_norm2.weight', 'vision_model.encoder.layers.9.layer_norm2.weight', 'vision_model.encoder.layers.15.self_attn.out_proj.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.bias', 'vision_model.encoder.layers.18.self_attn.v_proj.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.self_attn.out_proj.bias', 'vision_model.encoder.layers.21.layer_norm1.weight', 'vision_model.encoder.layers.1.mlp.fc2.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.weight', 'vision_model.encoder.layers.0.layer_norm2.weight', 'vision_model.encoder.layers.23.self_attn.out_proj.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.bias', 'vision_model.encoder.layers.12.layer_norm1.bias', 'vision_model.encoder.layers.2.mlp.fc2.weight', 'vision_model.encoder.layers.2.self_attn.k_proj.weight', 'vision_model.encoder.layers.16.self_attn.out_proj.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.mlp.fc2.bias', 'vision_model.encoder.layers.23.layer_norm2.weight', 'vision_model.encoder.layers.13.layer_norm2.bias', 'vision_model.encoder.layers.17.mlp.fc1.bias', 'vision_model.encoder.layers.18.layer_norm1.bias', 'vision_model.encoder.layers.9.mlp.fc1.weight', 'vision_model.encoder.layers.4.layer_norm2.weight', 'vision_model.encoder.layers.12.mlp.fc1.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.bias', 'vision_model.encoder.layers.17.mlp.fc2.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.layer_norm1.bias', 'vision_model.encoder.layers.0.self_attn.v_proj.weight', 'vision_model.embeddings.patch_embedding.weight', 'vision_model.encoder.layers.1.self_attn.k_proj.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.bias', 'vision_model.encoder.layers.8.self_attn.q_proj.bias', 'vision_model.encoder.layers.18.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.layer_norm2.weight', 'vision_model.encoder.layers.23.layer_norm1.weight', 'vision_model.encoder.layers.4.self_attn.v_proj.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.weight', 'vision_model.encoder.layers.19.layer_norm2.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.weight', 'vision_model.encoder.layers.11.self_attn.k_proj.bias', 'vision_model.encoder.layers.9.self_attn.q_proj.weight', 'vision_model.encoder.layers.5.self_attn.out_proj.weight', 'vision_model.encoder.layers.1.self_attn.v_proj.bias', 'vision_model.encoder.layers.22.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.weight', 'vision_model.encoder.layers.20.self_attn.q_proj.bias', 'text_projection.weight', 'vision_model.encoder.layers.1.self_attn.q_proj.weight', 'vision_model.encoder.layers.8.mlp.fc1.bias', 'vision_model.encoder.layers.8.mlp.fc1.weight', 'vision_model.encoder.layers.23.mlp.fc2.weight', 'vision_model.encoder.layers.5.self_attn.k_proj.bias', 'vision_model.encoder.layers.11.mlp.fc1.weight', 'vision_model.encoder.layers.17.self_attn.k_proj.weight', 'vision_model.encoder.layers.9.layer_norm1.bias', 'vision_model.encoder.layers.3.self_attn.v_proj.weight', 'vision_model.encoder.layers.3.mlp.fc2.bias', 'vision_model.encoder.layers.5.mlp.fc1.bias', 'vision_model.encoder.layers.18.layer_norm2.bias', 'vision_model.encoder.layers.4.layer_norm1.weight', 'vision_model.encoder.layers.12.self_attn.v_proj.bias', 'vision_model.encoder.layers.14.self_attn.k_proj.weight', 'vision_model.encoder.layers.3.layer_norm1.weight', 'vision_model.pre_layrnorm.bias', 'vision_model.encoder.layers.19.self_attn.q_proj.bias', 'vision_model.encoder.layers.19.mlp.fc1.weight', 'vision_model.encoder.layers.6.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.mlp.fc2.bias', 'vision_model.encoder.layers.22.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.self_attn.q_proj.bias', 'vision_model.encoder.layers.1.layer_norm1.bias', 'vision_model.encoder.layers.11.self_attn.k_proj.weight', 'vision_model.encoder.layers.1.self_attn.out_proj.weight', 'vision_model.encoder.layers.5.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.v_proj.weight', 'vision_model.encoder.layers.22.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.layer_norm2.bias', 'vision_model.encoder.layers.11.self_attn.q_proj.weight', 'vision_model.encoder.layers.9.self_attn.k_proj.bias', 'vision_model.encoder.layers.15.layer_norm2.weight', 'vision_model.encoder.layers.2.self_attn.out_proj.weight', 'vision_model.encoder.layers.12.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.mlp.fc2.weight', 'vision_model.encoder.layers.8.self_attn.v_proj.weight', 'vision_model.encoder.layers.8.self_attn.q_proj.weight', 'vision_model.encoder.layers.14.layer_norm2.weight', 'vision_model.encoder.layers.18.layer_norm2.weight', 'vision_model.encoder.layers.1.layer_norm1.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.bias', 'vision_model.encoder.layers.10.self_attn.k_proj.bias', 'vision_model.post_layernorm.weight', 'vision_model.encoder.layers.21.self_attn.v_proj.bias', 'vision_model.encoder.layers.20.mlp.fc2.bias', 'vision_model.encoder.layers.13.self_attn.out_proj.bias', 'vision_model.encoder.layers.9.self_attn.out_proj.bias', 'vision_model.encoder.layers.17.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.layer_norm1.weight', 'vision_model.encoder.layers.19.mlp.fc1.bias', 'vision_model.encoder.layers.3.mlp.fc1.bias', 'vision_model.encoder.layers.15.layer_norm1.weight', 'vision_model.encoder.layers.17.self_attn.v_proj.bias', 'vision_model.encoder.layers.7.mlp.fc2.bias', 'vision_model.encoder.layers.4.mlp.fc1.weight', 'vision_model.encoder.layers.12.self_attn.k_proj.bias', 'vision_model.encoder.layers.20.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm2.bias', 'vision_model.encoder.layers.3.self_attn.q_proj.weight', 'vision_model.encoder.layers.3.layer_norm2.bias', 'vision_model.encoder.layers.15.layer_norm1.bias', 'vision_model.encoder.layers.14.self_attn.q_proj.bias', 'vision_model.encoder.layers.3.mlp.fc1.weight', 'vision_model.encoder.layers.22.self_attn.out_proj.weight', 'vision_model.encoder.layers.6.layer_norm1.weight', 'vision_model.encoder.layers.16.mlp.fc1.weight', 'vision_model.encoder.layers.23.layer_norm1.bias', 'vision_model.encoder.layers.1.layer_norm2.bias', 'vision_model.encoder.layers.17.layer_norm1.bias', 'vision_model.encoder.layers.16.self_attn.k_proj.bias', 'vision_model.encoder.layers.4.self_attn.k_proj.bias', 'vision_model.encoder.layers.7.self_attn.v_proj.bias', 'vision_model.encoder.layers.5.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.mlp.fc2.weight', 'vision_model.encoder.layers.11.self_attn.v_proj.weight', 'vision_model.encoder.layers.15.self_attn.k_proj.weight', 'vision_model.encoder.layers.21.layer_norm1.bias', 'vision_model.encoder.layers.20.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.layer_norm2.weight', 'vision_model.encoder.layers.12.mlp.fc2.weight', 'vision_model.encoder.layers.7.layer_norm1.bias', 'vision_model.encoder.layers.13.layer_norm1.weight', 'vision_model.encoder.layers.0.mlp.fc1.weight', 'vision_model.encoder.layers.12.self_attn.q_proj.weight', 'vision_model.encoder.layers.13.self_attn.k_proj.bias', 'vision_model.encoder.layers.6.layer_norm2.bias', 'vision_model.encoder.layers.11.layer_norm1.weight', 'vision_model.encoder.layers.3.self_attn.v_proj.bias', 'vision_model.encoder.layers.3.self_attn.out_proj.bias', 'vision_model.encoder.layers.15.self_attn.v_proj.bias', 'vision_model.encoder.layers.10.self_attn.v_proj.bias', 'vision_model.encoder.layers.4.mlp.fc2.weight', 'vision_model.encoder.layers.13.self_attn.q_proj.weight', 'vision_model.pre_layrnorm.weight', 'vision_model.encoder.layers.15.layer_norm2.bias', 'vision_model.encoder.layers.2.self_attn.v_proj.bias', 'vision_model.encoder.layers.23.mlp.fc1.bias', 'vision_model.encoder.layers.0.self_attn.k_proj.weight', 'vision_model.encoder.layers.12.mlp.fc2.bias', 'vision_model.encoder.layers.5.self_attn.q_proj.bias', 'vision_model.encoder.layers.17.layer_norm2.bias', 'vision_model.encoder.layers.19.mlp.fc2.bias', 'vision_model.encoder.layers.7.mlp.fc2.weight', 'vision_model.encoder.layers.8.layer_norm2.weight', 'vision_model.encoder.layers.18.mlp.fc1.bias', 'vision_model.encoder.layers.6.self_attn.k_proj.bias', 'vision_model.encoder.layers.22.layer_norm1.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.weight', 'vision_model.encoder.layers.6.layer_norm2.weight', 'vision_model.encoder.layers.18.self_attn.k_proj.weight', 'vision_model.encoder.layers.6.self_attn.k_proj.weight', 'vision_model.encoder.layers.15.self_attn.q_proj.bias', 'vision_model.encoder.layers.7.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.layer_norm2.bias', 'vision_model.encoder.layers.0.mlp.fc2.weight', 'vision_model.encoder.layers.5.mlp.fc2.weight', 'vision_model.encoder.layers.11.mlp.fc2.weight', 'vision_model.encoder.layers.19.self_attn.k_proj.bias', 'vision_model.encoder.layers.1.self_attn.k_proj.bias', 'vision_model.encoder.layers.8.self_attn.out_proj.weight', 'vision_model.encoder.layers.9.mlp.fc1.bias', 'vision_model.encoder.layers.3.layer_norm2.weight', 'vision_model.encoder.layers.23.layer_norm2.bias', 'vision_model.encoder.layers.13.self_attn.out_proj.weight', 'vision_model.encoder.layers.9.self_attn.out_proj.weight', 'vision_model.encoder.layers.8.self_attn.k_proj.weight', 'vision_model.encoder.layers.4.self_attn.out_proj.weight', 'vision_model.encoder.layers.2.mlp.fc1.bias', 'vision_model.encoder.layers.16.self_attn.out_proj.weight', 'vision_model.encoder.layers.14.self_attn.k_proj.bias', 'vision_model.encoder.layers.13.self_attn.v_proj.bias', 'vision_model.encoder.layers.19.mlp.fc2.weight', 'vision_model.encoder.layers.20.layer_norm1.bias', 'vision_model.encoder.layers.7.mlp.fc1.bias', 'vision_model.encoder.layers.10.self_attn.q_proj.bias', 'vision_model.encoder.layers.13.layer_norm1.bias', 'vision_model.encoder.layers.3.layer_norm1.bias', 'vision_model.encoder.layers.9.mlp.fc2.bias', 'vision_model.encoder.layers.12.layer_norm2.weight', 'vision_model.encoder.layers.6.self_attn.q_proj.weight', 'vision_model.encoder.layers.0.layer_norm1.bias', 'vision_model.encoder.layers.7.self_attn.q_proj.bias', 'vision_model.encoder.layers.16.mlp.fc2.bias', 'vision_model.encoder.layers.16.self_attn.q_proj.bias', 'vision_model.encoder.layers.22.layer_norm1.bias', 'visual_projection.weight', 'logit_scale', 'vision_model.encoder.layers.3.self_attn.q_proj.bias', 'vision_model.encoder.layers.18.self_attn.out_proj.bias', 'vision_model.encoder.layers.2.mlp.fc1.weight', 'vision_model.encoder.layers.5.layer_norm1.bias', 'vision_model.encoder.layers.15.self_attn.out_proj.bias', 'vision_model.encoder.layers.23.mlp.fc2.bias', 'vision_model.encoder.layers.15.mlp.fc1.weight', 'vision_model.encoder.layers.19.self_attn.out_proj.weight', 'vision_model.encoder.layers.2.self_attn.q_proj.weight', 'vision_model.encoder.layers.17.layer_norm2.weight', 'vision_model.encoder.layers.4.self_attn.q_proj.bias', 'vision_model.encoder.layers.12.mlp.fc1.bias', 'vision_model.encoder.layers.18.self_attn.out_proj.weight', 'vision_model.encoder.layers.6.mlp.fc2.weight', 'vision_model.encoder.layers.0.layer_norm2.bias', 'vision_model.encoder.layers.21.layer_norm2.weight', 'vision_model.encoder.layers.14.self_attn.out_proj.weight', 'vision_model.encoder.layers.16.mlp.fc1.bias', 'vision_model.encoder.layers.8.mlp.fc2.weight', 'vision_model.encoder.layers.0.self_attn.out_proj.weight'] - This IS expected if you are initializing CLIPTextModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing CLIPTextModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). current seed: 10788741199826055526 sampling best quality, masterpiece, 1girl, looking at viewer, blurry background, upper body, contemporary, dress ... 100%|███████████████████████████████████████████████████████████████████████████████| 25/25 [1:00:05<00:00, 144.23s/it] 100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [00:07<00:00, 2.04it/s] save to samples/1-ToonYou-2023-07-13T19-31-04/sample/best-quality,-masterpiece,-1girl,-looking-at-viewer,-blurry-background,-upper.gif current seed: 6520604954829636163 sampling masterpiece, best quality, 1girl, solo, cherry blossoms, hanami, pink flower, white flower, spring season, wisteria, petals, flower, plum blossoms, outdoors, falling petals, white hair, black eyes, ... 68%|███████████████████████████████████████████████████████ | 17/25 [43:32<19:27, 145.95s/it]
GIF图像生成中的内容如上所示,我们可以看到站长已经生成了一个16帧,分辨率512*512的GIF图像,2080TI的算力时间为1小时。
使用教程
上面本站已经讲了正常的使用方法,有些人可能看的云里雾里,下面站长换个例子再讲一次吧。
官方一共给提供了8种预设,他们分别如下。
python -m scripts.animate --config configs/prompts/1-ToonYou.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/2-Lyriel.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/3-RcnzCartoon.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/4-MajicMix.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/5-RealisticVision.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/6-Tusun.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/7-FilmVelvia.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512 python -m scripts.animate --config configs/prompts/8-GhibliBackground.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512
经过观察,我们只需要修改configs/prompts/*.yaml
的路径,即可对应修改相关Prompt等设置。
我们这次以7-FilmVelvia
为例子,打开
项目根目录
下的configs/prompts/7-FilmVelvia.yaml
路径查看一下这个文件。
FilmVelvia: base: "models/DreamBooth_LoRA/majicmixRealistic_v4.safetensors" path: "models/DreamBooth_LoRA/FilmVelvia2.safetensors" motion_module: - "models/Motion_Module/mm_sd_v14.ckpt" - "models/Motion_Module/mm_sd_v15.ckpt" seed: [358675358833372813, 3519455280971923743, 11684545350557985081, 8696855302100399877] steps: 25 guidance_scale: 7.5 lora_alpha: 0.6 prompt: - "a woman standing on the side of a road at night,girl, long hair, motor vehicle, car, looking at viewer, ground vehicle, night, hands in pockets, blurry background, coat, black hair, parted lips, bokeh, jacket, brown hair, outdoors, red lips, upper body, artist name" - ", dark shot,0mm, portrait quality of a arab man worker,boy, wasteland that stands out vividly against the background of the desert, barren landscape, closeup, moles skin, soft light, sharp, exposure blend, medium shot, bokeh, hdr, high contrast, cinematic, teal and orange5, muted colors, dim colors, soothing tones, low saturation, hyperdetailed, noir" - "fashion photography portrait of 1girl, offshoulder, fluffy short hair, soft light, rim light, beautiful shadow, low key, photorealistic, raw photo, natural skin texture, realistic eye and face details, hyperrealism, ultra high res, 4K, Best quality, masterpiece, necklace, cleavage, in the dark" - "In this lighthearted portrait, a woman is dressed as a fierce warrior, armed with an arsenal of paintbrushes and palette knives. Her war paint is composed of thick, vibrant strokes of color, and her armor is made of paint tubes and paint-splattered canvases. She stands victoriously atop a mountain of conquered blank canvases, with a beautiful, colorful landscape behind her, symbolizing the power of art and creativity. bust Portrait, close-up, Bright and transparent scene lighting, " n_prompt: - "cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg" - "cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg" - "wrong white balance, dark, cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg" - "wrong white balance, dark, cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg"
除了prompt和n_prompt参数,我们都保持默认,只修改这两个参数即可。
我们可以看到prompt和n_prompt参数都是4行内容,也就是执行之后将会生成4张图像,因为我们处于测试阶段,所以只保留最后一个即可,删除前面三个。
FilmVelvia: base: "models/DreamBooth_LoRA/majicmixRealistic_v4.safetensors" path: "models/DreamBooth_LoRA/FilmVelvia2.safetensors" motion_module: - "models/Motion_Module/mm_sd_v14.ckpt" - "models/Motion_Module/mm_sd_v15.ckpt" seed: [358675358833372813, 3519455280971923743, 11684545350557985081, 8696855302100399877] steps: 25 guidance_scale: 7.5 lora_alpha: 0.6 prompt: - "In this lighthearted portrait, a woman is dressed as a fierce warrior, armed with an arsenal of paintbrushes and palette knives. Her war paint is composed of thick, vibrant strokes of color, and her armor is made of paint tubes and paint-splattered canvases. She stands victoriously atop a mountain of conquered blank canvases, with a beautiful, colorful landscape behind her, symbolizing the power of art and creativity. bust Portrait, close-up, Bright and transparent scene lighting, " n_prompt: - "wrong white balance, dark, cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg"
我们将prompt和n_prompt翻译成中文看一下,到底是什么样的描述。
prompt: - "在这幅轻松的肖像画中,一名女子打扮成一名凶猛的战士,手持画笔和调色刀。 她的战争颜料由厚重、充满活力的色彩笔触组成,她的盔甲由颜料管和溅满颜料的画布制成。 她胜利地站在一座被征服的空白画布上,身后是美丽多彩的风景,象征着艺术和创造力的力量。 半身像、特写、明亮通透的场景灯光、" n_prompt: - "错误的白平衡、黑暗、卡通、动漫、草图、最差质量、低质量、变形、扭曲、毁容、坏眼睛、错误的嘴唇、奇怪的嘴、坏牙齿、突变的手和手指、坏的解剖结构、错误的解剖结构、截肢、额外 肢体、肢体缺失、肢体浮动、肢体断开、突变、丑陋、恶心、坏图片、负手负"
简而言之,我们只保留了第4个
,那么官方给出的示例,使用第4行Prompt所生成的图像如下所示。
站长在现有的基础上,添加一句话她的头上戴着紫色的帽子
之后,变为一个新的Prompt
,至于n_prompt不作修改
,修改后的内容如下:
FilmVelvia: base: "models/DreamBooth_LoRA/majicmixRealistic_v4.safetensors" path: "models/DreamBooth_LoRA/FilmVelvia2.safetensors" motion_module: - "models/Motion_Module/mm_sd_v14.ckpt" - "models/Motion_Module/mm_sd_v15.ckpt" seed: [358675358833372813, 3519455280971923743, 11684545350557985081, 8696855302100399877] steps: 25 guidance_scale: 7.5 lora_alpha: 0.6 prompt: - "In this lighthearted portrait, a woman is dressed as a fierce warrior, On her head was a purple cap,armed with an arsenal of paintbrushes and palette knives. Her war paint is composed of thick, vibrant strokes of color, and her armor is made of paint tubes and paint-splattered canvases. She stands victoriously atop a mountain of conquered blank canvases, with a beautiful, colorful landscape behind her, symbolizing the power of art and creativity. bust Portrait, close-up, Bright and transparent scene lighting, " n_prompt: - "wrong white balance, dark, cartoon, anime, sketches,worst quality, low quality, deformed, distorted, disfigured, bad eyes, wrong lips, weird mouth, bad teeth, mutated hands and fingers, bad anatomy, wrong anatomy, amputation, extra limb, missing limb, floating limbs, disconnected limbs, mutation, ugly, disgusting, bad_pictures, negative_hand-neg"
上面就是修改后的文件了,你可以不只是添加,而是完全修改,这与SD的使用无异。
修改完成后记得保存,然后在CMD中执行如下代码,强制切换工作路径。
cd /d D:\openai.wiki\AnimateDiff
激活已创建的Conda环境,这样才可以正常使用该项目,否则将会自动调用系统中的默认Python。
conda activate animatediff
执行下面的代码,其中7-FilmVelvia.yaml
就是站长刚刚修改的文件。
python -m scripts.animate --config configs/prompts/7-FilmVelvia.yaml --pretrained_model_path models/StableDiffusion --L 16 --W 512 --H 512
执行完上在的代码会经常长时间的计算,站长就不等了,我这破电脑保守估计也要好几个小时。
总结
这个项目对于电脑的硬件条件还是比较苛刻的,不然速度会非常非常非常慢,
但这就目前来看,是本站所写的所有开源项目中,生成视频效果最好的一个了。如果你有时间想折腾的话,或许也可以搭配SD,使用SD生成图像预览,然后保留你的Prompt和相关设置复制到Yaml中做相应的修改,这样可以避免盲盒抽奖,有更大概率得到你想要的图像。
站长,不好意思,能问一下,这玩意儿能云端部署吗
感谢