ITCooky Recipes

Lets cooky it yammy things!

Let’s change the face in video or photo with DeepFaceLab!

дата September 13, 2021

DeepFake sounds like honey to the Russian ear! DeepFaceLab has Russian roots and a gloomy present, suffice to say that the only complete instructions and the support forum are on the p◯|~╓╖ web site – by the way, for that reason the stars are already worried about the existence of the technology DeepFake. However, I want to reassure the rest of the public, DeepFake does not threaten you, the main reason is that 2 Instagram photos are not enough for DeepFake, and even 1000 is not enough, especially if they are taken from the same angle. For a truly convincing DeepFake, you need thousands of head images in different positions; These can only be compiled for celebrities with a lot of roles, interviews and a lot of computing power is needed, using next generation video cards, which means that it is very expensive.

Personally, DeepFake fascinates me, this technology is capable of outraging all the most powerful home computers, it costs them hours to do a simple DeepFake; someone will remember that the first mp3 files were made at the speed of one song per night and it is only a matter of technological development, but it is another case.

How DeepFake works
It is best explained in terms of fruit. We want to make orange juice from apples. We take an orange in hand, squeeze it and extract orange juice. Then we take an apple with the same hand, squeeze and obtain orange juice; In real life this does not happen, but in the case of DeepFake, it does. The main thing here is the hand, it needs to be trained; this requires a lot of computing power and apples, oranges are not as important as a highly trained hand with apples, it can make grape juice with them, it just needs to be trained a little with grapes before that.

Video or photo
It does not matter. The video we want to change is divided into frames and placed in the workspace/data_dst folder; instead of frames, you can simply put a photo here. If there are many faces in the photo, you can start manual recognition and select the desired face.

Install DeepFaceLab
DeepFaceLab has an official site github.com/iperov/DeepFaceLab but the links to assemblies for Windows point to slop web site or even torrents.
I download a Windows version: DeepFaceLab_NVIDIA_up_to_RTX2080Ti_build_09_06_2021.exe

This is the second version of DeepFaceLab. Users note that version 1 does an excellent job with shadows and face color, and the second version doesn’t by default, and finding the correct setting is extremely difficult on weak systems (like mine).

Linux: I wanted to install it in Ubuntu, but the installation method is a bit dodgy, it includes a software that is not free open source … and it is notable that the creator of DeepFaceLab prefers Windows, so I do everything on it, sorry!

I have an NVIDIA GeForce RTX 2070 – the number of CUDA® 2304 cores, the amount of RAM in the kit is 8GB. CUDA cores will be needed, there aren’t many; the NVIDIA GeForce RTX 3070, already has 5888. It is important to note that my video card for DeepFake is weak and any card will be weak if you want to get an excellent result overnight in 12 hours; for that maybe 3-4 best NVIDIA video cards are needed …

All PC configuration:
OS: Windows 10 Pro 21H1 19043.1165
CPU: AMD® Ryzen 7 3700x 8-core processor × 16
MB: Gigabyte X570 I AORUS PRO WIFI
Mem: 64 GB DDR4 1330 Mhz
GPU: Nvidia GeForce RTX 2070 8Gb
Disk: Samsung SSD 970 EVO Plus 1TB

Unzip and get

This is the soft, there is no graphical interface, the whole operation is started by executing the scripts that are presented in this folder.

In Windows, you must put this in Display> Graphics Settings for better video card performance.

Material for DeepFake
Spanish actor Álvaro Morte will be the face of the transplant. I take this video Entrevista y sesión de doblaje con Álvaro Morte ‘Smalfoot’ there are only 402, but they are all in same position with donrs video. By the way, this is not the first face, before that I had already taken several – it is better to choose face with more face, free forehead, etc.

If you have a live face: let’s say yours. This simplifies the collection of facial data for transplantation. Shoot a video of yourself at the highest resolution. To have the many as possible face position, you will have to: turn your head to the left, right, up, down, open your mouth, smile, smile and turn your head to the left, then to … and so on. . Turn your head as you speak and smile. Please take care of the light when you turn, maybe a shadow will appear on your face somewhere; this will lower the quality; It may be best to shoot against the background of a drab wall on a sunny day, but not in direct sunlight.

As a donars video, I first took two seconds of 720p from the trailer of the new Matrix, but after a night of processing, the result was so terrible that … I decided to take a video of terrible quality already an old Mexican humorous series, the character Professor Hiraphales

Let’s do a DeepFake
We go to the folder where DeepFacelab was unzipped. In the workspace folder there are already examples, so we delete everything, we execute:
1) clear workspace.bat

Now in workspace we copy the video with Alvaro and change the name to data_scr.mp4

We take frames, for this we execute:
2) extract images from video data_src.bat
change
[0] Enter FPS ( ?:help ) : 3
[png] Output image format ( png/jpg ?:help ) : jpg
The frames appear in a folder workspace / data src

Now we have to find faces in these frames, we execute:
4) data_src faceset extract.bat
This can be done manually, if, for example, there is more than one face in the frame

Here I change the parameters, the rest are default.
[f] Face type ( f/wf/head ?:help ) : f
[0] Max number of faces from image ( ?:help ) : 1

Extracting faces...
Running on GeForce RTX 2070
100%|##############################################################################| 402/402 [07:22<00:00,  3.10it/s]
-------------------------
Images found:        402
Faces detected:      399

Bad faces can be cleaned up by running:
4.2) data_src sort.bat
Eliminate blur one, etc.

Choose sorting method:
[0] blur
[1] motion_blur
[2] face yaw direction
[3] face pitch direction
[4] face rect size in source image
[5] histogram similarity
[6] histogram dissimilarity
[7] brightness
[8] hue
[9] amount of black pixels
[10] original filename
[11] one face in image
[12] absolute pixel difference
[13] best faces
[14] best faces faster
Now in workspace copy the Girafales video and rename it to data_dst.mp4

We also need to extract frames, run:
3) extract images from video data_dst FULL FPS.bat
here they only give you the option of the type of photo
[png] Output image format ( png/jpg ?:help ) : jpg

We take out the faces:
5) data_dst faceset extract.bat
Here I also choose face
[f] Face type ( f/wf/head ?:help ) : f
They can also be cleaned, but i has not done it.

Now we need to do face masks

I don’t really understand if I’m doing everything in the correct order, but from the result I see that I am.

I make:
5.XSeg Generic) data_dst whole_face mask - apply.bat
5.XSeg Generic) data_src whole_face mask - apply.bat

It is the most important stage … here you have to estimate and think. In my original faces, it is best to take a face from the eyebrow to the chin. To do this, you have to circle a couple of frames manually, it is better to choose with different faces, then the training will be based on this, as I noticed!

I don’t edit Girafales, so it was well recognized by the machine, you can see it:
5.XSeg) data_dst mask - edit.bat

In Alvaro, I cut the forehead by hand into several frames; mask training will depend on them
5.XSeg) data_src mask - edit.bat

I start training masks:
5.XSeg) train.bat
here I choose a face too
[wf] Face type ( f/wf/head ?:help ) : f

I started the training, here they advise a pretrained model to use it and complete it: all humans have approximately the same faces, in terms of the number of noses and eyes, almost all of them have the same! But I did not take it, it is a huge file and to download it you must register in that forum.

You can see the process: at first the pictures are scary, but this is necessary for training, twisted faces helps.

How much to train? They always say the more is better! I look at the images in preview, if they are already good, stop.

I start to apply these masks:
5.XSeg) data_src trained mask - apply.bat

And I start the main training.
train SAEHD.bat
Here I choose this, I do not change the rest of the configuration
[0] Autobackup every N hour ( 0..24 ?:help ) : 1

[f] Face type ( h/mf/f/wf/head ?:help ) : f

[n] Eyes and mouth priority ( y/n ?:help ) : y

[0.0] Face style power ( 0.0..100.0 ?:help ) : 0.0
[0.0] Background style power ( 0.0..100.0 ?:help ) : 0.0

Option Face style power should be responsible for the color and shadows and it is very difficult to choose it correctly and greatly affects the processing time, as a result i only had a black hole in Álvaro’s nose instead of the shadow of Neo’s nose. Just reset it to zero when you notice that things will go wrong. After that you must continue the training, you will not have quick changes.

The beginning of the training, in the preview of the result, we have a spot, but in which you can already recognize the mustache!

Here you can see that 68000 iterations already have been done, this model was studying at night already trying to put Álvaro’s face on Neo – learning is good!

You can stop training at any time by pressing [Enter], you can only check position, because the image will be blurry at first. And start training again.

After 3000 iterations I stop to see how it goes, and I do:
7) merge SAEHD.bat

Here I select everything by default, first the instruction is shown and it will be possible to jump to the images by clicking [Tab]

First frame

In fact, the result is excellent. The edges can be corrected, the blur corresponds to the video quality, the color is the same cyanotic, everything is fine within the low quality of the original. Álvaro’s face even stands out for its clarity!

I train a little more and again I stop and start correcting.
7) merge SAEHD.bat

Blue mask set 78 with button [E] -so that the edges are blurred, but the color also fades and j
Erode mask set 20 with button [W] – makes it seem more like donor
Blur/Sharpen set-4 with button [H] – I erase it a little, otherwise it is too light

You can also cut Álvaro’s mask on all sides (I would like just one but no)
for this, pressing [W] decreases the mask. [S] will increase, not immediately, some time is spent on processing
You can zoom in on the face button [J] zoom out [U]

It looks good, transplant traits are minimized … for some reason, it looks like House M.D. now!

This is the first frame, I press the [? /] what to apply settings to the next frame, I click it until all are applied. At the end I press [ESC]

Now I save everything in the video, I run:
8) merged to mp4 lossless.bat

In folder workspace appears file result.mp4 . I notice some blending flaws, but the result is still good! I cut it and paste it with the original here:

Now the photo
I will try to correct a great injustice, Álvaro Morte is not on the poster for the fifth season of La Casa de Papel. I manually delete all files in the workspace/data_dst folder and workspace/data_dst/aligned if there are other folders, I also delete them. I put in workspace/data_dst the image of the poster.

I manually pull out the face:
5) data_dst faceset extract MANUAL.bat
I point it with the mouse, press and then [ENTER]

Do:
5.XSeg Generic) data_dst whole_face mask - apply.bat
See:
5.XSeg) data_dst mask - edit.bat

Change:
5.XSeg) data_src trained mask - apply.bat
I don’t even know if this will apply without training masks.

As I understand it, if you had a super-trained model (they say its 400-500 thousand iterations) for all face positions, then you wouldn’t have to start training every time. But I have weak training.
I start the training:
6) train SAEHD.bat
i put here 0.001
[0.0] Face style power ( 0.0..100.0 ?:help ) : 0.001
[0.0] Background style power ( 0.0..100.0 ?:help ) : 0.001
Maybe here it can give a suitable shade in a suitable time frame.

I did not study for a long time, only half an hour, it did not affect the quality much, and I cannot have it training days.

I make:
7) merge SAEHD.bat

I improve the quality with buttons to my liking

It does not copy clear textures from the original, I suspect that for this it is necessary to train the model for a long time.
But from afar, you can already take the photo for the original, and recognize Álvaro in his face too … if you try hard. If I had taken a not so clear photo initially, it would have been better, like with a video. The main thing that DeepFake did was put on his face (although you can see that it is crooked in the eyes), and the clarity and colors of all this can be finalized in the photo editors, it will be faster.

I’m not going to give up with DeepFake, I’ll keep learning it, apparently my limit is a low quality video and a photo without details of the face!


2 Responses to “Let’s change the face in video or photo with DeepFaceLab!”

  1. JR says:

    Great post. When you mention:

    start to apply these masks:
    5.XSeg) data_src trained mask – apply.bat

    You never say to do the dst version of that apply.bat

    Was that missed or after running 5.XSeg) train.bat you only need to apply the src trained mask info?

    • Александр says:

      I usualy applay only dst, i think appling src you make like a hole new face set to model starting all form 0

Leave a Reply

Your email address will not be published. Required fields are marked *