top of page
big-dark-grey-cat-in-round-glass-bowl-filled-with-water-surrounded-by-small-dark-fish-swim

TEXT TO IMAGE

The tutorial below will take you through the usage of an open sourced creative platform called Fooocus which is a highly developed AI image generator that also has a user friendly interface. I have some thoughts on Text to Image, regarding how it is a big mistake to assume that these platforms will create an image for you based on a simple demand. The truth of the matter is that it is you and you alone that makes the picture by using a novel technology as a technical assistant. Nothing less than that but nothing more either. Read more here >>>

Fooocus, which has been created by Illyasviel, can be downloaded from Github here: https://github.com/lllyasviel/Fooocus

At the top of the page you will see the code. When you scroll down the page, past the code section and a comparison chart to Midjourney,  you will see the download link, which is Windows only btw. Do what it says on the page, download and uncompress the zip file. In it you will see 3 .bat files. You only need to run the first of these which is called run.bat. Double click on this file which will open a DOS window and load the platform. The other two will load other presets, which I personally haven't tried yet.

vvv

Note: The DOS window that opened when you ran the run.bat should be minimized but kept open during all the time that you are working with Fooocus

Once the run file has loaded your default browser will open a new tab and on it you will see what follows from here on as a series of tutorial screenshots: 

MAIN

Inpainting and Outpainting

Now that we have gone through the main window of Fooocus, we come to the little "Input Images" check box on the bottom left. Once you uncheck this the page gets longer and you get a menu with 4 tabs.

vvv

The first of these is called "Upscale or Variation". With this you can upscale an image by dragging or uploading it into the little window. I would not advice this since it takes a long time and there is software out there which we will get to later that does a very good job of this task. But what you can also do is create variations of the uploaded or dragged image. In my experience "Vary Subtle" hardly makes a change to the original image, so that one I wouldn't bother with. However "Vary Strong" does make changes. The issue is that the changes are oftentimes too drastic, move too far from the original. It is a good idea to have the same prompts as well as the same seed instead of a random one.

vvv

As I have already shown above, you have access to your history from where you can obtain all of this info. And not only do you have access to the session log but you actually have access to your entire Fooocus history which you can find inside a folder called "Output" that is in the Foocus folder nested inside the folder that you downloaded that also contains the run.bat file. It is a good idea to clean up this output archive from time to time and move whatever you want to hold onto to an external drive since this will end up taking up a huge amount of space on your C drive.

variation-all.jpg

The second tab is called "Image Prompt" and here you can place 4 different images inside 4 windows and create an iteration without having to write anything, although I find that writing an additional prompt helps. What you need to be super careful of, if you decide to do this is to use only your own images or creative commons images that you have permission to use. I am not showing a completed iteration here since the result is somewhat similar to the "Vary Strong" feature above. Oftentimes the iterations are too far removed from the prompt images, even if the prompt images are actually very similar and you are using the same styles and the same seed.

image-prompt-1.jpg

The third tab is where the action really starts: This is called "Inpaint or Outpaint" and here you can do some really wonderful stuff, especially when it comes to achieving high end detailed results. The first time that you decide to do this Fooocus will download some additional things to your computer which takes a little while.

vvv

Outpainting makes the image that you drag into the window wider or higher or both. This is the default mode, it is what you see first in the little drop down menu. 

outpaint.jpg

You can use the default "Inpaint or Outpaint" mode to also change content as shown in the gallery below:

One of the other two choices in the pulldown is "Improve Detail (face, hand, eyes, etc.)" which is extremely useful for face details or any other detailed stuff that you wish to improve upon. However, in order to get this to work really well you should upscale whatever it is that you want to improve to at least twice the size (3 or 4 is even better), so that the algorithm has enough pixels to work with. Trying to improve details on a small resolution image will give you only mediocre results. Scroll down to the "Uspcaling" section on this page to find out more on how to Upscale by using a free software called Upscayl.

detail-5.png

So, here is the result in which the face details have been fixed on the left and in their original raw state on the right. As you can see from the 100% zoom on the bottom the fix was performed on a very large image which was upscaled to 3 times the original size. With the original sizes that Fooocus gives you, which are quite small, there is no way that you would get a good result. So, always remember to upscale before you start making detail fixes.

Under the Inpaint Outpaint tab there is a third option that you can access from the drop down menu called "Modify Content (add objects, change background, etc.)". I am not going to add screenshots to show you the procedure for this since it is exactly like the one above for improving details: You drag or upload the picture into the window, paint over the area where you want to either add things or remove things, or replace things and then press generate. Again, you make sure that you have the same styles that you find in your history log enabled and then generate the change.

vvv

I will, however, show you 4 images, one of the "before" and then 3 variations to show you the sorts of results that you can achieve. I deleted the main prompt at the top so that the algo did not get confused and added cats, which was a word in that main prompt, then I wrote into the small inpaint prompt box "add gold leaves and gold art nouveau stuff scrolls remove table remove platform". I painted the entire area around the cat, and this is what I got:

IN/OUTPAINT

The Prompt

Now that you know how it works, pay attention to the following while you are putting together your prompt since this is the most important part of this whole technique. Essentially what you are doing is painting a picture with words rather than a pencil or a brush or a computer mouse. But, the process is exactly the same - you first have to imagine the picture. So, close your eyes and visualize exactly what the picture should look like. This means that you need to bring together many different visual components in your head and then put them into a type of natural writing (not keywords, in other words, but proper small sentences, separated by commas) that is uncomplicated enough for the algorithm to understand but also holds all the information that it needs to do what you want it to do:

  • all of the objects that you want to see in the big "positive" prompt box under render area written in order of importance, even small things.

  • all of the things that you do not want to see written in the small "negative" prompt box on the side.

  • the activity of the animate things (people, animals, vehicles, etc) in the picture: Are they sitting down, flying, running, doing aerobics, sleeping, making love, eating, etc etc? You have to specify all of that also.

  • material of background (such as "infinity background" or "thickly wooded forest")

  • material of ground or placement surface (such as "tablecloth" or "dark soil")

The mood of the picture:

  • happy

  • upbeat

  • party

  • sad

  • depressed

  • scary

  • gloomy

  • serious

  • business

  • moody

  • romantic

  • meeting

  • natural

  • appetizing

  • clean

  • hygienic

  • etc etc

The type of shot angle that you want. (Read more about camera angles here >>>) This could be:

  • closeup

  • high angle

  • low angle

  • dutch angle

  • eye level

  • bird's eye

  • medium angle

  • long angle