Tagnet: a CLIP tags exploration tool¶

Introduction¶
CLIP and VQGan allow you to generate beautiful images from text. The descriptions of these images should be more specific than in natural language and are called prompts 1. The goal I have while making this document and code is to document how to reach the best results using prompts.
At Jun 1st., 2021, Aran Komatsuzaki tweeted that mentioning “Unreal Engine” changes the visual style and quality of an image. Since CLIP learned on the images from the Internet, the “Unreal Engine” can be called one of its many sources of inspiration. Even before that, many looked for tags, words that change how CLIP draws things.
I’ve experimented with many CLIP prompts using a Discord bot by BoneAmputee and decided to build a list of words I use often.
Then I experimented more, especially with a pencil style and understood I will need more than one list, because co-occurences of the words create a graph! I have also added many prompts by other users, often with some editing and pre-processing to make them more uniform.
The prompts
directory contains two files with mostly cleaned up prompt samples.
Uses¶
Counting tags¶
The basic use for the tagnet
utility is to count tags and display the counted occurences
for each of the tags.
You need to provide a directory with text files containing the prompts in a path
command-line argument.
tagnet.py --path ./prompts --mode count_tags
Filtering¶
You may also need to filter tags by the number of occurences.
For now, these are the supported modes (you can put whitespaces between mode and a number):
=
, >
, <
, >=
, <=
Examples:
tagnet.py --mode count_tags --filter "=1"
tagnet.py --path ./prompts --mode count_tags --filter ">1"
tagnet.py --path ./prompts --mode count_tags --filter "< 3"
tagnet.py --path ./prompts --mode count_tags --filter "<= 8"
tagnet.py --path ./prompts --mode count_tags --filter ">=5"
Tag graph¶
Displaying an approximate graph¶
Often, a prompt contains several tags, for example:
Sunset in a forest ; VRay ; 3D ; High detail
We’ve got two co-occurences:
VRay
and3D
High detail
and3D
Edges for this command are weighted, based on an amount of said co-occurences in all available prompts.
To generate and see it, write:
tagnet.py --mode display_graph --path ./prompts
The graph is using Matplotlib and WxWidges and looks like that.

Displaying a web graph¶
There’s a frontend side of the project: CLIP graph visualized. You may want to watch an online demo with existing tags or build your own tag graph and watch how it differs:
# Replace "your_path" with a path containing prompt directory and available for JSON file export
# --path is a prompt directory
# --output_file is a path to a new JSON output file
tagnet.py --path ~/your_path/prompt_directory --mode export_graph --output_file ~/your_path/graph.json
Now you can clone the visualization repository to use it locally and copy the generated graph.json
as a data source.
# Clone a repository
git clone git@github.com:6r1d/CLIP_graph_visualized.git
cd CLIP_graph_visualized
# Copy a graph.json
cp ~/your_path/graph.json ./graph.json
# Run a Python 3 webserver locally on a 8080 port
# (any other webserver with static file support might work)
python3 -m http.server --bind 0.0.0.0 8080
Now, by visiting “http://0.0.0.0:8080” or “http://127.0.0.1:8080”, you’ll be able to see your own graph version. The visualizer is using a force-graph library by Vasco Asturiano. It allows you to zoom in and out, see tag names and shift the workspace.

Code documentation¶
Documents the code to make it easier to navigate and maintain.
Tagnet utility¶
Lib directory¶
cmd_args module¶
This module contains code for
configuring commandline argument support and
related argparse
actions.
- class lib.cmd_args.NumberFilterAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)¶
An
argparse.Action
subclass that validates the number filters. Accepts inputs like<x
,= x
or>=x
, wherex
is an integer.Ignores a space in the middle.
- Raises
ValueError – if an incorrect format is provided
- class lib.cmd_args.ReadableDirectoryAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)¶
An
argparse.Action
subclass that checks if a directory is readable.- Raises
ArgumentTypeError – if a path is invalid
ArgumentTypeError – if a directory is unreadable
- lib.cmd_args.configure_parser()¶
Configures
argparse
to accept arguments needed by thetagnet
utility like “path”, “output_file”, “mode”, “filter”, etc.
process module¶
plot module¶
graph_util module¶
prompts module¶
Contains a function to load prompts from available files.
- lib.prompts.load_prompts(dir_path)¶
Looks up a directory path, takes a full path for it, lists for directory contents and loads all available prompts.
Example
>>> from lib.prompts import load_prompts >>> prompts = load_prompts('./prompts') >>> prompts[-3:] [ '.imagine α-pinene pool ; vray ; PBR ; HDR ; closeup ; DSLR ; hyperrealistic', '.imagine omicron ; vray ; hdr illumination ; contest winner', '.imagine the night ; vray ; isonoise ; contest winner ; highly sought art' ]
- Parameters
dir_path (str) – a path to the prompt directory
- Returns
a list of strings containing CLIP prompts
- lib.prompts.prompt_split(prompt, maxsplit=0)¶
Split to unique prompts.
Examples
>>> prompt_split('.imagine the Fresnel lens ; in fine detail ; rendered in charcoal | realistic', 1) ['.imagine the Fresnel lens', 'in fine detail ; rendered in charcoal | realistic']
>>> prompt_split('in fine detail ; rendered in charcoal | realistic') ['in fine detail', 'rendered in charcoal', 'realistic']
- Parameters
prompt (str) – a prompt to split
maxsplit (int) – a maximum number of splits
- Returns
a list of strings containing prompt elements
tags module¶
This module contains:
a generic tag processing class that corrects case, stores a tag list, counts tags
a function that extracts a list of tags from a CLIP prompt string
- class lib.tags.Tag_processor¶
Used to store tag indices, proper tag cases, global count of the tags.
- Variables
case_fix_dict (dict) – associates the lowercase string with properly cased ones
tag_list (list) – a list of enumerated lowercase strings
global_tag_count (int) – a count of all the tags added
- add_tags(tag_list)¶
Works like put_tags, but returns nothing
- Parameters
tag_list (list) – a list of strings with tag names (case-insensitive)
Example
>>> from lib.tags import Tag_processor >>> tp = Tag_processor() >>> tp.add_tags(['SFX', 'high detail', 'light transport sharpening'])
- get_tag_list()¶
- Returns
A list of dictionaries with “id”, “name” and “rank” attribute. ID is an integer, name is a string, a rank is a float value containing the quotient of tag count divided by the global tag count.
Example
>>> from lib.tags import Tag_processor >>> tp = Tag_processor() >>> tp.put_tags(['landscape', 'beautiful', 'neon']) [0, 1, 2] >>> tp.get_tag_list() [ {'id': 0, 'name': 'landscape', 'rank': 0.3333333333333333}, {'id': 1, 'name': 'beautiful', 'rank': 0.3333333333333333}, {'id': 2, 'name': 'neon', 'rank': 0.3333333333333333} ]
- get_tag_numbers()¶
Iterate a list of tags with their count.
- Returns
a list of tuples, containing tag names and numbers
Example
>>> from lib.tags import Tag_processor >>> tp = Tag_processor() >>> tp.put_tags(['landscape', 'beautiful', 'neon']) [0, 1, 2] >>> tp.get_tag_numbers() [('landscape', 1), ('beautiful', 1), ('neon', 1)]
- get_tag_rank(tag_id)¶
- Parameters
tag_id (int) – a tag index
- Returns
a rank of a tag, the quotient of tag count divided by the global tag count
- put_tag(tag)¶
- Parameters
tag (str) – a tag name, case-insensitive
- Returns
a tag ID
Example
>>> from lib.tags import Tag_processor >>> tp = Tag_processor() >>> tp.put_tag('VFX') 0 >>> tp.put_tag('HDR') 1 >>> tp.put_tag('DSLR') 2
- put_tags(tag_list)¶
- Parameters
tag_list (list) – a list of strings with tag names (case-insensitive)
- Returns
a list of tag IDs
Example
>>> from lib.tags import Tag_processor >>> tp = Tag_processor() >>> tp.put_tags(['SFX', 'high detail', 'light transport sharpening']) [0, 1, 2]
- lib.tags.extract_tags(prompt)¶
Extract a list of the tags from a single prompt.
- Parameters
prompt (str) – a prompt for the CLIP neural network
Example
>>> from lib.tags import extract_tags >>> extract_tags('.imagine the color clash ; HDR ; hyperrealistic ; contest winner') [ 'HDR', 'hyperrealistic', 'contest winner' ]
Plans¶
Modes¶
Currently, there’s two modes to display the graph: Python’s WxWidgets interface and a web interface.
There exists a potential to get more information out of the dataset by expanding available modes.
Adjacency graph¶
A mode for an adjacency graph will require a bit more work, for example, exporting only a top N tags and limit tag lengths so everything can be displayed.
Experiments¶
Edge weighting¶
Edge weights in pair_mgr are currently divided by an edge_count
parameter.
I am not sure it is an ideal option that allows to see the maximum amount of details.
Weighting by relation¶
Will add edges between tags like
Abstract style
andAbstract
add more context?How to weight those edges properly?
Argparse¶
Save contents for
tag_manager
andpair_manager
Footnotes
- 1
Understanding Community Detection Algorithms with Python NetworkX
- 2
- 3
Node2vec: Scalable Feature Learning for Networks; How node2vec works — and what it can do that word2vec can’t
- 4
Paper2vec: Citation-Context Based Document Distributed Representation for Scholar Recommendation by Han Tian and Hankz Hankui Zhuo
Footnotes
- 1
ArXiV: Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm by Laria Reynolds and Kyle McDonell. This article talks about the GPT-3 language model, but the same term applies to GPT-2, GPT-3, GPT-j and CLIP itself.