Burnins: Text alignment and padding based on font metrics#1752
Burnins: Text alignment and padding based on font metrics#1752iLLiCiTiT merged 20 commits intoynput:developfrom
Conversation
Ah.. I see. This is caused by this line this line padding_y = padding - descentThis was done to shrink the box closer to the original dimensions I've tested a few things and see three possible approaches:
Note: Options 1 and 3 are considerably simpler from a code perspective, which may make them more robust and easier to maintain. Option 1 and 2 arguably produces the cleanest visual results. For reference, the script used to generate these examples is attached below: import subprocess
from PIL import ImageFont
# SETTINGS
FONT_FILE = "/mnt/pipeline/fonts/Roboto-Regular.ttf"
FONT_SIZE = 32
FONT_ALIGN = "font" # "text", "basline", "font"
ADJUST_PADDING = True
def draw_text(text: str, position: tuple[int, int], padding: int):
font = ImageFont.truetype(FONT_FILE, FONT_SIZE)
args = [
"drawtext@bottomleft",
f"fontfile='{FONT_FILE}'",
f"fontsize={FONT_SIZE}",
f"text='{text}'",
f"x={position[0]}",
f"y_align={FONT_ALIGN}",
"fontcolor=#FFFFFF@1.0",
"box=1",
"boxcolor=#338D3885@0.5",
]
if FONT_ALIGN == "font":
args.append(f"y=h-{position[1]}-font_a")
else:
args.append(f"y=h-{position[1]}")
# get the height of the area below the baseline
_, _, _, extra_bottom = font.getbbox(text, anchor="ms")
# distance from the top the current text to the top of an lowercase letter
(_, dist_lowercase, _, _) = font.getbbox("a", anchor="la")
(_, dist_top, _, _) = font.getbbox(text, anchor="la")
pad_l = padding
pad_r = padding
if ADJUST_PADDING:
pad_t = max(padding + dist_top - dist_lowercase, 0)
pad_b = max(padding - extra_bottom, 0)
else:
pad_t = padding
pad_b = padding
boxborderw = f"boxborderw={pad_t}|{pad_l}|{pad_b}|{pad_r}"
args.append(boxborderw)
return ":".join(args)
def main(i):
print(f"===== Main {i=} ======")
filters = [
draw_text(f"p={i}", (20, 100), padding=i),
draw_text("Abc", (100, 100), padding=i),
draw_text("uvw", (200, 100), padding=i),
draw_text("xyz", (300, 100), padding=i),
draw_text("_", (400, 100), padding=i),
draw_text("~", (500, 100), padding=i),
draw_text("^", (600, 100), padding=i),
]
args = [
"ffmpeg",
"-y",
"-v", "quiet",
"-i",
"/mnt/d/work/checkerboard.png",
"-vf",
",".join(filters),
"-frames:v", "1",
"-update", "1",
f"/mnt/d/work/out_a_{i}.png",
]
subprocess.run(args)
for i in range(30):
main(i)
# for i in [0, 10, 20, 30]:
# main(i)
# main(8) |
|
I think that option 1 is the best from all options. |
|
BUT what I think is important is that bottom padding should be decresed by So if I have |
|
Ok, then if padding is If there is |
|
Not really, it is combination of both. Bottom padding does not start from the "full height" bottom, but from descentless bottom. So if padding is EDITED: Why I think it is important: It is weird that the height is so big with padding compared to side padding. |
|
Ok, I think I got it here are some examples with and as animation going from 0-30 text_v3.mp4Explanation:using PIL's TextAchor Reference
at padding=10
this addresses:
at padding=0
this adressses:
compared to web-textits a bit different to how css does padding.
so it appears that the padding is simply applied around
here is a jsfiddle version where I changed the padding background color to confirm this: |
| return args | ||
|
|
||
|
|
||
| def _drawtext(align, resolution, text, options): |
There was a problem hiding this comment.
I left this in, since it is used for ffmpeg_burnins._drawtext and I dont know if that function would break if the dict contains more than just the expected x and y keys
|
They are added but font paths is not escaped as expected so it fails on windows path |
|
@iLLiCiTiT I had a look at Roboto Regular (left, the font I was using in my tests) vs.
to avoid clipping text that goes beyond the reported font metrics I replaced |
|
@BigRoy can you give it a go? Just to confirm... |
|
I did one more change, that I'm not sure if is related to my font. The most right x offset was off by 2 pixels. How to test: |
BigRoy
left a comment
There was a problem hiding this comment.
Before
tes_maya_reviewMain_v005_h264.mp4
After
tes_maya_reviewMain_v006_h264.mp4
It's hard to tell for me whether it's better. I like the fact that the text is more centered within their backgrounds, but I personally prefer the old where all the boxes touch the sides, and the data is tighter to the 'sides'
I think if we can keep it so they are as tightly squeezed to the sides then I think this is solid. But now the font is coming in from the sides too much. I'm a bit worried if instead we change the 'defaults' for the text in settings that it may fix it to look more alike, but at the same time anyone with any override to the settings would likely not benefit from this newer standard - so not sure how to keep this better backwards compatible.
So - is there anything we can do make it stick closer to the original states?
That is "autofixed" with settings conversion. To test it without conversion, set x and y offset to |
BigRoy
left a comment
There was a problem hiding this comment.
but I personally prefer the old where all the boxes touch the sides, and the data is tighter to the 'sides'
That is "autofixed" with settings conversion. To test it without conversion, set x and y offset to
0.
I see now. What a beauty.
tes_maya_reviewMain_v007_h264.mp4
|
Pillow does not calculate text width correctly, I will also test timecode and list values, because I do believe we changed only one part. |
|
Ok, all tested... Merging, and let's pray. |



















Changelog Description
Make sure burnin's through FFMPEG's
drawtextaligned with the bottom in Extract Burnin don't jump around based on what text is set in them.Align them by a static font height instead of the text height so that a particular higher character being in the text for a bit doesn't make the box jump up or down.
Additional info
Came up internally here.
Testing notes:
{comment}at the bottom of the burnin.Example Images