Images 💾

Last commit ⭐

commit 9495f64f913107ef746588986edb8de5e4c39db7
Author:     Nico Weber <thakis@chromium.org>
AuthorDate: Mon Jan 1 19:31:27 2024 -0500
Commit:     Andreas Kling <kling@serenityos.org>
CommitDate: Tue Jan 2 22:13:21 2024 +0100

    LibPDF: Improve hex string parsing
    
    A local (non-public) PDF I have lying around contains this in
    a page's operator stream:
    
    ```
    [<00b4003e> 3 <002600480051> 3 <005700550044004f0003> -29
    <00330044> 3 <0055> -3 <004e0040> 4 <0003> -29 <004c00560003> -31
    <0057004b> 4 <00480003> -37 <0050
    >] TJ
    ```
    
    That is, there's a newline in a hexstring after a character.
    
    This led to `Parser error at offset 5184: Unexpected character`.
    
    The spec says in 3.2.3 String Objects, Hexadecimal Strings:
    """Each pair of hexadecimal digits defines one byte of the string.
    White-space characters (such as space, tab, carriage return, line feed,
    and form feed) are ignored."""
    
    But we didn't ignore whitespace before or after a character, only
    in between the bytes.
    
    The spec also says:
    """If the final digit of a hexadecimal string is missing—that is, if
    there is an odd number of digits—the final digit is assumed to be 0."""
    
    In that case, we were skipping the closing `>` twice -- or, more
    accurately, we ignored the character after it too. This has been
    wrong all the way back in #6974.
    
    Add a test that fails if either of the two changes isn't present.