-
Notifications
You must be signed in to change notification settings - Fork 556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redaction Annotation Fill Not Matching Up With Redacted Section #3575
Comments
Inserting / Adding stuff to rotated pages can be confusing. For most methods in PyMuPDF you must pass rotated coordinates (for points, rectangles, ...) to get them in the right place. import pymupdf as fitz # PyMuPDF
RED = fitz.pdfcolor["red"]
def process_pdf(input_pdf_path, output_pdf_path):
# Open the input PDF file
document = fitz.open(input_pdf_path)
# Iterate through each page
for page in document:
# 234 is half of the width of the page
rect = fitz.Rect(0, 0, 234, 234)
rot_rect = rect * page.derotation_matrix
redact_annot = page.add_redact_annot(
rot_rect, text=f"{page.number=}", text_color=RED
)
page.apply_redactions()
document.ez_save(output_pdf_path)
if __name__ == "__main__":
input_pdf_path = "input.pdf" # Replace with the path to your input PDF
output_pdf_path = "output.pdf" # Replace with the path to your output PDF
process_pdf(input_pdf_path, output_pdf_path)
print(f"Processed PDF saved to {output_pdf_path}") |
Thanks for responding! This is part of the issue, but it is still not solving the issue of the redact_annot fill. The fill rectangle appears to be rendering separately from the redact_annot, and I'm not sure why. The black fill rect is not showing up here.
|
This file indeed does a few unexpected things! import pymupdf as fitz # PyMuPDF
RED = fitz.pdfcolor["red"]
BLACK = fitz.pdfcolor["black"]
def process_pdf(input_pdf_path, output_pdf_path):
rect = fitz.Rect(0, 0, 234, 234)
# Open the input PDF file
src = fitz.open(input_pdf_path)
doc = fitz.open() # output file
# Iterate through each page
for src_page in src:
# the output PDF will contain pages with rotation 0
src_rect = src_page.rect
w, h = src_rect.br
src_rot = src_page.rotation
src_page.set_rotation(0)
# make output page having the visible dimension of the input
page = doc.new_page(width=w, height=h)
page.show_pdf_page( # insert source page
page.rect,
src,
src_page.number,
rotate=-src_rot, # reversed original rotation
)
# now we can redact in a worry-free manner
redact_annot = page.add_redact_annot(
rect, text=f"{page.number=}", text_color=RED, fill=BLACK
)
page.apply_redactions()
doc.ez_save(output_pdf_path)
if __name__ == "__main__":
input_pdf_path = "input.pdf" # Replace with the path to your input PDF
output_pdf_path = "output.pdf" # Replace with the path to your output PDF
process_pdf(input_pdf_path, output_pdf_path)
print(f"Processed PDF saved to {output_pdf_path}") |
Close issue for lack of reaction. |
Description of the bug
I am trying to redact words from a PDF, based on OCR-generated rectangles.
PyMuPdf has worked well for us, but I have run into a strange situation with a specific file that has some strange properties. (I've attached the file). The pages in this file are an abnormal size (8.5 x 6.5 in) and some of them are rotated.
I would like to have the coordinates in the rectangles relative to the top left, but even before I do that, I have noticed that the redacted rectangle is not in the same place as the fill.
If this is not a bug, I would like to understand why these appear to be being drawn on separate coordinate systems, and how to reconcile them.
How to reproduce the bug
This is a simple script that shows the problem in the files below:
Input:
input.pdf
Output:
output.pdf
PyMuPDF version
1.24.5
Operating system
Windows
Python version
3.11
The text was updated successfully, but these errors were encountered: