Clean Up PDFs
Erase black borders, remove punch holes, despeckle, and auto-deskew pages of a PDF
Drop your PDF file here or
Upload from your device
Upload from Google Drive
Upload from DropBox
Upload from Web address (URL)
Max file size: 128 MB
Powered by GdPicture OCR Preprocessing SDK | Find out more here - PSPDFKit GdPicture.NET OCR Preprocessing
Your files are safe !
We use the best encryption methods to protect your data.
All documents are automatically deleted from our servers after 30 minutes.
If you prefer, you can delete your file manually right after processing by clicking the bin icon.
How to clean up the pages of a PDF file online:
- To start, drop your PDF file or upload it from your device or your cloud storage service.
- Click on the filter you want to apply on your document: erase black borders, auto deskew, punch hole removal, despeckle.
- The filter engine automatically cleans up the document.
- Click on the Save button.
- Download the cleaned-up PDF file to your computer or save it directly to your cloud storage service.
Blog posts
Did you know?
Why is optimizing scanned documents so important? Besides better readability and visual appearance of the files, there are other benefits for cleaning up scanned documents.
Any detection engine like OCR will provide better results on a clear document. It is also the case for recognizing barcodes ,checkboxes in exam forms, special fonts in checks, and any other element.
You also get better compression results on cleaned-up documents. Tools like hyper-compression ensure the best quality/readability ratio for your PDFs and sometimes even improve scanned documents' readability, thanks to many optimization algorithms.
Once your documents are cleaned up, you can compress and convert them to PDF/A for long-term archiving and preservation. People who will use your documents in the future will thank you for this!
Any detection engine like OCR will provide better results on a clear document. It is also the case for recognizing barcodes ,checkboxes in exam forms, special fonts in checks, and any other element.
You also get better compression results on cleaned-up documents. Tools like hyper-compression ensure the best quality/readability ratio for your PDFs and sometimes even improve scanned documents' readability, thanks to many optimization algorithms.
Once your documents are cleaned up, you can compress and convert them to PDF/A for long-term archiving and preservation. People who will use your documents in the future will thank you for this!
Scanned documents quite often contain unwanted and randomly disseminated artifacts known as “noise.” In the imaging domain, we even have “salt and pepper noise,” which is bright pixels on darker areas and dark pixels on brighter image areas, as if someone poured salt and pepper particles over the document (imaging likes metaphors).
There are many filters to remove noise from a scanned document.
The Despeckle filter removes noise from images without blurring edges. It attempts to detect complex areas and leave these intact while smoothing areas where noise will be noticeable. Despeckle can clean up dirty or faded drawings that show spots or speckles after scanning.
The Median filter reduces noise in a layer by blending the brightness of pixels within a selection using an algorithm. The filter searches for pixels of similar brightness, discarding pixels that differ too much from adjacent pixels, and replaces the center pixel with the median brightness value of the searched pixels. It helps eliminate or reduce the appearance of motion in an image or undesirable patterns that may appear in a scanned image.
Median filtering particularly enhances OCR results because it removes noise but preserves edges.
There are many filters to remove noise from a scanned document.
The Despeckle filter removes noise from images without blurring edges. It attempts to detect complex areas and leave these intact while smoothing areas where noise will be noticeable. Despeckle can clean up dirty or faded drawings that show spots or speckles after scanning.
The Median filter reduces noise in a layer by blending the brightness of pixels within a selection using an algorithm. The filter searches for pixels of similar brightness, discarding pixels that differ too much from adjacent pixels, and replaces the center pixel with the median brightness value of the searched pixels. It helps eliminate or reduce the appearance of motion in an image or undesirable patterns that may appear in a scanned image.
Median filtering particularly enhances OCR results because it removes noise but preserves edges.
Skew is an artifact that might appear during the document scanning process, and it consists of getting the document’s text/images rotated at a slight angle. Most of the time, it occurs when the paper is misplaced in the scanner. Autodeskew is the process of detecting and fixing this issue on scanned files, so deskewed images will have the text/images correctly aligned.
This filter increases the rate of character recognition accuracy because the aligned text is much closer to what the OCR software is supposed to encounter when performing image analysis. Brightness and contrast are very well-known image adjustments and are particularly important for scanned documents because they can significantly improve readability.
We often forget about gamma correction, but changing gamma settings on a very light image will make it readable without darkening it. Its purpose is to optimize the contrast and brightness in the mid-tones while keeping the black and white elements.
A crop tool is useful when you need to cut out unwanted areas of a page. And if you need to remove black borders and punch holes, our clean-up widget will do it for you!
This filter increases the rate of character recognition accuracy because the aligned text is much closer to what the OCR software is supposed to encounter when performing image analysis. Brightness and contrast are very well-known image adjustments and are particularly important for scanned documents because they can significantly improve readability.
We often forget about gamma correction, but changing gamma settings on a very light image will make it readable without darkening it. Its purpose is to optimize the contrast and brightness in the mid-tones while keeping the black and white elements.
A crop tool is useful when you need to cut out unwanted areas of a page. And if you need to remove black borders and punch holes, our clean-up widget will do it for you!