Use scan compression to reduce a scan’s PDF file size. This is useful when you need to send PDFs by email, and also reduces the amount of file storage required. Many organizations impose file size restrictions on email attachments, so the scanned PDF files need to be an email-friendly file size.
Types of scan compression
PaperCut Hive has 2 types of compression:
- Native : compression performed in the printer
- OCR Add-on : compression performed in the cloud
Native scan compression
When native compression is enabled, the printer produces a highly compressed PDF.
This is a native feature of the printer, as opposed to PaperCut Software compressing the file with the OCR Add-on, so it has no additional costs.
Because the printer produces the compression locally, it might become busy processing the compression for large scan jobs, making it unavailable for other users.
It is available on these brands:
- Canon
- HP OXPd
- Ricoh Smart SDK 2.12 or above
- Sharp OSA 5.1+ and n2.0 browser
- Toshiba V3 or above
Within the brands above, the model itself must have the high-compression PDF feature available so that when you enable native compression in PaperCut Hive, it can map to the feature.
To use this compression, see Enabling native scan compression .
OCR Add-on: compression in the cloud
When this type of compression is enabled, the PDF file leaves the printer and goes to PaperCut’s cloud service where PaperCut compresses the PDF.
Because the compression happens in the cloud, not locally, the printer isn’t busy processing so it is more available. This is valuable in scenarios where many users use the same printer, and for scan jobs with many pages.
You can choose from different compression levels: Low, Medium, and High.
Cloud compression is a feature of the OCR Add-on, which requires a subscription on top of the PaperCut Hive subscription. However, it also includes other useful features.
To use this compression, see Enabling OCR Add-on scan compression .
More details about the OCR Add-on
Feature | What it does |
---|---|
OCR | Optical Character Recognition adds a transparent layer of text to PDF files to make its text contents searchable. |
PDF/A | Ensures your scanned documents have long-term archival support in compliance with ISO standards. |
PDF compression | Provides 3 levels of compression to ensure organizations can find the right balance between file size and image quality. |
Blank page removal | Automatically excludes blank pages from your scanned document. |
Batch split after blank page | Insert blank sheets in a batch of documents to output separate PDFs from the same scan job. Great when, for example, a teacher scans school exams for a whole class. |
Batch split on page # | Takes a batch of documents and creates multiple PDFs according to a specified page count. |
Despeckle | Removes fine dust particles from the scan file. |
Deskew | Straightens angled pages for a better reading experience. |
The effect of document processing features on scan file size
Note that if you use OCR Add-on scan compression, enabling any of these document processing features causes the PDF file sizes to increase:
- OCR (optical character recognition) to create searchable PDFs
- Blank page removal
- Batch splitting after blank page or on page number
- PDF/A
- Deskew
- Despeckle
When PaperCut Hive processes a document with any of the OCR Add-on features above, it converts the PDF pages into PNG image files to analyze them. For example, it converts pages to a PNG to optically recognize the text to create the searchable OCR text layer, or to check how blank a page is to decide if it can be removed, if blank page removal is enabled.
After conversion to PNG, the pages are reassembled in a PDF. The conversions across different file types increase the final PDF file size.
Enabling native scan compression
To enable this compression:
- In the PaperCut Hive admin console, go to Easy Print & Scan > Integrated Scanning > Add Quick Scan and click the Scan tab.
- In the Output Settings section, go to Format.
- Click the Use high compression PDF on printers that support it toggle to on.
Enabling the OCR Add-on: scan compression in the cloud
To enable this compression:
- In the PaperCut Hive admin console, go to Easy Print & Scan > Integrated Scanning > Add Quick Scan.
- Click the PDF Post Process tab and go to the PDF Compression section.
- Click the Use cloud compression toggle to on.
Lossy compression technology
To prioritize file size reduction, PaperCut Hive uses lossy compression. Lossy compression suits a variety of situations where file size is a primary concern, and the quality of the document can tolerate some loss of detail or information.
Lossy image compression is a method of reducing an image’s file size by selectively discarding some of the image data. It removes information that is less important or perceptually less significant, such as high detail or color information that is outside the range of human vision.
Advantages of lossy image compression
- Smaller file size - Lossy compression can result in significantly smaller file sizes, which can be beneficial when storing large numbers of files.
- Faster transmission - Smaller file sizes also means faster transmission over networks or the internet, which can improve the user experience.
- Reasonable quality - Lossy compression can achieve good image quality with a reasonable degree of compression, making it suitable for most applications where highest-quality images are not essential.
Disadvantages of lossy image compression
- Loss of quality - The main disadvantage of lossy compression is that it can result in a loss of image quality. Depending on the degree of compression and the characteristics of the image, this loss of quality can be noticeable and may reduce the usefulness of the image.
- Irreversibility - After an image is compressed using a lossy method, it is not possible to recover the original image. This can be problematic if the original image needs to be restored because the compression method is no longer suitable for the intended use.
- Limited editing - Lossy compression can make it more difficult to edit the image, especially if it involves resizing or cropping. This is because the compressed image may contain artifacts or other irregularities that can affect the editing process.
- Poor performance on certain types of images - Lossy compression might not perform well on certain types of images, such as those with high contrast or fine details.
Compression level recommendations
Choose your required compression level based on the specific needs and requirements of your use case. If image quality is essential or if the image needs to be edited or manipulated, no compression or low compression might be the better choice. However, if file size and transmission speed are the main considerations, compression might be more suitable.
PaperCut Hive’s OCR Add-on offers these 3 compression levels for scan output PDFs:
- Low compression - minimize the downsides of lossy compression (detailed above),
- Medium compression - trade off between file size and image quality
- High compression - smallest file size, poorer quality text and images
All of these compression levels use lossy compression.
In general, choose Low compression to minimize the downsides of lossy compression detailed above, and Medium or High compression if file size is your greatest concern.
Here are some use cases:
- When the document contains images or graphics with fine details, select low compression to preserve the visual clarity of the document.
- When the document needs to be printed, select low compression to ensure that the document prints with high quality, especially if it contains graphics, images or text with small fonts. If a PDF with high compression is printed, the output quality may suffer.
- When the document needs to be edited, select low compression because the PDFs are often easier to edit than highly compressed PDFs, especially if the document contains images or graphics. High compression can lead to artifacts or errors when editing the document.
- Email attachments can have size limits, so using high compression PDFs can help you send large files as attachments more easily.
Overall, deciding on which PDF compression level to use for scanned documents depends on your specific requirements and intended use of the PDF document. Always consider factors such as image quality, print quality, editability, security, and archival value when making this decision.
Compression is not recommended if you need to preserve the quality of sensitive documents for future reference e.g. legal documents, patient records.
Setting the compression level
All document processing features belong to the OCR Add-on, which requires an additional paid subscription to the PaperCut Hive organization.
To subscribe to and enable the OCR Add-on, go to Add-ons > OCR Add-on.
To access and enable the OCR Add-on’s features:
- In the PaperCut Hive admin console, click Easy Print & Scan > Integrated Scanning.
- Either click Add Quick Scan or edit an existing one that shows the OCR ADD-ON IN USE label.
- Go to the PDF Post Process tab.
- Under OCR Add-on Settings, click PDF Compression.
Scan files too large to email? Deliver them via a cloud link
When creating a Quick Scan, the default options are:
- maximum scan file size: 10 MB
- if over 10 MB, then discard the file and notify the person who sent the scan that the file was too big and has been discarded.
Ideally you should set up the maximum deliverable scan file size in PaperCut Hive to be the same file size as your organization’s maximum file size for the email inbox.
PaperCut’s maximum scan file size to send via email is 25 MB. As a reference, Gmail’s max attachment is 25 MB.
The alternative option for when a scan file exceeds the maximum size you set up, is to save the file in the cloud and send the recipient as a 24-hour temporary cloud download link.
The email delivery behavior will change as shown below.
Before: Sent to the sender of the scan | After: Sent to the scan's recipient, who could be the sender or someone else |
---|---|
Handling frequent large scan file sizes
If large scan file sizes are common in your organization’s workflows, consider creating Quick Scan actions to cloud storage destinations as opposed to scanning to email.
These Quick Scan actions don’t have the file size limit imposed by email inboxes and the user who sends the scan will receive an email confirmation with a direct link to the scan file in the cloud storage.
Some cloud storage destinations above have the added benefit of supporting scan automations, such as automatically creating folders and saving files in the right folders based on input from the user at the MFD (into prompts created by the admin).
Comments