Edit in GitHubLog an issue

PDF Accessibility Auto-Tag API

PDF Accessibility Auto-Tag API Output Format

The output of PDF Accessibility Auto-Tag API contains the following:

  • The version 1.7 tagged PDF file with headings shifted if the option of shift headings is set.
  • A report in XLSX format, which provides information related to tagging of the document. This will be generated if report generation is enabled.

API limitations

  • File size: Files up to a maximum of 100 MB are supported.
  • Number of Pages: Non-scanned PDFs up to 200 pages and scanned PDFs up to 100 pages are supported, however limits may be lower for files with a large number of tables.
  • Rate limits: Keep request rate below 25 requests per minute.
  • Page Size: The API supports standard page sizes not more than 17.5” or less than 6” in either dimension.
  • Hidden Objects: PDF files that contain content that is not visible on the page like Javascript, OCG (optional content groups), etc are not supported. Files that contain such hidden information may fail to process. For such cases, removing hidden content prior to processing files again may return a successful result.
  • Language: The API is currently optimized for English language content. Files containing content in French, German, Spanish, Danish, Dutch, Norwegian (Bokmal), Galician, Catalan, Finnish, Italian, Swedish, Portuguese, and Romanian should return good results most of the time. Files containing content in Afrikaans, Bosnian, Croatian, Czech, Hungarian, Indonesian, Malay, Polish, Russian, Serbian, Turkish, Hindi, Marathi and other similar languages should return good results often. Non-English files may have issues with non-English punctuation. OCR is configured for English content.
  • OCR and Scan quality: The quality of text extracted from scanned files is dependent on the clarity of content in the input file and is currently configured for English content. Conditions like skewed pages, shadowing, obscured or overlapping fonts, and page resolution less than 200 DPI can all result in lower quality text output.
  • Form fields: Files containing XFA and other fillable form elements are not supported.
  • Unprotected files: The API supports files that are unprotected or where security restrictions allow editing of content. Files that are secured and do not allow editing of content will not be processed.
  • Annotations: Content in PDF files containing annotations such as highlights and sticky notes will be processed, but annotations that obscure text could impact output quality. Text within annotations will not be included in the output.
  • PDF Producers: The PDF Accessibility Auto-Tag API is designed to add tags to PDF to make it easier to make the file accessible. Files created from applications that produce other types of content like illustrations, CAD drawings or other types of vector art may not return high quality results.
  • PDF Collections: PDFs that are made from a collection of files including PDF Portfolios are not currently supported.

Error codes

ScenarioError codeError message
Unknown error/ failure
ERROR
Unexpected error
Timeout
TIMEOUT
Unexpected error: Processing timeout
Disqualified
DISQUALIFIED
File is not suitable for conversion
Unsupported XFA file
DISQUALIFIED_XFA
File is not suitable for conversion: File contains an XFA form
Page limit violation
DISQUALIFIED_PAGE_LIMIT
File is not suitable for conversion: File exceeds page limit
Scan page limit violation
DISQUALIFIED_SCAN_PAGE_LIMIT
File is not suitable for conversion: Scanned file exceeds page limit
File size violation
DISQUALIFIED_FILE_SIZE
File is not suitable for conversion: File exceeds size limit
Encryption permission
DISQUALIFIED_PERMISSIONS
File is not suitable for conversion: File permissions do not allow conversion
Complex file
DISQUALIFIED_COMPLEX_FILE
File is not suitable for conversion: File content is too complex
Unsupported language
DISQUALIFIED_LANGUAGE
File is not suitable for conversion: File content language is unsupported
Bad PDF
BAD_PDF
The PDF file is damaged or its content is too complex
Bad PDF file type
BAD_PDF_FILE_TYPE
The input file is not a PDF file
Damaged input file
BAD_PDF_DAMAGED
The input file is damaged
Complex table
BAD_PDF_COMPLEX_TABLE
The input file contains a table that is too complex to process
Complex content
BAD_PDF_COMPLEX_INPUT
The input file contains content that is too complex to process
Unsupported font
BAD_PDF_UNSUPPORTED_FONT
The input file contains font data that is corrupted or not supported
Large PDF file
BAD_PDF_LARGE_FILE
The input file size exceeds the maximum allowed
Protected PDF
PROTECTED_PDF
PDF is encrypted or password-protected
Empty or corrupted input
BAD_INPUT
Input is corrupted or empty
Invalid input parameters
BAD_INPUT_PARAMS
Invalid input parameters

Generate tagged PDF with version 1.7 along with an XLSX report and shift the headings in the output PDF file

The sample below generate tagged PDF of version 1.7 along with an XLSX report and shift the headings in the output PDF file.

Copied to your clipboard
1// Get the samples from https://git.corp.adobe.com/dc/dc-cpf-sdk-java-samples/tree/beta
2// Run the sample:
3// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDFWithOptions
4
5public class AutotagPDFWithOptions {
6
7 private static final org.slf4j.Logger LOGGER = LoggerFactory.getLogger(AutotagPDFWithOptions.class);
8
9 public static void main(String[] args) {
10
11 try {
12 // Initial setup, create credentials instance.
13 Credentials credentials = Credentials.serviceAccountCredentialsBuilder()
14 .fromFile("pdfservices-api-credentials.json")
15 .build();
16
17 //Create an ExecutionContext using credentials and create a new operation instance.
18 ExecutionContext executionContext = ExecutionContext.create(credentials);
19
20 AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.createNew();
21
22 // Provide an input FileRef for the operation
23 autotagPDFOperation.setInput(FileRef.createFromLocalFile("src/main/resources/autotagPdfInput.pdf"));
24
25 // Build AutotagPDF options and set them into the operation
26 AutotagPDFOptions autotagPDFOptions = AutotagPDFOptions.autotagPDFOptionsBuilder()
27 .shiftHeadings()
28 .generateReport()
29 .build();
30 autotagPDFOperation.setOptions(autotagPDFOptions);
31
32 // Execute the operation
33 AutotagOutputFiles autotagOutputFiles = autotagPDFOperation.execute(executionContext);
34
35 // Save the output files at the specified location
36 autotagOutputFiles.saveTaggedPDF("output/AutotagPDFWithOptions-tagged.pdf");
37 autotagOutputFiles.saveReport("output/AutotagPDFWithOptions-report.xlsx");
38
39
40 } catch (ServiceApiException | IOException | ServiceUsageException e) {
41 System.out.println(e);
42 }
43 }
44}

Generate tagged PDF from a PDF

The sample below generates tagged PDF from a PDF.

Copied to your clipboard
1// Get the samples from https://git.corp.adobe.com/dc/dc-cpf-sdk-java-samples/tree/beta
2// Run the sample:
3// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDF
4
5public class AutotagPDF {
6
7 private static final org.slf4j.Logger LOGGER = LoggerFactory.getLogger(AutotagPDF.class);
8
9 public static void main(String[] args) {
10
11 try {
12 // Initial setup, create credentials instance.
13 Credentials credentials = Credentials.serviceAccountCredentialsBuilder()
14 .fromFile("pdfservices-api-credentials.json")
15 .build();
16
17 //Create an ExecutionContext using credentials and create a new operation instance.
18 ExecutionContext executionContext = ExecutionContext.create(credentials);
19
20 AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.createNew();
21
22 // Provide an input FileRef for the operation
23 autotagPDFOperation.setInput(FileRef.createFromLocalFile("src/main/resources/autotagPdfInput.pdf"));
24
25 // Execute the operation
26 AutotagOutputFiles autotagOutputFiles = autotagPDFOperation.execute(executionContext);
27
28 // Save the output files at the specified location
29 autotagOutputFiles.saveTaggedPDF("output/AutotagPDF-tagged.pdf");
30
31 } catch (ServiceApiException | IOException | ServiceUsageException e) {
32 System.out.println(e);
33 }
34 }
35}

Generates tagged PDF by setting options with command line arguments

The sample below generates tagged PDF by setting options through command line arguments.

Here is a sample list of command line arguments and their description:

  • --input < input file path >
  • --output < output file path >
  • --report { If this argument is present then the output will be generated with the report }
  • --shift_headings { If this argument is present then the headings will be shifted in the output PDF file }
Copied to your clipboard
1// Get the samples from https://git.corp.adobe.com/dc/dc-cpf-sdk-java-samples/tree/beta
2// Run the sample:
3// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDFParamaterised -Dexec.args="--report --shift_headings --input src/main/resources/autotagPdfInput.pdf --output output/"
4
5public class AutotagPDFParamaterised {
6
7 private static final org.slf4j.Logger LOGGER = LoggerFactory.getLogger(AutotagPDFWithOptions.class);
8
9 public static void main(String[] args) {
10 LOGGER.info("--input " + getInputFilePathFromCmdArgs(args));
11 LOGGER.info("--output " + getOutputFilePathFromCmdArgs(args));
12 LOGGER.info("--report " + getGenerateReportFromCmdArgs(args));
13 LOGGER.info("--shift_headings " + getShiftHeadingsFromCmdArgs(args));
14
15 try {
16 // Initial setup, create credentials instance.
17 Credentials credentials = Credentials.serviceAccountCredentialsBuilder()
18 .fromFile("pdfservices-api-credentials.json")
19 .build();
20
21 //Create an ExecutionContext using credentials and create a new operation instance.
22 ExecutionContext executionContext = ExecutionContext.create(credentials);
23
24 AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.createNew();
25
26 // Set input for operation from command line args
27 autotagPDFOperation.setInput(FileRef.createFromLocalFile(getInputFilePathFromCmdArgs(args)));
28
29 // Get and Build AutotagPDF options from command line args and set them into the operation
30 AutotagPDFOptions autotagPDFOptions = getOptionsFromCmdArgs(args);
31 autotagPDFOperation.setOptions(autotagPDFOptions);
32
33 // Execute the operation
34 AutotagOutputFiles autotagOutputFiles = autotagPDFOperation.execute(executionContext);
35
36 // Save the output files at the specified location
37 String outputPath = getOutputFilePathFromCmdArgs(args);
38 autotagOutputFiles.saveTaggedPDF(outputPath + "AutotagPDFParameterised-tagged.pdf");
39 if (autotagPDFOptions != null && autotagPDFOptions.isGenerateReport())
40 autotagOutputFiles.saveReport(outputPath + "AutotagPDFParameterised-report.xlsx");
41
42 } catch (ServiceApiException | IOException | ServiceUsageException e) {
43 System.out.println(e);
44 }
45 }
46
47 private static AutotagPDFOptions getOptionsFromCmdArgs(String[] args) {
48 Boolean generateReport = getGenerateReportFromCmdArgs(args);
49 Boolean shiftHeadings = getShiftHeadingsFromCmdArgs(args);
50
51 AutotagPDFOptions.Builder builder = AutotagPDFOptions.autotagPDFOptionsBuilder();
52
53 if (generateReport)
54 builder.generateReport();
55 if (shiftHeadings)
56 builder.shiftHeadings();
57
58 return builder.build();
59 }
60
61 private static Boolean getShiftHeadingsFromCmdArgs(String[] args) {
62 return Arrays.asList(args).contains("--shift_headings");
63 }
64
65 private static Boolean getGenerateReportFromCmdArgs(String[] args) {
66 return Arrays.asList(args).contains("--report");
67 }
68
69 private static String getInputFilePathFromCmdArgs(String[] args) {
70 String inputFilePath = "src/main/resources/autotagPdfInput.pdf";
71 int inputFilePathIndex = Arrays.asList(args).indexOf("--input");
72 if (inputFilePathIndex >= 0 && inputFilePathIndex < args.length - 1) {
73 inputFilePath = args[inputFilePathIndex + 1];
74 } else
75 LOGGER.info("input file not specified, using default value : autotagPdfInput.pdf");
76
77 return inputFilePath;
78 }
79
80 private static String getOutputFilePathFromCmdArgs(String[] args) {
81 String outputFilePath = "output/";
82 int outputFilePathIndex = Arrays.asList(args).indexOf("--output");
83 if (outputFilePathIndex >= 0 && outputFilePathIndex < args.length - 1) {
84 outputFilePath = args[outputFilePathIndex + 1];
85 } else
86 LOGGER.info("output path not specified, using default value : output/");
87
88 return outputFilePath;
89 }
90}
Was this helpful?
  • Privacy
  • Terms of Use
  • Do not sell or share my personal information
  • AdChoices
Copyright © 2024 Adobe. All rights reserved.