PDF Accessibility Auto-Tag API

API Output Format

The output of the PDF Accessibility Auto-Tag API contains the following:

The tagged PDF file.
A report in XLSX format, if the report generation was enabled. This report provides information related to the tags found in the original document, if any, and the auto-tagged document.

API limitations

File size: Files up to a maximum of 100 MB are supported.
Number of Pages: Non-scanned PDFs up to 200 pages and scanned PDFs up to 100 pages are supported, however limits may be lower for files with a large number of tables.
Rate limits: Keep request rate below 25 requests per minute.
Page Size: The API supports standard page sizes not more than 17.5” or less than 6” in either dimension.
Hidden Objects: PDF files that contain content that is not visible on the page like JavaScript, OCG (optional content groups), etc are not supported. Files that contain such hidden information may fail to process. For such cases, removing hidden content prior to processing files again may return a successful result.
Language: The API is currently optimized for English language content. Files containing content in French, German, Spanish, Danish, Dutch, Norwegian (Bokmal), Galician, Catalan, Finnish, Italian, Swedish, Portuguese, and Romanian should return good results most of the time. Files containing content in Afrikaans, Bosnian, Croatian, Czech, Hungarian, Indonesian, Malay, Polish, Russian, Serbian, Turkish, Hindi, Marathi and other similar languages should return good results often. Non-English files may have issues with non-English punctuation. OCR is configured for English content.
OCR and Scan quality: The quality of text extracted from scanned files is dependent on the clarity of content in the input file and is currently configured for English content. Conditions like skewed pages, shadowing, obscured or overlapping fonts, and page resolution less than 200 DPI can all result in lower quality text output.
Form fields: Files containing XFA and other fillable form elements are not supported.
Unprotected files: The API supports files that are unprotected or where security restrictions allow editing of content. Files that are secured and do not allow editing of content will not be processed. If the password of a protected PDF is known, the permissions of the file can be modified using the PDF Services API as shown here.
Annotations: Content in PDF files containing annotations such as highlights and sticky notes will be processed, but annotations that obscure text could impact output quality. Text within annotations will not be included in the output.
PDF Producers: The PDF Accessibility Auto-Tag API is designed to add tags to PDF to make it easier to make the file accessible. Files created from applications that produce other types of content like illustrations, CAD drawings or other types of vector art may not return high quality results.
PDF Collections: PDFs that are made from a collection of files including PDF Portfolios are not currently supported.

Error codes

Scenario	Error code	Error message
Unknown error/ failure	ERROR	Unexpected error
Timeout	TIMEOUT	Unexpected error: Processing timeout
Disqualified	DISQUALIFIED	File is not suitable for conversion
Unsupported XFA file	DISQUALIFIED_XFA	File is not suitable for conversion: File contains an XFA form
Page limit violation	DISQUALIFIED_PAGE_LIMIT	File is not suitable for conversion: File exceeds page limit
Scan page limit violation	DISQUALIFIED_SCAN_PAGE_LIMIT	File is not suitable for conversion: Scanned file exceeds page limit
File size violation	DISQUALIFIED_FILE_SIZE	File is not suitable for conversion: File exceeds size limit
Encryption permission	DISQUALIFIED_PERMISSIONS	File is not suitable for conversion: File permissions do not allow conversion
Complex file	DISQUALIFIED_COMPLEX_FILE	File is not suitable for conversion: File content is too complex
Unsupported language	DISQUALIFIED_LANGUAGE	File is not suitable for conversion: File content language is unsupported
Bad PDF	BAD_PDF	The PDF file is damaged or its content is too complex
Bad PDF file type	BAD_PDF_FILE_TYPE	The input file is not a PDF file
Damaged input file	BAD_PDF_DAMAGED	The input file is damaged
Complex table	BAD_PDF_COMPLEX_TABLE	The input file contains a table that is too complex to process
Complex content	BAD_PDF_COMPLEX_INPUT	The input file contains content that is too complex to process
Unsupported font	BAD_PDF_UNSUPPORTED_FONT	The input file contains font data that is corrupted or not supported
Large PDF file	BAD_PDF_LARGE_FILE	The input file size exceeds the maximum allowed
Protected PDF	PROTECTED_PDF	PDF is encrypted or password-protected
Empty or corrupted input	BAD_INPUT	Input is corrupted or empty
Invalid input parameters	BAD_INPUT_PARAMS	Invalid input parameters
User not enrolled to allowed Atlas plans	INVALID_PLAN_CODE	Unauthorized to execute this operation. User is not enrolled to plans allowed for the service

REST API

See our public API Reference for PDF Accessibility Auto-Tag API.

Generate tagged PDF from a PDF

The sample below generates a tagged PDF from a PDF.

Please refer to the API usage guide to understand how to use our APIs.

Java

.NET

Node JS

Python

REST API

// Get the samples from https://github.com/adobe/pdfservices-java-sdk-samples
// Run the sample:
// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDF

public class AutotagPDF {
    // Initialize the logger.
    private static final Logger LOGGER = LoggerFactory.getLogger(AutotagPDF.class);

    public static void main(String[] args) {

        try (InputStream inputStream = Files.newInputStream(new File("src/main/resources/autotagPDFInput.pdf").toPath())) {
            // Initial setup, create credentials instance
            Credentials credentials = new ServicePrincipalCredentials(
                System.getenv("PDF_SERVICES_CLIENT_ID"),
                System.getenv("PDF_SERVICES_CLIENT_SECRET"));
        
            // Creates a PDF Services instance
            PDFServices pdfServices = new PDFServices(credentials);
        
            // Creates an asset(s) from source file(s) and upload
            Asset asset = pdfServices.upload(inputStream, PDFServicesMediaType.PDF.getMediaType());
        
            // Creates a new job instance
            AutotagPDFJob autotagPDFJob = new AutotagPDFJob(asset);
        
            // Submit the job and gets the job result
            String location = pdfServices.submit(autotagPDFJob);
            PDFServicesResponse<AutotagPDFResult> pdfServicesResponse = pdfServices.getJobResult(location, AutotagPDFResult.class);
        
            // Get content from the resulting asset(s)
            Asset resultAsset = pdfServicesResponse.getResult().getTaggedPDF();
            StreamAsset streamAsset = pdfServices.getContent(resultAsset);
        
            // Creates an output stream and copy stream asset's content to it
            Files.createDirectories(Paths.get("output/"));
            OutputStream outputStream = Files.newOutputStream(new File("output/autotagPDFOutput.pdf").toPath());
            LOGGER.info("Saving asset at output/autotagPDFOutput.pdf");
            IOUtils.copy(streamAsset.getInputStream(), outputStream);
            outputStream.close();
        } catch (ServiceApiException | IOException | SDKException | ServiceUsageException ex) {
            LOGGER.error("Exception encountered while executing operation", ex);
        }
    }
Copied to your clipboard
1// Get the samples from https://github.com/adobe/pdfservices-java-sdk-samples
2// Run the sample:
3// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDF
4
5public class AutotagPDF {
6    // Initialize the logger.
7    private static final Logger LOGGER = LoggerFactory.getLogger(AutotagPDF.class);
8
9    public static void main(String[] args) {
10
11        try (InputStream inputStream = Files.newInputStream(new File("src/main/resources/autotagPDFInput.pdf").toPath())) {
12            // Initial setup, create credentials instance
13            Credentials credentials = new ServicePrincipalCredentials(
14                System.getenv("PDF_SERVICES_CLIENT_ID"),
15                System.getenv("PDF_SERVICES_CLIENT_SECRET"));
16        
17            // Creates a PDF Services instance
18            PDFServices pdfServices = new PDFServices(credentials);
19        
20            // Creates an asset(s) from source file(s) and upload
21            Asset asset = pdfServices.upload(inputStream, PDFServicesMediaType.PDF.getMediaType());
22        
23            // Creates a new job instance
24            AutotagPDFJob autotagPDFJob = new AutotagPDFJob(asset);
25        
26            // Submit the job and gets the job result
27            String location = pdfServices.submit(autotagPDFJob);
28            PDFServicesResponse<AutotagPDFResult> pdfServicesResponse = pdfServices.getJobResult(location, AutotagPDFResult.class);
29        
30            // Get content from the resulting asset(s)
31            Asset resultAsset = pdfServicesResponse.getResult().getTaggedPDF();
32            StreamAsset streamAsset = pdfServices.getContent(resultAsset);
33        
34            // Creates an output stream and copy stream asset's content to it
35            Files.createDirectories(Paths.get("output/"));
36            OutputStream outputStream = Files.newOutputStream(new File("output/autotagPDFOutput.pdf").toPath());
37            LOGGER.info("Saving asset at output/autotagPDFOutput.pdf");
38            IOUtils.copy(streamAsset.getInputStream(), outputStream);
39            outputStream.close();
40        } catch (ServiceApiException | IOException | SDKException | ServiceUsageException ex) {
41            LOGGER.error("Exception encountered while executing operation", ex);
42        }
43    }

// Get the samples from https://github.com/adobe/PDFServices.NET.SDK.Samples
// Run the sample:
// cd AutotagPDF/
// dotnet run AutotagPDF.csproj

namespace AutotagPDF
{
    class Program
    {
        private static readonly ILog log = LogManager.GetLogger(typeof(Program));

        static void Main()
    {
        //Configure the logging
        ConfigureLogging();
        try
        {
            // Initial setup, create credentials instance.
            Credentials credentials = Credentials.ServicePrincipalCredentialsBuilder()
                    .WithClientId("PDF_SERVICES_CLIENT_ID")
                    .WithClientSecret("PDF_SERVICES_CLIENT_SECRET")
                    .Build();

            //Create an ExecutionContext using credentials and create a new operation instance.
            ExecutionContext executionContext = ExecutionContext.Create(credentials);
            AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.CreateNew();

            // Provide an input FileRef for the operation
            autotagPDFOperation.SetInput(FileRef.CreateFromLocalFile(@"autotagPDFInput.pdf"));

            // Execute the operation
            AutotagPDFOutput autotagPDFOutput = autotagPDFOperation.Execute(executionContext);
            
            // Save the output files at the specified location
            autotagPDFOutput.GetTaggedPDF().SaveAs(Directory.GetCurrentDirectory() + "autotagPDFOutput.pdf");
        }
        catch (ServiceUsageException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (ServiceApiException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (SDKException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (IOException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (Exception ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    }

        static void ConfigureLogging()
    {
        ILoggerRepository logRepository = LogManager.GetRepository(Assembly.GetEntryAssembly());
        XmlConfigurator.Configure(logRepository, new FileInfo("log4net.config"));
    }
    }
}
Copied to your clipboard
1// Get the samples from https://github.com/adobe/PDFServices.NET.SDK.Samples
2// Run the sample:
3// cd AutotagPDF/
4// dotnet run AutotagPDF.csproj
5
6namespace AutotagPDF
7{
8    class Program
9    {
10        private static readonly ILog log = LogManager.GetLogger(typeof(Program));
11
12        static void Main()
13    {
14        //Configure the logging
15        ConfigureLogging();
16        try
17        {
18            // Initial setup, create credentials instance.
19            Credentials credentials = Credentials.ServicePrincipalCredentialsBuilder()
20                    .WithClientId("PDF_SERVICES_CLIENT_ID")
21                    .WithClientSecret("PDF_SERVICES_CLIENT_SECRET")
22                    .Build();
23
24            //Create an ExecutionContext using credentials and create a new operation instance.
25            ExecutionContext executionContext = ExecutionContext.Create(credentials);
26            AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.CreateNew();
27
28            // Provide an input FileRef for the operation
29            autotagPDFOperation.SetInput(FileRef.CreateFromLocalFile(@"autotagPDFInput.pdf"));
30
31            // Execute the operation
32            AutotagPDFOutput autotagPDFOutput = autotagPDFOperation.Execute(executionContext);
33            
34            // Save the output files at the specified location
35            autotagPDFOutput.GetTaggedPDF().SaveAs(Directory.GetCurrentDirectory() + "autotagPDFOutput.pdf");
36        }
37        catch (ServiceUsageException ex)
38        {
39            log.Error("Exception encountered while executing operation", ex);
40        }
41    catch (ServiceApiException ex)
42        {
43            log.Error("Exception encountered while executing operation", ex);
44        }
45    catch (SDKException ex)
46        {
47            log.Error("Exception encountered while executing operation", ex);
48        }
49    catch (IOException ex)
50        {
51            log.Error("Exception encountered while executing operation", ex);
52        }
53    catch (Exception ex)
54        {
55            log.Error("Exception encountered while executing operation", ex);
56        }
57    }
58
59        static void ConfigureLogging()
60    {
61        ILoggerRepository logRepository = LogManager.GetRepository(Assembly.GetEntryAssembly());
62        XmlConfigurator.Configure(logRepository, new FileInfo("log4net.config"));
63    }
64    }
65}

// Get the samples from https://github.com/adobe/pdfservices-node-sdk-samples
// Run the sample:
// node src/autotagpdf/autotag-pdf.js

const {
    ServicePrincipalCredentials,
    PDFServices,
    MimeType,
    AutotagPDFJob,
    AutotagPDFResult,
    SDKError,
    ServiceUsageError,
    ServiceApiError,
} = require("@adobe/pdfservices-node-sdk");
const fs = require("fs");

(async () => {
    let readStream;
    try {
        // Initial setup, create credentials instance
        const credentials = new ServicePrincipalCredentials({
            clientId: process.env.PDF_SERVICES_CLIENT_ID,
            clientSecret: process.env.PDF_SERVICES_CLIENT_SECRET
        });

        // Creates a PDF Services instance
        const pdfServices = new PDFServices({credentials});

        // Creates an asset(s) from source file(s) and upload
        readStream = fs.createReadStream("./autotagPDFInput.pdf");
        const inputAsset = await pdfServices.upload({
            readStream,
            mimeType: MimeType.PDF
        });

        // Creates a new job instance
        const job = new AutotagPDFJob({inputAsset});

        // Submit the job and get the job result
        const pollingURL = await pdfServices.submit({job});
        const pdfServicesResponse = await pdfServices.getJobResult({
            pollingURL,
            resultType: AutotagPDFResult
        });

        // Get content from the resulting asset(s)
        const resultAsset = pdfServicesResponse.result.taggedPDF;
        const streamAsset = await pdfServices.getContent({asset: resultAsset});

        // Creates an output stream and copy stream asset's content to it
        const outputFilePath = "./autotag-tagged.pdf";
        console.log(`Saving asset at ${outputFilePath}`);

        let writeStream = fs.createWriteStream(outputFilePath);
        streamAsset.readStream.pipe(writeStream);
    } catch (err) {
        if (err instanceof SDKError || err instanceof ServiceUsageError || err instanceof ServiceApiError) {
            console.log("Exception encountered while executing operation", err);
        } else {
            console.log("Exception encountered while executing operation", err);
        }
    } finally {
        readStream?.destroy();
    }
})();
Copied to your clipboard
1// Get the samples from https://github.com/adobe/pdfservices-node-sdk-samples
2// Run the sample:
3// node src/autotagpdf/autotag-pdf.js
4
5const {
6    ServicePrincipalCredentials,
7    PDFServices,
8    MimeType,
9    AutotagPDFJob,
10    AutotagPDFResult,
11    SDKError,
12    ServiceUsageError,
13    ServiceApiError,
14} = require("@adobe/pdfservices-node-sdk");
15const fs = require("fs");
16
17(async () => {
18    let readStream;
19    try {
20        // Initial setup, create credentials instance
21        const credentials = new ServicePrincipalCredentials({
22            clientId: process.env.PDF_SERVICES_CLIENT_ID,
23            clientSecret: process.env.PDF_SERVICES_CLIENT_SECRET
24        });
25
26        // Creates a PDF Services instance
27        const pdfServices = new PDFServices({credentials});
28
29        // Creates an asset(s) from source file(s) and upload
30        readStream = fs.createReadStream("./autotagPDFInput.pdf");
31        const inputAsset = await pdfServices.upload({
32            readStream,
33            mimeType: MimeType.PDF
34        });
35
36        // Creates a new job instance
37        const job = new AutotagPDFJob({inputAsset});
38
39        // Submit the job and get the job result
40        const pollingURL = await pdfServices.submit({job});
41        const pdfServicesResponse = await pdfServices.getJobResult({
42            pollingURL,
43            resultType: AutotagPDFResult
44        });
45
46        // Get content from the resulting asset(s)
47        const resultAsset = pdfServicesResponse.result.taggedPDF;
48        const streamAsset = await pdfServices.getContent({asset: resultAsset});
49
50        // Creates an output stream and copy stream asset's content to it
51        const outputFilePath = "./autotag-tagged.pdf";
52        console.log(`Saving asset at ${outputFilePath}`);
53
54        let writeStream = fs.createWriteStream(outputFilePath);
55        streamAsset.readStream.pipe(writeStream);
56    } catch (err) {
57        if (err instanceof SDKError || err instanceof ServiceUsageError || err instanceof ServiceApiError) {
58            console.log("Exception encountered while executing operation", err);
59        } else {
60            console.log("Exception encountered while executing operation", err);
61        }
62    } finally {
63        readStream?.destroy();
64    }
65})();

# Get the samples from https://github.com/adobe/pdfservices-python-sdk-samples
# Run the sample:
# python src/autotagpdf/autotag_pdf.py

logging.basicConfig(level=os.environ.get('LOGLEVEL', 'INFO'))

try:
    # get base path.
    base_path = str(Path(__file__).parents[2])

    # Initial setup, create credentials instance.
    credentials = Credentials.service_principal_credentials_builder() \
        .with_client_id('PDF_SERVICES_CLIENT_ID') \
        .with_client_secret('PDF_SERVICES_CLIENT_SECRET') \
        .build()

    # Create an ExecutionContext using credentials and create a new operation instance.
    execution_context = ExecutionContext.create(credentials)
    autotag_pdf_operation = AutotagPDFOperation.create_new()

    # Set operation input from a source file.
    input_file_path = 'autotagPdfInput.pdf'
    source = FileRef.create_from_local_file(base_path + '/resources/' + input_file_path)
    autotag_pdf_operation.set_input(source)

    # Execute the operation.
    autotag_pdf_output: AutotagPDFOutput = autotag_pdf_operation.execute(execution_context)

    input_file_name = Path(input_file_path).stem
    base_output_path = base_path + '/output/AutotagPDF/'

    Path(base_output_path).mkdir(parents=True, exist_ok=True)
    tagged_pdf_path = f'{base_output_path}{input_file_name}-tagged.pdf'

    # Save the result to the specified location.
    autotag_pdf_output.get_tagged_pdf().save_as(tagged_pdf_path)

except (ServiceApiException, ServiceUsageException, SdkException) as e:
    logging.exception(f'Exception encountered while executing operation: {e}')
Copied to your clipboard
1# Get the samples from https://github.com/adobe/pdfservices-python-sdk-samples
2# Run the sample:
3# python src/autotagpdf/autotag_pdf.py
4
5logging.basicConfig(level=os.environ.get('LOGLEVEL', 'INFO'))
6
7try:
8    # get base path.
9    base_path = str(Path(__file__).parents[2])
10
11    # Initial setup, create credentials instance.
12    credentials = Credentials.service_principal_credentials_builder() \
13        .with_client_id('PDF_SERVICES_CLIENT_ID') \
14        .with_client_secret('PDF_SERVICES_CLIENT_SECRET') \
15        .build()
16
17    # Create an ExecutionContext using credentials and create a new operation instance.
18    execution_context = ExecutionContext.create(credentials)
19    autotag_pdf_operation = AutotagPDFOperation.create_new()
20
21    # Set operation input from a source file.
22    input_file_path = 'autotagPdfInput.pdf'
23    source = FileRef.create_from_local_file(base_path + '/resources/' + input_file_path)
24    autotag_pdf_operation.set_input(source)
25
26    # Execute the operation.
27    autotag_pdf_output: AutotagPDFOutput = autotag_pdf_operation.execute(execution_context)
28
29    input_file_name = Path(input_file_path).stem
30    base_output_path = base_path + '/output/AutotagPDF/'
31
32    Path(base_output_path).mkdir(parents=True, exist_ok=True)
33    tagged_pdf_path = f'{base_output_path}{input_file_name}-tagged.pdf'
34
35    # Save the result to the specified location.
36    autotag_pdf_output.get_tagged_pdf().save_as(tagged_pdf_path)
37
38except (ServiceApiException, ServiceUsageException, SdkException) as e:
39    logging.exception(f'Exception encountered while executing operation: {e}')

// Please refer our REST API docs for more information 
// https://developer.adobe.com/document-services/docs/apis/#tag/PDF-Accessibility-Auto-Tag

curl --location --request POST 'https://pdf-services.adobe.io/operation/autotag' \
--header 'x-api-key: {{Placeholder for client_id}}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {{Placeholder for token}}' \
--data-raw '{
    "assetID": "urn:aaid:AS:UE1:23c30ee0-2e4d-46d6-87f2-087832fca718"
}'
Copied to your clipboard
1// Please refer our REST API docs for more information 
2// https://developer.adobe.com/document-services/docs/apis/#tag/PDF-Accessibility-Auto-Tag
3
4curl --location --request POST 'https://pdf-services.adobe.io/operation/autotag' \
5--header 'x-api-key: {{Placeholder for client_id}}' \
6--header 'Content-Type: application/json' \
7--header 'Authorization: Bearer {{Placeholder for token}}' \
8--data-raw '{
9    "assetID": "urn:aaid:AS:UE1:23c30ee0-2e4d-46d6-87f2-087832fca718"
10}'

Generate tagged PDF by setting options with command line arguments

The sample below generates a tagged PDF by setting options through command line arguments.

Here is a sample list of command line arguments and their description:

--input < input file path >
--output < output file path >
--report { If this argument is present then the output will be generated with the report }
--shift_headings { If this argument is present then the headings will be shifted in the output PDF file }

Java

.NET

Node JS

Python

// Get the samples from https://github.com/adobe/pdfservices-java-sdk-samples
// Run the sample:
// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDFParamaterised

public class AutotagPDFParameterised {

    private static final org.slf4j.Logger LOGGER = LoggerFactory.getLogger(AutotagPDFParameterised.class);

    public static void main(String[] args) {
        LOGGER.info("--input " + getInputFilePathFromCmdArgs(args));
        LOGGER.info("--output " + getOutputFilePathFromCmdArgs(args));
        LOGGER.info("--report " + getGenerateReportFromCmdArgs(args));
        LOGGER.info("--shift_headings " + getShiftHeadingsFromCmdArgs(args));
        
        try (InputStream inputStream = Files.newInputStream(new File(getInputFilePathFromCmdArgs(args)).toPath())) {
            // Initial setup, create credentials instance
            Credentials credentials = new ServicePrincipalCredentials(
                System.getenv("PDF_SERVICES_CLIENT_ID"),
                System.getenv("PDF_SERVICES_CLIENT_SECRET"));
        
            // Creates a PDF Services instance
            PDFServices pdfServices = new PDFServices(credentials);
        
            // Creates an asset(s) from source file(s) and upload
            Asset asset = pdfServices.upload(inputStream, PDFServicesMediaType.PDF.getMediaType());
        
            // Create parameters for the job
            AutotagPDFParams autotagPDFParams = getOptionsFromCmdArgs(args);
        
            // Creates a new job instance
            AutotagPDFJob autotagPDFJob = new AutotagPDFJob(asset)
                .setParams(autotagPDFParams);
        
            // Submit the job and gets the job result
            String location = pdfServices.submit(autotagPDFJob);
            PDFServicesResponse<AutotagPDFResult> pdfServicesResponse = pdfServices.getJobResult(location, AutotagPDFResult.class);
        
            // Get content from the resulting asset(s)
            Asset resultAsset = pdfServicesResponse.getResult().getTaggedPDF();
            Asset resultAssetReport = pdfServicesResponse.getResult().getReport();
            StreamAsset streamAsset = pdfServices.getContent(resultAsset);
            StreamAsset streamAssetReport = (autotagPDFParams != null && autotagPDFParams.isGenerateReport()) ?
                pdfServices.getContent(resultAssetReport) : null;
        
            // Creating output streams and copying stream assets' content to it
            Files.createDirectories(Paths.get("output/"));
            String outputPath = getOutputFilePathFromCmdArgs(args);
            OutputStream outputStream = Files.newOutputStream(new File(outputPath + "autotagPDFInput-tagged.pdf").toPath());
            LOGGER.info("Saving asset at " + outputPath + "autotagPDFInput-tagged.pdf");
            IOUtils.copy(streamAsset.getInputStream(), outputStream);
            if(streamAssetReport != null) {
                OutputStream outputStreamReport = Files.newOutputStream(new File(outputPath + "autotagPDFInput-report.xlsx").toPath());
                LOGGER.info("Saving asset at " + outputPath + "autotagPDFInput-report.xlsx");
                IOUtils.copy(streamAssetReport.getInputStream(), outputStreamReport);
                outputStreamReport.close();
            }
        } catch (ServiceApiException | IOException | SDKException | ServiceUsageException e) {
            LOGGER.error("Exception encountered while executing operation", e);
        }
    }

    private static AutotagPDFParams getOptionsFromCmdArgs(String[] args) {
        Boolean generateReport = getGenerateReportFromCmdArgs(args);
        Boolean shiftHeadings = getShiftHeadingsFromCmdArgs(args);
        AutotagPDFParams.Builder autotagPDFParamsBuilder = AutotagPDFParams.autotagPDFParamsBuilder();
    
        if (generateReport)
            autotagPDFParamsBuilder.generateReport();
        if (shiftHeadings)
            autotagPDFParamsBuilder.shiftHeadings();
    
        return autotagPDFParamsBuilder.build();
    }
    
    private static Boolean getShiftHeadingsFromCmdArgs(String[] args) {
        return Arrays.asList(args).contains("--shift_headings");
    }
    
    private static Boolean getGenerateReportFromCmdArgs(String[] args) {
        return Arrays.asList(args).contains("--report");
    }
    
    private static String getInputFilePathFromCmdArgs(String[] args) {
        String inputFilePath = "src/main/resources/autotagPDFInput.pdf";
        int inputFilePathIndex = Arrays.asList(args).indexOf("--input");
        if (inputFilePathIndex >= 0 && inputFilePathIndex < args.length - 1) {
            inputFilePath = args[inputFilePathIndex + 1];
        } else
            LOGGER.info("input file not specified, using default value : autotagPDFInput.pdf");
    
        return inputFilePath;
    }
    
    private static String getOutputFilePathFromCmdArgs(String[] args) {
        String outputFilePath = "output/AutotagPDFParameterised/";
        int outputFilePathIndex = Arrays.asList(args).indexOf("--output");
        if (outputFilePathIndex >= 0 && outputFilePathIndex < args.length - 1) {
            outputFilePath = args[outputFilePathIndex + 1];
        } else
            LOGGER.info("output path not specified, using default value : " + outputFilePath);
    
        return outputFilePath;
    }
}

Copied to your clipboard
1// Get the samples from https://github.com/adobe/pdfservices-java-sdk-samples
2// Run the sample:
3// mvn -f pom.xml exec:java -Dexec.mainClass=com.adobe.pdfservices.operation.samples.autotagpdf.AutotagPDFParamaterised
4
5public class AutotagPDFParameterised {
6
7    private static final org.slf4j.Logger LOGGER = LoggerFactory.getLogger(AutotagPDFParameterised.class);
8
9    public static void main(String[] args) {
10        LOGGER.info("--input " + getInputFilePathFromCmdArgs(args));
11        LOGGER.info("--output " + getOutputFilePathFromCmdArgs(args));
12        LOGGER.info("--report " + getGenerateReportFromCmdArgs(args));
13        LOGGER.info("--shift_headings " + getShiftHeadingsFromCmdArgs(args));
14        
15        try (InputStream inputStream = Files.newInputStream(new File(getInputFilePathFromCmdArgs(args)).toPath())) {
16            // Initial setup, create credentials instance
17            Credentials credentials = new ServicePrincipalCredentials(
18                System.getenv("PDF_SERVICES_CLIENT_ID"),
19                System.getenv("PDF_SERVICES_CLIENT_SECRET"));
20        
21            // Creates a PDF Services instance
22            PDFServices pdfServices = new PDFServices(credentials);
23        
24            // Creates an asset(s) from source file(s) and upload
25            Asset asset = pdfServices.upload(inputStream, PDFServicesMediaType.PDF.getMediaType());
26        
27            // Create parameters for the job
28            AutotagPDFParams autotagPDFParams = getOptionsFromCmdArgs(args);
29        
30            // Creates a new job instance
31            AutotagPDFJob autotagPDFJob = new AutotagPDFJob(asset)
32                .setParams(autotagPDFParams);
33        
34            // Submit the job and gets the job result
35            String location = pdfServices.submit(autotagPDFJob);
36            PDFServicesResponse<AutotagPDFResult> pdfServicesResponse = pdfServices.getJobResult(location, AutotagPDFResult.class);
37        
38            // Get content from the resulting asset(s)
39            Asset resultAsset = pdfServicesResponse.getResult().getTaggedPDF();
40            Asset resultAssetReport = pdfServicesResponse.getResult().getReport();
41            StreamAsset streamAsset = pdfServices.getContent(resultAsset);
42            StreamAsset streamAssetReport = (autotagPDFParams != null && autotagPDFParams.isGenerateReport()) ?
43                pdfServices.getContent(resultAssetReport) : null;
44        
45            // Creating output streams and copying stream assets' content to it
46            Files.createDirectories(Paths.get("output/"));
47            String outputPath = getOutputFilePathFromCmdArgs(args);
48            OutputStream outputStream = Files.newOutputStream(new File(outputPath + "autotagPDFInput-tagged.pdf").toPath());
49            LOGGER.info("Saving asset at " + outputPath + "autotagPDFInput-tagged.pdf");
50            IOUtils.copy(streamAsset.getInputStream(), outputStream);
51            if(streamAssetReport != null) {
52                OutputStream outputStreamReport = Files.newOutputStream(new File(outputPath + "autotagPDFInput-report.xlsx").toPath());
53                LOGGER.info("Saving asset at " + outputPath + "autotagPDFInput-report.xlsx");
54                IOUtils.copy(streamAssetReport.getInputStream(), outputStreamReport);
55                outputStreamReport.close();
56            }
57        } catch (ServiceApiException | IOException | SDKException | ServiceUsageException e) {
58            LOGGER.error("Exception encountered while executing operation", e);
59        }
60    }
61
62    private static AutotagPDFParams getOptionsFromCmdArgs(String[] args) {
63        Boolean generateReport = getGenerateReportFromCmdArgs(args);
64        Boolean shiftHeadings = getShiftHeadingsFromCmdArgs(args);
65        AutotagPDFParams.Builder autotagPDFParamsBuilder = AutotagPDFParams.autotagPDFParamsBuilder();
66    
67        if (generateReport)
68            autotagPDFParamsBuilder.generateReport();
69        if (shiftHeadings)
70            autotagPDFParamsBuilder.shiftHeadings();
71    
72        return autotagPDFParamsBuilder.build();
73    }
74    
75    private static Boolean getShiftHeadingsFromCmdArgs(String[] args) {
76        return Arrays.asList(args).contains("--shift_headings");
77    }
78    
79    private static Boolean getGenerateReportFromCmdArgs(String[] args) {
80        return Arrays.asList(args).contains("--report");
81    }
82    
83    private static String getInputFilePathFromCmdArgs(String[] args) {
84        String inputFilePath = "src/main/resources/autotagPDFInput.pdf";
85        int inputFilePathIndex = Arrays.asList(args).indexOf("--input");
86        if (inputFilePathIndex >= 0 && inputFilePathIndex < args.length - 1) {
87            inputFilePath = args[inputFilePathIndex + 1];
88        } else
89            LOGGER.info("input file not specified, using default value : autotagPDFInput.pdf");
90    
91        return inputFilePath;
92    }
93    
94    private static String getOutputFilePathFromCmdArgs(String[] args) {
95        String outputFilePath = "output/AutotagPDFParameterised/";
96        int outputFilePathIndex = Arrays.asList(args).indexOf("--output");
97        if (outputFilePathIndex >= 0 && outputFilePathIndex < args.length - 1) {
98            outputFilePath = args[outputFilePathIndex + 1];
99        } else
100            LOGGER.info("output path not specified, using default value : " + outputFilePath);
101    
102        return outputFilePath;
103    }
104}
105

// Get the samples from https://github.com/adobe/PDFServices.NET.SDK.Samples
// Run the sample:
// cd AutotagPDF/
// dotnet run .csproj

namespace AutotagPDFParameterised
{
    class Program
    {
        private static readonly ILog log = LogManager.GetLogger(typeof(Program));

        private static AutotagPDFOptions GetOptionsFromCmdArgs(String[] args)
    {
        Boolean generateReport = GetGenerateReportFromCmdArgs(args);
        Boolean shiftHeadings = GetShiftHeadingsFromCmdArgs(args);

        AutotagPDFOptions.Builder builder = AutotagPDFOptions.AutotagPDFOptionsBuilder();

        if (generateReport) builder.GenerateReport();
        if (shiftHeadings) builder.ShiftHeadings();

        return builder.Build();
    }

    private static Boolean GetShiftHeadingsFromCmdArgs(String[] args)
    {
        return Array.Exists(args, element => element == "--shift_headings");
    }

    private static Boolean GetGenerateReportFromCmdArgs(String[] args)
    {
        return Array.Exists(args, element => element == "--report");
    }

    private static String GetInputFilePathFromCmdArgs(String[] args)
    {
        String inputFilePath = @"autotagPdfInput.pdf";
        int inputFilePathIndex = Array.IndexOf(args, "--input");
        if (inputFilePathIndex >= 0 && inputFilePathIndex < args.Length - 1)
        {
            inputFilePath = args[inputFilePathIndex + 1];
        }
        else
            log.Info("input file not specified, using default value : autotagPdfInput.pdf");

        return inputFilePath;
    }

    private static String GetOutputFilePathFromCmdArgs(String[] args)
    {
        String outputFilePath = Directory.GetCurrentDirectory() + "/output/";
        int outputFilePathIndex = Array.IndexOf(args, "--output");
        if (outputFilePathIndex >= 0 && outputFilePathIndex < args.Length - 1)
        {
            outputFilePath = args[outputFilePathIndex + 1];
        }
        else
            log.Info("output path not specified, using default value : /output/");

        return outputFilePath;
    }

    static void Main(string[] args)
    {
        //Configure the logging
        ConfigureLogging();

        log.Info("--input " + GetInputFilePathFromCmdArgs(args));
        log.Info("--output " + GetOutputFilePathFromCmdArgs(args));
        log.Info("--report " + GetGenerateReportFromCmdArgs(args));
        log.Info("--shift_headings " + GetShiftHeadingsFromCmdArgs(args));

        try
        {
            // Initial setup, create credentials instance.
            Credentials credentials = Credentials.ServicePrincipalCredentialsBuilder()
                    .WithClientId("PDF_SERVICES_CLIENT_ID")
                    .WithClientSecret("PDF_SERVICES_CLIENT_SECRET")
                    .Build();

            //Create an ExecutionContext using credentials and create a new operation instance.
            ExecutionContext executionContext = ExecutionContext.Create(credentials);
            AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.CreateNew();

            // Provide an input FileRef for the operation
            autotagPDFOperation.SetInput(FileRef.CreateFromLocalFile(GetInputFilePathFromCmdArgs(args)));

            // Get and Build AutotagPDF options from command line args and set them into the operation
            AutotagPDFOptions autotagPDFOptions = GetOptionsFromCmdArgs(args);
            autotagPDFOperation.SetOptions(autotagPDFOptions);

            // Execute the operation
            AutotagPDFOutput autotagPDFOutput = autotagPDFOperation.Execute(executionContext);

            // Save the output files at the specified location
            string outputPath = GetOutputFilePathFromCmdArgs(args);
            FileRef taggedPDF = autotagPDFOutput.GetTaggedPDF();
            taggedPDF.SaveAs(Directory.GetCurrentDirectory() + outputPath + "autotagPDFInput-tagged.pdf");
            if (autotagPDFOptions != null && autotagPDFOptions.IsGenerateReport)
                autotagPDFOutput.GetReport()
                    .SaveAs(Directory.GetCurrentDirectory() + outputPath + "autotagPDFInput-report.xlsx");
        }
        catch (ServiceUsageException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (ServiceApiException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (SDKException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (IOException ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    catch (Exception ex)
        {
            log.Error("Exception encountered while executing operation", ex);
        }
    }

    static void ConfigureLogging()
    {
        ILoggerRepository logRepository = LogManager.GetRepository(Assembly.GetEntryAssembly());
        XmlConfigurator.Configure(logRepository, new FileInfo("log4net.config"));
    }
}
}
Copied to your clipboard
1
2// Get the samples from https://github.com/adobe/PDFServices.NET.SDK.Samples
3// Run the sample:
4// cd AutotagPDF/
5// dotnet run .csproj
6
7namespace AutotagPDFParameterised
8{
9    class Program
10    {
11        private static readonly ILog log = LogManager.GetLogger(typeof(Program));
12
13        private static AutotagPDFOptions GetOptionsFromCmdArgs(String[] args)
14    {
15        Boolean generateReport = GetGenerateReportFromCmdArgs(args);
16        Boolean shiftHeadings = GetShiftHeadingsFromCmdArgs(args);
17
18        AutotagPDFOptions.Builder builder = AutotagPDFOptions.AutotagPDFOptionsBuilder();
19
20        if (generateReport) builder.GenerateReport();
21        if (shiftHeadings) builder.ShiftHeadings();
22
23        return builder.Build();
24    }
25
26    private static Boolean GetShiftHeadingsFromCmdArgs(String[] args)
27    {
28        return Array.Exists(args, element => element == "--shift_headings");
29    }
30
31    private static Boolean GetGenerateReportFromCmdArgs(String[] args)
32    {
33        return Array.Exists(args, element => element == "--report");
34    }
35
36    private static String GetInputFilePathFromCmdArgs(String[] args)
37    {
38        String inputFilePath = @"autotagPdfInput.pdf";
39        int inputFilePathIndex = Array.IndexOf(args, "--input");
40        if (inputFilePathIndex >= 0 && inputFilePathIndex < args.Length - 1)
41        {
42            inputFilePath = args[inputFilePathIndex + 1];
43        }
44        else
45            log.Info("input file not specified, using default value : autotagPdfInput.pdf");
46
47        return inputFilePath;
48    }
49
50    private static String GetOutputFilePathFromCmdArgs(String[] args)
51    {
52        String outputFilePath = Directory.GetCurrentDirectory() + "/output/";
53        int outputFilePathIndex = Array.IndexOf(args, "--output");
54        if (outputFilePathIndex >= 0 && outputFilePathIndex < args.Length - 1)
55        {
56            outputFilePath = args[outputFilePathIndex + 1];
57        }
58        else
59            log.Info("output path not specified, using default value : /output/");
60
61        return outputFilePath;
62    }
63
64    static void Main(string[] args)
65    {
66        //Configure the logging
67        ConfigureLogging();
68
69        log.Info("--input " + GetInputFilePathFromCmdArgs(args));
70        log.Info("--output " + GetOutputFilePathFromCmdArgs(args));
71        log.Info("--report " + GetGenerateReportFromCmdArgs(args));
72        log.Info("--shift_headings " + GetShiftHeadingsFromCmdArgs(args));
73
74        try
75        {
76            // Initial setup, create credentials instance.
77            Credentials credentials = Credentials.ServicePrincipalCredentialsBuilder()
78                    .WithClientId("PDF_SERVICES_CLIENT_ID")
79                    .WithClientSecret("PDF_SERVICES_CLIENT_SECRET")
80                    .Build();
81
82            //Create an ExecutionContext using credentials and create a new operation instance.
83            ExecutionContext executionContext = ExecutionContext.Create(credentials);
84            AutotagPDFOperation autotagPDFOperation = AutotagPDFOperation.CreateNew();
85
86            // Provide an input FileRef for the operation
87            autotagPDFOperation.SetInput(FileRef.CreateFromLocalFile(GetInputFilePathFromCmdArgs(args)));
88
89            // Get and Build AutotagPDF options from command line args and set them into the operation
90            AutotagPDFOptions autotagPDFOptions = GetOptionsFromCmdArgs(args);
91            autotagPDFOperation.SetOptions(autotagPDFOptions);
92
93            // Execute the operation
94            AutotagPDFOutput autotagPDFOutput = autotagPDFOperation.Execute(executionContext);
95
96            // Save the output files at the specified location
97            string outputPath = GetOutputFilePathFromCmdArgs(args);
98            FileRef taggedPDF = autotagPDFOutput.GetTaggedPDF();
99            taggedPDF.SaveAs(Directory.GetCurrentDirectory() + outputPath + "autotagPDFInput-tagged.pdf");
100            if (autotagPDFOptions != null && autotagPDFOptions.IsGenerateReport)
101                autotagPDFOutput.GetReport()
102                    .SaveAs(Directory.GetCurrentDirectory() + outputPath + "autotagPDFInput-report.xlsx");
103        }
104        catch (ServiceUsageException ex)
105        {
106            log.Error("Exception encountered while executing operation", ex);
107        }
108    catch (ServiceApiException ex)
109        {
110            log.Error("Exception encountered while executing operation", ex);
111        }
112    catch (SDKException ex)
113        {
114            log.Error("Exception encountered while executing operation", ex);
115        }
116    catch (IOException ex)
117        {
118            log.Error("Exception encountered while executing operation", ex);
119        }
120    catch (Exception ex)
121        {
122            log.Error("Exception encountered while executing operation", ex);
123        }
124    }
125
126    static void ConfigureLogging()
127    {
128        ILoggerRepository logRepository = LogManager.GetRepository(Assembly.GetEntryAssembly());
129        XmlConfigurator.Configure(logRepository, new FileInfo("log4net.config"));
130    }
131}
132}

// Get the samples from https://github.com/adobe/pdfservices-node-sdk-samples
// Run the sample:
// node src/autotag/autoag-pdf-parameterised.js

const {
    ServicePrincipalCredentials,
    PDFServices,
    MimeType,
    AutotagPDFParams,
    AutotagPDFJob,
    AutotagPDFResult,
    SDKError,
    ServiceUsageError,
    ServiceApiError
} = require("@adobe/pdfservices-node-sdk");
const fs = require("fs");
const args = process.argv;

(async () => {
    let readStream;
    try {
        console.log("--input " + getInputFilePathFromCmdArgs(args));
        console.log("--output " + getOutputFilePathFromCmdArgs(args));
        console.log("--report " + getGenerateReportFromCmdArgs(args));
        console.log("--shift_headings " + getShiftHeadingsFromCmdArgs(args));

        // Initial setup, create credentials instance
        const credentials = new ServicePrincipalCredentials({
            clientId: process.env.PDF_SERVICES_CLIENT_ID,
            clientSecret: process.env.PDF_SERVICES_CLIENT_SECRET
        });

        // Creates a PDF Services instance
        const pdfServices = new PDFServices({credentials});

        // Creates an asset(s) from source file(s) and upload
        readStream = fs.createReadStream(getInputFilePathFromCmdArgs(args));
        const inputAsset = await pdfServices.upload({
            readStream,
            mimeType: MimeType.PDF
        });

        // Create parameters for the job
        const params = new AutotagPDFParams({
            generateReport: getGenerateReportFromCmdArgs(args),
            shiftHeadings: getShiftHeadingsFromCmdArgs(args)
        });

        // Creates a new job instance
        const job = new AutotagPDFJob({inputAsset, params});

        // Submit the job and get the job result
        const pollingURL = await pdfServices.submit({job});
        const pdfServicesResponse = await pdfServices.getJobResult({
            pollingURL,
            resultType: AutotagPDFResult
        });

        // Get content from the resulting asset(s)
        const resultAsset = pdfServicesResponse.result.taggedPDF;
        const resultAssetReport = pdfServicesResponse.result.report;
        const streamAsset = await pdfServices.getContent({asset: resultAsset});
        const streamAssetReport = resultAssetReport
            ? await pdfServices.getContent({asset: resultAssetReport})
            : undefined;

        // Creates an output stream and copy stream asset's content to it
        const outputPath = getOutputFilePathFromCmdArgs(args);
        const outputFilePath = outputPath + "autotagPDFInput-tagged.pdf";
        console.log(`Saving asset at ${outputFilePath}`);

        const writeStream = fs.createWriteStream(outputFilePath);
        streamAsset.readStream.pipe(writeStream);
        if (resultAssetReport) {
            const outputFileReportPath = outputPath + "autotagPDFInput-report.xlsx";
            console.log(`Saving asset at ${outputFileReportPath}`);

            const writeStream = fs.createWriteStream(outputFileReportPath);
            streamAssetReport.readStream.pipe(writeStream);
        }
    } catch (err) {
        if (err instanceof SDKError || err instanceof ServiceUsageError || err instanceof ServiceApiError) {
            console.log("Exception encountered while executing operation", err);
        } else {
            console.log("Exception encountered while executing operation", err);
        }
    } finally {
        readStream?.destroy();
    }
})();

function getInputFilePathFromCmdArgs(args) {
    let inputFilePath = "resources/autotagPdfInput.pdf";
    let inputFilePathIndex = args.indexOf("--input");
    if (inputFilePathIndex >= 0 && inputFilePathIndex < args.length - 1) {
        inputFilePath = args[inputFilePathIndex + 1];
    } else
        console.log("input file not specified, using default value : autotagPdfInput.pdf");
    return inputFilePath;
}

function getOutputFilePathFromCmdArgs(args) {
    let outputFilePath = "output/";
    let outputFilePathIndex = args.indexOf("--output");
    if (outputFilePathIndex >= 0 && outputFilePathIndex < args.length - 1) {
        outputFilePath = args[outputFilePathIndex + 1];
    } else {
        console.log("output path not specified, using default value :" + outputFilePath);
        fs.mkdirSync(outputFilePath, {recursive: true});
    }
    return outputFilePath;
}

function getGenerateReportFromCmdArgs(args) {
    return args.includes("--report");
}

function getShiftHeadingsFromCmdArgs(args) {
    return args.includes("--shift_headings");
}
Copied to your clipboard
1// Get the samples from https://github.com/adobe/pdfservices-node-sdk-samples
2// Run the sample:
3// node src/autotag/autoag-pdf-parameterised.js
4
5const {
6    ServicePrincipalCredentials,
7    PDFServices,
8    MimeType,
9    AutotagPDFParams,
10    AutotagPDFJob,
11    AutotagPDFResult,
12    SDKError,
13    ServiceUsageError,
14    ServiceApiError
15} = require("@adobe/pdfservices-node-sdk");
16const fs = require("fs");
17const args = process.argv;
18
19(async () => {
20    let readStream;
21    try {
22        console.log("--input " + getInputFilePathFromCmdArgs(args));
23        console.log("--output " + getOutputFilePathFromCmdArgs(args));
24        console.log("--report " + getGenerateReportFromCmdArgs(args));
25        console.log("--shift_headings " + getShiftHeadingsFromCmdArgs(args));
26
27        // Initial setup, create credentials instance
28        const credentials = new ServicePrincipalCredentials({
29            clientId: process.env.PDF_SERVICES_CLIENT_ID,
30            clientSecret: process.env.PDF_SERVICES_CLIENT_SECRET
31        });
32
33        // Creates a PDF Services instance
34        const pdfServices = new PDFServices({credentials});
35
36        // Creates an asset(s) from source file(s) and upload
37        readStream = fs.createReadStream(getInputFilePathFromCmdArgs(args));
38        const inputAsset = await pdfServices.upload({
39            readStream,
40            mimeType: MimeType.PDF
41        });
42
43        // Create parameters for the job
44        const params = new AutotagPDFParams({
45            generateReport: getGenerateReportFromCmdArgs(args),
46            shiftHeadings: getShiftHeadingsFromCmdArgs(args)
47        });
48
49        // Creates a new job instance
50        const job = new AutotagPDFJob({inputAsset, params});
51
52        // Submit the job and get the job result
53        const pollingURL = await pdfServices.submit({job});
54        const pdfServicesResponse = await pdfServices.getJobResult({
55            pollingURL,
56            resultType: AutotagPDFResult
57        });
58
59        // Get content from the resulting asset(s)
60        const resultAsset = pdfServicesResponse.result.taggedPDF;
61        const resultAssetReport = pdfServicesResponse.result.report;
62        const streamAsset = await pdfServices.getContent({asset: resultAsset});
63        const streamAssetReport = resultAssetReport
64            ? await pdfServices.getContent({asset: resultAssetReport})
65            : undefined;
66
67        // Creates an output stream and copy stream asset's content to it
68        const outputPath = getOutputFilePathFromCmdArgs(args);
69        const outputFilePath = outputPath + "autotagPDFInput-tagged.pdf";
70        console.log(`Saving asset at ${outputFilePath}`);
71
72        const writeStream = fs.createWriteStream(outputFilePath);
73        streamAsset.readStream.pipe(writeStream);
74        if (resultAssetReport) {
75            const outputFileReportPath = outputPath + "autotagPDFInput-report.xlsx";
76            console.log(`Saving asset at ${outputFileReportPath}`);
77
78            const writeStream = fs.createWriteStream(outputFileReportPath);
79            streamAssetReport.readStream.pipe(writeStream);
80        }
81    } catch (err) {
82        if (err instanceof SDKError || err instanceof ServiceUsageError || err instanceof ServiceApiError) {
83            console.log("Exception encountered while executing operation", err);
84        } else {
85            console.log("Exception encountered while executing operation", err);
86        }
87    } finally {
88        readStream?.destroy();
89    }
90})();
91
92function getInputFilePathFromCmdArgs(args) {
93    let inputFilePath = "resources/autotagPdfInput.pdf";
94    let inputFilePathIndex = args.indexOf("--input");
95    if (inputFilePathIndex >= 0 && inputFilePathIndex < args.length - 1) {
96        inputFilePath = args[inputFilePathIndex + 1];
97    } else
98        console.log("input file not specified, using default value : autotagPdfInput.pdf");
99    return inputFilePath;
100}
101
102function getOutputFilePathFromCmdArgs(args) {
103    let outputFilePath = "output/";
104    let outputFilePathIndex = args.indexOf("--output");
105    if (outputFilePathIndex >= 0 && outputFilePathIndex < args.length - 1) {
106        outputFilePath = args[outputFilePathIndex + 1];
107    } else {
108        console.log("output path not specified, using default value :" + outputFilePath);
109        fs.mkdirSync(outputFilePath, {recursive: true});
110    }
111    return outputFilePath;
112}
113
114function getGenerateReportFromCmdArgs(args) {
115    return args.includes("--report");
116}
117
118function getShiftHeadingsFromCmdArgs(args) {
119    return args.includes("--shift_headings");
120}

# Get the samples from https://github.com/adobe/pdfservices-python-sdk-samples
# Run the sample:
# python src/autotagpdf/autotag_pdf_parameterised.py --report --shift_headings --input resources/autotagPdfInput.pdf --output output/

logging.basicConfig(level=os.environ.get('LOGLEVEL', 'INFO'))


class AutotagPDFParameterised:

    _input_path: str
    _output_path: str
    _generate_report: bool
    _shift_headings: bool

    base_path = str(Path(__file__).parents[2])

    def __init__(self):
        pass

    @staticmethod
    def parse_args(*args: str):
        if not args:
            args = sys.argv[1:]
        parser = argparse.ArgumentParser(description='Autotag PDF')

        parser.add_argument('--input', help='Input file path', type=Path, metavar='input')
        parser.add_argument('--output', help='Output path', type=Path, dest='output')
        parser.add_argument('--report', dest='report', action='store_true', help='Generate report(in XLSX format)',
                            default=False)
        parser.add_argument('--shift_headings', dest='shift_headings', action='store_true', help='Shift headings',
                            default=False)

        return parser.parse_args(args)

    def get_default_input_file_path(self) -> str:
        return self.base_path + '/resources/autotagPdfInput.pdf'

    def get_default_output_file_path(self) -> str:
        return self.base_path + '/output/AutotagPDFParameterised'

    def get_autotag_pdf_options(self) -> AutotagPDFOptions:
        shift_headings = self._shift_headings
        generate_report = self._generate_report

        builder: AutotagPDFOptions.Builder = AutotagPDFOptions.builder()
        if shift_headings:
            builder.with_shift_headings()
        if generate_report:
            builder.with_generate_report()
        return builder.build()

    def execute(self, *args: str) -> None:
        args = self.parse_args(*args)
        self._input_path = args.input if args.input else self.get_default_input_file_path()
        self._output_path = args.output if args.output else self.get_default_output_file_path()
        self._generate_report = args.report
        self._shift_headings = args.shift_headings

        self.autotag_pdf()

    def autotag_pdf(self):
        try:
            # Initial setup, create credentials instance.
            credentials = Credentials.service_principal_credentials_builder() \
                .with_client_id('PDF_SERVICES_CLIENT_ID') \
                .with_client_secret('PDF_SERVICES_CLIENT_SECRET') \
                .build()

            # Create an ExecutionContext using credentials and create a new operation instance.
            execution_context = ExecutionContext.create(credentials)
            autotag_pdf_operation = AutotagPDFOperation.create_new()

            # Set operation input from a source file.
            source = FileRef.create_from_local_file(self._input_path)
            autotag_pdf_operation.set_input(source)

            # Build AutotagPDF options and set them into the operation
            autotag_pdf_operation.set_options(self.get_autotag_pdf_options())

            # Execute the operation.
            autotag_pdf_output: AutotagPDFOutput = autotag_pdf_operation.execute(execution_context)

            input_file_name = Path(self._input_path).stem
            base_output_path = self._output_path

            Path(base_output_path).mkdir(parents=True, exist_ok=True)

            # Save the result to the specified location.
            tagged_pdf_path = f'{base_output_path}/{input_file_name}-tagged.pdf'
            autotag_pdf_output.get_tagged_pdf().save_as(tagged_pdf_path)
            if self._generate_report:
                report_path = f'{base_output_path}/{input_file_name}-report.xlsx'
                autotag_pdf_output.get_report().save_as(report_path)

        except (ServiceApiException, ServiceUsageException, SdkException) as e:
            logging.exception(f'Exception encountered while executing operation: {e}')


if __name__ == "__main__":
    autotag_pdf_parameterised = AutotagPDFParameterised()
    autotag_pdf_parameterised.execute()
Copied to your clipboard
1# Get the samples from https://github.com/adobe/pdfservices-python-sdk-samples
2# Run the sample:
3# python src/autotagpdf/autotag_pdf_parameterised.py --report --shift_headings --input resources/autotagPdfInput.pdf --output output/
4
5logging.basicConfig(level=os.environ.get('LOGLEVEL', 'INFO'))
6
7
8class AutotagPDFParameterised:
9
10    _input_path: str
11    _output_path: str
12    _generate_report: bool
13    _shift_headings: bool
14
15    base_path = str(Path(__file__).parents[2])
16
17    def __init__(self):
18        pass
19
20    @staticmethod
21    def parse_args(*args: str):
22        if not args:
23            args = sys.argv[1:]
24        parser = argparse.ArgumentParser(description='Autotag PDF')
25
26        parser.add_argument('--input', help='Input file path', type=Path, metavar='input')
27        parser.add_argument('--output', help='Output path', type=Path, dest='output')
28        parser.add_argument('--report', dest='report', action='store_true', help='Generate report(in XLSX format)',
29                            default=False)
30        parser.add_argument('--shift_headings', dest='shift_headings', action='store_true', help='Shift headings',
31                            default=False)
32
33        return parser.parse_args(args)
34
35    def get_default_input_file_path(self) -> str:
36        return self.base_path + '/resources/autotagPdfInput.pdf'
37
38    def get_default_output_file_path(self) -> str:
39        return self.base_path + '/output/AutotagPDFParameterised'
40
41    def get_autotag_pdf_options(self) -> AutotagPDFOptions:
42        shift_headings = self._shift_headings
43        generate_report = self._generate_report
44
45        builder: AutotagPDFOptions.Builder = AutotagPDFOptions.builder()
46        if shift_headings:
47            builder.with_shift_headings()
48        if generate_report:
49            builder.with_generate_report()
50        return builder.build()
51
52    def execute(self, *args: str) -> None:
53        args = self.parse_args(*args)
54        self._input_path = args.input if args.input else self.get_default_input_file_path()
55        self._output_path = args.output if args.output else self.get_default_output_file_path()
56        self._generate_report = args.report
57        self._shift_headings = args.shift_headings
58
59        self.autotag_pdf()
60
61    def autotag_pdf(self):
62        try:
63            # Initial setup, create credentials instance.
64            credentials = Credentials.service_principal_credentials_builder() \
65                .with_client_id('PDF_SERVICES_CLIENT_ID') \
66                .with_client_secret('PDF_SERVICES_CLIENT_SECRET') \
67                .build()
68
69            # Create an ExecutionContext using credentials and create a new operation instance.
70            execution_context = ExecutionContext.create(credentials)
71            autotag_pdf_operation = AutotagPDFOperation.create_new()
72
73            # Set operation input from a source file.
74            source = FileRef.create_from_local_file(self._input_path)
75            autotag_pdf_operation.set_input(source)
76
77            # Build AutotagPDF options and set them into the operation
78            autotag_pdf_operation.set_options(self.get_autotag_pdf_options())
79
80            # Execute the operation.
81            autotag_pdf_output: AutotagPDFOutput = autotag_pdf_operation.execute(execution_context)
82
83            input_file_name = Path(self._input_path).stem
84            base_output_path = self._output_path
85
86            Path(base_output_path).mkdir(parents=True, exist_ok=True)
87
88            # Save the result to the specified location.
89            tagged_pdf_path = f'{base_output_path}/{input_file_name}-tagged.pdf'
90            autotag_pdf_output.get_tagged_pdf().save_as(tagged_pdf_path)
91            if self._generate_report:
92                report_path = f'{base_output_path}/{input_file_name}-report.xlsx'
93                autotag_pdf_output.get_report().save_as(report_path)
94
95        except (ServiceApiException, ServiceUsageException, SdkException) as e:
96            logging.exception(f'Exception encountered while executing operation: {e}')
97
98
99if __name__ == "__main__":
100    autotag_pdf_parameterised = AutotagPDFParameterised()
101    autotag_pdf_parameterised.execute()