Persits Software, Inc. Web Site
 Navigator:  Home |  Manual |  Chapter 14: Document Stitching and Other Features
Chapter 15: HTML to PDF Conversion Chapter 13: Barcodes
  Chapter 14: Document Stitching and Other Features
14.1 Document Stitching
14.2 Metadata
14.3 PDF/A Support

14.1 Document Stitching

AspPDF.NET is capable of joining together two or more PDFs to form a new document. This process is often referred to as document stitching.

14.1.1 AppendDocument Method

Document stitching is performed via the AppendDocument method provided by the PdfDocument object. This method expects a single argument: an instance of the PdfDocument object representing another document to be appended to the current document. The AppendDocument method can be called more than once to append multiple documents to the current one.

The PdfDocument object to which other documents are appended (the master document) can either be a new or existing document. The PdfDocument objects that get appended must be all existing documents. A document cannot be appended to itself.

The master document determines the general and security properties of the resultant document.

The following code sample appends the file doc2.pdf to the end of the document doc1.pdf:

C#
PdfManager objPDF = new PdfManager();

// Open Document 1
PdfDocument objDoc1 = objPDF.OpenDocument( Server.MapPath("doc1.pdf") );

// Open Document 2
PdfDocument objDoc2 = objPDF.OpenDocument( Server.MapPath("doc2.pdf") );

// Append doc2 to doc1
objDoc1.AppendDocument( objDoc2 );
VB.NET
Dim objPDF As PdfManager = New PdfManager()

' Open Document 1
Dim objDoc1 As PdfDocument = objPDF.OpenDocument(Server.MapPath("doc1.pdf"))

' Open Document 2
Dim objDoc2 As PdfDocument = objPDF.OpenDocument(Server.MapPath("doc2.pdf"))

' Append doc2 to doc1
objDoc1.AppendDocument(objDoc2)

Click the links below to run this code sample:

http://localhost/asppdf.net/manual_14/14_stitch.cs.aspx
http://localhost/asppdf.net/manual_14/14_stitch.vb.aspx  Make sure AspPDF.NET is installed on your local machine and IIS is running for these links to work.

14.1.2 Applying and Removing Security

A cumulative document produced by appending one or more PDFs to a master document inherits the master document's security properties. For example, if a master document is encrypted and the documents appended to it are not, the resultant PDF will be encrypted with the same passwords and permission flags as the master document. Conversely, if the master document is unencrypted and encrypted documents are appended to it, the result document will be unencrypted.

This feature can be used to apply security to unsecure documents, as well as modify or remove security from encrypted documents. The idea is to create an empty document, call the Encrypt method on it if necessary, then append the PDF that needs security added or removed.

To be in compliance with Adobe PDF licensing requirements, AspPDF.NET performs security removal only if the document being appended is opened using the owner password. Otherwise, an error exception is thrown.

The following code sample applies security to the file doc1.pdf. Note that various document properties are being copied from the original document (doc1.pdf) to the new one, because by default the resultant PDF would inherit document properties of the master PDF (in our case, an empty document) and the original document's properties would be lost.

C#
...
// Create empty document
PdfDocument objDoc = objPdf.CreateDocument();

// Open Document 1
PdfDocument objDoc1 = objPdf.OpenDocument( Server.MapPath("doc1.pdf") );

// Copy properties
objDoc.Title		= objDoc1.Title;
objDoc.Creator		= objDoc1.Creator;
objDoc.Producer		= objDoc1.Producer;
objDoc.CreationDate = objDoc1.CreationDate;
objDoc.ModDate		= objDoc1.ModDate;

// Apply security to objDoc, use pdfFull permission by default
objDoc.Encrypt( "abc", "", 128 );

// Append doc1 to doc
objDoc.AppendDocument( objDoc1 );
...
VB.NET
...
' Create empty document
Dim objDoc As PdfDocument = objPdf.CreateDocument()

' Open Document 1
Dim objDoc1 As PdfDocument = objPdf.OpenDocument(Server.MapPath("doc1.pdf"))

' Copy properties
objDoc.Title = objDoc1.Title
objDoc.Creator = objDoc1.Creator
objDoc.Producer = objDoc1.Producer
objDoc.CreationDate = objDoc1.CreationDate
objDoc.ModDate = objDoc1.ModDate

' Apply security to objDoc, use pdfFull permission by default
objDoc.Encrypt("abc", "", 128)

' Append doc1 to doc
objDoc.AppendDocument(objDoc1)
...

Click the links below to run this code sample:

http://localhost/asppdf.net/manual_14/14_applysecurity.cs.aspx
http://localhost/asppdf.net/manual_14/14_applysecurity.vb.aspx  Make sure AspPDF.NET is installed on your local machine and IIS is running for these links to work.

14.1.3 Making Changes to Documents Being Appended

As mentioned earlier, a document being appended must be an existing document opened via OpenDocument. Changes made to a document being appended will not propagate to the resultant compound document.

If you need to make changes to a document being appended, the following workaround is recommended:

PdfDocument objDoc1 = objPDF.OpenDocument(...);
PdfDocument objDoc2 = objPDF.OpenDocument(...);
// Make changes to objDoc2

PdfDocument objDoc3 = objPDF.OpenDocument( objDoc2.SaveToMemory );
objDoc1.AppendDocument( objDoc3 );

This code fragment uses an intermediary memory-based document objDoc3 to hold the modified version of objDoc2.

14.1.4 Creating Multi-Page Documents Based on a Template

AppendDocument is not a very efficent way to create multi-page documents based on a single-page PDF template. We recommend that the method CreateGraphicsFromPage described in Section 9.6 be used for this task instead. For a code sample, see our KB Article PS130905190.

14.2 Metadata

All major Adobe products share a common technology that enables you to embed data describing a file, known as metadata, into the file itself. This technology, called Extensible Metadata Platform (XMP), uses XML as the syntax for metadata description. For more information on XMP, go to http://www.adobe.com/products/xmp.

XML tags used in an XMP data block are described by the Resource Description Framework (RDF) available at http://www.w3.org/RDF.

A typical metadata string may look as follows:

<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#' xmlns:iX='http://ns.adobe.com/iX/1.0/'>

 <rdf:Description about='' xmlns='http://ns.adobe.com/pdf/1.3/' xmlns:pdf='http://ns.adobe.com/pdf/1.3/'>
  <pdf:CreationDate>2002-12-24T07:48:28Z</pdf:CreationDate>
  <pdf:ModDate>2003-02-28T19:39:16+09:00</pdf:ModDate>
  <pdf:Producer>Acrobat Distiller 5.0.1 for Macintosh</pdf:Producer>
  <pdf:Title>Technical Specifications</pdf:Title>
  <pdf:Author>John Smith</pdf:Author>
 </rdf:Description>

 <rdf:Description about='' xmlns='http://ns.adobe.com/xap/1.0/' xmlns:xap='http://ns.adobe.com/xap/1.0/'>
  <xap:CreateDate>2002-12-24T07:48:28Z</xap:CreateDate>
  <xap:ModifyDate>2003-02-28T19:39:16+09:00</xap:ModifyDate>
  <xap:MetadataDate>2003-02-28T19:39:16+09:00</xap:MetadataDate>
  <xap:Title>
   <rdf:Alt>
    <rdf:li xml:lang='x-default'>Technical Specifications</rdf:li>
   </rdf:Alt>
  </xap:Title>
  <xap:Author>John Smith</xap:Author>
 </rdf:Description>

 <rdf:Description about='' xmlns='http://purl.org/dc/elements/1.1/' xmlns:dc='http://purl.org/dc/elements/1.1/'>
  <dc:title>Technical Specifications</dc:title>
  <dc:creator>John Smith</dc:creator>
 </rdf:Description>

</rdf:RDF>

AspPDF.NET enables you to retrieve and specify metadata associated with a PDF document via the MetaData property of the PdfDocument object. The following code fragment extracts and prints out metadata from an existing PDF file:

PdfDocument objDoc = objPDF.OpenDocument(@"c:\path\somedoc.pdf");
Response.Write( objDoc.MetaData );

AspPDF.NET provides no functionality for parsing out individual metadata items. Any XML parser object can be used for that, such as Microsoft XML DOM.

14.3 PDF/A Support
14.3.1 PDF/A: PDF for Archiving

The PDF/A format is a subset of the regular PDF format with certain features, deemed incompatible with long-term archival and storage of documents, removed. PDF/A-compliant documents must be completely self-contained, with no reliance on external resources. The single most important requirement for PDF/A files is that all fonts must be embedded. Other requirements include:

  • Encryption is not allowed;
  • Documents must contain standards-based metadata;
  • Links to other documents and URLs are not allowed.
  • Use of device-dependent color spaces such as DeviceRGB is only allowed with some restrictions.
  • Certain other PDF features, such as JavaScript, XML Forms Architecture (XFA), LZW compression, and others, are not allowed.

There are currently three levels of PDF/A conformance: PDF/A-1, PDF/A-2 and PDF/A-3, with Level 1 subdivided into sublevels A and B.

For more information on PDF/A, see http://www.pdfa.org.

14.3.2 AspPDF.NET's Support for PDF/A

As of Version 3.3, AspPDF.NET is capable of producing PDF documents compliant with PDF/A-1b, the basic conformance level which ensures reliable reproduction of the visual appearance of the document. Even prior to Version 3.3, AspPDF.NET embedded all TrueType fonts and allowed metadata to be specified, thus meeting the most important PDF/A requirements. Version 3.3 bridges the remaining gap to full PDF/A-1b compliance by implementing the following features and enhancements:

  • The new PdfDocument.AddOutputIntent method enables mapping from a device-dependent color space such as DeviceRGB to a device-independent color space via an International Color Consortium (ICC) profile, thus satisfying the media-independent visual color reproduction requirement.
  • The entries /CIDToGIDMap and /CIDSet have been implemented for embedded TrueType fonts.
  • A bug has been fixed responsible for certain stream objects to lack the required end-of-line character before the keyword endstream.

The AddOutputIntent method expects 4 arguments: the output condition, the output condition indentifier, the path to the .icc profile file, and the number of color components in the device-dependent color space used by the document (1 for DeviceGray, 3 for DeviceRGB and 4 for DeviceCMYK.) The output condition is a string concisely identifying the intended output device or production condition in human-readable form. The output condition identifier is a string that identifies the output device or production condition as it appears in an industry-standard registry, and can be set to "Custom".

The metadata format is XML-based and similar to that described in the previous section of this chapter, but must contain additional tags. The following example is a minimal metadata string required for PDF/A-1b compliance:

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>

   <x:xmpmeta x:xmptk="Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00" xmlns:x="adobe:ns:meta/">
      <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

         <rdf:Description rdf:about="" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/">
            <pdfaid:part>1</pdfaid:part>
            <pdfaid:conformance>B</pdfaid:conformance>
         </rdf:Description>

         <rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
            <pdf:Producer>Persits Software AspPDF.NET - www.persits.com</pdf:Producer>
            <pdf:Keywords></pdf:Keywords>
         </rdf:Description>

      </rdf:RDF>
   </x:xmpmeta>

<?xpacket end="w"?>

There are several components of this metadata string that are worth noting:

  • The metadata must be enclosed within the <?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?> and <?xpacket end="w"?> tags.
  • The PDF/A level of conformance must be specified via the <pdfaid:part> and <pdfaid:conformance> tags (1 and B for AspPDF.NET's level of conformance.)
  • The Producer value must match the current value for the PdfDocument.Producer property which is set to "Persits Software AspPDF for .NET - www.persits.com" by default.

In addition to the tags shown above, PDF/A metadata almost always contains "Dublin Core" (DC) tags as well, such as <dc:title> and <dc:description>, for example:

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>

   <x:xmpmeta x:xmptk="Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00" xmlns:x="adobe:ns:meta/">
      <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

         ...

         <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
            <dc:title>
               <rdf:Alt>
                  <rdf:li xml:lang="x-default">Sunset on the beach</rdf:li>
                  <rdf:li xml:lang="de-DE">Sonnenuntergang am Strand</rdf:li>
               </rdf:Alt>
            </dc:title>

            <dc:description>
               <rdf:Alt>
                  <rdf:li xml:lang="x-default">Hello, World</rdf:li>
                  <rdf:li xml:lang="de-DE">Hallo, Welt</rdf:li>
               </rdf:Alt>
            </dc:description>

         </rdf:Description>

      </rdf:RDF>
   </x:xmpmeta>

<?xpacket end="w"?>

14.3.3 Code Sample

The following code sample creates a PDF/A-1b compliant document by importing the URL http://www.asppdf.net, attaching the metadata from the text file metadata.txt located in the same folder as the code sample, and adding an output intent based on the color profile AdobeRGB1998.icc located in the sibling folder manual_16 of the installation.

The content of the file metadata.txt is as follows:

<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>

<x:xmpmeta x:xmptk="Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-20:48:00" xmlns:x="adobe:ns:meta/">
	<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

		<rdf:Description rdf:about="" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/">
			<pdfaid:part>1</pdfaid:part>
			<pdfaid:conformance>B</pdfaid:conformance>
		</rdf:Description>

		<rdf:Description rdf:about="" xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
			<pdf:Producer>Persits Software AspPDF for .NET - www.persits.com</pdf:Producer>
			<pdf:Keywords></pdf:Keywords>
		</rdf:Description>

		<rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
			<dc:title>
				<rdf:Alt>
					<rdf:li xml:lang="x-default">@@@title@@@</rdf:li>
				</rdf:Alt>
			</dc:title>
		</rdf:Description>

	</rdf:RDF>
</x:xmpmeta>
	
<?xpacket end="w"?>
	

Note that this metadata file is actually a template, it contains the placeholder @@@title@@@ where the actual title should be. The code sample replaces the placeholder with the value of the PdfDocument.Title property (which is not known in advance since ImportFromUrl sets it based on the HTML content it imports) to ensure that the document's Title entry and the value of the <dc:title> tag in the metadata match.

C#
PdfManager objPdf = new PdfManager();

// Create empty document
PdfDocument objDoc = objPdf.CreateDocument();

// Convert HTML to PDF
objDoc.ImportFromUrl("http://www.asppdf.net", "landscape=true; scale=0.75");

// Add metadata from a file
string strMetadata = objPdf.LoadTextFromFile(Server.MapPath("metadata.txt"));

// Replace placeholder with actual document title
strMetadata = strMetadata.Replace("@@@title@@@", objDoc.Title);

objDoc.MetaData = strMetadata;

// Add output intent using an RGB color profile. Borrow .icc file from Chapter 15
string strProfilePath = Server.MapPath(".") + @"\..\manual_16\AdobeRGB1998.icc";
objDoc.AddOutputIntent("AdobeRGB", "Custom", strProfilePath, 3);

// Save document
string strPath = Server.MapPath("pdfa.pdf");
string strFileName = objDoc.Save(strPath, false);
VB.NET
Dim objPdf As PdfManager = New PdfManager()

' Create empty document
Dim objDoc As PdfDocument = objPdf.CreateDocument()

' Convert HTML to PDF
objDoc.ImportFromUrl("http://www.asppdf.net", "landscape=true; scale=0.75")

' Add metadata from a file
Dim strMetadata As String = objPdf.LoadTextFromFile(Server.MapPath("metadata.txt"))

' Replace placeholder with actual document title
strMetadata = strMetadata.Replace("@@@title@@@", objDoc.Title)

objDoc.MetaData = strMetadata

' Add output intent using an RGB color profile. Borrow .icc file from Chapter 15
Dim strProfilePath As String = Server.MapPath(".") + "\..\manual_16\AdobeRGB1998.icc"
objDoc.AddOutputIntent("AdobeRGB", "Custom", strProfilePath, 3)

' Save document
Dim strPath As String = Server.MapPath("pdfa.pdf")
Dim strFileName As String = objDoc.Save(strPath, false)

Click the links below to run this code sample:

http://localhost/asppdf.net/manual_14/14_pdfa.cs.aspx
http://localhost/asppdf.net/manual_14/14_pdfa.vb.aspx  Make sure AspPDF.NET is installed on your local machine and IIS is running for these links to work. Chapter 15: HTML to PDF Conversion Chapter 13: Barcodes

Search AspPDF.net

Newsletter Signup

Other Products
AspPDF
AspUpload
AspJpeg
AspEmail
AspEncrypt
AspGrid
AspUser
  This site is owned and maintained by Persits Software, Inc. Copyright © 2003 - 2014. All Rights Reserved.