Submitting, processing and responding to PDF form data

1/7/2016 By Frank 0 comments

Submitting, processing and responding to PDF form data

If you have to choose between an HTML form and a PDF form - or maybe you are required to support both - then it is good to know about the differences between these two forms and what they have in common. The scope of this article is restricted to classic PDF forms - as opposed to XFA forms.

HTML Form

An HTML form looks like this:

1 <h1>I want pizza!</h1> 2 <form method>="get" action="order"<>> 3 <p>Choose a size:</p> 4 <input type>="radio" name="size" value="small"> small 5 6 <input type>="radio" name="size" value="medium">medium 7 8 <input type>="radio" name="size" value="large">large 9 <p>Choose ingredients:</p> 10 <input type>="checkbox" name="tomatoes" value="tomatoes">tomatoes 11 12 <input type>="checkbox" name="onions" value="onions">onions 13 14 <input type>="checkbox" name="tuna" value="tuna">tuna 15 16 <input type>="checkbox" name="cheese" value="cheese">cheese 17 <p>My name:</p> 18 <input typep>="text" name="name" /></b> <input type>="submit" value="order" /> 19 </form>

Both the field itself and it how it looks are represented by the same element, namely the input element. PDF on the other hand, separates the notion of a field and its representation completely.

PDF Form

PDF fields are defined at document level and each field may have zero or more visual represenations called widgets. Each widgets is associated with a page. The following diagram shows this structure and how document, page, field and widget are related:

PDF-fields-pages-and-widgets.png

Throughout this article I will use the following PDF form:

example-pdf-form.png

I used notepad to create the text part and then printed it to PDF. Next, I used Adobe Acrobat Pro DC to add the form elements. The size options are radio buttons that share the same group name "size". The radio buttons "small", "medium" and "large" are part of the same group named "size". The ingredients are checkboxes with corresponding names. Finally there is a textbox named "name" and a button named "order".

Add a form submit button to the PDF

In order to submit a PDF form to a web endpoint, you need to add a button with a submit form action. You typically do this in Adobe Acrobat. Here is what the actions tab of the button properties dialog looks like after adding a submit form action:

add-pdf-form-submit-button.png

If you select the action and click the Edit button, you will see the available options for submitting form data:

submit-form-options.png

The selected export format is HTML. This will POST all form data to the specified URL when the button is clicked. Note that in contrast to an HTML form it is not possible to specify GET as the HTTP method. Later on we will see how to handle this request in an ASP.NET MVC application.

Open the PDF form

Let's open this form in the browser and see what happens, PDF pizza form demo

As an implementation note, the form is located inside the Content folder of an MVC app and the action method looks like this:

1 public class PizzaController : Controller 2 { 3 public ActionResult Form() 4 { 5 return File("~/Content/order-pizza.pdf", "application/pdf"); 6 } 7 }

There is a good chance that your browser will render the PDF form itself instead of using the Adobe Reader plug-in. Google Chrome renders the PDF as HTML and consequently breaks a great deal of PDF features, including submitting form data. Edge does the same thing. In fact, all modern web browsers have stopped supporting the NPAPI plug-in infrstructureon which the Adobe Reader plug-in relies. If you click the order button in the browser, nothing happens.

This is why Adobe made it possible to submit form data using the latest versions of Adobe Reader. Earlier version of Adobe Reader did not allow this unless your document was Reader extended (if you know exactly when this changed entered Adobe Reader, then please leave a comment. I tried to Google it but without success.)

To get the full PDF experience when opening PDF documents or forms from the web, you must disable your browser's PDF viewer. Here are the steps for Google Chrome:

1. Browse to chrome://plugins

2. Click the disable link of the Chrome PDF Viewer

(Google for similar instructions for other browsers.)

If you now open the form in your browser using the same link, your default system PDF viewer (make sure it is Adobe Reader) opens the PDF outside the browser like this:

pdf-form-in-reader.png

Submit form data from Adobe Reader

Clicking the order button from Adobe Reader submits the form data to endpoint http://www.tallcomponents.com/demos/pizza/order. Here is the ASP.NET MVC controller action that handles this request:

1 public class PizzaController : Controller 2 { 3 [HttpPost] 4 public ActionResult Order(Pizza pizza) 5 { 6 return View(pizza); 7 } 8 }

Model pizza

1 public class Pizza 2 { 3 public string Size { get; set; } 4 public string Tomatoes { get; set; } 5 public string Onions { get; set; } 6 public string Tuna { get; set; } 7 public string Cheese { get; set; } 8 public string Name { get; set; } 9 }

View Order.cshtml:

1 public class Pizza 2 @model Pizza 3 <h2>Hi @Model.Name!</h2> 4 <p> 5 Thanks for ordering a @Model.Size pizza. 6 Tomatoes: @Model.Tomatoes. 7 Onions: @Model.Onions. 8 Tuna: @Model.Tuna. 9 Cheese: @Model.Cheese. 10 </p>

Note how MVC takes care of mapping form data to members of Pizza based on their names.

After clicking the order button, the following dialog displays:

security-warning.png

After clicking Allow, Adobe Reader asks permission to open the response:

confirm-open-html.png

Apparantly, Adobe Reader saves the response to a temporary location. After clicking Yes, the default browser displays the response:

response-in-pdf.png

This is as expected but far from a great user experience.

Return a PDF Response

The previous use case returned HTML as a response. Consequently, a browser instance opens and displays the HTML. Let's see what happens if we return the response as PDF.

I have created a second version of the order pizza form. The order button of this PDF submits the data to a second endpoint that returns a PDF response using "PDFKit.NET, as follows:

1 [HttpPost] 2 public ActionResult Order2(Pizza pizza) 3 { 4 Document document = new Document(); 5 Page page = new Page(PageSize.Letter); 6 document.Pages.Add(page); 7 8 double margin = 72; // points 9 MultilineTextShape text = new MultilineTextShape( 10 margin, page.Height - margin, page.Width - 2 * margin); 11 page.Overlay.Add(text); 12 Fragment fragment = new Fragment( 13 string.Format("Hi {0}!, thanks for ordering a {1} pizza!", 14 pizza.Name, pizza.Size), 15 Font.Helvetica, 16 16); 17 text.Fragments.Add(fragment); 18 19 Response.ContentType = "application/pdf"; 20 Response.AppendHeader("Content-disposition", "attachment; filename=file.pdf"); 21 document.Write(Response.OutputStream); 22 23 return null; 24 }

If I now click the order button, a new instance opens showing the following response:

response-pdf.png

Return flattened PDF form

A form is said to be flattened if all fields have been replaced by non-editable graphics corresponding to the form data. Note that the fields have not just been disabled or made read-only but they have been removed entirely and replaced with non-interactive content.

Let's see how we can return the flattened form as a response.

I have created a third version of the order pizza form

The order button of this PDF submits the data to a third endpoint that uses PDFKit.NET to merge the submitted data with the original form and flattens the form as follows:

1 [HttpPost] 2 public ActionResult Order2(Pizza pizza) 3 [HttpPost] 4 public ActionResult Order3(Pizza pizza) 5 { 6 using (FileStream file = new FileStream( 7 Server.MapPath("~/Content/order-pizza3.pdf"), 8 FileMode.Open, FileAccess.Read)) 9 { 10 // import submitted data into original form 11 Document document = new Document(file); 12 FormData data = FormData.Create(System.Web.HttpContext.Current.Request); 13 document.Import(data); 14 15 // flatten form 16 foreach (Field field in document.Fields) 17 { 18 foreach (Widget widget in field.Widgets) 19 { 20 widget.Persistency = WidgetPersistency.Flatten; 21 } 22 } 23 24 Response.ContentType = "application/pdf"; 25 Response.AppendHeader("Content-disposition", "inline; filename=file.pdf"); 26 document.Write(Response.OutputStream); 27 28 return null; 29 } 30 }

If I now click the order button, a new instance opens showing the following response:

response-flat-pdf.png

The fields still look like fields, but they are actually graphics.

Download

Download the ASP.NET MVC project including PDF forms.