Manipulating a PDF form with Json Data

Manipulating a PDF form with Json Data

This isn't a super common scenario but through the years it has come up several times. A client will have a PDF form that they'll want to fill out with data from a database or form. It is usually a report or gets emailed around so I will show you how to manipulate a pdf form. The first few times I built this (ten years ago_ were different than the last refactor which I was able to take advantage of some Service Stack utils to make it super easy along with Aspose for the PDF programmatic changes. Since these were all commercial projects we had no problem purchasing tools to make it work (years ago). First, let's setup the data. I am going to assume you know how to create a form, save data to a database, serialize/deserialize. So in this case, let's just assume we have an object we can convert to/from json. Something like this will do for this demonstration:

{
    "firstName": "Sam",
    "isHuman": true,
    "birthDate": "1985-01-01",
    "isHappy": false
}

Now, to the fun part. For PDF manipulation and editing I like to use the Nitro Pro. It is not expensive and works great. The following screenshots will be from that software but you can do the same in Adobe/etc. In order to link the json values to the pdf form fields, I will name each form field to map to a json value or expression.

Text Field Example:

Manipulating a PDF form with Json Data

Checkbox True Example:

Manipulating a PDF form with Json Data

Checkbox False Example:

Manipulating a PDF form with Json Data

And last let's add a text field that outputs a custom date example below.

Manipulating a PDF form with Json Data

The logic should be obvious by now and in order to bring it all together we will use the Service Stack JS Utils and Aspose to loop over each field and set its value. Each PDF field name is an expression that gets evaluated against the Json object we have. Last, I have defined a toCustomDateto output the date in the format I want. Of course there could be some error checking/etc as well. We can add as many custom methods as we need if required.

    public void GeneratePdf(string pathToPdfFile, string pathToExport, string pathToJson) {
        PdfFileInfo file = new PdfFileInfo(pathToFile);

        Aspose.Pdf.Facades.Form f = new Aspose.Pdf.Facades.Form(file.Document);

        var json = pathToJson.ReadAllText();

        var formParse = JSON.parse(json);

        Dictionary<string, object> Args = new Dictionary<string, object>();
        Args.Add("form", formParse);


        var scope = JS.CreateScope(
             args: Args,

             functions: new CustomMethods());

        foreach(var n in f.FieldNames) {

            switch (f.GetFieldType(n))
            {
                case FieldType.CheckBox:
                    FillBoolField(f, n, scope); // helper method
                    break;
                case FieldType.Text:
                    FillTextField(f,n,scope); // helper
                break;

            }
        }

        file.Document.Flatten();
        using (var stream = new MemoryStream())
        {
            file.Save(stream);
            File.WriteAllBytes(pathToExport, stream.ToArray());
        }

    }

    void FillBoolField(Aspose.Pdf.Facades.Form f, string n, ScriptScopeContext scope)
    {
        //Console.WriteLine(n);
        try
        {
            var b = JS.eval(n, scope);

            if (b != null)
            {
                bool bb = b.ConvertTo<bool>();
                Console.WriteLine($"{n} - {bb}");
                f.FillField(n, (bool)b);
            }
            else
            {
                Console.WriteLine($"{n} didn't parse");
            }

        }
        catch(Exception ex)
        {
            Console.WriteLine("exception" + ex.Message);
        }

    }


    void FillTextField(Aspose.Pdf.Facades.Form f, string n, ScriptScopeContext scope)
    {

        try
        {
            var b = JS.eval(n, scope);

            if (b != null)
            {
                f.FillField(n, b.ToString());
            }
            else
            {
                Console.WriteLine($"{n} didn't parse");
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine("exception" + ex.Message);
        }

    }

    public class CustomMethods : ScriptMethods
{
    public string toCustomString(object text) => text != null ? text.ToString() : string.Empty;
    public string toCustomDate(string text) {
        return DateTime.Parse(text).ToString("dd MMM yyyy");
    }
}

That is really the extent of the code required to populate a pdf form. The output of the small demo here looks like the following. The last checkmark is form.isHappy=false because I am not a machine so that isn't checked.

Manipulating a PDF form with Json Data

To cap this off, there are several options and variants to do this. For instance, the Service Stack utility methods already come with a date format method, `form.birthDate.dateFormat('dd MMM yyyy')` so I could have used that however if you have 20 different date fields that use the same format it is recommended that you wrap it in your own method so you can easily make the change later. This can easily be extended to add signature and images as well. I bet you didn't know how easy and fun working with PDF files could be? If anyone is interested in the full codeset I can provide a linqpad project, just drop me a line in the comments.