How do I extract page destinations from bookmarks?

Content extraction

This article explains how one can determine which page a particular bookmark refers to.

If one considers the PDFKit.NET Bookmark class, there appears to be no obvious place where it holds page destinations. The reason for this is simply that a PDF Bookmark is not just a true bookmark. Instead, it is an entity that can have various actions associated with it when clicked on. This can be an action that jumps to a particular page, but also an action that executes some javascript code for example. A Bookmark can have a sequence of such actions associated with it, and these will all be executed in order when someone clicks on it.

This means that one will need to inspect the actions in order to determine whether a Bookmark actually jumps to a page, and if so, which one.

Another complicating factor is that there can be multiple ways to perform such a jump. The code below takes into account internal destinations and named destinations. These are the most common cases. It is also possible to jump to a particular page via a javascript action. This is highly unusual though, - and rather complex to detect - so we ignore that possibility below.

1 private void listBookmarks(BookmarkCollection collection) 2 { 3 foreach (Bookmark bookmark in collection) 4 { 5 System.Diagnostics.Trace.WriteLine("bookmark: " + bookmark.Title); 6 7 foreach (Action action in bookmark.Actions) 8 { 9 GoToAction gotoAction = action as GoToAction; 10 if (gotoAction != null) 11 { 12 // Arbitrary actions can be attached to a bookmark, but for references inside 13 // a document this is normally a GoToAction with either an InternalDestination 14 // or a NamedDestination. The positions of an internal destination may only 15 // be partially defined. In that case, the Top, Left, Bottom, and 16 // Right values may contain NaN values (Not-A-Number). 17 // 18 // Please note that it is also possible to attach javascript to a bookmark and 19 // that this may execute code that goes to a particular page. This is quite 20 // unusual though. 21 22 InternalDestination internalDestination = gotoAction.Destination as InternalDestination; 23 if (internalDestination == null) 24 { 25 NamedDestination namedDestination = gotoAction.Destination as NamedDestination; 26 if (namedDestination != null) 27 { 28 internalDestination = document.NamedDestinations[namedDestination.Name]; 29 } 30 } 31 32 if (internalDestination != null) 33 { 34 Console.WriteLine( 35 string.Format("internal -> page {0}, position ({1},{2})", 36 internalDestination.Page.Index, internalDestination.Left, internalDestination.Top)); 37 } 38 } 39 } 40 41 // list sub-bookmarks. 42 listBookmarks(bookmark.Bookmarks); 43 } 44 }

Please also note that an internal destination may have undefined values (NaN) for its destination rectangle. Some destinations will have no valid rectangle at all, and will just refer to some page index.