Extract images from PDF

Content extraction, Generate PDF, Shapes, Images
4/15/2011

Downloads

This shows how to extract an image from pdf

Extract images from an existing PDF document

This code sample illustrates how to iterate through existing content in a PDF document and to save each images found on every page as a new Image file.

C# code sample

1 static void Main(String[] args) 2 { 3 using (FileStream fileIn = 4 new FileStream(@"..\..\..\inputdocuments/PackingLightBrochure.pdf", FileMode.Open, FileAccess.Read)) 5 { 6 //open pdf document and cycle trhough all pages 7 Document document = new Document(fileIn); 8 int i = 0; 9 10 foreach (Page page in document.Pages) 11 { 12 ShapeCollection shapes = page.CreateShapes(); 13 //go through all images 14 saveImageShapes(shapes, i); 15 i++; 16 } 17 } 18 } 19 20 static void saveImageShapes(ShapeCollection shapes, int i) 21 { 22 foreach (Shape shape in shapes) 23 { 24 ImageShape imageShape = shape as ImageShape; 25 26 if (imageShape != null)//if current shape is an imageshape, save it 27 { 28 System.Drawing.Bitmap bitmap = imageShape.CreateBitmap(); 29 bitmap.Save(string.Format(@"..\..\Image_{0}.png", i)); 30 } 31 else 32 { 33 ShapeCollection shapeCollection = shape as ShapeCollection; 34 //if current shape is a ShapeCollection, recurse 35 if (shapeCollection != null) 36 { 37 saveImageShapes(shapeCollection, i++); 38 } 39 } 40 } 41 }

VB.NET code sample

1 Sub Main() 2 3 Using fileIn As New FileStream("..\..\..\inputdocuments/PackingLightBrochure.pdf", FileMode.Open, FileAccess.Read) 4 'open pdf document and cycle trhough all pages 5 Dim document As New Document(fileIn) 6 Dim i As Integer = 0 7 8 For Each page As Page In document.Pages 9 Dim shapes As ShapeCollection = page.CreateShapes() 10 'go through all images 11 saveImageShapes(shapes, i) 12 i += 1 13 Next 14 End Using 15 End Sub 16 17 Private Sub saveImageShapes(shapes As ShapeCollection, i As Integer) 18 For Each shape As Shape In shapes 19 Dim imageShape As ImageShape = TryCast(shape, ImageShape) 20 21 If imageShape IsNot Nothing Then 22 'if current shape is an imageshape, save it 23 Dim bitmap As Drawing.Bitmap = imageShape.CreateBitmap() 24 bitmap.Save(String.Format("..\..\Image_{0}.png", i)) 25 Else 26 Dim shapeCollection As ShapeCollection = TryCast(shape, ShapeCollection) 27 'if current shape is a ShapeCollection, recurse 28 If shapeCollection IsNot Nothing Then 29 saveImageShapes(shapeCollection, System.Math.Max(System.Threading.Interlocked.Increment(i), i - 1)) 30 End If 31 End If 32 Next 33 End Sub