Your slide deck is a zip file in disguise

Show me!

Want to see for yourself? Open a .PPTX file in your favorite text editor (here’s a sample file) and look at the first 4 bytes. A zip file typically starts with “50 4b 03 04” in hexadecimal notation (the first two bytes are “PK”, which stands for Phil Katz, who developed the zip format).

I need more proof

Since filename extensions are really just suggestions to your Operating System, rename your zip file from sample.pptx to sample.zip (you may need to enable displaying file extensions for your OS to let you do this).

Neat. But how is this useful?

Before discarding this as a mere party trick, it’s worth thinking through why this can be powerful:

Use Case 1: Extracting all images from a slide deck

After unzipping your PowerPoint file, you may have noticed a folder called ppt/media/:

Use Case 2: Extracting notes from each slide

Another use of this trick would be extract all the notes in your PowerPoint presentation. As it turns out, slide notes are stored in a separate XML file in the folder ppt/notesSlides:

function extractSlideNotes()
{
local i=0;
local DIR=${1:-.}; # Default is current folder
# XML path to slide notes: <p:txBody> -> <a:p> --> <a:r> --> <a:t> (see notesSlide.xml files)
local XPATH="//*[local-name()='txBody']/*[local-name()='p']/*[local-name()='r']/*[local-name()='t']/text()";
# Loop through each slide
for FILE in ${DIR}/ppt/notesSlides/*.xml;
do
(( i++ ));
# Fetch and output the notes
local NOTES=$(xmllint --xpath ${XPATH} ${FILE});
echo -e ${i} "\t" ${NOTES};
done
}
$ extractSlideNotes ~/Desktop/sample/
1 Notes on slide 1
2 Notes on slide 2
3 Notes on slide 3

What about adding slides?

If, like me, you often find yourself demoing your software tools in PowerPoint, you may be wondering whether we can automate inserting 100’s of new slides, where each slide would contain one screenshot obtained from a specified folder. Doing that manually can be painful (although in the Windows version of PowerPoint, it seems you can do Insert -> Photo Album).

  • [Content_Types].xml
  • ppt/presentation.xml
  • ppt/_rels/presentation.xml.rels
  • ppt/slides/slide4.xml
  • ppt/slides/_rels/slide4.xml
Sub Test()
Dim pptSlide As Slide, pptLayout As CustomLayout
Set pptLayout = ActivePresentation.Slides(1).CustomLayout

For Index = 1 To 5
slideID = ActivePresentation.Slides.Count + 1

' Create new slide
Set pptSlide = ActivePresentation.Slides.AddSlide(slideID, pptLayout)

' Add a new image to that slide from /Users/robert/Desktop/images/figure*.png
pptSlide.Shapes.AddPicture fileName:="Users:robert:Desktop:images:figure" & Index & ".png", LinkToFile:=msoTrue, SaveWithDocument:=msoTrue, Left:=0, Top:=100
Next
End Sub

It’s a wrap

This concludes our exploration into the wonders hiding behind the .pptx file. If you have ideas about how this trick can be used for other practical purposes, feel free to share them below.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Robert Aboukhalil

Robert Aboukhalil

1.8K Followers

Bioinformatics Software Engineer, Author of Level up with WebAssembly book.