Dissertation: Refactoring HTML and CSS with help from wkpdf
I’ve been thinking about how wkpdf could be useful when refactoring code in order to avoid inadvertently changing the appearance of a page.
Say you want to move some in-line HTML styles into a CSS block. Since a PDF represents only the rendered content, and not the underlying structure, comparing the PDFs of the before and after state should be a reliable way to confirm the page layout hasn’t been changed.
You’d start by creating a reference copy of the page before the refactoring.
You’d then carry out the refactoring and run the same command to create an
after.pdf
.
A quick visual check should show that both look identical, but the aim is to do this programatically.
My first thought was to compare the MD5 hashes of the before and after pages:
So it seems these PDF files must have some subtle difference.
I then tried converting each to images:
But I was suprised to see they still differ:
I looked for others way to compare PDFs and found i-net PDFC
That can be run from the command line as:
So that demonstrates that the pages layout is unchanged.
I’m still thinking of how this would fit into a refactoring workflow but it feels like a useful start.