Removing PDFs from the web. One at a time.

Removing PDFs from the web. One at a time.

As local authorities (LA), our websites have a lot of PDFs. I would guess at thousands per Welsh LA. 

The Government Digital Service has done a lot of work in the past 10 years to remove PDFs from central government services. You can read their guidelines for publishing accessible documents on gov.uk. 

HTML is best

Simply, HTML content is better and all content we publish to local government websites should be in HTML.

If your documents do not meet accessibility standards you could be breaking the Equality Act 2010.

Here is a list of why HTML should be used and why PDFs are bad for the web:

  • Better experience across devices (PDFs are not responsive and will not change size to fit the browser)
  • Access to all, for all assistive technologies, devices, browsers and software
  • Machine readable  – good for SEO, data extraction and reuse
  • HTML uses browser settings so those with custom settings can access immediately
  • Good for SEO means people can find the information easily!
  • Futureproof
  • PDFs are not designed for reading on screens 
  • They are much harder to track and inform design and iterations 
  • Distracts from the user experience- Depending on the user’s device and browser, PDFs might open in a new browser window, new tab or a separate app. Sometimes they automatically download to the user’s device. Whatever happens, the user is taken away from the website when they open a PDF. This is even more of an issue if the user goes directly to the PDF from a search engine. Without the context of the site the PDF is hosted on, they can’t easily browse to related content or search the website.
  • Many devices and browsers have PDF viewers built-in – and they are freely available to download – there are still users who do not have them, or cannot download them.
  • Compared with HTML, it’s harder to update a PDF once it’s been created and published. PDFs are also less likely to be actively maintained, which can lead to broken links and users getting the wrong information. 
  • It can be very difficult to reuse content from a PDF by copy and pasting it. The design and layout of the PDF can produce unexpected results, particularly if it has multiple columns, hasn’t been structured correctly, or uses incompatible fonts.

Action

The best advice is to delete your PDFs and replace them with web content (HTML).

It’s simpler to create HTML from an OpenDocument or Word Document than it is to try to make a PDF accessible for all users. 

Turn existing PDFs into: 

  • HTML if it is to be viewed
  • an OpenDocument if it is to be edited, such as a form (but still better to have an online form!)

HTML is easier than making an accessible PDF but if you need to…

Simply saying to publish everything in HTML is far easier than actually doing it.

If you need to keep the PDF, you must publish an accessible version with it. If you do not, you may be breaking the law.

If you must continue to create and publish PDFs, As an interim, create an accessible Word document (using the accessibility guidelines for documents below) and save as a PDF. Remember to add alternative text or alt tags to all images and check for accessibility when you’ve created the document. 

How to make your documents accessible

  • Give the document a meaningful title.
  • Keep sentences and paragraphs short. Aim for around 25 words or less per sentence.
  • Use a sans serif font like Arial or Helvetica. Use a minimum size of 12 points.
  • Use sentence case. Avoid all caps text and italics.
  • Make sure the text is left aligned, not justified.
  • Avoid underlining, except for links.
  • Make sure link text clearly describes where the link goes. It should also be understandable on its own, even if read out of context. This is because some screen reader users list links on a page to find what they need quickly.
  • Documents with single continuous columns of text are easier to make accessible than documents with a complex layout.
  • Only use tables for data. Keep tables simple: avoid splitting or merging cells.
  • Do not use things like colour or shape alone to show meaning. Instructions like ‘click the big green button’ rely on the user to see the page and someone who is colour blind may not see the green button.
  • If you’re using images or charts, think about how you’ll make the content accessible to people with a visual impairment. Two options are:
    1. Make the same point in the text of the document (so people with visual impairments get the information they need – the image or chart is there as an extra for people who are able to see it)
    2. Give the person converting or uploading the document for you alt text (‘alternative text’) for the image or chart
  • Do not use images containing text, as it’s not possible to resize the text in the image and screen readers cannot read text which is part of an image.
  • Avoid footnotes where possible. Provide explanations inline instead.

How to identify accessible vs non-accessible PDFs

A Two-Minute Quick Check for a digital PDF document 

  1. Open PDF document 
  2. Can you select and highlight the text? 
  3. Can you have the text read aloud to you? Go to Adobe Reader’s View menu, select and activate Read Out Loud then select Read this page only or Read to the end of document and listen to your document. 
  4. If you include any images, photos, diagrams, etc., have you provided alternative text or captions to explain the key message in your images? Screen reader does not read/recognize any images if alternative text is missing. 

If answers from 2 to 4 are “yes”, your document is as accessible as a it can be. 

A toolkit for making your content better

Readability:

Browser testing:

Accessibility testing:

More information