Government publications at risk: Gaps in the collection and preservation of Ontario government publications, by Simone O'Byrne

Simone O'Byrne is a Library and Information Specialist at the Ontario Ministry of the Environment, Conservation and Parks. She serves as a Director on the Ontario Government Libraries Council, and is Chair of that organisation's Working Group on Ontario Government Publications. Simone can be contacted by email at simone.obyrne@ontario.ca

The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of the Ontario government.






Digital publishing has proved to be game changing in jurisdictions around the world. Online publications are hard to track, can change without notice, disappear, and often lack adequate metadata. Collecting government publications is challenging; in Ontario, there are concerns that many documents will either be lost or become inaccessible over time.

Canada comprises ten provinces and three territories, each with a sub-national government. Ontario is the most populous province, with around 13 ½ million people. The Ontario Public Service numbers around 60,000 full time employees, with 20 libraries or information centres. Not every Ministry, Agency, Board or Commission has a library.

Until 2012, Library and Archives Canada (LAC) maintained a legal deposit program that included publications from all levels of government.  In 2012 provincial and territorial government publications were specifically exempted from LAC’s legal deposit mandate. 

Only six sub-national jurisdictions within Canada have a legislated mandate to collect and provide long term access to their own publications:  one is a Provincial Library; one a Provincial Archives and four are Legislative Libraries.  Ontario is not one of the six; left with no legal deposit, and with “published works” specifically excluded under Ontario’s Archives and Recordkeeping Act [1], our publications are clearly at risk.

2012 was a pivotal year in yet other ways. The Ontario government launched a policy of moving away from print towards digital-only publishing. Individual ministry web sites began to be consolidated into a single site (www.ontario.ca) with an emphasis on providing “current” information in plain English and a requirement to fully comply with the Accessibility for Ontarians with Disabilities Act [2]. Cabinet Office designated HTML as the accessible format of choice, and alternate non-compliant file formats were no longer permitted. This resulted in hundreds of non-compliant PDF files not migrating to the new site and many simply disappeared. Support for remediating existing files (usually PDF) or creating new files to comply with the accessibility legislation was not available. 

The Ministry of Environment, Conservation and Parks, where I work, published an average of 400 reports annually from the mid-1980s to mid-1990s. By the early 2000s print and electronic numbers had dropped dramatically; 2013 saw only 57 electronic (PDF) publications, with just one title produced in print. For librarians, using the PDF file format as a short cut to collection was no longer effective.

HTML-only publishing introduces a new set of challenges. It is virtually impossible to download an HTML page as a digital object or to print it with format integrity [see sidebar case study]. It is also increasingly difficult to define what constitutes a publication. Annual reports tend to retain their publication type in their title and remain easily identifiable, but things like guidance documents, fact sheets, research or scientific reports, are now web pages undifferentiated from splash pages and ephemera. Insufficient metadata makes discovery difficult, and content is increasingly generated on the fly. Publication is now a relatively obsolete term for describing what our libraries wish to collect, but for now we persist in using it. New descriptive terms such as artifacts, ‘content pieces’, ‘content objects’ and ‘content objects meaningfully rendered’ are somehow inadequate.
Government publications should be easy to find, easy to use, easy to share and available for the long term. Government publications are primary source material and preserving point-in-time information is critical to ensure accountability and to maintain the historical record.
What is being done in Ontario?

Legislative Assembly of Ontario (also known as Queen’s Park, Ontario Parliament and others, but properly Legislative Assembly of Ontario)

The Ontario Government Libraries Council membership includes librarians and information professionals working in Ministries, Agencies, Boards, Commissions, the Archives of Ontario, and the Ontario Legislative Library.  The Council formed a Working Group on Ontario Government Publications including representatives from the Queen’s Printer and Open Government offices to explore how policy and logistical issues might be addressed.

The Working Group established a broader community of practice by holding roundtable events with representatives from government, university and public libraries, and academic and not-for-profit government document repositories. Just over a year ago Ontario’s then Secretary of the Cabinet, Steve Orsini, spoke at one of the Roundtables. Following an animated discussion he requested a full briefing on the gaps in the collection and preservation of Ontario government publications.

The briefing materials opened with a quote from Adam Farquhar, Head of Digital Scholarship at The British Library:

If we’re not careful, we will know more about the beginning of the 20th century than the beginning of the 21st century [3]           

During the briefing we discussed four main gaps and challenges:

  1. No public or private organization has a mandate to collect and provide long-term access to Ontario’s print and digital publications
  2. Downloadable and printable file formats are rarely available
  3. Many scientific, technical and older publications lack an online home
  4. University libraries face significant licensing and copyright hurdles to collecting and sharing digital content

The briefing was well received with notable support from the Secretary. As a result, a committee was formed with senior executives from the Ontario Archives, Open Government, Queen’s Printer and Digital offices.  

The internal guidance document Government of Ontario websites - Recordkeeping and Archival Requirements [4] was issued by the Archivist of Ontario shortly thereafter, confirming government websites would now be treated as records. The document outlines the triggers that dictate when web archives should be captured, such as prior to a change in government (as happened a few months later).  Informally, we have been assured that archives of public facing sites will be fully indexed and posted publicly over the long term.

The promise of web archiving is a welcome step in addressing the need for comprehensive collecting and provision of access to Ontario government publications. With time, we may be able to rely on a robust web archive similar to the UK Government Web Archive (nationalarchives.gov.uk/webarchive).  However, the technology and business practices for web archiving in Ontario are still under development with no clear implementation date, and publications are still being lost.  Also, web archiving captures web sites, not web pages, so following a collection development policy is not practicable. 
Compliance with accessibility legislation remains a considerable hurdle. Curated collection is thwarted by the absence of downloadable, printable file formats that maintain the integrity of the original formatting. The Working Group’s focus is now on advocating for additional file formats to complement the accessible HTML content. 
Page capture tools like WebCite (webcitation.org), archive.today (archive.is) and the Internet Archive’s Wayback Machine and Archive-It tools (archive.org/web) were investigated but each had limitations. Webrecorder (webrecorder.io) and Perma.cc (perma.cc) come the closest to meeting our requirements. Perma.cc is a service built by Harvard’s Library Innovation Lab. It is simple to use, captures a date and time stamped, unalterable copy of a web page, and hosts it permanently. The copy maintains formatting and compliance with our accessibility requirements, but the banner it imposes at the top does not (see a test page at https://perma.cc/7NMZ-8T7B).  
The WARC (Web ARChive) format merits further research, however the need for a WARC viewer/player is an extra hurdle to overcome. Archivematica (archivematica.org), an open source application is also under review. If suitable, it would depend on the involvement of developers for customization, and funding would need to be approved.
I’m still optimistic that there is a relatively simple solution. I would appreciate any comments, suggestions or questions. Please contact me at simone.obyrne@ontario.ca

Sidebar / case study

For many years Ontario’s Chief Drinking Water Inspector’s annual reports were published both in print and as Adobe PDF files. From 2015 onward, the report was only published as HTML.


The 2017-2018 Chief Drinking Water Inspector Annual Report [5] in HTML

The report was published as an HTML page that meets all of Ontario’s accessibility legislation requirements.



The same page “printed to PDF”

Printing to PDF or saving as PDF ignores the contents/navigation section, and results in loss of formatting. In addition, the file is no longer accessible.



“Save page as” Webpage, Complete

“Save page as” Webpage, Complete (*.htm; *.html) results in an HTML file that cannot be displayed without the associated folder of 28 files. Different browsers or versions of browsers will render the page differently, and sometimes not at all.



[1] Archives and Recordkeeping ActSO 2006, c 34, s Ahttps://www.ontario.ca/laws/statute/06a34
[2] Accessibility for Ontarians with Disabilities ActSO 2005, c 11.  https://www.ontario.ca/laws/statute/05a11
[3] History flushed: Digital archiving”, Economist, 28 April 2012, http://www.economist.com/international/2012/04/28/history-flushed
[4] Ontario, Information, Privacy and Archives Portfolio Management Office, Government of Ontario websites - Recordkeeping and Archival Requirements (Toronto, ON: Ministry of Government and Consumer Services, 2018), unpublished internal document.  
[5] Ontario, Ministry of the Environment, Conservation and Parks, 2017-2018 Chief Drinking Water Inspector Annual Report (Toronto, ON: Ministry of the Environment, Conservation and Parks, 2018). www.ontario.ca/page/2017-2018-chief-drinking-water-inspector-annual-report.

Comments