| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Datasets

This version was saved 13 years, 2 months ago View current version     Page history
Saved by Daniel Pett
on January 19, 2011 at 3:13:05 pm
 

Added more PAS data and feeds to the list for people to hack around with.

OS OpenData

 

http://ckan.net/package/ordnance_survey

 

You can download lots of Ordnance Survey vector and raster mapping datasets for free from the OS OpenData site, including a digital terrain model and a place name gazetteer. As well as maps, it also includes post code and administrative boundary data. 

 

Data comes in TIFF, ESRI Shape Files, CSV, DXF and ASCII text formats.

 

CKAN - http://ckan.net

 

http://ckan.net (Comprehensive Knowledge Archive Network) already lists a good number of datasets and APIs in relation to historical (and related) material. It's an open wiki-like registry so you can also amend entries there.

 

Many of the datasets and APIs listed on this wiki already have a page there and it is easy to use a bespoke tag (e.g. history-hackday-2011) to create a simple shared list. Some useful queries:

 

http://ckan.net/package?tags=history

http://ckan.net/package?q=history

 

Open Plaques

 

http://ckan.net/package/open-plaques

 

Open Plaques is an open source project (both code and data) has data on over three and a half thousand commemorative plaques, and new plaques are being added regularly. Where available, the data for each plaque includes the:

 

  • Inscription
  • Person being commemorated
  • Geo location
  • Address / country
  • Organisation that created the plaque (around 250 so far) 
  • Colour
  • Roles (e.g. author, inventor, actor, prime minister, namer of clouds etc.)
  • Verbs (e.g. lived, worked, died, visited, built, founded) that link the person to the location
  • Link to creative commons photo of plaque on Flickr

 

Most of the page views on the site have .xml .json .kml views, just add the extension to the url.

Another very useful query is the 'box' query. Specify two geopoints, top-left and bottom-right and get returned all the plaques in that bounded box, e.g. http://openplaques.org/plaques.kml?box=[51.5482,-0.1617],[51.5282,-0.1217] or http://openplaques.org/plaques.json?box=[51.5482,-0.1617],[51.5282,-0.1217]

 

A kml view can be piped directly in Google Maps, e.g. http://maps.google.co.uk/maps?q=http://openplaques.org/plaques.kml?box=[51.5482,-0.1617],[51.5282,-0.1217]

 

The API is far from complete, but it is usable as we have two main users: Flickr and the openplaques iphone app. Simon, Jez and Tom should be at the hack day so chat with them if you have any thoughts.

 

Tom Morris has recently been doing some work to add RDF data so that records can be linked to Freebase, etc.

 

Open Economics

 

http://ckan.net/package/open-economics

 

Facebook

Facebook's Graph API gives you access to a user, and that user's friends entire Facebook history - photos, checkins, likes, wall posts and photos. Each with a timestamp. Its not ancient history, but increasingly Facebook contains a large facet of a user's personal online history. Get started with the APIs at http://developers.facebook.com/docs/ or drop me a line on @sicross or facebook.com/sicross.

 

Events - all the events a user and their friends have ever RSVP'd to, including: start/end time, date and location. https://graph.facebook.com/me/events 

Checkins - all the checkins a user and their friends have ever made including: time, location (place id, address, lat long), who they were with, and the checkin's message. https://graph.facebook.com/me/checkins

Photos - all the photos a user has ever been tagged in including: datetime of when the photo was upload, who uploaded it, who else it tagged in it, and the coordinates of the face of each tagged person. https://graph.facebook.com/me/photos

 

 

Newspaper and Journal archives

Archives of newspaper and science journals (eg http://www.nature.com/nature/archive/index.html not free unfortunately) are useful for cross-references (@zzgavin)

 

History data from the Guardian

There's tonnes of data on the Guardian site now - you can access most of it at: Guardian.co.uk/data

A lot of the data is time series. Here are a few to get started with:

London Blitz 1940: the first day's bomb attacks listed in full

Every prisoner of war camp in the UK mapped and listed

UK marriage rates back to 1862

UK inflation since 1948

Interest rates since 1694

 

English Heritage monument data

English Heritage monument data is available as ESRI shapefiles. Their use is 'generally unrestricted' as long as it is not 'used for purposes which may lead to damage to archaeological sites, historic buildings and landscapes'. Includes listed buildings, scheduled monuments etc, with a little bit of data (date of scheduling, listing grade etc). http://services.english-heritage.org.uk/NMRDataDownload/ Needs a login, but is easy to obtain (and I have one!) @mdgreaney

 

I have this data as points for the centre of each NMR, which has also been converted to lat/lon and WOEID, if anyone is interested. Currently using it to work out if detecting is happening in the vicinity of scheduled sites. @portableant

 

Historical theatrical data

http://theatricalia.com/ contains information about nearly 20,000 productions involving over 60,000 people at over 1,500 theatres. The earliest production currently in the system is from 1660: http://theatricalia.com/play/8/othello/production/ncj - possibly the first time a professional actress appeared on a public stage in England.

 

http://ckan.net/package/theatricalia

 

Ordnance Survey gazeteer

OS 1:50K gazetteer - has antiquities and Roman sites. Also converted to Yahoo WOEID. You can get the original dataset from http://parlvid.mysociety.org:81/os/ and I can send you my enhanced data. @portableant

 

UK Parliament data

http://hansard.millbanksystems.com has speeches, questions, written answers and ministerial statements from parliament going from 1802-2005 - it's about 95% complete. It also has a basic REST API at http://hansard.millbanksystems.com/api.

 

Current Commons Hansard can be found, officially, at http://www.parliament.uk/business/publications/hansard/commons/, and current Lords Hansard can be found at http://www.parliament.uk/business/publications/hansard/lords/.  Scottish Parliamentary reports can be found at http://www.scottish.parliament.uk/business/officialReports/. Welsh Assembly reports can be found at http://www.assemblywales.org/bus-home/bus-record-of-proceedings.htm. The Northern Ireland Assembly is at http://www.niassembly.gov.uk/ .

 

If what you want is within Commons debates back to 1935, Commons written answers/statements/public bill committees back to 2001, Lords back to 1999, or anything in the Northern Ireland Assembly debates or Scottish Parliament official report, you can use TheyWorkForYou and its API. It also has knowledge of MPs back to 1805ish, with a few corrections on top of the hansard.millbanksystems.com data.

 

UK government data

There's the smorgasbord which is http://data.gov.uk/.

 

Also of interest are the Gazettes (London, Edinburgh, Belfast) which are the records of things happening - legislation enacted, company insolvencies and all sorts of interesting bits and bobs which have to be officially recorded and published:

 

 

 

Old Bailey Online

 

http://ckan.net/package/oldbaileyonline

 

Old Bailey Online (http://www.oldbaileyonline.org/) has records of about 200,000 criminal trials conducted at the Old Bailey between 1674 and 1913.

 

London Lives

London Lives (http://www.londonlives.org/) has a collection of 3.35 million names associated with 240,000 documents, coming from 1690 to 1800. 

 

The National Maritime Museum

The National Maritime Museum could make the following datasets available if there's interest (@foe):

 

Research databases

  • Maritime Memorials (approx. 5,000 records, which can be retrieved via a simple API by ID, word/phrase or KML endpoint)
  •  Marine Society’s registers for boys sent to the Royal Navy during the Seven Years War. They're interesting as a history of the poor and for genealogists in search of an ancestor who went to sea as a boy (approx. 5,000 records, available as CSV)
  • Database of Royal Navy victualling during the French Revolutionary and Napoleonic Wars 1793–1815:  (Sustaining the Empire, over 4,000 records, available as an Access database)
  • Warship histories circa 1500 to 1950, detailing captain, where the ships went and the vessels they encountered (approx. 65,000 records, available as CSV)

 

Transcripts

 

The National Archives

http://www.nationalarchives.gov.uk/ is the UK government's official archive, containing over 1,000 years of history, with detailed guidance to government departments and the public sector on information management and links to other historical archives.

 

We are currently working to make available the following data sets. (Note: This is a provisional list and may change). Please do get in touch with me @mentionthewar if there's something you're particularly interested in working with so we can focus our efforts a bit.

 

  • Complete dump of our catalogue (hold tight it's apparently 10.5GB) [confirmed]
  • Seletec (more manageable) catalogue subsets covering serious crime, Victorian photographers and Medieval petitioners (now available from our Labs website)
  • Full OCR'd text of the Cabinet Papers (covering Cabinet Meetings 1916-79) [confirmed]
  • Full transcribed text of MH 12, Victorian poor law and workhouse records [confirmed]
  • Our growing gazetteer relating historic placenames to modern ones (now available from our Labs website)
  • Domesday Book placenames dataset (now available from our Labs website)
  • Output from our @ukwarcabinet twitter feed representing 1940 in real time and linking to original documents (now available from our Labs website)

 

(National Archives also has many statutes freely available online, although not a complete set. Good material for crowd-sourcing completion and data-mining. John Levin, @anterotesis)

 

Good point. These are at http://www.legislation.gov.uk/ (@mentionthewar)

 

Pleiades

 

http://ckan.net/package/pleiades

 

http://pleiades.stoa.org/ is a gazetteer for ancient world studies operated by NYU's Institute for the Study of the Ancient World and supported by the US National Endowment for the Humanities. It is derived originally from the Barrington Atlas of the Greek and Roman World and continually adds new resources. Features include:

 

 

BBC

 

All of the data from On This Day, including links to people, places and themes in Wikipedia. For an idea of the shape of this data and the links it contains have a look at this prototype.

 

We will also be able to share on the day details and prototype content for a major upcoming history site on the Northern Ireland Troubles, working title A State Apart. This will include early drafts and edits of some text, image and video content.

 

The Portable Antiquities Scheme (British Museum based)

 

The Scheme (PAS) records archaeological objects found by the public in England and Wales online. This dataset (http://finds.org.uk) now has:

 

Culture Grid (from Collections Trust)

 

Culture Grid provides over 1.2 million records covering a huge range of topics, places, periods and media, from a wide range of UK museum, library and archive collections.  Most records reference images but there is also a growing number of audio and video material and 10,000s of collection and institution records as well.

 

The Culture Grid APIs are openly available to encourage innovation and feedback through events such as History Hack, Culture Hack, JISC Dev8D and the Culture Grid Hack Day.

 

Culture Grid is funded by the Museums, Libraries and Archives Council (MLA) to address digital priorities for the cultural sector and by the European Commission to support the European Cultural portal (Europeana).  Culture Grid has been developed by Collections Trust with technical partners Knowledge Integration Ltd.

Comments (0)

You don't have permission to comment on this page.