| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

Datasets

Page history last edited by eatyourgreens 13 years, 1 month ago

Crafts Council – Maker Objects Data, photostore.org.uk

 

The data provided represents the objects that are shown on the www.photostore.org.uk website totalling around 60,000 records. The website contains approximately 1000 selected makers. As well as recording the maker, dimensions and date of creation etc, there is also a link to an image and these have been provided. The original folder structure of the images has been maintained to allow easy access from the data.

 

OS OpenData

 

http://ckan.net/package/ordnance_survey

 

You can download lots of Ordnance Survey vector and raster mapping datasets for free from the OS OpenData site, including a digital terrain model and a place name gazetteer. As well as maps, it also includes post code and administrative boundary data. 

 

Data comes in TIFF, ESRI Shape Files, CSV, DXF and ASCII text formats.

 

 

CKAN - http://ckan.net

 

http://ckan.net (Comprehensive Knowledge Archive Network) already lists a good number of datasets and APIs in relation to historical (and related) material. It's an open wiki-like registry so you can also amend entries there.

 

Many of the datasets and APIs listed on this wiki already have a page there and it is easy to use a bespoke tag (e.g. history-hackday-2011) to create a simple shared list. Some useful queries:

 

http://ckan.net/package?tags=history

http://ckan.net/package?q=history

 

Open Plaques

 

http://ckan.net/package/open-plaques

 

Open Plaques is an open source project (both code and data) has data on over three and a half thousand commemorative plaques, and new plaques are being added regularly. Where available, the data for each plaque includes the:

 

  • Inscription
  • Person being commemorated
  • Geo location
  • Address / country
  • Organisation that created the plaque (around 250 so far) 
  • Colour
  • Roles (e.g. author, inventor, actor, prime minister, namer of clouds etc.)
  • Verbs (e.g. lived, worked, died, visited, built, founded) that link the person to the location
  • Link to creative commons photo of plaque on Flickr

 

Most of the page views on the site have .xml .json .kml views, just add the extension to the url.

Another very useful query is the 'box' query. Specify two geopoints, top-left and bottom-right and get returned all the plaques in that bounded box, e.g. http://openplaques.org/plaques.kml?box=[51.5482,-0.1617],[51.5282,-0.1217] or http://openplaques.org/plaques.json?box=[51.5482,-0.1617],[51.5282,-0.1217]

 

A kml view can be piped directly in Google Maps, e.g. http://maps.google.co.uk/maps?q=http://openplaques.org/plaques.kml?box=[51.5482,-0.1617],[51.5282,-0.1217]

 

The API is far from complete, but it is usable as we have two main users: Flickr and the openplaques iphone app. Simon, Jez and Tom should be at the hack day so chat with them if you have any thoughts.

 

Tom Morris has recently been doing some work to add RDF data so that records can be linked to Freebase, etc.

 

Open Economics

 

http://ckan.net/package/open-economics

 

Facebook

Facebook's Graph API gives you access to a user, and that user's friends entire Facebook history - photos, checkins, likes, wall posts and photos. Each with a timestamp. Its not ancient history, but increasingly Facebook contains a large facet of a user's personal online history. Get started with the APIs at http://developers.facebook.com/docs/ or drop me a line on @sicross or facebook.com/sicross.

 

Events - all the events a user and their friends have ever RSVP'd to, including: start/end time, date and location. https://graph.facebook.com/me/events 

Checkins - all the checkins a user and their friends have ever made including: time, location (place id, address, lat long), who they were with, and the checkin's message. https://graph.facebook.com/me/checkins

Photos - all the photos a user has ever been tagged in including: datetime of when the photo was upload, who uploaded it, who else it tagged in it, and the coordinates of the face of each tagged person. https://graph.facebook.com/me/photos

 

 

Newspaper and Journal archives

Archives of newspaper and science journals (eg http://www.nature.com/nature/archive/index.html not free unfortunately) are useful for cross-references (@zzgavin)

 

History data from the Guardian

There's tonnes of data on the Guardian site now - you can access most of it at: Guardian.co.uk/data

A lot of the data is time series. Here are a few to get started with:

London Blitz 1940: the first day's bomb attacks listed in full

Every prisoner of war camp in the UK mapped and listed

UK marriage rates back to 1862

UK inflation since 1948

Interest rates since 1694

 

English Heritage monument data

English Heritage monument data is available as ESRI shapefiles. Their use is 'generally unrestricted' as long as it is not 'used for purposes which may lead to damage to archaeological sites, historic buildings and landscapes'. Includes listed buildings, scheduled monuments etc, with a little bit of data (date of scheduling, listing grade etc). http://services.english-heritage.org.uk/NMRDataDownload/ Needs a login, but is easy to obtain (and I have one!) @mdgreaney

 

I have this data as points for the centre of each NMR, which has also been converted to lat/lon and WOEID, if anyone is interested. Currently using it to work out if detecting is happening in the vicinity of scheduled sites. @portableant

 

Historical theatrical data

http://theatricalia.com/ contains information about nearly 20,000 productions involving over 60,000 people at over 1,500 theatres. The earliest production currently in the system is from 1660: http://theatricalia.com/play/8/othello/production/ncj - possibly the first time a professional actress appeared on a public stage in England.

 

http://ckan.net/package/theatricalia

 

Ordnance Survey gazeteer

OS 1:50K gazetteer - has antiquities and Roman sites. Also converted to Yahoo WOEID. You can get the original dataset from http://parlvid.mysociety.org:81/os/ and I can send you my enhanced data. @portableant

 

UK Parliament data

http://hansard.millbanksystems.com has speeches, questions, written answers and ministerial statements from parliament going from 1802-2005 - it's about 95% complete. It also has a basic REST API at http://hansard.millbanksystems.com/api.

 

Current Commons Hansard can be found, officially, at http://www.parliament.uk/business/publications/hansard/commons/, and current Lords Hansard can be found at http://www.parliament.uk/business/publications/hansard/lords/.  Scottish Parliamentary reports can be found at http://www.scottish.parliament.uk/business/officialReports/. Welsh Assembly reports can be found at http://www.assemblywales.org/bus-home/bus-record-of-proceedings.htm. The Northern Ireland Assembly is at http://www.niassembly.gov.uk/ .

 

If what you want is within Commons debates back to 1935, Commons written answers/statements/public bill committees back to 2001, Lords back to 1999, or anything in the Northern Ireland Assembly debates or Scottish Parliament official report, you can use TheyWorkForYou and its API. It also has knowledge of MPs back to 1805ish, with a few corrections on top of the hansard.millbanksystems.com data.

 

UK government data

There's the smorgasbord which is http://data.gov.uk/.

 

Also of interest are the Gazettes (London, Edinburgh, Belfast) which are the records of things happening - legislation enacted, company insolvencies and all sorts of interesting bits and bobs which have to be officially recorded and published:

 

 

 

Old Bailey Online

 

http://ckan.net/package/oldbaileyonline

 

Old Bailey Online (http://www.oldbaileyonline.org/) has records of about 200,000 criminal trials conducted at the Old Bailey between 1674 and 1913.

 

London Lives

London Lives (http://www.londonlives.org/) has a collection of 3.35 million names associated with 240,000 documents, coming from 1690 to 1800. 

 

The National Maritime Museum

The National Maritime Museum could make the following datasets available if there's interest (@foe):

 

Research databases

  • Maritime Memorials (approx. 5,000 records, which can be retrieved via a simple API by ID, word/phrase or KML endpoint)
  •  Marine Society’s registers for boys sent to the Royal Navy during the Seven Years War. They're interesting as a history of the poor and for genealogists in search of an ancestor who went to sea as a boy (approx. 5,000 records, available as CSV)
  • Database of Royal Navy victualling during the French Revolutionary and Napoleonic Wars 1793–1815:  (Sustaining the Empire, over 4,000 records, available as an Access database)
  • Warship histories circa 1500 to 1950, detailing captain, where the ships went and the vessels they encountered (approx. 65,000 records, available as CSV)

 

Transcripts

 

There are also YQL tables to search the Maritime Museum collections, returning results as JSON or XML. See http://www.nmm.ac.uk/collections/feeds/docs (@pekingspring)

 

Fiction

Project Gutenberg has many out of copyright fictional accounts of the world, many of them reference real world places and events

 

The National Archives

http://www.nationalarchives.gov.uk/ is the UK government's official archive, containing over 1,000 years of history, with detailed guidance to government departments and the public sector on information management and links to other historical archives.

 

We are currently working to make available the following data sets. (Note: This is a provisional list and may change). Please do get in touch with me @mentionthewar if there's something you're particularly interested in working with so we can focus our efforts a bit.

 

  • Complete dump of our catalogue (hold tight it's apparently 10.5GB) [confirmed]
  • Seletec (more manageable) catalogue subsets covering serious crime, Victorian photographers and Medieval petitioners (now available from our Labs website)
  • Full OCR'd text of the Cabinet Papers (covering Cabinet Meetings 1916-79) [confirmed]
  • Full transcribed text of MH 12, Victorian poor law and workhouse records [confirmed]
  • Our growing gazetteer relating historic placenames to modern ones (now available from our Labs website)
  • Domesday Book placenames dataset (now available from our Labs website)
  • Output from our @ukwarcabinet twitter feed representing 1940 in real time and linking to original documents (now available from our Labs website)

 

(National Archives also has many statutes freely available online, although not a complete set. Good material for crowd-sourcing completion and data-mining. John Levin, @anterotesis)

 

Good point. These are at http://www.legislation.gov.uk/ (@mentionthewar)

 

Pleiades

 

http://ckan.net/package/pleiades

 

http://pleiades.stoa.org/ is a gazetteer for ancient world studies operated by NYU's Institute for the Study of the Ancient World and supported by the US National Endowment for the Humanities. It is derived originally from the Barrington Atlas of the Greek and Roman World and continually adds new resources. Features include:

 

 

BBC

 

All of the data from On This Day, including links to people, places and themes in Wikipedia. For an idea of the shape of this data and the links it contains have a look at this prototype.

 

We will also be able to share on the day details and prototype content for a major upcoming history site on the Northern Ireland Troubles, working title A State Apart. This will include early drafts and edits of some text, image and video content.

 

The Portable Antiquities Scheme (British Museum based)

 

The Scheme (PAS) records archaeological objects found by the public in England and Wales online. This dataset (http://finds.org.uk) now has:

 

Culture Grid (from Collections Trust)

 

Culture Grid provides over 1.2 million records covering a huge range of topics, places, periods and media, from a wide range of UK museum, library and archive collections.  Most records reference images but there is also a growing number of audio and video material and 10,000s of collection and institution records as well.

 

The Culture Grid APIs are openly available to encourage innovation and feedback through events such as History Hack, Culture Hack, JISC Dev8D and the Culture Grid Hack Day.

 

Culture Grid is funded by the Museums, Libraries and Archives Council (MLA) to address digital priorities for the cultural sector and by the European Commission to support the European Cultural portal (Europeana).  Culture Grid has been developed by Collections Trust with technical partners Knowledge Integration Ltd.

 

Nomisma.org

The site http://nomisma.org publishes stable URIs for concepts in numismatics, with a focus on the ancient Mediterranean. Right now, it has the complete set of hoards from the volume Inventory of Greek Coin Hoards. There is also a fairly  comprehensive set of ancient Greek "mints" with geographic co-ordinates. The whole dataset can be downloaded from http://nomisma.org/nomisma.org.xml . There is a KML file at http://nomisma.org/nomisma.org.xml . Individual hoards have KML files as well. There is RDFa embedded in the data that establishes links, such as those between a hoard and the mints that struck the coins in that hoard. It's all under development, of course.

 

The Diary of Samuel Pepys

The data behind the decade-long project to publish the diaries of the 17th century London diary are now available in JSON format. You can read a brief description on the site, the more detailed README, or download the 6MB zip file of data and images.

 

UK Film Council - Historical Screening Data

This data set contains raw screening data on every film shown publicly at over 2,500 venues in the UK between August 2007 - December 2010 (over 3 million unique screening records). It is the first time that any sort of comprehensive historical data on public film provision in the UK is being made available in an open source way, and it's a complete overview on what type of, and where, films are being made available to watch 'theatrically' across the UK because it provides records from not only the approx. 750 full-time cinemas, but this data includes screenings held in universities, village halls, galleries, arts venues, one-off film society events etc. 

 

  • Data fields include Venue (full address, geo-coded location, number of screens etc), Film / Content (genre, rating, release date, principle language, whether the film is specialised - non-mainstream, documentary, foreign language etc), Event - viewed by film or venue (Ticket prices, film title, all screening dates per film, all screening times per film, the week after national release date that a film is shown at a venue, and how long it played there for)
  • Questions we think can now be answered include: what is the distribution of foreign language films in the UK. what is the programming patterns for cinemas, how many cinemas play specialised film etc.

 

Data to go up today! (in .csv files)  Together with details of fields and examples. Other historical film data can be found on the UKFC research website

 

The British Library

Some historical datasets discoverable through www.searchbeta.bl.uk

 

 

Museum of London

 

London Archaeological Archive API (a bit limited, needs improving)

Bioarchaeology data dumps

 

Newspapers

 

Richmond Daily Dispatch from 1800s: http://www.perseus.tufts.edu/hopper/collection?collection=Perseus%3Acollection%3ARichTimes

 

Comments (4)

Simon Rogers said

at 3:54 pm on Jan 6, 2011

I've added some of the Guardian's many history datasets to the page above - but there's lots more on the site

Morena Fiore said

at 1:56 am on Jan 19, 2011

Do you know if the data set from the National Maritime Museum has data from Columbus' travels? thanks

Daniel Knell said

at 10:08 am on Jan 19, 2011

Ordnance Survey Gazeteer link is down.

Mia said

at 1:26 pm on May 13, 2011

There are lots of museum APIs listed at http://museum-api.pbworks.com/w/page/21933420/Museum%C2%A0APIs

You don't have permission to comment on this page.