Exporting and Re-using Data from the MMM Portal – Mapping Manuscript Migrations

Author: Toby Burrows

My goal is to export data relating to manuscripts formerly owned by Sir Thomas Phillipps (1792-1872) from the MMM Portal, and then to import these data into a separate nodegoat database of Phillipps manuscripts. This has involved the following steps.

In the MMM Portal:

Filter for Thomas Phillipps as an owner: result = 8,750 records
Export these results as a CSV spreadsheet into the Yasgui SPARQL service, with the accompanying SPARQL query

In Yasgui:

Edit the SPARQL query from MMM:
- Remove unwanted elements – chiefly the IDs and some provenance events; keep the labels
- Add two missing queries: Phillipps number, and number of miniatures
- Remove the 10-manuscript limit in the query
Re-run the SPARQL query (21 variables)
- Result = 149,777 rows and 21 columns (81.9 seconds) – an average of 17 rows per manuscript
Download the spreadsheet

In Google Sheets: (OpenRefine could also be used here)

Upload and open the CSV file
Fix UTF-8 character display problems
With the Power Tools add-on, use Merge and Combine to combine the rows relating to a single manuscript:
- Use a semicolon delimiter to merge values
- Remove an additional semicolon at the beginning and end of merged values – where there was empty content;
Result = 8,750 rows / manuscripts – 6,882 of them with Phillipps numbers

In nodegoat:

Upload the amended CSV file
Create Import Templates for each section of the import process
Import the objects (manuscripts) and descriptive fields to nodegoat
Import production and transfer events as sub-objects – use the MMM URI as the field for matching with objects / manuscripts

I will document the nodegoat process in more detail on my Phillipps project blog.

Leave a Reply Cancel reply