Dataset Creation

Across the University, experts are available to consult on creating data and datasets using tools like MTurk, Qualtrics, and other surveys and field experiments. Whether your research involves creation of new data or conversion of print text and visual materials through OCR and machine learning tools, Harvard experts can partner with you. These same experts can help with data transformation, curation, and storage.

Eligibility information is outlined below based on providers with offerings that are available to the entire Harvard community or a specific unit/appointment. 

University-wide

Harvard Library

Digital Scholarship Support Group offers consultations and advice on creating data sets for data analysis and data mining. We also offer advanced workshops on toppics such as web APIs and the Image Interoperability Framework.

Audience

All Harvard affiliates

Service Provider

Digital Scholarship Support Group

Service Fee

None

Service Website

https://dssg.fas.harvard.edu/initiatives/research/

Contact Information

dssg@fas.harvard.edu

Harvard College Library, Services for Academic Programs

Harvard College Library (SAP) offers consultations on available options for digitizing texts and using optical character recognition (OCR) software to create machine-readable texts or datasets.

Audience

All Harvard community; focus on FAS undergraduates, graduate students, and faculty

Service Provider

Harvard College Library, Services for Academic Programs (SAP)

Service Fee

None

Service Website

https://guides.library.harvard.edu/digitalhumanities

Contact Information

Hugh Truslow: truslow@fas.harvard.edu

Unit/Appointment-specific

Baker Library (Harvard Business School)

Baker Library will download, scrape, merge, and clean data provided by disparate sources. Full service for HBS faculty (and consultation for students) includes: -

  • Downloading data provided by licensed and publicly available sources
  • Batch article searching to capture counts of keywords or article citations
  • Scraping data provided by the web
  • Pulling historical data provided by disparate electronic and print sources and collating it into databases (e.g. company founding dates)
  • Screening, merging, and subsetting existing data sources excludes surveys and human subjects experiments

Audience

  • School Faculty
  • School Graduate Students

Service Provider

Baker Library

Service Fee

Yes for HBS Faculty.

Service Website

https://www.library.hbs.edu/Services/Data-Creation-Data-Collection

Contact Information

Alex Caracuzzo: acaracuzzo@hbs.edu

 

Harvard Law School

Audience

  • School Faculty (full service)
  • School Graduate Students (support)

Service Provider

Harvard Law School Library

Service Fee

None

Service Website

Contact Information