Data Science and Software Engineering play an important role in research by creating new capabilities to process and analyze data, helping ensure reproducibility,and aiding researchers in extracting knowledge and insight for the data. The term software here is used broadly to include all the ways in which one creates and analyses data. Researchers utilize software in their research by using scripts, tools, open-source software, and licensed software. Data science also covers a wide range of skills and techniques applied to cleaning (aka wrangling), processing, and statistics that are typically beyond what a researcher from a specific domain might have. Due to the rapidly evolving nature of research, there are not always codes for all functions needed, nor are their clean data sources; therefore, the software or data pipelines are developed specifically for a given project. Traditionally, this development was done with researchers (graduate students and postdocs) or independent contractors. This approach poses several issues in terms of maintenance, optimization, reproducibility, and cost. RSE or Data Scientist team can work closely with other Research Computing Systems teams to design, develop, deploy, optimize, and maintain software packages/tools and data pipelines that are paired with specific hardware architectures to accelerate cutting-edge research at Harvard University.
Data Science and Research Software Engineering Collaboration
Eligibility information is outlined below based on providers with offerings that are available to the entire Harvard community or a specific unit/appointment.
University-wide
Faculty of Arts and Sciences, Research Computing
The following services are offered by the RSE team:
- Development of scientific software packages
- Development of functional and robust UI/UX
- Add critical features to existing codebases
- Maintenance of the current codebases developed by researchers
- Development of Machine Learning/Big Data/Deep Learning apps and platforms
- Development of data acquisition and analysis automation platform
- Improve the performance of existing software packages
- Complex database design and deployment
Tier 1: Free small single tasks
Tier 2: Individual Project, defined SOW start-end-dates
Tier 3: Product, on-going development and operations, SLA
Audience
Available to all research groups with FASRC account.
Service Provider
Faculty of Arts and Sciences, Research Computing (FASRC)
Service Fee
None
Service Website
All RSE requests by booking a consultation appointment at https://www.rc.fas.harvard.edu/consulting-calendar/ or emailing rchelp@rc.fas.harvard.edu.
Contact Information
Contact Mahmood Shad at rchelp@rc.fas.harvard.edu
Institute for Quantitative Social Science
Extended support over the lifecycle of a research project by embedding a data science specialist in your research team. We can design and implement a data analysis pipeline for many stages of your research project, and/or develop a prototype of your research focused software tool. Specifically, we can help with the following:
- Writing reproducible, version-controlled code (R, Python, C, C++)
- Data organization and cleaning
- Model estimation and post-estimation
- Visualization of raw data and model output Interpretation of results
- Writing methods and results sections of papers
- Responding to peer-reviews of our analyses
- Developing tool prototypes in R / Python
Audience
All Social Science researchers
Service Provider
Institute for Quantitative Social Sciences
Service Fee
$100/hr
Service Website
https://www.iq.harvard.edu/data-science-services
Contact Information
Contact Steve Worthington at help@iq.harvard.edu
Unit/Appointment-specific
Harvard Business School
The following services are offered by the RSE team
- Development of scientific software codes
- Add critical features to existing codebases
- Maintenance of the current codebases developed by researchers
- Development of Machine Learning/Big Data/Deep Learning codes
- Development of data acquisition and analysis automation
- Improve the performance of existing software codes
- Complex database design and deployment
Tier 1: Free small single tasks
Tier 2: Individual Project, defined SOW start-end-dates
Tier 3: Product, on-going development and operations, SLA
Audience
Available to all research groups at HBS
Service Provider
Harvard Business School
Service Fee
None
Service Website
None
Contact Information
Contact Bob Freeman at research@hbs.edu
Quantitative Biomedical Research Center (Harvard Chan School)
The Quantitative Biomedical Research Center can assist with:
- Software design, implementation, optimization, refactoring, and maintenance: Every phase of the software development life cycle from pre-publication to post-publication
- Cloud computing: From budgeting for grant applications to implementing, securing, and maintaining infrastructure as code
- Data management: Storage and sharing of large ‘omics data sets
- Data visualization: Interactive dashboards using R Shiny and other tools
- Software package management: Containers or Conda packages for your software tools to facilitate distribution
-
Customized Analysis Pipeline Construction: Leveraging the Center's Cloud Native Application Platform (CNAP), this customized solution offers fast deployment, easy dissemination, and the ability to process large data volume in a secure and highly reproducible environment.
- Many other areas of scientific computing: HPC, web development, etc.
Audience
Harvard Community, other academic institutions, and industry
Service Provider
Quantitative Biomedical Research Center
Service Fee
From $145/hour
Service Website
https://www.hsph.harvard.edu/qbrc/services/
Contact Information