Skip to main content

Subject and data repositories

Brunel has its own institutional repository, BURA, where research staff are required to archive their research publications (for more information on both our research database and institutional repository, please see the BRAD and BURA page). However, researchers also have the option to use external subject and data repositories to disseminate their work further.

Subject repositories are subject orientated collections of open access e-publications and research data from multiple institutions, which are free to users.

Useful resources include ROAR-MAP and OpenDOAR which provide details on the largest subject repositories on the web. These resources is an excellent way of locating subject repositories relevant to your discipline.

Social Media platforms like ResearchGate and are useful tools for connecting to researchers and sharing research, however they are not considered the best means to preserve academic research for long term access. Please see the details in the tabs below.


Document repositories are open access archives that hold academic literature. Typically these are self-archived by authors. If the document is a published work, authors often archive a manuscript draft after peer review. These might be subject to an embargo period, but in such cases there is usually a metadata record, an indication of  the date of release and sometimes even means to request copies directly from the authors.

Repositories also hold grey literature; output produced on all levels of government, academics, business and industry in print and electronic formats, but which is not controlled by commercial publishers. Libraries, museums and heritage organisations are working hard to bring out-of-copyright works into repositories for public access. Most Universities disseminate PhD theses, and sometimes Masters dissertations.

The available types of repositories are broadly classified in the sections below.

Preprint servers

Publication of manuscripts in a peer-reviewed journal often takes weeks, months or even years from the time of initial submission, owing to the time required by editors and reviewers to evaluate and critique manuscripts, and the time required by authors to address critiques. The need to quickly circulate current results within a scholarly community has led researchers to distribute documents known as preprints, which are manuscripts that have yet to undergo peer review. They may be considered as grey literature. The immediate distribution of preprints allows authors to receive early feedback from their peers, which may be helpful in revising and preparing articles for submission.

A pre-print server is a repository that has a special focus on collecting pre-prints before submission to a journal. The preprint may persist, often as a non-typeset version available free, after a paper is published in a journal.

These repositories take full advantage of copyright law. Before a copyright transfer agreement has been signed and especially before the manuscript has been submitted to a commerical publisher, the author retains full rights over their intellectual property. This means they can disseminate their research early and take and respond to comments from contributors.

Many researchers and funding organisations are recognising the value of early involvement by peers, the process acting as a kind of informal peer review. The MRC and some other funders now consider preprint publications in grant applications.

Perhaps the most well known and largest preprint server is arXiv, which servers mathematics, physics and computer science. Other preprint servers have been set up for disciplines Biology (bioRxiv)

Subject repositories

A disciplinary repository (or subject repository) is an online archive containing works or data associated with these works of scholars in a particular subject area. Disciplinary repositories can accept work from scholars from any institution. A disciplinary repository shares the roles of collecting, disseminating, and archiving work with other repositories, but is focused on a particular subject area. These collections can include academic and research papers.

Disciplinary repositories can acquire their content in many ways. Many rely on author or organization submissions, such as SSRN. Others such as CiteSeerX crawl the web for scholar and researcher websites and download publicly available academic papers from those sites.

A disciplinary repository generally covers one broad based discipline, with contributors from many different institutions supported by a variety of funders; the repositories themselves are likely to be funded from one or more sources within the subject community. Deposit of material in a disciplinary repository is sometimes mandated by research funders.

Disciplinary repositories can also act as stores of data related to a particular subject, allowing documents along with data associated with that work to be stored in the repository.

Social networking repositories

The most popular social networking sites for researchers are and researchgate. These are both private companies that offer services to the research community. In addition to the ability to build social networks, much in the same fashion as linkedIn or Facebook, users may publish versions of the paper much in the same way as other repositories

We strongly recommend the archival of your publications in subject or institutional repositories, which are not-for-profit and guarantee public access in to the future in perpetuity. There are many instances of users of researchgate inadvertantly publishing versions of papers for which they no longer have the copyrights. Subject or institutional repositories usually employ a copyright verification workflow with dedicated, trained staff who ensure that legal requirements are met and, therefore, that papers have no reason to be removed.


Open access repositories work within copyright law to legally disseminate academic works, making them freely accessible to researchers and the the public. This is because when copyright has been transferred to a commercial publisher they, as the rightsholder, often put restrictions on how the work may be used in repositories.

This diagram from HEFCE highlights the publishing workflow and indicates some of kinds of document versions you would expect to find regarding current research.

It is not uncommon to find all these document versions in repositories; it depends on what the rightsholder allows. The accepted and published versions are the most valuable, because the are subject to formal peer-review.



 How do I access repositories relating to my subject?

Often a google search is the quickest way to find these repositories. All repository contents are indexed by Google Scholar. ROAR-MAP and OpenDOAR which provide details on where to the largest subject repositories on the web. These resources is an excellent way of locating subject repositories relevant to your discipline. 

In 2018 experts from UK Research and Innovation (UKRI) contributed to a new Google search tool to help scientists, policy makers and other user groups easily find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.

There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. As part of the UKRI commitment to easy access to data, its experts worked with Google to help develop the Dataset Search, launched on 6 September 2018.

Similar to how Google Scholar works, Dataset Search lets users find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. (Source:

More information on Research Data repositories will be available shortly.

NERC/STFC JASMIN computer.(Credit: STFC)

This content was authored by Katie Fortney and Justin Gonder (University of California) and is gratefully reused under the terms of © 2014 The Regents of the University of California [Creative Commons CC-BY License]   Site text by the University of California Office of Scholarly Communication is licensed under Creative Commons Attribution 4.0 International License. Article originally available at

A social networking site is not an open access repository “What’s the difference between ResearchGate,, and the institutional repository?”

“I put my papers in ResearchGate, is that enough for the open access policy?

These and similar questions have been been common at open access events over the past couple of years. Authors want to better understand the differences between these platforms and when they should use one, the other, or some combination.

First, a brief primer on what each service has to offer:

ResearchGate and ResearchGate and are social networking platforms whose primary aim is to connect researchers with common interests. Users create profiles on these services, and are then encouraged to list their publications and other scholarly activities, upload copies of manuscripts they’ve authored, and build connections with scholars they work or co-author with. Essentially these services provide a Facebook or LinkedIn experience for the research community.

Both services are commercial companies. Although has a “.edu” URL, it isn’t run by a higher education institution. The domain name was registered before the rules that would now prohibit this use went into effect, and the address was grandfathered in and later sold to the company. On its filings with the Securities and Exchange Commission it uses the legal name Academia Inc.

Open access repositories Open access repositories come in two basic flavors:

  • Institutional repositories (IRs) are generally library-run websites that enable authors to upload a version of their manuscripts for public “open access” display. Brunel’s is calledenterHere;BURA. The primary aim of institutional repositories is to make the scholarly outputs of the university as widely available as possible and to ensure long-term preservation of these outputs.
  • Subject-based repositories collect publications in a particular discipline or a range of disciplines, so that authors in a field can share and solicit feedback on their work from colleagues in that field, regardless of where they work.

Next, let’s take a closer look at some of the major differences between these two kinds of services:

 Open access repositoriesAcademia.eduResearchGate
Supports export or harvesting Yes No No
Long-term preservation Yes No No
Business model Nonprofit (usually) Commercial. Sells job posting services. Hopes to sell data. Commericial. Sells ads, job posting services
Sends you lots of emails (by default) No Yes Yes
Wants your address book No Yes Yes
Fulfills requirements of Brunel, REF and funder policies Yes No No

Openness and interoperability

We are often asked by researchers already using ResearchGate or why they should use a repository operated or recommended by the library instead (or as well), or alternatively: Why can’t the library just take my information from ResearchGate or and use that to populate the institutional repository?

The simple answer is: ResearchGate and do not permit their users to take their own data and reuse it elsewhere, nor do their terms of service permit the library to extract that data on the authors’ behalf.

  • ResearchGate: “Users must not misuse the Service. Misuse of the Service includes, without limitation: … automated or massive manual retrieval of other Users’ profile data (‘data harvesting’).”
  • “You agree not to do any of the following: … Attempt to access or search the Site, … through the use of any engine, software, tool, agent, device or mechanism (including spiders, robots, crawlers, data mining tools or the like).”

Interestingly, ResearchGate permits you to import publications from other applications, but provides no method for getting that same data out of the ResearchGate ecosystem (well, not without some creative acrobatics). Similarly, previously supported import, but now makes it impossible to bring data in or out of their system.

Institutional repositories, on the other hand, are largely committed to complete openness and re-use of data.

  • They get involved with efforts like SHARE, a mission to build a comprehensive, free open data set of research activities and outputs. SHARE participants include sites like the Social Science Research Network, the Smithsonian Digital Repository, and DataONE: Data Observation Network for Earth – see a full list here.
  • They make their metadata – the information about what’s in the repository – interoperable and open by using standards like OAI-PMH.PubMedCentral, ArXiv, and BURA are all OAI-PMH providers – see more examples here. These kinds of activities make open access repositories good places for publications you want people to be able to find.

Some IRs also offer APIs that further expand what researchers can do with the publication data they provide. ResearchGate previously discussed offering an API so that the data they collect could help foster open science, but over two years after announcing this intention, no progress seems to have been made.

Long-term preservation and access

Open access repositories are usually managed by universities, government agencies, or nonprofit associations. Affiliation with a larger institution (with a public service mission) means that repositories are likely to be around for a long time. They often employ librarians and data specialists who specialize in ensuring long term archiving. and ResearchGate are independent for-profit companies that could theoretically close up shop at any time (anyone remember Both sites disavow any duty to warn users if they shut down:

  • “reserves the right, at its sole discretion, to discontinue or terminate the Site and Services and to terminate these Terms, at any time and without prior notice.”
  • ResearchGate “reserves the right to change, reduce, interrupt or discontinue the Service or parts of it at any time.”

Business models

Less theoretical is the likelihood of a shift in these sites’ profit strategy. ResearchGate and are commercial sites, whereas most open access repositories are non-profits.

These academic social networking sites have each raised large amounts of initial funding: $17.8 million for, and $35 million for ResearchGate. They share funders with Uber, Snapchat, and Upworthy.’s largest funder is in a prolonged battle with the Surfrider foundation and the California Coastal Commission over preventing public beach access. This isn’t particularly notable for a startup company, but it’s unusual for an “academic” site.

And as Kathleen Fitzpatrick recently pointed out when writing about, venture capital funds don’t last forever. “There are a limited number of options for the network’s future: at some point, it will be required to turn a profit, or it will be sold for parts, or it will shut down.” What are their options?

ResearchGate offers to help companies “reach the right professionals in science and research with targeted, on-page advertising.” hopes to be able to track what topics and articles are trending with their users and sell that information to R&D companies. [Note: subsequent to this post,’s CEO Richard Price told the Chronicle of Higher Education that the company is no longer planning on pursuing that idea.] Both companies host job listings, and either charge for premium placement of the job ad or the ability to list it at all.

Open access repositories, as mentioned above, usually get their funding from a host entity like a university or a government agency.

Use of your contacts and personal data

ResearchGate and don’t have a lot in common with open access repositories, but they do have a lot in common with other social networking sites like Facebook, LinkedIn, and Twitter. They even encourage users to connect those and other services and contacts to their ResearchGate and accounts – sometimes aggressively.

Part of’s account setup process automatically tries to connect to a user’s Facebook account. If the user is signed in to Facebook, a pop up appears, saying “ will receive the following info: your public profile, friend list, email address, work history and education history.” The options for moving past this screen are “Find Facebook Friends,” “Back,” or “I don’t have a Facebook account.” No “No Thanks” or “Skip this step” – you have to fib or fork over your data.

Both sites have a long list of possible types of email notifications, all of which can be turned off, and all of which appear to be turned on as a default.ResearchGate faced criticism in the past for sending unwanted emails not only to users themselves, but also to users’ co-authors that claimed, erroneously, to be from the users themselves.

For better or worse, open access repositories are not social networking sites. Users can search for work by a particular author, but authors can’t build a friend or collaborator list, and usually can’t manage a profile page. The success of ResearchGate and demonstrate that this is a functionality that scholars find valuable, and new efforts like MLA Commonsare trying to fill the gap.

The fine print

Whenever you sign up for a service, it’s a good idea to read the Terms of Use.’s terms give the company a license to make derivative works (like translations?) based on articles users upload to the site “in connection with operating and providing the Services and Content to you and to other Members.” ResearchGate’s terms include an agreement to have the user’s relationship with the company be governed by German law. And both sites have an indemnification clause, asserting that if the site faces any legal claims arising from things users upload to the site, the user will bear the cost.

Ok, great. But really: what should I use?

In the end, both types of services have unique offerings, and both likely hold some value for researchers. Academic social networking sites, such as ResearchGate or, might be valuable when trying to find others in your field conducting related research, or for providing access to your papers to those people you know use the site.

The value provided by the institutional repository, however — particularly the long-term preservation and commitment to open access, should not be overlooked. Until some public commitment has been made, it should not be assumed that the other services provide this, and they will not be considered “open access repositories” that meet the requirements of participating in Brunel’s open access policies.

If your colleagues find a social networking site useful and you can manage the email notification settings, that site might be worth your time. On the other hand, as Kathleen Fitzpatrick writes, “everything that’s wrong with Facebook is wrong with” If the typical behavior of commercial social networking sites bothers you – gathering users’ information for their own purposes – be as wary of those that target academics as you are of those with a more general audience. Whether or not you decide these social networking sites are right for you, remember that institutional repositories (such as BURA) enable you to share your research widely without trying to mine your address book. If you’re not already using eScholarship or another open access repository, take a few minutes to check out the services available to you, at no charge, from organizations who offer similar tools for broadening access to your publications, but who have no interest in making a profit from your work.