Skip to main content

Subject and data repositories

Subject repositories are subject orientated collections of open access e-publications and research data from multiple institutions, which are free to users.

ROAR-MAP and OpenDOAR give details on the largest subject repositories on the web and are excellent resources for finding  subject repositories relevant to your discipline.

Brunel University London has its own institutional repository, BURA, which showcases staff research publications and doctoral theses.  Researchers also have the option to use external subject and data repositories to disseminate their work further. For information on how to deposit your research in BURA, visit BRAD and BURA.

Please see the details in the tabs below.


 

Document repositories are open access archives that hold a range of literature, including academic publications, pre-prints, working papers, postgraduate dissertations and theses. Typically these are self-archived by authors or archived on their behalf by publishers.

If the document is a published work, authors often archive the final draft manuscript following peer review with any corrections applied.   If a work has not been published, there may be an embargo period before the paper can be released for public viewing. However a public metadata record is released which should contain information about the expected release date and sometimes a means of requesting access directly from the author. 

Repositories also hold grey literature: outputs produced by government, academic institutions, business and industry in print and electronic formats, but which is not controlled by commercial publishers. These may include working papers.

Some libraries, museums and heritage organisations work to provide public access to out-of-copyright works in repositories and preserve collections of cultural, social or scientific value. 

The section below describes various types of repositories below.

Pre-print servers

Publication of manuscripts in a peer-reviewed journal often takes weeks, months or even years from the time of initial submission, owing to the time required by editors and reviewers to evaluate and critique manuscripts, and the time required by authors to address critiques.

The need to quickly circulate current results within a scholarly community has led researchers to distribute documents known as pre-prints, which are manuscripts that have yet to undergo peer review.  The immediate distribution of preprints allows authors to receive early feedback from their peers, which may be helpful in revising and preparing articles for submission.

A pre-print server is a repository that has a special focus on collecting pre-prints before submission to a journal. The preprint may persist, often as a non-typeset version available free, after a paper is published in a journal.

These repositories take full advantage of under copyright law. Before a copyright transfer agreement has been signed and especially before the manuscript has been submitted to a commerical publisher, the author (or their institution depending on local policy and employment contract) retains full rights over their intellectual property. This means they can disseminate their research early and take and respond to comments from contributors.

Many researchers and funding organisations are recognising the value of early involvement by peers, the process acting as a kind of informal peer review. Many funders now consider pre-print publications when assessing grant applications.

Perhaps the most well known and largest preprint server is arXiv, which serves mathematics, physics and computer science. bioRxiv is a discipline specific preprint server covering Biology. OSF hosts several pre-print servers at https://osf.io/preprints.

Subject repositories

A subject repository is an online archive containing works or data associated with these works of scholars in a particular subject area. Disciplinary repositories can accept work from scholars from any institution. A disciplinary repository shares the roles of collecting, disseminating, and archiving work with other repositories, but is focused on a particular subject area. These collections can include academic and research papers.

Disciplinary repositories can acquire their content in many ways. Many rely on author or organization submissions, such as SSRN. Others such as CiteSeerX crawl the web for scholar and researcher websites and download publicly available academic papers from those sites.

A disciplinary repository generally covers one broad based discipline, with contributors from many different institutions supported by a variety of funders; the repositories themselves are likely to be funded from one or more sources within the subject community. Deposit of material in a disciplinary repository is sometimes mandated by research funders.

Disciplinary repositories can also act as stores of data related to a particular subject, allowing documents along with data associated with that work to be stored in the repository.

Academic social networking websites

Among some of the most popular social networking sites for researchers are Academia.edu and ResearchGate which offer services to the global research community.

In addition to the ability to build research social networks, users may post versions of their research paper in the same way as other repositories.

Academic social media platforms are not considered the best means to preserve academic research for long term access as they do not currently meet technical standards required by funders and institutions. However, they can be useful tools for connecting to researchers and sharing research. Scholarly sharing on such platforms is not always permitted by publishers under copyright laws, or they may have conditions, so it is important to verify their individual guidance for authors before posting your research papers on these websites. 

The University requires that all research publications are archived in BURA, in addition to any other repositories required by funders or which are standard for a discipline to maximise the reach and visibility of your research.

Subject or institutional repositories will follow copyright verification workflows who ensure that copyright and access requirements are met and, therefore, that papers have no reason to be removed.

 

Open access repositories work within copyright law to legally disseminate academic works, making them freely accessible to researchers and the the public. This is because when copyright has been transferred to a commercial publisher they, as the rightsholder, often put restrictions on how the work may be used in repositories.

This diagram here originally published by HEFCE highlights the publishing workflow and indicates some of kinds of document versions you would expect to find regarding current research.

It is not uncommon to find all these document versions in repositories; it depends on what the rightsholder allows. The accepted post-print and published versions are the most valuable, because these have are subject to formal peer-review.

HEFCE

 

 How do I access repositories relating to my subject?

Often a google search is the quickest way to find these repositories. All repository contents are indexed by Google Scholar. ROAR-MAP and OpenDOAR which provide details on where to the largest subject repositories on the web. These resources is an excellent way of locating subject repositories relevant to your discipline. 

In 2018 experts from UK Research and Innovation (UKRI) contributed to a new Google search tool to help scientists, policy makers and other user groups easily find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.

There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. As part of the UKRI commitment to easy access to data, its experts worked with Google to help develop the Dataset Search, launched on 6 September 2018.

Similar to how Google Scholar works, Dataset Search lets users find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. (Source: https://www.ukri.org/news/ukri-contributes-to-new-google-search-tool/).

More information on Research Data repositories will be available shortly.

NERC/STFC JASMIN computer.(Credit: STFC)

This content was authored by Katie Fortney and Justin Gonder (University of California) and is gratefully reused under the terms of © 2014 The Regents of the University of California [Creative Commons CC-BY License]   Site text by the University of California Office of Scholarly Communication is licensed under Creative Commons Attribution 4.0 International License. Article originally available at https://osc.universityofcalifornia.edu/2015/12/a-social-networking-site-is-not-an-open-access-repository/

A social networking site is not an open access repository “What’s the difference between ResearchGate, Academia.edu, and the institutional repository?”

“I put my papers in ResearchGate, is that enough for the open access policy?

These and similar questions have been been common at open access events over the past couple of years. Authors want to better understand the differences between these platforms and when they should use one, the other, or some combination.

First, a brief primer on what each service has to offer:

ResearchGate and Academia.edu ResearchGate and Academia.edu are social networking platforms whose primary aim is to connect researchers with common interests. Users create profiles on these services, and are then encouraged to list their publications and other scholarly activities, upload copies of manuscripts they’ve authored, and build connections with scholars they work or co-author with. Essentially these services provide a Facebook or LinkedIn experience for the research community.

Both services are commercial companies. Although Academia.edu has a “.edu” URL, it isn’t run by a higher education institution. The domain name was registered before the rules that would now prohibit this use went into effect, and the address was grandfathered in and later sold to the company. On its filings with the Securities and Exchange Commission it uses the legal name Academia Inc.

Open access repositories Open access repositories come in two basic flavors:

  • Institutional repositories (IRs) are generally library-run websites that enable authors to upload a version of their manuscripts for public “open access” display. Brunel’s is calledenterHere;BURA. The primary aim of institutional repositories is to make the scholarly outputs of the university as widely available as possible and to ensure long-term preservation of these outputs.
  • Subject-based repositories collect publications in a particular discipline or a range of disciplines, so that authors in a field can share and solicit feedback on their work from colleagues in that field, regardless of where they work.

Next, let’s take a closer look at some of the major differences between these two kinds of services:

 Open access repositoriesAcademia.eduResearchGate
Supports export or harvesting Yes No No
Long-term preservation Yes No No
Business model Nonprofit (usually) Commercial. Sells job posting services. Hopes to sell data. Commericial. Sells ads, job posting services
Sends you lots of emails (by default) No Yes Yes
Wants your address book No Yes Yes
Fulfills requirements of Brunel, REF and funder policies Yes No No

Openness and interoperability

We are often asked by researchers already using ResearchGate or Academia.edu why they should use a repository operated or recommended by the library instead (or as well), or alternatively: Why can’t the library just take my information from ResearchGate or Academia.edu and use that to populate the institutional repository?

The simple answer is: ResearchGate and Academia.edu do not permit their users to take their own data and reuse it elsewhere, nor do their terms of service permit the library to extract that data on the authors’ behalf.

  • ResearchGate: “Users must not misuse the Service. Misuse of the Service includes, without limitation: … automated or massive manual retrieval of other Users’ profile data (‘data harvesting’).”
  • Academia.edu: “You agree not to do any of the following: … Attempt to access or search the Site, … through the use of any engine, software, tool, agent, device or mechanism (including spiders, robots, crawlers, data mining tools or the like).”

Interestingly, ResearchGate permits you to import publications from other applications, but provides no method for getting that same data out of the ResearchGate ecosystem (well, not without some creative acrobatics). Similarly, Academia.edu previously supported import, but now makes it impossible to bring data in or out of their system.

Institutional repositories, on the other hand, are largely committed to complete openness and re-use of data.

  • They get involved with efforts like SHARE, a mission to build a comprehensive, free open data set of research activities and outputs. SHARE participants include sites like the Social Science Research Network, the Smithsonian Digital Repository, and DataONE: Data Observation Network for Earth – see a full list here.
  • They make their metadata – the information about what’s in the repository – interoperable and open by using standards like OAI-PMH.PubMedCentral, ArXiv, and BURA are all OAI-PMH providers – see more examples here. These kinds of activities make open access repositories good places for publications you want people to be able to find.

Some IRs also offer APIs that further expand what researchers can do with the publication data they provide. ResearchGate previously discussed offering an API so that the data they collect could help foster open science, but over two years after announcing this intention, no progress seems to have been made.

Long-term preservation and access

Open access repositories are usually managed by universities, government agencies, or nonprofit associations. Affiliation with a larger institution (with a public service mission) means that repositories are likely to be around for a long time. They often employ librarians and data specialists who specialize in ensuring long term archiving.

Academia.edu and ResearchGate are independent for-profit companies that could theoretically close up shop at any time (anyone remember pets.com?). Both sites disavow any duty to warn users if they shut down:

  • Academia.edu “reserves the right, at its sole discretion, to discontinue or terminate the Site and Services and to terminate these Terms, at any time and without prior notice.”
  • ResearchGate “reserves the right to change, reduce, interrupt or discontinue the Service or parts of it at any time.”

Business models

Less theoretical is the likelihood of a shift in these sites’ profit strategy. ResearchGate and Academia.edu are commercial sites, whereas most open access repositories are non-profits.

These academic social networking sites have each raised large amounts of initial funding: $17.8 million for Academia.edu, and $35 million for ResearchGate. They share funders with Uber, Snapchat, and Upworthy.  Academia.edu’s largest funder is in a prolonged battle with the Surfrider foundation and the California Coastal Commission over preventing public beach access. This isn’t particularly notable for a startup company, but it’s unusual for an “academic” site.

And as Kathleen Fitzpatrick recently pointed out when writing about Academia.edu, venture capital funds don’t last forever. “There are a limited number of options for the network’s future: at some point, it will be required to turn a profit, or it will be sold for parts, or it will shut down.” What are their options?

ResearchGate offers to help companies “reach the right professionals in science and research with targeted, on-page advertising.” Academia.edu hopes to be able to track what topics and articles are trending with their users and sell that information to R&D companies. [Note: subsequent to this post, Academia.edu’s CEO Richard Price told the Chronicle of Higher Education that the company is no longer planning on pursuing that idea.] Both companies host job listings, and either charge for premium placement of the job ad or the ability to list it at all.

Open access repositories, as mentioned above, usually get their funding from a host entity like a university or a government agency.

Use of your contacts and personal data

ResearchGate and Academia.edu don’t have a lot in common with open access repositories, but they do have a lot in common with other social networking sites like Facebook, LinkedIn, and Twitter. They even encourage users to connect those and other services and contacts to their ResearchGate and Academia.edu accounts – sometimes aggressively.

Part of Academia.edu’s account setup process automatically tries to connect to a user’s Facebook account. If the user is signed in to Facebook, a pop up appears, saying “Academia.edu will receive the following info: your public profile, friend list, email address, work history and education history.” The options for moving past this screen are “Find Facebook Friends,” “Back,” or “I don’t have a Facebook account.” No “No Thanks” or “Skip this step” – you have to fib or fork over your data.

Both sites have a long list of possible types of email notifications, all of which can be turned off, and all of which appear to be turned on as a default.ResearchGate faced criticism in the past for sending unwanted emails not only to users themselves, but also to users’ co-authors that claimed, erroneously, to be from the users themselves.

For better or worse, open access repositories are not social networking sites. Users can search for work by a particular author, but authors can’t build a friend or collaborator list, and usually can’t manage a profile page. The success of ResearchGate and Academia.edu demonstrate that this is a functionality that scholars find valuable, and new efforts like MLA Commonsare trying to fill the gap.

The fine print

Whenever you sign up for a service, it’s a good idea to read the Terms of Use. Academia.edu’s terms give the company a license to make derivative works (like translations?) based on articles users upload to the site “in connection with operating and providing the Services and Content to you and to other Members.” ResearchGate’s terms include an agreement to have the user’s relationship with the company be governed by German law. And both sites have an indemnification clause, asserting that if the site faces any legal claims arising from things users upload to the site, the user will bear the cost.

Ok, great. But really: what should I use?

In the end, both types of services have unique offerings, and both likely hold some value for researchers. Academic social networking sites, such as ResearchGate or Academia.edu, might be valuable when trying to find others in your field conducting related research, or for providing access to your papers to those people you know use the site.

The value provided by the institutional repository, however — particularly the long-term preservation and commitment to open access, should not be overlooked. Until some public commitment has been made, it should not be assumed that the other services provide this, and they will not be considered “open access repositories” that meet the requirements of participating in Brunel’s open access policies.

If your colleagues find a social networking site useful and you can manage the email notification settings, that site might be worth your time. On the other hand, as Kathleen Fitzpatrick writes, “everything that’s wrong with Facebook is wrong with Academia.edu.” If the typical behavior of commercial social networking sites bothers you – gathering users’ information for their own purposes – be as wary of those that target academics as you are of those with a more general audience. Whether or not you decide these social networking sites are right for you, remember that institutional repositories (such as BURA) enable you to share your research widely without trying to mine your address book. If you’re not already using eScholarship or another open access repository, take a few minutes to check out the services available to you, at no charge, from organizations who offer similar tools for broadening access to your publications, but who have no interest in making a profit from your work.