Research Data Management Highlights: Digital Infrastructure Summary Summit 2014 Summary Document

This week, a very significant document regarding the future of research data management and digital stewardship landed on my desktop. This is a PDF all academic librarians in Canada must read – whether or not you are tasked with RDM. If you are in IT, Research Facilitation, REB, Industry Compliance, or are a researcher or an administrator, then you should read this, too. It conveys the pressing importance of RDM to the profession, and it shows that we have an opportunity at hand if we take it – or a storm brewing if we turn it away.

The document is the Summary Report for the Digital Infrastructure Summit 2014This conference was hosted by the Leadership Council for Digital Infrastructure in January 2014. Group representation included CARL, CRKN, CANARIE, TC3+, and CUCCIO; in all 140 participants took part (p. 1). This document outlines the outcomes of the summit, which argued that RDM is lacking in Canada, that a sincere commitment to digital stewardship and not just technology is required to move forward, and the time to act is now (p. 1). If you are a Canadian academic librarian, download the document and read it now.

Note: I was not a participant of this summit and am only summarizing the PDF in regards to RDM in Canada for librarians. I’m standing on the shoulders of giants when I write this post.

This document asks What is Digital Infrastructure (DI)?, considers the existing problems that are hampering the development of an effective DI in Canada, and traces a clear path forward on which the Canadian research enterprise should move. Research Data Management and the people involved in it are front and centre in this document, and this means academic librarians and preservationists. The library has a significant role to play, and we are expected to contribute.

Digital Infrastructure and Soft Skills

One of the document’s biggest takeaways – and what I argue should be one of the first talking points you should use when discussing research data management – is that digital infrastructure (DI) is far more than technology alone. The executive summary states in clear, plain language that digital infrastructure includes “our ability to capture, manage, preserve, and use data . . . data are infrastructure, as are the highly skilled personnel who facilitate access to data, computational power and networks” (p. 1). DI requires “skilled knowledge management personnel” (p.1) who have technical capacity, but as we see elsewhere in the document, also can participate in local and national policy formulation and interpretation, understand project management, and have the capacity to collaborate and lead in their own field and in others. These are a suite of advanced “soft skills” that are concomitant IT knowledge and experience, and they are bound together with other essential criteria such as sustained funding and ongoing government and industry support, which allow research data management to flourish rather than wither on the vine. A successful solution that addresses near-team and long-team RDM issues requires skilled, committed resources on the ground who are leading the way. DI cannot be left to colleagues on limited term appointments or to our grad students. It demands institutional memory and it requires organizational vision.

I’ve mentioned the argument in the above paragraph in a post long ago, but I’ll take this opportunity to link out again. Chuck Humphrey states this in clear terms when he explains that RDM is the “what” and the “how”, and digital stewardship is the “who”, and both are necessary requirements in RDM infrastructure. If you are a librarian, then read Chuck’s website. If you are a Canadian librarian, then read it again. And again.

What’s wrong with Canada’s Digital Infrastructure?  

The Leadership Council has cut right to the chase in their document. They want you, the reader, to know right away that there are real issues affecting digital scholarship in Canada:

  1. Our research data are a national asset, and they are not stewarded properly (p. 1). Canada needs to get up to speed, quickly. It needs RDM and it needs it now. It requires data storage infrastructure it doesn’t have at present. It requires better skills training. It requires better software development. (Fellow Librarians: This is all about us.)
  2. There is very little governance and coordination (p. 1-2). There are many, many players, from funding agencies to libraries to standards organizations to researchers themselves. We are all trying hard to fix this, but we’re not working together. Our governance model is weak. Time is lost, efforts are duplicated, and we are spinning our wheels. (Fellow Librarians: This is very much about us. Get in there and make it happen.)
  3. There is very little federal policy regarding DI (p. 2). This is related but distinct from the second point. With little direction from government, the community is looking in all directions all at once. Greater coordination, planning, and sustained, reliable investment would be beneficial to the national research enterprise. (Fellow Librarians. This, too, is about us. Do. Take part. Take charge.)

I support it's not a blog post if you don't add a worldle.

How to act. How to improve RDM. How to solve this crisis.

Note: I am focusing on mainly on RDM and digital stewardship in this post; the original document gives equal attention to other areas such as governance, policy, and funding.

Research data management/stewardship is as yet the weakest link in the Canadian DI landscape, despite the massive increases in the amount of data being created daily through the research process. There is currently no agreed-upon strategy and/or the capacity to protect this valuable public asset, with little capacity to support access, use and reuse by a wide range of users. (p. 6)

The document makes a strong case not just for increased technical infrastructure but for greater knowledge management, project management, and policy analysis. We simply cannot allow ourselves to dump data files one after another onto a server and then hope that serendipity or an as-of-yet uncoded search algorithm will help us organize, preserve, and provide access to these files in the future. Research data – especially publicly funded research data – are a public good, and they require maintenance, management, and care.

The document highlights significant RDM gaps in Canada that must be addressed. These are:

  • Lack of a core RDM resource (p. 8)
      • Canada requires a national data service, which can lead in stewardship, policy, and education. RDC, CARL, and CRKN all have assets to contribute in this regard; RDC has shown incredible strength in this area already
  • Lack of strategy (p. 9)
    • Canada has no high-level strategy framework guiding debate and decisions on standards, infrastructure and distribution access networks, obligations to existing international agreements; funding
  • Lack of Policy leadership (p. 9)
    • Tri-Council should take the next step and implement RDM policy under consideration.
  • Weak RDM culture (p. 9-10)
    • The benefits that RDM brings must be better articulated.
  • Lack of understanding of Digital Infrastructure (p. 10)
    • It is incumbent that stakeholders demonstrate to the greater community that digital infrastructure necessarily includes the data, and the professionals who steward them
  • Lack of training (p. 10)
    • RDM training is inconsistent at present and must be improved in the short-term for practitioners and researchers alike
  • Weak policy on long-term data lifecycle management (p. 11)
    • Like any collection, data must be managed in part because its supports are not without cost. Management will include asking tough questions like what should be preserved, if we have the means and capacity to preserve it, and for what length of time. I recommend that we all have discussions about data collection policies as soon as possible. Locally, in our consortia, and nationally.
  • Lack of Storage (p. 11)
    • Storage capacity for all disciplines must be addressed. RDM is in no way an “X not Y” proposition. We must serve all discipline, departments, faculties, and researchers.
  • Means to foster acceptance (p. 11)
    • This is a tricky issue. We need our researchers to accept and be a part of RDM. Compliance should be required, but strict policies at the outset may prevent too much pushback. There will be give-and-take in the beginning.
    • Note: The original document refers to “compliance” here. I don’t want to use that term. Do we need sticks? Yes. Do we want to use them? Only if we have to. But from the outside, we must have the attitude that everyone is a partner in this venture.

Good data stewardship is not just a researcher’s responsibility, but it also needed at institutional, organizational, national, and disciplinary levels. (p. 10)

Making things happen and getting things done.

The LC provides a roadmap for action and results in its summary report from its 2014 Digital Infrastructure Summit. I am focusing on RDM-related activities and policy in this post since they are both so important to me, so I do encourage you to read the entire document yourselves to see the entire action plan.

The LC’s ways forward for RDM and policy include:

  • Maintain the Leadership Council and analyze its organizational structure (p. 17-18)
    • A steering committee is required and the LC has done a good job this far. That said, there are clamours and a need for greater representation. Consider increasing membership, developing an executive committee, form working groups, and establishing a Charter and Secretariat
  • Engage government (p. 19)
    • The LC had developed a strong community-driven response to RDM challenges. That said, push government – again – for improved coordination of policy and funding
  • Establish a national RDM network (p. 20)
    • Working on CARL and CRKN’s leadership and experience in this area, establish a network focusing on services, tools and tech, and education
  • Create an RDM pilot (p. 21)
    • Develop pilot discipline-based RDM programmes in three domains: astronomy, social sciences, and medical genomics
  • Coordinate with CRKN’s Integrated Digital Scholarship Ecosystem (ISDE) (p. 22)
    • Engage with this initiative that will enable next-gen library collaboration for seamless access, and improved infrastructure
  • Develop an RDM metrics pilot (p. 22-23)
    • For assessment, understanding performance

If you have made it this far in the post, then I offer you my congratulations. There is a lot of information to synthesize, but it is vital that academic librarians in Canada understand what is on the horizon for our profession, and what role will be expected of us. As this post shows, the work that follows – the opportunity we can take hold of – is as much resource-related and people-related as it is tech-related. To discuss digital infrastructure is to discuss the people who make it happen. Research Data Management doesn’t happen on its own. RDM requires careful planning, policy interpretation, technical capacity, and a thorough understanding of resource management.  

And yes, this is an opportunity for us. But we must be ready for what is to come. RDM will soon become the coordinated response to big data in Canada as it is elsewhere in the developed world, and it will mean work. But this is our work. It is our field. Take heed, take note, ask questions, and get set. Plan for this, and get set to play a leading role, because things are going to get busy.

tl;dr : read this now.  apply it to your work.


September Projects, 2013

I’m stating the obvious by telling you that September is a busy month in academics.  The start of the school calendar changes the mood, tempo, and pulse of a university campus, and it shifts things at the library into full gear as we roll out all programming and services. Here are a few of the things I’ve been contributing to lately, which has kept me busy in a good way (as opposed to the bad kind of busy).

Changing liaison duties

I’ve taken on liaison duties in Sociology and Social Work while a colleague is on sabbatical this year, and I’m also part of a group that is expanding the Library’s services to the University’s students who are cross-registered at the Balsillie School of International Affairs.  This has already translated to a large increase in my in-class instruction, especially for Social Work, which is its own Faculty and has its own small library. Taking on these subject-based duties has been a great opportunity since they’ve given me greater everyday contact with faculty and researchers whose work touches on socio-economic data or would benefit from research data management support and consultation.  Simply put, it’s a lot easier to push data management planning when you already have a built-in relationship with the researcher, so I expect my subject-based work in Sociology and Social Work to benefit our RDM programme.

Outreach to Faculty and Students

This term, I’m offering a full slate of seminars on research data management, bibliometrics, and data access through the library. I developed these seminars with a graduate student/faculty audience in mind, partly to help the Library increase its presence within graduate programming and in the university’s research enterprise. While I don’t expect large numbers because this is the first time in a few terms that we developed a suite of seminars with graduate or faculty research in mind,  I do hope they begin to build on our growing profile as a center of research facilitation on campus.  (This could be a blog post in its own right; I may have to write more on it in the future.)

Citation management

Our Library has taken a close look at RefWorks and has also considered what kinds of citation management systems our users use.  What we’ve known all along is that many people use RefWorks and many people do not.  We asked ourselves why we commit our support only to one service when our users will always work with their personal preferences in mind, and we decided that giving information and advice on RefWorks alone just doesn’t cut it.  If we are to support or know something about citation management and research management, then we shouldn’t limit ourselves to only one tool.  Going forward, our library is now supporting RefWorks, Zotero, and Mendeley by offering instructional sessions, consultation, and in some cases, even research collaboration between the researcher and the librarian through these tools

Data and Statistics

Don’t think that this project or the next (RDM) are subordinated to the others I’ve mentioned because they’re at the bottom of this list.  That’s far from the truth as both areas have seen significant change in the past few months.  On the Data and Stats side alone, data librarians in Canada are busy dealing with a new EULA for Canada Post postal code products (e.g., the PCCF) and what it means for researcher access and use. This issue alone has eaten up probably half of my time in the past two weeks since the new EULA changes long-standing practices for researchers who use these products, and the library, which administers licences on their behalf. A lot of time has been given to consultation within the data community and within the library to produce new practices, and I’m now rolling out a PR and education campaign.  If you are Laurier faculty and use postal code products, be on the look out for more news on this very shortly.  If you’re faculty at another university in Canada, you may want to contact your own data librarian.

Research data management

RDM has been the largest part of my work at Laurier.  We’ve been developing a research data management programme, based largely on consultative support that helps researchers and research groups learn about and then develop flexible data management plans that speak to their current and future research needs.  Next month, I’ll be attending CASRAI’s ReConnect2013 conference to build upon my current knowledge and to see and learn what others are doing in this area across Canada and in other jurisdictions.  We have a couple of RDM projects on the go already, and I would like to increase that number on campus since the service the Library provides offers clear benefits to the researchers in terms of meeting funding obligations, providing research management planning, and improving access to and citation of produced work after the research has ended.  Laurier researchers: let’s chat.

Link: Measuring a library’s holdings based on its “uniqueness”

Here’s a Monday morning link for all y’all. Dan Cohen notes an interesting way to measure a library’s holdings : by evaluating the collection’s “uniqueness.”

This may be an interesting metric that could be useful at the local-consortial level? I’ll let the Collections Librarians answer that, though.  Read it here:

Dan Cohen: Visualizing the Uniqueness, and Conformity, of Libraries