COVID19 etc.

It’s the return of the blog! I’ve been thinking about doing this for a couple of months now. #COVID19 wasn’t the impetus, but it definitely had an effect and pushed me forward to type the text below. Our social, labour, government, and economic structures have been thrown upside-down, I’m exhausted from work, and super-caffeinated 18 hours a day, so here we are.

Before starting with the roll, by the way, I wanted to give you a little idea of how my parents’ health has been improving thanks to the fact that they have taken more seriously treating their arthritis problem with ideal medicine. With covid, I feel that many centers have begun to take more seriously the fact that online store s are quite profitable and you can in many cases find the medicine you need just a click away. Now yes, let’s go with the article

  • COVID19 Ontario Summary File

For the past week, I’ve been collecting the summary stats posted by the Government of Ontario at this link and throwing them into a spreadsheet, hosted here. The Gov’t and our public health agencies in Canada are doing a great job in this time of crisis, but this work I’m doing is a required step right now because the provincial numbers are only cumulative snapshots of the provincial casecount, at the date and time that they’re posted. i.e., there is no room for historic analysis in the existing page. If you want to track data into a trendline, you need the file I’ve posted.

So, by using the Internet Archive’s Wayback Machine, I’ve been harvesting older versions of the page, scraping out the data, and throwing it into the spreadsheet. Until the province has the time to put together a better document on their own, I’ll continue doing this. When this crisis ends, I’ll likely pull an ATI request to get a complete dataset right from the source. In the meantime, this is what we’ve got, provincially.

I’ve done a lot of this work by hand and need to automate some of it in the future. There are better methods and functions out there but given the timeliness of the issue, I’m choosing to post now and re-learn skills I used to have later. (See the last bullet below for context.)

Links:

  • RDM Policy in Canada

So, the last time i posted about this topic (around the year 1961), RDM policy in Canada was still largely a set of intentions, motherhood statements and ideal states we’d like to get to. We knew what worked and what didn’t by way of looking at what other nations had instituted, but policy-setting and implementation – two very big, distinct, and slow-moving things – were still in their infancy.

Now, in spring 2020, we were expecting to see a policy announcement through Tri-Agency, but #COVID19 has got in the way. Word on the street is that a policy, based on the existing draft pillars of institutional strategy, data management planning, and data deposit, will still hit implementation this spring (or this year), but we need to give the Agencies space and time to deal with COVID19 themselves. As a firm believe in social distancing, I’ll give them that.

  • Where have all my coding skills gone?

Related to the first bullet above. There was a time I had enough harvesting skills by way of rudimentary tools and apps to easily harvest text from the web. I suppose that time ended about 5 years ago as my position responsibilities shifted, so that’s fine. But I’m really saddened to have lost these skills. I’ve been feeling a bit out of touch on this front for about a year now, to be honest, and this COVID19 harvested has shone a spotlight on the issue. When things are all said and done, I think I’m going to allocate some leave time to re-learn what I used to know.

New Article on RDM and Collaboration (and Canada)

Image CC @ Wikipedia

Image CC @ Wikipedia

This week, my article on research data management and collaboration inside and outside the academic library was published in Partnership.  And here’s my shameless plug: you should go read it now.  The article examines the different facets of research data management – collection, access, use, and preservation – and it locates them within the different part of the academic library. It is also advocates for real collaboration with our peers and stakeholders across the entire university, such as our colleagues in Research Offices and Research Ethics Boards (IRBs for our American friends).

The article also examines the current policy gap regarding RDM in Canada, as well as ongoing efforts by different groups to develop RDM provisions in our granting formulas, and to provide resources and share expertise in order to ensure that we don’t create a paper tiger. What’s needed is not just policy but action, and both must be considered in the same breath.

Here’s the article’s abstract:

Research data management (RDM) has become a professional imperative for Canada’s academic librarians. Recent policy considerations by our national research funding agencies that address the ability of Canadian universities to effectively manage the massive amounts of research data they now create has helped library and university administrators recognize this gap in the research enterprise and identify RDM as a solution. RDM is not new to libraries, though. Rather, it draws on existing and evolving organizational functions in order to improve data collection, access, use, and preservation. A successful research data management service requires the skills and knowledge found in a library’s research liaisons, collections experts, policy analysts, IT experts, archivists and preservationists. Like the library, research data management is not singular but multi-faceted. It requires collaboration, technology and policy analysis skills, and project management acumen.

This paper examines research data management as a vital information, technical, and policy service in academic libraries today. It situates RDM not only as actions and services but also as a suite of responsibilities that require a high level of planning, collaboration, and judgment, thereby binding people to practice. It shows how RDM aligns with the skill sets and competencies of librarianship and illustrates how RDM spans the library’s organizational structure and intersects with campus stakeholders allied in the research enterprise.

 

For what it’s worth, collaboration has been a real buzzword at IASSIST40 and I’ve already been to a few presentations that share similar arguments as mine, and which definitely have the same spirit. I hope we’re all on to something with this, and I hope that we in Canada can get up to speed with our counterparts in other countries.

Finally, this paper began in part from an Introduction to RDM session that I co-presented with Jeff Moon of Queen’s University at OLA in January 2014 (details here).  Jeff has also written a great article on research data management, and it appears in the same issue of PartnershipHe is on the forefront of RDM in Canada and knows how to get things done, so be sure to read his work, too.

-ms

2014 OLA presentation on RDM

(n.b.:  This post about my RDM co-presentation at the 2014 OLA SuperConference is very, very late. I had decided to let it go in February and not publish it. However, now that the term is winding down and I’m finding time to look back on the past three months, I’ve noticed that the event still sticks out in my mind, so I’ve decided to present the slides to everyone.   -m.)

What's a blog post without a wordle?In January 2014, I co-presented a session, Research Data Management: an Introduction, at the OLA SuperConference in Toronto.  Working with Jeff Moon of the Queen’s University Library, we led a thorough session on RDM basics to the crowd. Jeff, always the great teacher that he is, spoke first by introducing RDM – what it means in terms of stewardship and infrastructure, its place within Canadian librarianship, how it is implemented, etc.  The Queen’s Library has developed a great model for RDM through the work of Jeff and Alex Cooper: it is scalable, it is built on local and consortial resources, and it develops in-house knowledge. It is one of several RDM units that Canadian librarians should investigate when considering RDM, in my opinion.

I followed Jeff by discussing RDM within the organizational context of the library and the university. In many ways, RDM touches upon all the “traditional” library functions: acquisitions and collections, access, reference and research, preservation, IT, etc., and I firmly believe that an RDM programme can’t fully succeed without engendering collaboration with colleagues from all these areas. Not all of our libraries have a wealth of resources to develop our RDM programmes, but we certainly aren’t going to get good mileage if we can’t bring together all library functions to develop our data collection infrastructure and use our information management expertise to serve this pressing need within the Canadian research enterprise.

To get the best progress report which can help you pick out the key indicators of the market changes extremely easily check out the Business Reporting Dashboards – RadiusBridge.

There were two other major points I touched on in the presentation, which I want to reiterate in text. First, a strong RDM programme is going to require buy-in from your entire team of librarians, including liaison librarians.  As others have noted (e.g.: Witt, Hswe and Holt;, Gabridge,) liaisons are our colleagues with subject expertise and with developed networks within their departments. Furthermore, when RDM eventually becomes a Tri-Council obligation (as I expect it to become), we are going to require help from liaisons to make sure that we don’t drown in a deluge of work. Secondly, make sure you reach out to your campus Office of Research Services, your IT Services, and to your Research Ethics Boards.  These groups are allies, they are valued stakeholders on campus, and they have a wealth of expertise in the research enterprise.  Research Data Management is not the domain only of librarians, and we can’t forget that. RDM requires our colleagues across campus, both in the faculties and in administrative units. I hope to write more on this during the spring since I’m working on this particular area at the moment myself.

You can find slides from our presentation at this link, under Session 413, and I’ve embedded mine below.  (And yes, it was a damn fine cup of coffee.)

[slideshare id=33005685&doc=20140130olardmfinal02-140401160730-phpapp02]

Research Data Management Highlights: Digital Infrastructure Summary Summit 2014 Summary Document

This week, a very significant document regarding the future of research data management and digital stewardship landed on my desktop. This is a PDF all academic librarians in Canada must read – whether or not you are tasked with RDM. If you are in IT, Research Facilitation, REB, Industry Compliance, or are a researcher or an administrator, then you should read this, too. It conveys the pressing importance of RDM to the profession, and it shows that we have an opportunity at hand if we take it – or a storm brewing if we turn it away.

The document is the Summary Report for the Digital Infrastructure Summit 2014This conference was hosted by the Leadership Council for Digital Infrastructure in January 2014. Group representation included CARL, CRKN, CANARIE, TC3+, and CUCCIO; in all 140 participants took part (p. 1). This document outlines the outcomes of the summit, which argued that RDM is lacking in Canada, that a sincere commitment to digital stewardship and not just technology is required to move forward, and the time to act is now (p. 1). If you are a Canadian academic librarian, download the document and read it now.

Note: I was not a participant of this summit and am only summarizing the PDF in regards to RDM in Canada for librarians. I’m standing on the shoulders of giants when I write this post.

This document asks What is Digital Infrastructure (DI)?, considers the existing problems that are hampering the development of an effective DI in Canada, and traces a clear path forward on which the Canadian research enterprise should move. Research Data Management and the people involved in it are front and centre in this document, and this means academic librarians and preservationists. The library has a significant role to play, and we are expected to contribute.

Digital Infrastructure and Soft Skills

One of the document’s biggest takeaways – and what I argue should be one of the first talking points you should use when discussing research data management – is that digital infrastructure (DI) is far more than technology alone. The executive summary states in clear, plain language that digital infrastructure includes “our ability to capture, manage, preserve, and use data . . . data are infrastructure, as are the highly skilled personnel who facilitate access to data, computational power and networks” (p. 1). DI requires “skilled knowledge management personnel” (p.1) who have technical capacity, but as we see elsewhere in the document, also can participate in local and national policy formulation and interpretation, understand project management, and have the capacity to collaborate and lead in their own field and in others. These are a suite of advanced “soft skills” that are concomitant IT knowledge and experience, and they are bound together with other essential criteria such as sustained funding and ongoing government and industry support, which allow research data management to flourish rather than wither on the vine. A successful solution that addresses near-team and long-team RDM issues requires skilled, committed resources on the ground who are leading the way. DI cannot be left to colleagues on limited term appointments or to our grad students. It demands institutional memory and it requires organizational vision.

I’ve mentioned the argument in the above paragraph in a post long ago, but I’ll take this opportunity to link out again. Chuck Humphrey states this in clear terms when he explains that RDM is the “what” and the “how”, and digital stewardship is the “who”, and both are necessary requirements in RDM infrastructure. If you are a librarian, then read Chuck’s website. If you are a Canadian librarian, then read it again. And again.

What’s wrong with Canada’s Digital Infrastructure?  

The Leadership Council has cut right to the chase in their document. They want you, the reader, to know right away that there are real issues affecting digital scholarship in Canada:

  1. Our research data are a national asset, and they are not stewarded properly (p. 1). Canada needs to get up to speed, quickly. It needs RDM and it needs it now. It requires data storage infrastructure it doesn’t have at present. It requires better skills training. It requires better software development. (Fellow Librarians: This is all about us.)
  2. There is very little governance and coordination (p. 1-2). There are many, many players, from funding agencies to libraries to standards organizations to researchers themselves. We are all trying hard to fix this, but we’re not working together. Our governance model is weak. Time is lost, efforts are duplicated, and we are spinning our wheels. (Fellow Librarians: This is very much about us. Get in there and make it happen.)
  3. There is very little federal policy regarding DI (p. 2). This is related but distinct from the second point. With little direction from government, the community is looking in all directions all at once. Greater coordination, planning, and sustained, reliable investment would be beneficial to the national research enterprise. (Fellow Librarians. This, too, is about us. Do. Take part. Take charge.)

I support it's not a blog post if you don't add a worldle.

How to act. How to improve RDM. How to solve this crisis.

Note: I am focusing on mainly on RDM and digital stewardship in this post; the original document gives equal attention to other areas such as governance, policy, and funding.

Research data management/stewardship is as yet the weakest link in the Canadian DI landscape, despite the massive increases in the amount of data being created daily through the research process. There is currently no agreed-upon strategy and/or the capacity to protect this valuable public asset, with little capacity to support access, use and reuse by a wide range of users. (p. 6)

The document makes a strong case not just for increased technical infrastructure but for greater knowledge management, project management, and policy analysis. We simply cannot allow ourselves to dump data files one after another onto a server and then hope that serendipity or an as-of-yet uncoded search algorithm will help us organize, preserve, and provide access to these files in the future. Research data – especially publicly funded research data – are a public good, and they require maintenance, management, and care.

The document highlights significant RDM gaps in Canada that must be addressed. These are:

  • Lack of a core RDM resource (p. 8)
      • Canada requires a national data service, which can lead in stewardship, policy, and education. RDC, CARL, and CRKN all have assets to contribute in this regard; RDC has shown incredible strength in this area already
  • Lack of strategy (p. 9)
    • Canada has no high-level strategy framework guiding debate and decisions on standards, infrastructure and distribution access networks, obligations to existing international agreements; funding
  • Lack of Policy leadership (p. 9)
    • Tri-Council should take the next step and implement RDM policy under consideration.
  • Weak RDM culture (p. 9-10)
    • The benefits that RDM brings must be better articulated.
  • Lack of understanding of Digital Infrastructure (p. 10)
    • It is incumbent that stakeholders demonstrate to the greater community that digital infrastructure necessarily includes the data, and the professionals who steward them
  • Lack of training (p. 10)
    • RDM training is inconsistent at present and must be improved in the short-term for practitioners and researchers alike
  • Weak policy on long-term data lifecycle management (p. 11)
    • Like any collection, data must be managed in part because its supports are not without cost. Management will include asking tough questions like what should be preserved, if we have the means and capacity to preserve it, and for what length of time. I recommend that we all have discussions about data collection policies as soon as possible. Locally, in our consortia, and nationally.
  • Lack of Storage (p. 11)
    • Storage capacity for all disciplines must be addressed. RDM is in no way an “X not Y” proposition. We must serve all discipline, departments, faculties, and researchers.
  • Means to foster acceptance (p. 11)
    • This is a tricky issue. We need our researchers to accept and be a part of RDM. Compliance should be required, but strict policies at the outset may prevent too much pushback. There will be give-and-take in the beginning.
    • Note: The original document refers to “compliance” here. I don’t want to use that term. Do we need sticks? Yes. Do we want to use them? Only if we have to. But from the outside, we must have the attitude that everyone is a partner in this venture.

Good data stewardship is not just a researcher’s responsibility, but it also needed at institutional, organizational, national, and disciplinary levels. (p. 10)

Making things happen and getting things done.

The LC provides a roadmap for action and results in its summary report from its 2014 Digital Infrastructure Summit. I am focusing on RDM-related activities and policy in this post since they are both so important to me, so I do encourage you to read the entire document yourselves to see the entire action plan.

The LC’s ways forward for RDM and policy include:

  • Maintain the Leadership Council and analyze its organizational structure (p. 17-18)
    • A steering committee is required and the LC has done a good job this far. That said, there are clamours and a need for greater representation. Consider increasing membership, developing an executive committee, form working groups, and establishing a Charter and Secretariat
  • Engage government (p. 19)
    • The LC had developed a strong community-driven response to RDM challenges. That said, push government – again – for improved coordination of policy and funding
  • Establish a national RDM network (p. 20)
    • Working on CARL and CRKN’s leadership and experience in this area, establish a network focusing on services, tools and tech, and education
  • Create an RDM pilot (p. 21)
    • Develop pilot discipline-based RDM programmes in three domains: astronomy, social sciences, and medical genomics
  • Coordinate with CRKN’s Integrated Digital Scholarship Ecosystem (ISDE) (p. 22)
    • Engage with this initiative that will enable next-gen library collaboration for seamless access, and improved infrastructure
  • Develop an RDM metrics pilot (p. 22-23)
    • For assessment, understanding performance

If you have made it this far in the post, then I offer you my congratulations. There is a lot of information to synthesize, but it is vital that academic librarians in Canada understand what is on the horizon for our profession, and what role will be expected of us. As this post shows, the work that follows – the opportunity we can take hold of – is as much resource-related and people-related as it is tech-related. To discuss digital infrastructure is to discuss the people who make it happen. Research Data Management doesn’t happen on its own. RDM requires careful planning, policy interpretation, technical capacity, and a thorough understanding of resource management.  

And yes, this is an opportunity for us. But we must be ready for what is to come. RDM will soon become the coordinated response to big data in Canada as it is elsewhere in the developed world, and it will mean work. But this is our work. It is our field. Take heed, take note, ask questions, and get set. Plan for this, and get set to play a leading role, because things are going to get busy.

tl;dr : read this now.  apply it to your work.

 

Required Reading, 9 January 2014

cropped-3554628032_97be3b3319_b.jpg

Required Reading:

  • Joyce, R. (ed.) (2013) : Research Data Management: Practical Strategies for Information Professionals
    • “This volume provides a framework to guide information professionals in academic libraries, presses, and data centers through the process of managing research data from the planning stages through the life of a grant project and beyond. It illustrates principles of good practice with use-case examples and illuminates promising data service models through case studies of innovative, successful projects and collaborations.”
  • Vines, T.H., et al. (2013) : The Availability of Research Data Declines Rapidly with Article Age
    • This link truly is required reading.  Vines et al. conduct a statistical analysis that shows the persistent decline in the availability of and access to research data as well as the lowered chances of finding a working PI e-mail address) over time in scholarly literature. This is the proof you can give to doubting Thomases about the need for proper research data management and digital preservation principles.
    • “Our results reinforce the notion that, in the long term, research data cannot be reliably preserved by individual researchers, and further demonstrate the urgent need for policies mandating data sharing via public archives.”
    • [Mendeley link]

Required Reading, 8 January 2014

cropped-3554628032_97be3b3319_b.jpg

Required Reading:

  • OCLC : Starting the Conversation: University-wide Research Data Management Policy
    • A call for action that summarizes the benefits of systemic data management planning and identifies the stakeholders and their concerns. It suggests that the library director proactively initiate a conversation among these stakeholders to get buy-in for a high-level, responsible data planning and management policy that is proactive, rather than reactive. It also addresses the various topics that should be discussed and provides a checklist of issues to help the discussion result in a supportable and sustainable policy.”
  • Chronicle : Born Digital, Projects Need Attention to Survive
    • “A team . . . often based in academic libraries or digital-scholarship centers-has to conduct regular inspections and make sure that today’s digital scholarship doesn’t become tomorrow’s digital junk.
      . . .
      Mr. Daigle advises scholars who want to pursue digital-humanities work to consult with their librarians and put long-term archiving strategies in place early on. ‘Think about the life cycle of preservation,” he says. “The more you do that, the longer it’s going to be around, and that is time well spent.'”
  • The Tyee : What’s Driving Chaotic Dismantling of Canada’s Science Libraries?
    • On the ongoing dismantling of government research libraries in Canada
    • “I saw a private consultant firm working for Manitoba Hydro back up a truck and fill it with Manitoba data and materials that the public had paid for. I was profoundly saddened and appalled.”

Thoughts on CARL’s Research Data Management Course

Last month, I attended CARL’s 4-day course on Research Data Management Services in Toronto. (Jargon alert: CARL is the Canadian Association of Research Libraries). This was an intensive week of collaborating on research data management (RDM) practices and creating a community of practice within Canadian academic librarianship. Our concern for sound RDM practices at Canadian universities brought together librarians with all kinds and levels of expertise so that we could share tools and develop action plans that will make a positive impact in this field.

1. Research Data Management, Data Lifecycles, and Research Data Lifecycles

What is research data management? I won’t go into textbook-detail suffice to say we’re talking about systematic practices that govern how research data are defined, organized, collected, used and conserved before, during, and after the research process. That sentence is a mouthful and it covers a lot of ground, so I suggest you look to Chuck Humphrey’s Research Data Management Infrastructure (RDMI) site for a more focused definition. Chuck is hailed in Canada for his data management expertise, and he led many sessions at the workshop. He explains that:

Research data management involves the practices and activities across the research lifecycle that involve the operational support of data through design, production, processing, documentation, analysis, preservation, discovery and reuse.  Collectively, these data-related activities span the stages of project-based research as well as the extended stages that tend to be institutionally based.  The activities are about the “what” and “how” of research data. (source)

Chuck’s website is a great introduction to the existing RDM gap in Canada, and we referred to it several times in the course. It neatly summarizes key information such as the shaky progress and history of RDM in Canada, where the Canadian RDM community stands in the world today, the differences between data management and data stewardship, and why the Canadian research community should focus its attention on building infrastructure to support RDM as opposed to building a national institution to guide it.

The Data Lifecycle (Source: UK Data Archive)

The Data Lifecycle (Source: UK Data Archive)

Beyond talking about what RDM is and isn’t, we spent a lot of time studying where RDM sits within the research lifecycle. Many people are familiar with the data lifecycle model since it introduces us to the many facets of data management, however, the CARL course proposed that we instead examine data management practices as an integral part of the larger research lifecycle. Rather than focusing only on data at the expense of the larger research project, the course facilitators asked us to apply RDM within the entire research process, using the following model from the University of Virginia:

Research Lifecycle (Source: UVa Library

Research Lifecycle (Source: UVa Library)

The salient point is that research data management isn’t limited to only the data life cycle; it affects the entire research process. (A simple example: data management strategies should be discussed well before data are created or collected.). Furthermore, if we want to develop sound RDM practices, we need to think like the researcher, understand the researcher’s needs, and include our work within their processes. If you’re not working with the researcher, then your RDM plan isn’t working.

2. Local RDM Drivers and Activities

If understanding what research data management is and where it affects the research process was one takeaway of the course, analyzing our local data environments was another:

  • RDM drivers, such as your library’s consortial collaborations, number of staff, existing IT relationships, administrative support, etc., are the parameters that shape and support your local RDM programme.
  • The activities in your RDM programme, meanwhile can be broadly categorized into the four areas: collection, access, use, and preservation (note: activities can fall into more than one category, and the order is not linear).

Discussing the things that affect our data landscapes and the activities we could perform helped us understand what is possible at our own libraries. I think a lot of us found this useful because all of our unique circumstances (e.g., library and university sizes, existing infrastructure and knowledge, etc.) can make RDM a bit nebulous at times. Although our focus is the same – RDM – our individual goals and aims might be different – are we building our technical capability, or are we designing soft systems that focus on relationships? Are we only collecting new locally created data, or will we also gather existing, completed projects?  The answers are going to depend on your local situation.

RDM activities within the research process.

RDM activities within the research process.

 The course facilitators were careful to help participants understand RDM as a necessarily scalable enterprise. Don’t create a monster RDM plan. Instead, contextualize your local RDM drivers and your library’s capabilities and desires so that you can mitigate the risks of creating an RDM plan that doesn’t fit your organization. The aim is to create a system and process that brings clear benefits to the researchers.

[youtube=http://www.youtube.com/watch?v=xos2MnVxe-c]

3. Planning… and Doing

The final takeway from the CARL RDM course, which you may have noticed I’ve been building up to, was straight-up, no-nonsense, get’er-done planning. The course facilitators built opportunities for real action into the course, which is probably one of the best parts of the week. Generally speaking, the academic enterprise undertakes a lot of talk and high-level planning before things happen.  This is often a good thing (read: I demand critical inquiry), but it can also stifle action (read: I despise institutional inertia). However, this CARL course found a way to bring together discussion and action. It gave us theory, but it demanded practice. Before the week was out, we had all talked about 3-year planning, considered how such a plan might look locally, and started to write one. Of course, these drafts aren’t ready for prime time, but my point is that before I came back to the office on Monday, I already had written the skeleton of a research data management plan that shows my library’s potential RDM activities and stakeholders, outlines activities and scopes, and offers timelines and deliverables. It didn’t make me an expert (and neither do I claim to be one), but it did offer some tools to help the library step out and make positive change.

So was the CARL RDM course money well spent? It sure was.  It’s not too often you come back from an event with a new community of practice, insight on a vital part of the research enterprise, and a plan to put everything in action. Hat’s off to the course facilitators for putting on such a great week – I think you’ve started something necessary, and good, for Canadian research.

(And some time next week, I’ll start gathering up some of the key readings from some of the bibliographies they presented us…  I’ll try not to turn the next post into a lit review, but it may come close to it.)