Big Data or Big Brother? Information collection at an academic library

Academic libraries are interesting places. Right now there is a huge push towards assessment and linking library use to student success. In the library world, library use goes beyond book checkout, and encompasses online use (whether on or off campus), instruction classes, tutorials, physical use of the space, reference questions asked, and the use of special services. So basically, anything library related that can be quantitatively or qualitatively measured. Student success can be defined in many different ways as well: graduation rate, graduation rate within a time frame, grades, or even how quickly a graduate lands a job. So why is this important to libraries? Well, if a library can prove that it is essential to the university and the university’s continued existence, if it can prove that it will increase (or at least maintain) the university’s reputation and attraction of new students, then the university will be more likely to allocate funds to said library. Yes. No matter how you dress it up, it’s all about the Benjamins.

Ok, it may not be completely finance-driven. I personally feel that I make a difference in the lives of students, faculty and staff. I get all warm and fuzzy just thinking about it. But am I really making a difference? And what if I’m wasting my time doing something when I should be doing the exact opposite? You can see why data-driven decisions are important.

But there is another factor in the works. Where do we get this data? From our users. This means monitoring what they do, what they check out, and how they use our services. This type of monitoring has become ubiquitous in today’s society. My Fitbit literally knows every step I take. My phone knows where I’m going before I do, and has a much better sense of direction. My Amazon account knows me better than my husband. As consumers, we have become accustomed to this kind of data collection and think it is the norm.

But libraries are a little different. All libraries, academic libraries included, have a strong commitment to intellectual freedom and against censorship. The very idea of being observed may create an influence on that behavior. One way to think of it is this: The book that goes missing from the shelves the most in our library is a very large book about human sexuality. We’ve had to replace it several times. It doesn’t get checked out…it just disappears. Our users don’t want to be seen checking out this book about human sexuality, because they don’t want to be observed by their peers and library staff. What happens, then to circulation when students know that we are observing their behavior? Will it change their behavior? Will they self-censor and only check out books they think are “appropriate”?

Now let’s move a step further into predicting user behaviour, or even steering user behavior. Is this appropriate for an academic institution where the focus is on learning and critical thinking? This has ethical implications that are not often considered in the business world.

The library I currently work for made a decision in 2004 when it changed its ILS system (integrated library system) to deliberately blind itself to certain kinds of data. We are unable to access records of patron activity once a patron has returned an item (as long as there is no fine, etc). So while we can see that John Doe has checked out 23 books since the account was created, that is the only information we can access. As someone who wanted to attempt to make a simple recommender system, similar (but more primitive) than what you would find on Amazon or Netflix, this was both frustrating and disappointing.

But there was a history there. The decision was made in the wake of 9/11, with the passage of the Patriot Act and the fear of government surveillance. This is not an unfounded fear, even up to today, although as of 2016, section 215, the “libraries provision” was allowed to expire. But even so the prevailing method of prevention in this case was to sidestep the issue. If we don’t collect the information in the first place, then we can’t provide it. This is not standing up for what is right, this is avoiding the issue, and avoiding trouble.

There is nothing wrong with wanting to avoid trouble! But this avoidance is hamstringing future possibilities. We are basically saying as librarians that we cannot be trusted with data.

My library director, my supervisor and I have had discussions about data collection, most recently in relation to a change in our interlibrary loan article service. We are working in cooperation with other libraries, and as such had to make data collection decisions together. I, of course, recommended to get the data in its raw form, before it had been anonymized. I was asked the question, why would we want to get the requesting patron’s identification if the details are just for statistical purposes. I replied the following (via email…I’m not this rhetorical in person):

I’ve been thinking about this for a while now… and while I understand and respect the historical reasons for the reluctance to collect this kind of information (Patriot Act, etc), I think we need to also look at what is happening with data all around us every day, and have a conversation about what we are potentially missing out on by not collecting this kind of data.

In the past we’ve had patrons that request the exact same article multiple times- this may be a way to identify these patterns in a systematic way. This information also could be useful for collection development purposes: This journal has been requested 8 times…by the same person. This journal has been requested 7 times…by 7 different people. I would think that this would make a difference. It wouldn’t have to be the patron’s name, but some unique identifier.

This can be used for data analysis purposes as well- Is there a correlation between student success and library usage (in this case interlibrary loan usage)? Jane Smith has ordered xxx articles and has xxx GPA. John Doe has ordered 1 article and has xxx GPA. Is it totally necessary? No, but I would argue that by collecting this kind of information we can answer some of the broader questions that allow us to justify budgets.

Much of our consortial reporting is currently confusing. If the information is found one way there is one answer and another way, the numbers can be completely different. But currently we are unable to identify why the numbers are so different- the process is opaque (even after speaking with consortial staff). By being able to track a single person, it may be easier to see where the discrepancies and variances happen, and thus get a more accurate report.

I would just like to add that I strongly urge us to recommend this, but considering how conservative the cluster has been historically about collecting data, I don’t anticipate it being adopted. This could start a conversation, however. As universities rely increasingly on metrics and data collection, as public libraries use past checkouts to drive recommendation systems, and as society grows accustomed to data collection by corporations in our everyday lives, why do we not trust that entities such as academic libraries can’t be ethically responsible with this type of data collection? Why do we continue a “no touch” policy when this information could be used to impact budgets? We are missing opportunities to demonstrate our relevance, get buy in and help our students, faculty and staff by building recommender systems, or even just giving our users access to their past checkout history for their own edification.

We can’t just ignore what is happening with data and data collection today. If we do, we will be in danger of becoming obsolete in order to maintain the status quo. But it is our responsibility to consider the ethical implications that data collection may have on user behavior. It is important to think about and come up with a plan. Can we foresee unintended consequences? Who is responsible for data security? What kind of information will we make publicly available? These are all considerations that need to be taken into account.

Posted in Data Science, etc., Library Trends | Leave a comment

Organizing the Badass Networker: Part 3 of 3

In the past couple of posts I’ve described my changing attitude towards networking, and even discovered that I’m not the horrible networker that I had once thought. But taking all that learning and translating it into action, that is always the most difficult part.

In one part of the Networking for Success class (shout out to Alana Muller, instructor extraordinaire), we had one activity where wrote down a few people who were already in our network (or in other words, our friends and colleagues). I wrote down people with whom I would be comfortable contacting, but that I didn’t necessarily stay in touch with all the time. I thought that this would be a great time to rekindle friendships and get back into touch with colleagues who have fallen out of touch. I plan on trying to connect with one of them a week.

One connection a week is pretty much the lowest bar I could set. And the first two days after that class, that was not my initial goal. Riding high on endorphins and caffeine, I was ready to go out and network the pants off of Kansas City. And then two more days went by and I hadn’t even started the reflection assignments due for the class. Two more days, and nothing but some guilt. So I realize that I could, and possibly should, be more ambitious, but I also want to actually follow through.  This means setting reasonable, attainable goals.

I recently listened to a fascinating podcast about David Halpern and the Nudge Unit in the United Kingdom (I would link to the podcast, but I honestly can’t remember if I heard it on Freakonomics, the Ted Radio Hour, or one of the data science/machine learning podcasts I listen to), and then I read Amy Cuddy’s book, Presence, which talks about self-nudges. I really want to improve the communication with people in my life and the future people I haven’t met. It would be so easy to take all the enthusiasm I have right now and make very elaborate and detailed plans. So, so easy. (Actually, I’ve already done it.)

But I don’t want to create a plan that would be completely unmanageable. So instead, I’m going to work on small behavioral changes, and nullsetting up systems that make it easy for me to follow through on my plans. In order to make it as easy as possible, I thought that I could create a database (decided to use Air Table because I like their mobile interface…yes, still using it!), and set up a system in Zapier so that it automatically connects with my calendar so that I can get updates and reminders. Automation is a wonderful thing, my friends. It does the heavy lifting of remembering dates, times, and information so that I can do the actual connecting.

A side note: I thought about using some of my coding chops to create my own program that would do this work instead of using these third-party sites, and I might still eventually, but this gets me up and running right now, using programs I am familiar with.

I’ve included an example of my networking database. All names and information is fictional, of course. So that I don’t forget, I named the database “What Can I Do For You” and used a coffee mug icon. The picture is of the web version, but there is also a phone app, so this information will be available to me on the go, as well.


The best part is that when going to a networking event or conference, I can connect the person to the event. It should be easy at an event to find a spare moment and use my phone to create a new “card” with someone’s name, then go back and put in all relevant information later. If there is a person with whom I want to connect, I can add them beforehand. Also, instead of having a separate app for cards, why not take pictures of business cards and upload it to a person’s file as well? I’m still in the tweaking phase with this, but once the structure is set up, adding and updating people is a snap.

It has been a few weeks since I’ve taken the class. I will admit to cheating a little with my networking goals- there were two weeks where instead of reaching out, I instead followed up with incoming networking requests to me. In the past, I may not have been quite as proactive with my response and attempts to connect, but now I recognize these requests as what they are: opportunities to help other people and grow.

What system do you use to organize your networking?  If you have a system that works, I’d love to hear it!


Posted in Networking | Tagged , , , | Leave a comment

Networking is Badass: Part 2 of 3

In my previous post, I talked about networking course I attended through Rockhurst University. This course was presented/instructed by Alana Muller of Coffee Lunch Coffee.

I walked into class on Saturday with a giant cup of caffeine. We were supposed to be there at 8 a.m. and some of those brown-nosers known as my classmates got there at 7:30. Actually, that was a really smart move, especially if they wanted to talk with Muller one-on-one, and I’m jealous because I thought of it, but…Saturday. Don’t ask for miracles.

The first thing we were tasked to do was to reflect on the previous day’s session and our thoughts about networking. I had a mental image of networking as a gray-suited transaction, with a power element involved. Kind of hard and cold. Or like that one scene in Jerry MacGuire. Not THAT scene. The one where Tom Cruise is networking. Eh. Perhaps I need to watch that again.

But Alana Muller stressed that networking was just the opposite of my impression. She stressed that the idea was to create opportunities to connect with interesting people and have interesting conversations. The connection may lead to opportunities, or it may not, but no connection was wasted.

And looking at it this way, I actually do an amazing amount of networking, I just never realized it. I called it having friends. I called it reaching out to my professional community. I called it coaching or mentoring, something I’m very passionate about. Connecting with my university community- students, faculty and staff. Building community. Caring about people. Reciprocity. Being nice. I called it anything but but that horrible “networking” word.

The weirdest part? I’m really of good at it. I think meeting strangers is always intimidating, but I don’t meet strangers at work. I don’t meet strangers at library conferences. I don’t meet strangers while taking classes. I meet library users (or potential library users), colleagues and classmates.

During this workshop there were panel discussions, which I thought were helpful because even though there was a lot of agreement with Muller about the idea of networking, there were also distinct differences in how individuals approached networking. The differences had to do with age, personality, and what industry each person was involved in, and so I was able to see how approaches vary. As in so many things, there is no one right way.

One of the panelists, said that when you are meeting with someone, especially when you have an “ask”, or are going to ask them to provide you with something (help, contacts, what-have-you), you should first ask the question, “How can I help you?” This one phrase encapsulates my new understanding of networking.

And when I think of networking like that, it’s easy. It is in keeping with my personal values, with a core tenant of my teaching philosophy, with my experience and inclination as a librarian (we LOVE to help people), and even with the Jesuit values of the institution where I work (cura personalis, anyone?).  I do it every day, and it is so simple and easy to extend that out to other aspects of my life.

The practical components in this course were great. I know that incorporating some of these ideas into my life will shine up the rough spots, and help me to find connections with people in a natural way. But taking my worldview and subverting the dominant paradigm of “networking is horrible” into “networking is about real connections and real people?” That is just badass.

Posted in Networking | Tagged , , | 1 Comment

Networking is a Vampire: Part 1 of 3

I call myself an introvert, and people are usually very surprised. I like talking to people. I talk all the time. I have no problem speaking up and asking questions (actually, I have to restrain myself). I actually enjoy getting up in front of a crowd of people and giving a rockin’ presentation. I have been shushed in the library, people, and I’m a librarian! So why do I consider myself an introvert? To me, extroversion is where a person becomes energized by being around people, and introversion is where she is drained by it. I have been known to recharge by spending an entire weekend without saying a word. But more than that, I know that I am an introvert because I hate, hate, hate networking.*

Networking is awkward. It’s exhausting. And I’m horrible at it. Talk to random strangers at networking events? Nope. And because I apparently like to throw myself into stuff that I am bad at, last weekend I spent Friday night and Saturday taking a networking class.

Imagine this: it’s Friday afternoon, and this is the first day of sun after two weeks of rain. I’ve had a full day at work, and all I want to do is go and sit outside on the patio with a book and maybe an alcoholic beverage. I’m tired, I’m cranky, and I’m going to spend 3.5 hours in a big room full of other tired and cranky people.

But somehow, I walked out of the Friday class energized, and feeling that maybe, just maybe, networking is not the horrible, soul-killing endeavour that I had thought it was. The great Alana Muller of Coffee Lunch Coffee brought a ton of presence to the workshop and positivity to the subject. Her energy was off the charts, and she changed my attitude around.

Muller was really great at breaking down the term “networking” into achievable components: here’s how you put together your introduction, here’s how you get started with contacting people. I feel like I took away some tangible tools that I can work with, but I know that putting them into practice will be the difficult part.

On a side note, some people in the class were already really good at networking, which was surprising. They were involved in networking every day because they were recruiters or in sales. Hearing these experts talk about feeling awkward, or needing areas to work on was a bit of an eye-opener to me. In retrospect, it seems silly. If your livelihood depends on connecting with other people, why wouldn’t you jump at the chance to work at it?

We broke for the day, and I resolved to immediately do the “homework” that she suggested we do, which was basically internet creeping on people with whom we were going to network at an event the next day. I went home and promptly fell asleep on the couch, because introvert + people = really tired.

*People networking. Not computer networking…that’s kind of cool.

Posted in Networking, Uncategorized | Tagged , , | 1 Comment

When You Ghost Your Own Blog

Hello all my dedicated readers. All three of you. You know who you are. (Hi mom!) So maybe it has been a year and a half since I have posted. So maybe the last time someone visited this site was in February (and I think it was me). So maybe I’m guilty of letting this blog languish like my high-school diary, and any crochet project I’ve ever attempted. But I have good news! I’m back and ready to begin again.

I feel like I owe an explanation for my abandonment, so here it is: I decided to double-down on my MBA. As an employee at Rockhurst University, I was able to take Rockhurst up on it’s incredibly generous tuition remission program in order to work on an MBA with an emphasis in Data Science, and that decision was awesome, but it took on a life of its own. A new degree program, the Master of Science in Business Intelligence and Data Analytics opened up and I took the opportunity to go the dual degree route. Because why get one degree when you can get two! In order to graduate within a reasonable time frame, I had to take more hours than I was planning. And quite frankly, some of the classes were extremely challenging and took an inordinate amount of time.

Google the term data science and one of the first things you find out is that is one of the fastest growing fields everywhere. This actually includes libraries. And it’s super cool, and apparently one of the sexiest jobs for three years running. Who doesn’t want a little more sexy?

A Data science librarian is an actual position. (I know, right?) Does a person have to go to school to learn this stuff? Actually no- there is a lot of information out there on the web. Did I have to go to school to learn this stuff? Yes. Oh yes. One thing that I have learned about myself is that I’m deadline-driven, so if there is not some kind of external accountability structure, it doesn’t get done. Cough. <this blog> Cough.

So in a nutshell, I was busy. I was beyond busy. As I’m growing in my R and Python coding and learning about predictive models, text analysis, etc, I’ve been able to apply that to my job. This is beyond amazing, and I hope to discuss these projects at a later date, but working on them was (and is) time-consuming. Not only was I learning when and how to use different algorithms, but also how to translate what I did into plain English. Predictive modeling, text mining, and Hadoop, oh my!

I have learned so many things that I feel like my head is going to explode. I’m afraid to talk about all the ideas about working with data that I have, because if what if I talk and the ideas start piling up and I start talking faster and faster and as I get excited my voice gets higher and higher and all the sudden there is a whoosh, some glitter and I turn into a giant chipmunk? And more importantly, will I be a sexy data science chipmunk? But I’m almost at the end of this crazy journey with my sanity (almost) intact. I am graduating in May, and I am starting to look ahead into what is next. It is going to be a wild ride.


A note: There are also some personal reasons that occurred over the summer and in the fall of 2016. I hesitate to get to personal online, but this was something that I needed to hear, and so I will pass it along in hopes that it will help someone else: If you have to deal with the deaths of close loved ones within a couple of months of each other, the second one will be exponentially worse than the first. It has to do with your grief and the grieving process, NOT whether you loved one person more. There is nothing wrong with you.

Posted in Data Science, etc. | Tagged , , | Leave a comment

Airtable, A Cool New Toy

The other day, I was diligently working. Yes, when you are an academic librarian, you can consider cruising the internet as work.  It’s a perk.  Anyway, I was online and I found an interesting new toy called Airtable.  

Airtable isAirtable a database that is tailor-made for people who don’t know databases, and a great way to get an intuitive feel for how relational databases work.  Hold up my fellow librarians!  I am not talking about the databases that we use every day, otherwise known as the catalog, or Academic Search Complete, or any other Ebsco, JSTOR or Proquest database.  Of course we know how to get knowledge out of these databases (or at least how to use the help function).  But there are databases, and then there are databases.  Although really when we define a database, it boils down to a bunch of organized information.  That organization can be text files or an Excel spreadsheet, or it can be much fancier like Oracle or MySql.   

When I say people who don’t know databases, I mean people who don’t know how to create databases or use fancy query languages in order to extract the information.  And again, librarians, when I say fancy query languages, I’m going a little beyond the realm of Boolean Operators.

The reason I’m excited about Airtable is because you can create little databases that can be accessed from any device.  When you create a database, it looks and feels like an Excel or Google Sheet spreadsheet that we are all familiar with using, but it has more advanced features so  you can really get some functionality.  You can add records and values to your list using the same keyboard shortcuts you’re used to with Excel. However, Airtable also offers additional features that aren’t possible with typical spreadsheets, like expandable note fields and file attachments. Another interesting feature that Airtable provides is the ability to create different tables that link to one another.

So for example, I am testing out the database to use it as a repository for photos taken in the library.  Library staff take pictures all the time during events, for marketing purposes, or just to mark changes the layout.  In a small library, everyone does a little bit of everything, and that means that we have several different people who have a ton of random photos in various places.  Some are on computers, some are on phones, some are in shared drives; it’s a total mess.  I wanted to create a repository for the library photos, but be able to cross-reference them with different events, or sort via where or when the picture was taken, or even whether it has already been posted on facebook.  


I created a database where I uploaded a picture or a grouping of pictures to each entry, gave it a somewhat arbitrary descriptive title and then tried to do some description (yes, we are talking about controlled metadata).  It is really easy to attach pictures, and you can attach items from your computer, a url, your phone, Dropbox, Google Drive, Facebook, and the list goes on.  



A lot of our pictures centered around events, so I made a separate tab for the Events, and then when it fits, I link the record to the event tab entries and add groupings of pictures to each event.  So if our Christmas Reception happens every year, I can describe it and attach pictures from this year, last year and the year before, with the click of a button.


But what is really cool about it is the amount of control I have over the database.  I can grant access to specific people or to anyone to be able to upload photos.  If I were feeling brave and optimistic, I could share a link or embed this form on our website and invite anyone to upload library photos!  And yes, it’s a really boring form, but you can customize the questions you want to put on it, or descriptors.   I can also share the database with anyone and determine the amount of access they have, whether it’s view only, or whether they can edit the information or create new columns, etc.

Also you can export all your data to a .csv file, or share the API to do other cool data science things to it.

If you have an iPhone, there is also an app that lets you access Airtable, and let me tell you that app is pretty cool, although I wish that it had the ability to upload several photos to a record instead of making me upload them one at a time.

Oh and the best part?  The price.  It starts at free.  Since this is a relatively new company/product, I don’t know how long that will last, but until then, the party is on!


PLEASE NOTE: I was not paid in any way, shape, or form for this review, although Airtable people,  if you are listening, please know that I would not have any objections to that.

Posted in Tools | Tagged , , | Leave a comment

Missouri Public Libraries: An argument with data and graphs. Part 2

Locally Sourced Libraries and State Funding Issues

In my last post, part 1 of the Missouri Public Library: An argument with data and graphs series, I tried to show through library coverage areas and usage how much Missourians, especially children, utilize the library when it is available to them.

But how much do all these fabulous services cost?   The average Missourian pays about $46.58 per year in taxes for the privilege of having a library in their community.  That is about 13 cents a day.  And that pays for 7.7 visits to the library, 11 checkouts of books or other materials, 1.3 internet uses and .7 reference transactions.  That is a lot of bang for your buck.

At this point it’s pretty important to note how libraries are funded.  Keep in mind that I am discussing public libraries only, and not academic libraries, school libraries coving grades K-12 (although there are some public library drops and stops in school buildings), or special libraries like prison libraries, private libraries such as the Linda Hall Library, and presedential libraries. I bet you didn’t know there are so many libraries in Missouri!

The good news is that public libraries are some of the most cost effective public institutions in the country today. You can check the value of your favorite library here.

Public libraries are the ultimate locally sourced resource. As shown below, in 2014, about 97% of funding was local, and only 3% was from state and federal. But these numbers are lazy statistics and don’t actually tell the whole story (I included a graph anyway, because, well, never let a graph go to waste).


For the 2015 fiscal year (that’s this year), the state legislature voted to decrease funding in state aid by about $3.3 million or 82%. Let me say that again. 82%.  14

This is a huge decrease. And Jason Kander, that guy I wrote about a few years ago?  He stepped up and supported/recommended that the full budget be approved. So he gets a librarian shout out: Hey Kander– good job!

What does this really mean? It’s actually part of a trend that started with the recession and has continued.  From 2009 to 2015, you can see library funding per capita dropping each successive year.  The median amount of state funds spent on libraries was $0.67 per person. In FY15, you can see this sad outlier on the bottom of the graph, where the amount drops to approximately $0.12 per person.


But that’s ok, right?  It’s not like state funding has that much of an impact on library budgets, because it’s only 3% at most of the total budget.  Actually, it’s not ok, and it does have a larger impact on the budget.


The devil is in the details. When you look at the distribution of library populations, you can see that the vast majority of libraries service a population that is under 100,000.  Only a few libraries services larger population areas. These smaller, rural libraries are small town America.  They are  less well funded because as the library populations are smaller, there are fewer people to contribute to the local tax base.

These smaller libraries disproportionately rely on state aid, up to an average of 4%, where the larger libraries rely on state funding as a percentage of their overall budget less and less on state aid as their population area increases. Ironically, the larger libraries get the larger portion of the budget, because funds are distributed per capita.

17Another way to see this is by looking at a scatterplot of the library systems. 


Some of these smaller libraries with a total income that almost doesn’t scale relies on state aid for 25%, 21%, 16% of their overall budget. Below is the same graph that eliminates any library with an annual income greater than $1.5 million. And, by the way, check out how many libraries have an annual budget under $100,000 a year.


Any cut in state funding will have a drastic effect on these smaller, local, and mostly rural libraries. These libraries may not close, but they will offer fewer and fewer services. They may cut their hours, not buy new books, or eliminate their computer services.  They may cut staff (who, by the way pay taxes and help keep the Missouri economy running), or eliminate programming.

There are few public programs that provide so much good for such a little amount. The state’s legislature failure to support public libraries is incredibly short-sighted, and ultimately reflects poorly on them. But it is not the state legislature that will suffer, it’s the next generation of Missourians who will not have the support structure their parents had when it comes to early literacy and learning to love reading.  It’s librarians and library staff who may find themselves out of jobs or with reduced hours who struggle contribute to the local economy.  It’s Missouri that will ultimately pay the price, as people move out of state in search of better opportunities.

Note: All of the data about Missouri Library usage came from this wonderful Missouri State Library statistics webpage. The data is provided in pdf AND excel format, so thanks Missouri State Library, for being so awesome. All graphs and charts are made from Tableau.

Posted in Uncategorized | Leave a comment