Tag Archives: Newspaper digitization

Fair Use and Copyright Research for Newspaper Digitization: What You Need to Know

This article is based on a talk I gave at the Digital Public Library of America’s DPLA Fest conference on September 21, 2018.

Disclaimer: I am not a lawyer and this is not professional legal advice. This article is for educational purposes only. Please consult counsel concerning any potential digitization projects your institution is interested in pursuing.

Introduction

Good afternoon. Thank you very much for attending this session. I’m Justin Clark, Project Manager of Hoosier State Chronicles, our state-wide historic digital newspaper program at the Indiana State Library. We are a part of the National Digital Newspaper Program (NDNP), a joint venture between the Library of Congress and the National Endowment for the Humanities. To date, we’ve digitized nearly a million pages of historic Indiana newspapers, of which over 300,000 have gone into NDNP’s Chronicling America database of nearly 14 million digitized newspaper pages from across the county.

When digitizing historic newspapers for NDNP, one of the most important things to consider is whether the paper is under copyright. You could have picked the perfect title, had it approved by your institution, and completed all of the arduous work of collation, but if you don’t check its copyright status, your work could all be for naught. This is why a basic understanding of fair use, the public domain, copyright, and conducting copyright research is essential to any newspaper digitization project. This talk will provide a general overview of what fair use is, how it relates to newspaper titles, and how you can complete the necessary research to ensure your desired title for digitization is acceptable. Doing this work gives you not only an expanded scope of potential titles for digitization, but it also provides peace of mind that you won’t hear from any lawyers in the future, besides your institution’s counsel, of course.

Now, before we begin our stroll through copyright, I must say this. I AM NOT A LAWYER . . . nor have I played one on TV. This talk is only an educational overview of what I’ve learned about copyright research for digitizing newspapers. Other materials such as photographs, 3D objects, and written documents may not follow the same procedures or guidelines. It is imperative that you consult your institution’s legal counsel before making any concrete decisions to digitize anything. This saves you a visit from an irate lawyer who is upset that you’ve digitized materials that are still in copyright. And this little disclaimer saves ME a visit from an irate lawyer who got the call from the other one about copyrighted materials. In short, the only lawyer you want visiting your office should come from your institution. Now, with that out of the way, let’s start with fair use.

What Is Fair Use?

The Fair Use Logo, Wikipedia.

In the United States, copyright holders possess considerable legal rights for the protection of their intellectual property. This is a great thing – copyright holders can use their hard work to ensure an income and that scammers will keep their greedy hands off of work that doesn’t belong to them. But there are exceptions. One such exception to US copyright law plays a vital role in our emerging digital landscape: fair use. Fair use, according to the U.S. Copyright Office, “is a legal doctrine that promotes freedom of expression by permitting the unlicensed use of copyright-protected works in certain circumstances.” Essentially, fair use allows someone to use a copyrighted work for a completely different purpose than the copyright holder originally intended, which usually falls in the categories of “criticism, comment, news reporting, teaching, scholarship, and research.” These protections fall under Section 107 of the Copyright Act.

To determine whether or not a use of a copyrighted work is fair use, four general guidelines are followed. The first is the “purpose and character of the use.” Most of the time, if a person is using a copyrighted work for non-profit and/or educational purposes, it generally falls under fair use. This is especially the case if the use is “transformative” meaning that it “add[s] something new, with a further purpose or different character, and do[es] not substitute for the original use of the work.” In NDNP’s case, taking a newspaper which was originally created for immediate public consumption at a profit and transforming it into a digital historical artifact at no cost to the researcher usually falls under fair use. This guideline is not ironclad; sometimes, a copyright holder will object to their work being used in this way. Nevertheless, this guideline is generally applicable to NDNP and newspaper digitization as a whole.

Second, the “nature of the copyrighted work” is considered when determining fair use. This guideline is a little harder to pin down, but it basically means whether or not your use of a copyrighted work is too close to the original to be considered fair use. Specifically, “using a more creative or imaginative work (such as a novel, movie, or song) is less likely to support a claim of a fair use than using a factual work (such as a technical article or news item).” For our purposes, taking informational works such as newspapers and digitizing them for researchers changes the nature of the work, from a paid periodical into a free primary source document. In most cases, this would count as a fair use.

Third, the “amount and substantiality of the portion used in relation to the copyrighted work as a whole” plays a role in deciding fair use. In other words, if a person just blatantly copied the entirety of a copyrighted work and then sold it for their own benefit, it would not be fair use. However, for material that falls under the public domain (more on that below), recreating the entirety of the work is more than fine and falls under fair use. NDNP projects often have syndicated columns and cartoons that are copyrighted but the newspaper as a whole is not copyrighted. In those instances, the amount of non-copyrighted work outweighs the copyrighted work and the digitization of a newspaper is then considered fair use. We will unpack this more in the copyright research section.

Finally, fair use is determined by the “effect of the use upon the potential market for or value of the copyrighted work.” Put simply, does the use of a copyrighted work ruin its value in the marketplace? In the case of digitizing newspapers, a newspaper’s value stemmed from its original sale date, which was years or decades before. If a newspaper title is already in the public domain, its original market value is already gone and can be used by others in a myriad of ways. For NDNP projects, turning a newspaper into a primary source historical document does not destroy the market value of the original paper nor does it harm copyrighted works therein (syndicated columns and cartoons). Potential researchers are using the digitized newspapers for scholarly purposes, not for the resale of copyrighted material. As with the other three guidelines, the “market value” guideline is generally met.

This overview of fair use is not exhaustive. Definitely review material on fair use from the U.S. Copyright Office and the Copyright Alliance for more information.

What is “Public Domain”?

Public Domain Logo

Alongside fair use, a clear conception of public domain is essential for working on NDNP-related projects. Works in the public domain, according to the Stanford University Library, are:

. . . creative materials that are not protected by intellectual property laws such as copyright, trademark, or patent laws. The public owns these works, not an individual author or artist. Anyone can use a public domain work without obtaining permission, but no one can ever own it.

A work enters into the public domain via three avenues: it can’t be copyrighted (i.e., titles, names, facts, ideas, government works), the creator of the work places it in the public domain, or its copyright term has expired. With NDNP, the last of these three is the most important.

Have you ever wondered why the vast majority of NDNP’s content, and most digitized newspaper content, ends around 1923? It’s for a very simple reason: all works published in the United States before 1923 are in the public domain. No copyright research is necessary for this material; it’s free and clear for you to use. However, NDNP announced in 2016 that it has expanded its date range for newspaper titles, from 1836-1922 to 1690-1963. Thus, post-1923 works are in the public domain if a copyright claim was never filed from 1923 through 1977 or if the copyright was never renewed from 1923 through 1963.  All NDNP projects that follow these public domain guidelines will easily determine if their potential title is ready for digitization.

To learn more about public domain, visit these online resources from the Stanford University and Cornell University libraries.

Conducting Copyright Research

Now that you know how fair use and the public domain work, you can begin the necessary research to determine the copyright status of a newspaper title. Here in Indiana, we wanted to know the copyright status of one of Indianapolis’s premier papers of the 20th Century: the Indianapolis Times. The Times ran from 1888 (when it was titled the Sun) until 1965, a pretty impressive run for a daily metropolitan newspaper. From 1922 until its end, the Times was owned and operated by Scripps-Howard, a major publishing corporation based out of Cincinnati, Ohio. Knowing that such an influential publishing company owned the Times from 1922 until 1965 put an increased responsibility on us to make sure that the paper was either in the public domain and/or that its digitization would be considered fair use.

Indianapolis Times, October 11, 1965, Indiana State Library Newspaper Microfilm Collection.

To figure this out, we examined its copyright as a complete title as well as the copyright of individual articles and/or syndicated content, to get a sense of how much material within the newspaper was copyrighted. Three resources allow you to complete this research: the Catalog of Copyright Entries (1906-1977) (published by the Library of Congress), the Public Catalog of Copyright Entries (1978-present) (online; published by the Library of Congress), and the Indianapolis Times newspaper microfilm collection (courtesy of the Indiana State Library).

Catalog of Copyright Entries, Internet Archive.

The Catalog of Copyright Entries (1906-1977) is available at Internet Archive (www.archive.org) in a readable, PDF format. It comes with Optimal Character Recognition (OCR), so it is text-and-word searchable. To begin, view the 1923 Catalog of Copyright Entries, Part 2, which provides the copyright and copyright renewal for all periodicals published in the United States that year. For all the following years, look for the volume devoted to periodicals. In the search field, type the name of your title. If nothing comes up, search the catalog’s index for the title. If nothing is there, check the title within the book in the new copyright section as well as the renewal section. If nothing comes up, your newspaper title filed neither a new copyright nor a copyright renewal and it is in the public domain. Consult all remaining years of the catalog (in the periodical section) for any new copyright notices or copyright renewals. If you do find that your title was published with a copyright notice and a renewal from 1923-1963, it is not in the public domain and will remain under copyright for 95 years after the publication date. However, if the title was published from 1923-1963 with an initial copyright notice but was not renewed during that time, it is in the public domain and you are free to digitize.

Catalog of Copyright Entries, Library of Congress/Internet Archive. This is an example of the periodicals section of the catalog.

If you need to check anything after 1977, use the online Public Catalog of Copyright Entries, which covers 1978 to the present. This search is much easier than combing through the scanned versions at the Internet Archives. All you have to do is type in your title in the search bar; if you get no results, no copyright renewals were filed and you’re good to move forward. If there are copyright renewals, the title will remain under copyright for 95 years after its initial publication date.

Online Catalog of Copyright Entries, Library of Congress.

For our research, we started with 1922, the year that Scripps-Howard Newspapers purchased the Times and the final year it could have been in the public domain (this research was done in 2017, before the public domain covered 1923). According to listings in the Catalog of Copyright Entries and the Public Catalog of Copyright Entries, Scripps-Howard Newspapers never filed the Times for copyright between 1922-1965 or for subsequent renewals from 1965-present. Therefore, the Times as a complete newspaper is within the public domain and eligible for digitization.

Online Catalog of Copyright Entries, Library of Congress. A search for “Indianapolis Times” yields no results, which means that its copyright was never renewed after 1978.

But your search doesn’t end there! The copyright of individual articles and syndicated content also needs to be established. Library of Congress policy for NDNP has generally been that individually-copyrighted content within the “context” of an entire newspaper in the public domain is not a problem, so long as it doesn’t account for over 50% of the entire work. This rule is a recommendation and not an absolute policy. It is still up to you as an NDNP awardee, your institution, and your legal counsel to establish the proper procedures for such content.

Start with the scanned Catalog of Copyright Entries at the Internet Archive. However, instead of viewing the volumes devoted to complete periodicals, look at the volumes usually devoted to books or pamphets. These volumes include copyright information on individual pieces published in periodicals.  Then search the online Catalog of Copyright Entries. Remember to check for both an original copyright notice and a copyright renewal. As with the newspaper title as a whole, if the article was published with a copyright notice and a renewal from 1923-1963, it is not in the public domain and will remain under copyright for 95 years after the publication date. Additionally, articles published from 1923-1963 with an initial copyright notice but no renewal are in the public domain and you are free to digitize.

Catalog of Copyright Entries, Library of Congress/Internet Archive. This is an example of the book and/or pamphlet section of the catalog, where copyright information on contributions to periodicals is located.

With our research of the Times, one type of syndicated content that showed up right away within copyright research was the Sunday supplemental, with PARADE magazine being an applicable example in the Times. From 1963-1965, PARADE was published with Sunday issues of the Times; it was copyrighted when it originally ran (and included in the Catalog of Copyright Entries) and was subsequently renewed (and included in the Public Catalog of Copyright Entries). As such, we decided not to include this supplemental in our NDNP deliverables. Regarding individual articles, we found 32 copyright listings in the Catalog of Copyright Entries from 1922-1965; only the initial copyright was listed and no renewals were found. These were then cross-referenced in the online Public Catalog of Copyright Entries to check for post-1978 renewals; none were found. These articles accounted for less than 10% of the entire field of research, way less than the more than 50% threshold for fair use. (So long as you consult your institution and its legal counsel.)

An example of PARADE magazine’s copyright notice from 1964. Supplementals like this are not in the public domain.

Now that you’ve thoroughly gone through the Catalogs, it’s also good policy to review the title’s microfilm. Here’s what we did. We chose three reels from each decade of the Times from 1923 to 1965 and scoured them for copyrighted content. We concluded that the vast majority of material on these reels fell within the public domain, in keeping the Times’s policy on copyright. As for what was copyrighted, it was mostly advertisements for still-existing products (Columbia Records, Bayer Aspirin), syndicated cartoons (individual cartoons scattered throughout the paper as well as one full page an issue), serialized fiction, and syndicated columns. These materials contained a copyright symbol and text, indicating its status. We concluded that these entries constituted a small minority of the newspaper content and largely will not affect the proprietary interests of the copyright holders (seeing as the content in question was digitized from second-generation microfilm, which itself come from first-generation preservation microfilm based photographed pages; the loss in resolution and quality should not urge copyright holders to pursue legal action). You can do more or less with your title’s microfilm than we have, but this should be enough to establish a broad consensus on your title’s copyright status.

A Bayer Aspirin ad from 1925. This was a copyrighted aspect of the Indianapolis Times that we reviewed when combing the microfilm collection.

Once you’ve done all of these procedures, it is best to draft a full report of your research and findings to your NDNP advisory board, as well as your institution’s legal counsel. Make sure to be as detailed as possible – this ensures they fully understand what you’ve done and saves you the trouble of having to answer a bunch of follow-up questions. For our research on the Times, I and my project director drafted our report and then sent it to the aforementioned parties. From there, we received approval to digitize the Times.

An example of syndicated and copyright cartoons from the Indianapolis Times.
An example of copyrighted serialized fiction in the Indianapolis Times.

One more tip for your research: make sure to keep detailed notes of everything you do. You will be going through a lot of newspapers, so it will help you keep things straight. It also provides a paper trail that your institution’s leadership and legal counsel can consult if necessary. I suggest using Google Sheets and Docs to complete this research. It will be in the Cloud and can be easily shared with anyone who would like to see it. If Google is not your fancy, use Microsoft Office and back up your work to the Cloud or another hard drive. You don’t want to work diligently for months to have all of it lost because of computer issues.

Examples of how I documented all my work. You will be going through a lot of newspapers, so it will help you keep things straight. It also provides a paper trail that your institution’s leadership and legal counsel can consult if necessary.

Conclusion

Digitizing newspapers has been one the most rewarding things I’ve worked on in the public history and cultural heritage space. Seeing a title like the Indianapolis Times digitized and made available for researchers to use, for free, has been a real privilege. But all of this could not have happened without doing the long and often-tedious work of copyright research. Researching a title’s copyright ensures that it is free and clear for you to digitize—and a lawyer from King Features or PARADE magazine won’t come knocking on your door. Yet, copyright research can also be very rewarding. It gives you a big-picture view of the title you’re considering for digitization. You’ll see who its original audience may have been, the kinds of stories they covered, and how it fits in the context of your state’s, and the country’s, history. This, among many other things, makes copyright research worth it. Thank you.

New Batch Available!

Hey there Chroniclers!

We have a new batch available for you through Chronicling Americahttp://chroniclingamerica.loc.gov/.

This batch comprises 977 issues (totaling 9,957 pages) and brings our total page count in Chronicling America to 299,200!

Here’s the paper and dates available:

Richmond Palladium And Sun-Telegram (Daily): April 1, 1912-November 20, 1915.

As always, happy searching!

This project has been assisted by a grant from the National Endowment for the Humanities.

New Batch Available!

Greetings chroniclers!

We have another new batch available for you at Chronicling America.

This batch contains issues from:

This batch adds 1166 issues (8,878 pages), growing Indiana’s total number of pages in Chronicling America to 288,102!

Have fun with all these new pages, and as always, happy searching!

This project has been assisted by a grant from the National Endowment for the Humanities.

New Issues Available!

Hello again Chroniclers!

Another batch of issues has been added to Hoosier State Chronicles!

Titles updated:

Richmond Palladium [Weekly], January 1831-June 1837.

Richmond Palladium [Daily], 1907-1910, April 1912-June 1912, October 1912-September 1913, and 1914-November 1915.

As always, happy searching!

This project has been assisted by a grant from the National Endowment for the Humanities.

Another New Batch Available!

Hello again, fellow chroniclers!

Another 10,000+ pages of Indiana newspapers have been added to The Library of Congress‘s Chronicling America, thanks to a grant from the National Endowment for the Humanities. Our total page count is now 268,827! Check them out here.

Titles available:

Indianapolis Journal [1887-1888]

Richmond Daily Palladium [1874-1898]

Richmond Weekly Palladium [1831-1874]

Also, check out these great institutions on Facebook:

The Library of Congress

National Endowment for the Humanities

Indiana State Library

Morrisson-Reeves Library

As always, happy searching!

New Batch Available!

Greetings chroniclers!

Another newspaper batch from Hoosier State Chronicles has been added to the Library of Congress’s national newspaper repository, Chronicling America. Our total page count is now 258,563!

Check them all out here: http://bit.ly/2mF4b7r.

Furthermore, Chronicling America’s total page count is now 11,687,970.

As always, happy searching!

Check out these great institutions on Facebook:

National Endowment for the Humanities

Indiana State Library 

The Library of Congress

NDNP Conference 2016 Highlights

The US Capitol. Courtesy of Justin Clark.
The US Capitol. Courtesy of Justin Clark.

This past week, I went to the National Digital Newspaper Program (NDNP) Awardee Conference in Washington, D.C., with my colleague Jill Weiss. It was an informative and inspiring conference. The first day, we met at the National Constitution Center and we welcomed by the chairman of the National Endowment for the Humanities, Dr. William D. Adams. In his brief remarks, he emphasized the commitment that NEH has to the program and his belief in its importance to the public good. As a public historian, I was motivated by his call to make Chronicling America (the national digital newspaper repository) more accessible to the public. He also shared with us the big news about the program: the date range is expanding! This new date expansion will cover 1690-1963, which means that awardee states can do so much more for Chronicling America.

NEH Chairman Dr. William D. Adams speaking at the NDNP Conference. Courtesy of Justin Clark.
NEH Chairman Dr. William D. Adams speaking at the NDNP Conference. Courtesy of Justin Clark.
Leaning about the technical specifications for the NDNP with Tonijala Penn from the Library of Congress. Courtesy of Justin Clark.
Leaning about the technical specifications for the NDNP with Tonijala Penn from the Library of Congress. Courtesy of Justin Clark.

During the first day, we learned about the specific program needs for Chronicling America, including newspaper essays that explain the history of a title, deliverable products submitted to the Library of Congress, and the ins-and-outs of preparing newspaper titles for microfilm and digital preservation. These talks were especially important to a new program assistant like myself, who needs to know all the important tasks for the NDNP. Additionally, we watched a live-stream of the swearing-in of the new Librarian of Congress, Dr. Carla Hayden. In her speech, she called for the Library of Congress to make its own history by making materials more easily available to the public. With NDNP, we are doing just that.

Dr. Carla Hayden, the 14th Librarian of Congress, during her confirmation ceremony. Courtesy of District Dispatch.
Dr. Carla Hayden, the 14th Librarian of Congress, during her confirmation ceremony. Courtesy of District Dispatch.

In the afternoon of the first day, winners of the NDNP’s Data Challenge Awards presented on the innovative and creative ways they are using digital newspapers through Chronicling America. George Mason University professor Lincoln Mullen shared his research on the use of the Bible in American newspapers and how it showed religious trends during the late nineteenth and early twentieth centuries. Andrew Bales, a doctoral student from the University of Cincinnati, created a database for chronicling the horrific history of Lynching in the American South. Ending the first session, Amy Giroux, Marcy Galbreath, and Nathan Giroux from the University of Central Florida explored agricultural trends through their own aggregator of newspapers called Historical Agricultural News.

IUPUI librarians Caitlyn Pollack, Ted Polley, and Kristi Palmer accepting their NDNP Data Challenge Award for their work on "Chronicling Hoosier." Courtesy of Justin Clark.
IUPUI librarians Caitlyn Pollack (Second from left), Ted Polley, and Kristi Palmer accepting their NDNP Data Challenge Award for their work on “Chronicling Hoosier.” Courtesy of Justin Clark.

However, my favorite presentation (and maybe I’m biased since I’m from Indiana) was Chronicling Hoosier, presented by IUPUI’s own Kristi Palmer, Ted Polley, and Caitlyn Pollock. Their research looked into the history and geographical usage of the word “Hoosier.” While they didn’t learn the clear origin of the word (we may never really know), they did learn that its usage extended beyond just Indiana, from Virginia and Kentucky all the way down the Mississippi River to Louisiana. Originally a term of derision, meaning “country bumpkin” or “backwoodsman,” Hoosier became a beloved moniker by the late nineteenth century for those who lived in the State of Indiana. Listening to their presentation brought back memories of fourth grade Indiana History Class and the tall tales my teacher, Mrs. Hall, would share with the class about “Hoosiers.”

NDNP Conference attendees during a break. Courtesy of Justin Clark.
NDNP Conference attendees during a break. Courtesy of Justin Clark.

History teacher Ray Palin and student Virgile Bissonnette-Blais from Sunapee High School in New Hampshire displayed their project chronicling pivotal events in American history such as Plessy v. Ferguson. Ending the data challenge winner presentations, Professor Claudio Saunt and engineer Trevor Goodyear from Georgia shared with us their winning project, USNewsMap.com, which provides a timeline-based “heat map” on newspapers based on search queries. For those interested, it does work on proper nouns as well as regular search terms (I asked).

The Library of Congress, Madison Building. This is where days two and three of the conference were hosted. Courtesy of Justin Clark.
The Library of Congress, Madison Building, where days two and three of the conference were hosted. Courtesy of Justin Clark.

The second day mostly focused on working with bilingual and multilingual newspapers, copyright issues, and the production aspects of NDNP. The main session that day for me was the production session, where awardees that are new to the program learn the basics of microfilm and digital preservation. We learned how to organize film, correct technical specifications for digital files, and preparing those files for the Library of Congress and Chronicling America. While it was a lot to take in for a two-hour session, the production talks were vital to my understanding of all the tasks necessary for working on the NDNP.

Our last day involved a nice, open ended morning session for brainstorming marketing and outreach. We learned different marketing strategies for Twitter, Facebook, and other social media outlets, as well as other fun ways to get people to Chronicling America. My Hoosier State Chronicles colleague, Jill Weiss, asked questions about how we could get a podcast off the ground (something we’re working on for the future). The ground shared some of their favorite podcasts to check out for ideas and seemed very receptive to our idea. Like with the Data Challenge winners, I loved learning about all the creative ways that we can use NDNP content to reach users.

Overall, this was a very fun and informative conference and I look forward to applying much of what I learned to my tasks on this program. Stay tuned for more, and as always, happy searching!

Data Challenge Winner Links

America’s Public Bible: http://americaspublicbible.org/.

American Lynching: http://www.americanlynchingdata.com/.

Historical Agricultural News: http://ag-news.net/.

Chronicling Hoosier: http://centerfordigschol.github.io/chroniclinghoosier/.

Digital APUSH: https://apush.omeka.net/.

USNewsMap.com: http://usnewsmap.com/.

Over 55,000 More Pages

The Rising Sun

Hey, readers.  Just a quick news flash.  Here’s a list of new content added to Hoosier State Chronicles over the last few days.

Check out some colorful titles — like Wabash Scratches — and a hilarious and witty antebellum paper from Indianapolis, The Locomotive.  A further decade of this comical weekly, one of the best papers ever published in the Hoosier State, is coming soon.

Additionally, we just added some early titles going back to 1807, when the sun was just rising on printing in Indiana Territory.  A huge run of Greencastle’s Daily Banner, digitized at DePauw University, brings us up to 1968.  Enjoy!