Translate

Search This Blog

Thursday, November 28, 2024

Breaking Down Barriers: How Automated Tools Can Increase Faculty Participation in Open Access

Build Your Own AI Tool: Scripting with Google's PaLM and Python for Library

Presented by Eric Silverberg, Librarian at Queens College, City University of New York



Introduction

In this presentation, Eric Silverberg shares his journey in developing an automated tool to assist faculty at Queens College in depositing their scholarly articles into the institutional repository. Recognizing the low participation of faculty in the School of Education, he sought to simplify the process by leveraging Google's PaLM API and Python scripting.

Background and Motivation

The Importance of Open Access

  • Personal Commitment: Eric emphasizes the significance of making educational research openly accessible, aligning with his values and background as a classroom teacher.
  • University Mission Alignment: As a public institution, the City University of New York aims to make its research available to the public.
  • Impact on Education: Open access to research empowers policymakers, administrators, and teachers by providing them with valuable insights and data.

Challenges with Faculty Participation

  • Faculty were generally unaware of the institutional repository or found the process too cumbersome.
  • Understanding open access policies for each journal can be complex and time-consuming.
  • Manually checking policies via Sherpa Romeo for numerous publications is inefficient.

Problem Statement

The core issue was automating the extraction of journal names from faculty citations to retrieve open access policies from Sherpa Romeo's API without manual intervention.

Initial Approach

  • Coding APA Rules: Attempted to parse citations by coding the rules of APA formatting.
  • Encountering Exceptions: Faculty citations varied significantly, with inconsistencies and creative deviations from standard formats.
  • Limitations: The approach became impractical due to the numerous exceptions, leading to excessive coding for edge cases.

Leveraging Google's PaLM API

Discovering PaLM

  • He learned about Google's PaLM API, which powers the language model behind Bard (now Gemini).
  • Recognized its potential for natural language understanding and processing.

Implementing PaLM for Journal Extraction

  • Simple Prompting: Used straightforward prompts like "What is the name of the journal in this citation?"
  • High Accuracy: PaLM effectively extracted journal names even from inconsistently formatted citations.
  • Automation: Enabled batch processing of citations without manually coding for formatting exceptions.

Technical Implementation

Setting Up the Environment

  1. API Key Connection: Established a connection to PaLM's API using a free API key.
  2. Selecting the Model: Chose the text generation model suitable for processing text inputs.
  3. Python Scripting: Used Python to write functions for automating the process.

Key Components of the Script

Part A: Connecting to PaLM

# Connect to PaLM API
import google.generativeai as palm
palm.configure(api_key='YOUR_API_KEY')

# Select the text generation model
models = [model for model in palm.list_models() if 'generateText' in model.supported_generation_methods]
model = models[0].name

Part B: Extracting Journal Names

# Function to get journal name
def get_journal_name(citation):
    prompt = f"What is the name of the journal in this citation?\n{citation}"
    completion = palm.generate_text(model=model, prompt=prompt, temperature=0, max_output_tokens=800)
    return completion.result
  • Temperature Parameter: Set to 0 to minimize randomness and ensure consistent outputs.
  • Max Output Tokens: Defined to control the length of the response.

Automating the Entire Process

  1. Input Data: Collected faculty citations in a spreadsheet.
  2. Journal Extraction: Used the `get_journal_name` function to populate journal names next to citations.
  3. OA Policy Retrieval: Sent journal names to Sherpa Romeo's API to get open access policies.
  4. Output Report: Generated a comprehensive report detailing OA policies for each publication.

Example Output

An example of the output report includes:

  • Citation: Full citation provided by the faculty.
  • Journal Name: Extracted using PaLM.
  • OA Policies: Detailed information on preprint, accepted manuscript, and final version policies.
Citation 4:
[Full Citation Here]

Journal: African Journal of Teacher Education

OA Policies:
- Submitted Manuscript: [Policy Details]
- Accepted Manuscript: [Policy Details]
- Final Version of Record: [Policy Details]

Challenges and Considerations

Dealing with Sherpa Romeo's API

  • Data Structure: The API returns data nested in complex ways, requiring careful parsing.
  • Error Handling: Implemented to manage cases where OA data was missing or incomplete.

Faculty Engagement

  • Planned to share the generated reports with faculty to encourage repository deposits.
  • Recognized the need for feedback to refine the tool and process.

Next Steps and Potential Enhancements

  • User Feedback: Gather input from faculty like Professor N'Dri T. AssiĆ©-Lumumba, who agreed to pilot the tool.
  • Automation of Deposits: Consider scripting the submission of articles into the repository, pending faculty permission.
  • Exploring Other APIs: Investigate alternatives like OpenAlex for OA policy data, potentially simplifying the process.
  • Improving PDF Handling: Explore methods to reverse engineer formatted PDFs back into Word documents for easier repository submissions.

Audience Questions and Responses

Is there a template available?

Answer: Yes, the code shared is largely based on Google's documentation. You can access Eric's script on GitHub and modify it for your needs.

How are citations received from faculty?

Answer: Currently, citations are obtained directly from faculty CVs. The process may evolve based on faculty feedback and scalability considerations.

Does the tool handle abbreviated journal names?

Answer: Yes, PaLM effectively recognizes and extracts abbreviated journal names, which is particularly useful in fields where abbreviations are common.

Why use Sherpa Romeo instead of OpenAlex?

Answer: Familiarity with Sherpa Romeo's API led to its initial use. OpenAlex may offer a more streamlined API, and exploring it could be beneficial for future iterations.

Can ChatGPT be used for journal name extraction?

Answer: While ChatGPT could perform similar tasks, using PaLM's API allows for automation within the script, eliminating the need for manual input and handling larger batches efficiently.

Could the process be further automated to deposit articles?

Answer: Automating the entire submission process is an intriguing idea. It would require careful consideration of repository submission protocols and faculty permissions.

Conclusion

Eric Silverberg's innovative approach demonstrates how AI tools like Google's PaLM can address practical challenges in academic libraries. By automating the extraction of journal names and retrieval of OA policies, the process becomes more efficient, encouraging greater faculty participation in open access initiatives.

The project underscores the potential of AI in streamlining workflows and enhancing access to scholarly research. Ongoing feedback and collaboration with faculty will be essential in refining the tool and maximizing its impact.

Resources and Contact Information

Eric welcomes questions, collaborations, and feedback on the project.

Acknowledgments

Special thanks to Natalie Swanberg for participating in the pilot and to all attendees for their insightful questions and engagement.

Building AI Competency in Library Staff: The Key to Success

At the Helm of Innovation: Librarians at the Forefront of AI Engagement and Integration

Presented by the Library Team at Georgetown University's International Campus in Qatar



Introduction

The advent of artificial intelligence (AI) has ushered in a new era of opportunities and challenges in the academic landscape. Recognizing the transformative potential of AI, the library team at Georgetown University's International Campus in Qatar embarked on a proactive journey to engage with and integrate AI tools across the campus. This article delves into their comprehensive approach, highlighting staff development initiatives, experimentation with AI, faculty outreach, and the incorporation of AI into daily operations.

Staff Development: Building AI Competency

The foundation of the library's AI integration strategy was robust staff development. The acting director of library services emphasized the importance of equipping the team with the necessary resources, time, and training to navigate the evolving AI landscape.

Workshops and Training Sessions

  • ALA's AI Literacy Workshop: The team participated in the American Library Association's workshop on "AI Literacy Using ChatGPT and Artificial Intelligence in Instruction," which provided valuable insights into AI applications in educational settings.
  • Collaborative Learning: The library facilitated special sessions and collaborations with colleagues to foster a culture of continuous learning and shared expertise.

Access to AI Tools

  • ChatGPT Account: A dedicated ChatGPT account was secured for the librarians, serving as a sandbox environment to explore and understand the capabilities and limitations of AI language models.
  • Skilltype Investment: The library invested in Skilltype, a talent management and development platform that provided personalized learning paths, including AI-related courses through LinkedIn Learning.

Experimenting with AI: Collaborative Exploration

Understanding the importance of hands-on experience, the library team engaged in active experimentation with AI tools.

Inter-Institutional Collaboration

The team collaborated with other institutions within Education City, including the Qatar National Library and neighboring universities like Texas A&M and Virginia Commonwealth University. These collaborative sessions focused on:

  • Demonstrating AI Tools: Sharing knowledge about various AI applications and how they can be utilized effectively.
  • Discussing Challenges: Identifying pitfalls and limitations of AI tools to develop best practices for their use.

Creative Applications of AI

The library leveraged AI creatively to enhance their services and outreach efforts:

  • Marketing Initiatives: AI tools were used to develop innovative marketing campaigns and materials, showcasing the library's commitment to embracing new technologies.
  • Workshop Development: AI was utilized to design a series of workshops aimed at exploring AI's creative potential, catering to faculty members who were hesitant to integrate AI directly into their courses.

Faculty Outreach: Bridging the Gap

Recognizing the varying levels of acceptance and familiarity with AI among faculty, the library undertook a strategic outreach initiative.

Understanding Faculty Perspectives

The team reached out to faculty members to gauge their plans and comfort levels regarding AI integration in their courses. They discovered that:

  • Some faculty were resistant to incorporating AI, often due to a lack of familiarity or concerns about academic integrity.
  • There was a trend toward eliminating traditional research papers in favor of in-class assessments to mitigate potential misuse of AI tools.

Adaptive Support and Resources

In response, the library developed alternative strategies to support faculty and students:

  • New Workshop Offerings: They created workshops that complemented and supplemented existing information literacy sessions, focusing on ethical and effective use of AI in research.
  • Alternative Assignments: The library assisted faculty in designing alternative assignments, such as podcasting and video discussions, that leveraged technology while addressing concerns about AI misuse.

Incorporating AI into Daily Operations

The library team integrated AI tools into their everyday workflows to enhance efficiency and innovation.

Brainstorming and Content Creation

  • Utilizing AI Language Models: Tools like ChatGPT and Claude were used for brainstorming ideas, drafting content, and refining communications.
  • Enhancing Marketing Efforts: AI-generated content and images were incorporated into marketing materials, increasing engagement and showcasing the library's forward-thinking approach.

AI-Driven Projects

One notable project involved using AI to recreate book covers for a library display:

  • Image Generation: Using tools like Leonardo AI, the team reimagined existing book covers, demonstrating the creative capabilities of AI.
  • Community Engagement: The display sparked interest among students and faculty, serving as a conversation starter about the role of AI in creativity and design.

Instructional Integration: AI in the Pre-Research Process

The Instructional Services Librarian took significant steps to integrate AI into the research instruction provided to students.

Addressing Citation and Academic Integrity

By the summer of 2023, major citation styles (APA, MLA, and Chicago) had issued guidelines on citing AI tools. The library:

  • Collaboration with the Writing Center: Partnered to create a cheat sheet on how to cite AI content and tools correctly.
  • Resolving Challenges: Addressed issues with citation management tools like Zotero, which lacked specific item types for AI-generated content.
  • Promoting Ethical Use: Emphasized the importance of attribution and academic integrity when using AI tools in research.

Overcoming Faculty Resistance

Some faculty members prohibited the use of AI in their syllabi. To navigate this:

  • Educational Frameworks: Utilized the CLEAR framework and UNESCO publications to demonstrate ethical and effective ways to incorporate AI into academic work.
  • Non-Generative AI Tools: Introduced tools like Research Rabbit, which assist in literature mapping without generating text, alleviating concerns about plagiarism.

Integrating AI into Lesson Plans

The librarian incorporated AI tools into instruction sessions, focusing on:

  • Free and Privacy-Conscious Tools: Selected AI applications like Copilot in Microsoft Edge that protect student data and are accessible without cost.
  • Parallel with Existing Tools: Demonstrated how AI can perform similar functions to familiar tools like Credo's concept mapping, easing the transition for both faculty and students.

AI Workshop Series: Empowering the Campus Community

To further AI literacy on campus, the library launched a futuristic-themed workshop series titled "AI's Creative Edge."

Workshop Offerings

  1. Advanced Prompt Engineering: Taught participants how to use AI for brainstorming keywords and concepts to enhance database searches.
  2. Citing AI Content: Provided hands-on training on using Zotero and Grammarly to correctly cite AI-generated material.
  3. Student Perspectives: Invited students to share their experiences and discuss ethical uses of AI tools.

Engagement and Outcomes

The workshop on citing AI content saw the highest attendance, indicating a strong interest in understanding how to use AI ethically within the bounds of academic integrity. This response highlighted the need for ongoing education and support in navigating AI's role in academia.

AI Across the Research Process

The library team developed a comprehensive framework illustrating how AI tools can be integrated at various stages of the research process:

Research Stage AI Applications
Brainstorming Tools for organizing tasks, defining topics, and generating ideas (e.g., Copilot, ChatGPT).
Literature Review Non-generative AI tools for mapping literature and identifying key sources (e.g., Research Rabbit).
Evaluation Using AI to verify sources, assess credibility, and filter results based on journal rankings (e.g., Consensus).
Citing AI-assisted citation tools for proper attribution (e.g., Grammarly add-on with ChatGPT, integrated with Zotero).

Leadership in AI Engagement: A Collaborative Effort

The Data, Media, and Web Librarian discussed the library's leadership role in advancing AI engagement on campus.

Proactive Initiatives

  • AI Literacy Development: Embraced AI as an area of intellectual curiosity and practical application, positioning the library as a knowledge hub.
  • Workshop Series: Expanded offerings to include topics like generative AI in images, music, and video, as well as AI's impact on career development.

Creative Projects and Experimentation

  • AI-Generated Book Covers: Created a library display featuring AI-generated reimaginings of existing book covers, engaging the community in discussions about AI and creativity.
  • Teaching AI Skills: Offered instruction on prompt engineering and image generation, enabling students and staff to interact effectively with AI tools.

Advanced AI Applications

  • GPT-4 and Claude 3 Vision Features: Explored the use of AI to transcribe and analyze handwritten historical documents, enhancing access to primary sources.
  • Support for Course Development: Participated in a pilot course on learning processes and AI, addressing the ethical considerations and potential of AI in education.

Campus Collaboration and Conversations

The library facilitated campus-wide discussions and collaborations regarding AI:

  • Campus Conversations: Organized events where faculty, IT staff, admissions officers, and finance team members shared perspectives on AI's impact in their areas.
  • Faculty Workshops: Engaged with faculty to discuss AI's role in teaching and learning, offering support and resources for integration.
  • Increased Course Support: Provided enhanced support for courses incorporating AI, ensuring that students and faculty have the necessary tools and knowledge.

Overcoming Challenges and Resistance

Throughout their journey, the library encountered challenges, including resistance from faculty and staff hesitant to adopt AI tools.

Addressing Faculty Concerns

  • Demonstrating Value: Showed faculty how AI could enhance research and learning without compromising academic integrity.
  • Alternative Assignments: Assisted in designing assignments that leveraged technology while mitigating concerns about AI misuse.

Engaging Resistant Staff

  • Demonstrations and Training: Conducted sessions to showcase the practical benefits of AI, highlighting efficiency gains and new capabilities.
  • Collaborative Approach: Encouraged open dialogue and shared experiences to ease apprehensions and build confidence in using AI tools.

Conclusion

The library team at Georgetown University's International Campus in Qatar exemplifies proactive leadership in AI engagement and integration. Through dedicated staff development, innovative experimentation, strategic faculty outreach, and the incorporation of AI into daily operations, they have positioned themselves at the forefront of academic innovation.

Their efforts not only enhance the library's services but also contribute significantly to the campus's overall readiness to navigate the evolving landscape of AI in education. By fostering a culture of ethical use, continuous learning, and collaborative exploration, they are shaping a future where AI is harnessed to enrich learning, research, and creativity.

Questions and Engagement

During their presentations and workshops, the library team actively engaged with students and faculty, addressing questions such as:

  • How can AI tools be used ethically in academic work?
  • What are effective strategies for citing AI-generated content?
  • How can resistance to AI adoption among staff and faculty be overcome?

Their willingness to share resources, such as cheat sheets for citing AI content, and to collaborate across departments underscores their commitment to supporting the campus community in embracing AI responsibly and effectively.

Unlocking Hidden Treasures: The Transformative Potential of AI in Special Collections

What Can AI Do for Special Collections? Improving Access and Enhancing Discovery

Presenters: Sonia Yaco and Bala Singu



In this enlightening presentation, Sonia Yaco and Bala Singu explore the transformative potential of Artificial Intelligence (AI) in the realm of special collections. Drawing from a year-long study conducted at Rutgers University, they delve into how AI can significantly improve access to and enhance the discovery of rich archival materials.

Introduction

Special collections in libraries house a wealth of historical and cultural artifacts. However, accessing and extracting meaningful insights from these collections can be challenging due to the nature of the materials, which often include handwritten documents, rare photographs, and other hard-to-process formats.

The presenters highlight a "golden opportunity" at the intersection of rich collections, an ever-expanding set of AI tools, and a strong desire to maximize the utility of these collections. By applying AI in meaningful ways, they aim to mine this wealth of information and make it more accessible to scholars and the public alike.

The William Elliot Griffis Collection

The focal point of the study is the William Elliot Griffis Papers at Rutgers University. This extensive collection documents the lives and work of the Griffis family, who were educators and missionaries in East Asia during the Meiji period (1868-1912). The collection includes manuscripts, photographs, and published materials and is heavily utilized by scholars from Asia, the United Kingdom, and the United States.

Margaret Clark Griffis

The study specifically focuses on Margaret Clark Griffis, the sister of William Elliot Griffis. She holds historical significance as one of the first Western women to educate Japanese women. By centering on her diaries, biographies, and photographs, the presenters aim to shed light on her contributions and experiences.

Strategies for Mining the Collection

To unlock the wealth of information within the Griffis collection, the presenters employed several strategies:

  1. Extracting Text to Improve Readability: Utilizing AI tools to transcribe handwritten and typewritten documents into machine-readable text.
  2. Finding Insights in Digitized Text and Photographs: Applying natural language processing and image analysis to gain deeper understanding.
  3. Connecting Text to Images: Linking textual content with corresponding images to create a richer narrative.

Software Tools Utilized

The project explored a variety of AI tools, categorized into:

  • Generative AI for Text and Images
  • Natural Language Processing Tools
  • Optical Character Recognition (OCR) Tools
  • Other Analytical Tools

In total, they examined nearly 26 software tools, assessing each based on cost and learning curve. The tools ranged from free and user-friendly applications like ChatGPT 3.5 to more complex and subscription-based services like ChatGPT 4.0 and DALL·E API.

Project Demonstrations

The presenters showcased three key demonstrations to illustrate the capabilities of AI in handling special collections:

1. Improving Readability

One of the primary challenges with special collections is the difficulty in reading handwritten and typewritten documents, especially those written in old cursive styles. To address this, the team used OCR tools to convert these documents into ASCII text, making them more accessible for computational analysis.

Handwritten Material

The team focused on transcribing Margaret Griffis's handwritten diary entries. They used tools like eScriptorium, Transkribus (AM Digital), and ChatGPT-4 to process the text. Each tool had varying levels of accuracy and challenges:

  • eScriptorium: A free tool with a moderate learning curve, it achieved an initial accuracy of around 89%.
  • Transkribus (AM Digital): A commercial tool with a higher cost but offered competitive accuracy.
  • ChatGPT-4: While powerful, it faced issues with "hallucinations," generating text not present in the original material.

By combining these tools, they improved the transcription accuracy significantly. For instance, feeding the eScriptorium output into ChatGPT-4 enhanced the accuracy to approximately 96%.

Typewritten Material

For typewritten documents, such as William Griffis's biography of his sister, tools like Adobe Acrobat provided efficient OCR capabilities with high accuracy. These documents were easier to process compared to handwritten materials.

2. Finding Insights with AI

Once the text was extracted, the next step was to derive meaningful insights using AI techniques:

Translation

To make the content accessible to international scholars, the team utilized translation tools:

  • Google Translate: A free tool suitable for smaller text volumes.
  • Googletrans API: An API version of Google Translate, which had reliability issues and limitations on volume.
  • Google Cloud Translation API: A paid service offering high reliability for large-scale translations.

Text Analysis and Visualization

Using natural language processing tools, the team performed analyses such as named entity recognition and topic modeling. They employed Voyant Tools, a free, open-source platform that offers various analytical capabilities:

  • Identifying key entities like names, places, and dates.
  • Visualizing word frequencies and relationships.
  • Creating interactive geographic maps based on the text.

Photographic Grouping

With over 427 photographs in the collection, the team sought to group images programmatically based on content similarities. By leveraging Python scripts and AI algorithms, they clustered photographs that shared visual characteristics, such as shapes, subjects, and themes.

3. Connecting Text and Images

One of the most innovative aspects of the project was linking textual content with corresponding images to enrich the narrative:

Describing Photographs Using AI

The team used ChatGPT to generate detailed descriptions of photographs. For example, given a photograph with minimal metadata labeled "small Japanese print," ChatGPT produced an extensive description, identifying elements like traditional attire, expressions, and possible historical context.

This process significantly enhances the discoverability of images, providing researchers with richer information than previously available.

Adding Metadata and Generating MARC Records

Beyond descriptions, the AI tools were used to generate metadata and even create MARC records for cataloging purposes. This automation can streamline library workflows and improve access to collections.

Generating Images from Text and Matching to Real Images

Taking the connection a step further, the team explored generating images based on extracted text and then matching these AI-generated images to real photographs in the collection:

  1. Extract Text Descriptions: Using ChatGPT to identify descriptive passages from the diary.
  2. Generate Images: Employing tools like DALL·E to create images based on these descriptions.
  3. Match to Real Images: Programmatically comparing AI-generated images to actual photographs in the collection to find potential matches.

While not perfect, this method opens up new avenues for discovering connections within archival materials that might not be immediately apparent.

Limitations and Takeaways

Limitations

  • Infrastructure Needs: AI requires significant resources, including computational power, software costs, and staff time.
  • Technical Expertise: A background in programming and software development is highly beneficial. Collaboration with technical staff is often necessary.
  • Learning Curves: Many AI tools, even free ones, come with steep learning curves that can be challenging to overcome.
  • Human Intervention: AI tools are not fully autonomous and require human oversight to ensure accuracy and relevance.

Takeaways

  • Combining Tools Enhances Effectiveness: Using multiple AI tools in conjunction can yield better results than using them in isolation.
  • Start with Accessible Tools: Begin with user-friendly software like Adobe Acrobat for OCR and Google Translate for initial forays into AI applications.
  • Incorporate AI into Workflows: Integrate AI tools into existing library processes to improve efficiency and output quality.
  • Partnerships are Crucial: Collaborate with technical staff, data scientists, and computer science departments to leverage expertise.

Recommendations for Libraries

The presenters offer practical advice for libraries interested in leveraging AI for their special collections:

  1. Begin with Easy-to-Use Software: Tools like Adobe Acrobat and Google Translate can have an immediate impact with minimal investment.
  2. Experiment with Text Analysis: Use platforms like Voyant Tools to gain insights into your collections and explore new research possibilities.
  3. Enhance Metadata Creation: Utilize AI to generate or enrich metadata, improving searchability and access.
  4. Seek Funding Opportunities: Apply for grants to support more extensive AI projects, such as large-scale photograph organization.
  5. Collaborate with Technical Experts: Engage with technical staff within or outside your institution to support complex AI initiatives.

Conclusion

The presentation underscores the significant potential of AI in unlocking the hidden treasures within special collections. By improving readability, finding insights, and connecting text with images, AI tools can make collections more accessible and enhance scholarly research.

The journey involves challenges, particularly in terms of resources and expertise, but the rewards can be substantial. As AI technology continues to evolve, libraries have an opportunity to embrace these tools, transform their workflows, and open their collections more fully to the world.

Questions and Further Discussion

During the Q&A session, attendees posed several insightful questions:

  • Tools for MARC Records: The presenters used ChatGPT-4 to generate MARC records from photographs, finding it effective for creating initial catalog entries.
  • Batch Processing: When asked about processing multiple images, they noted that while interactive interfaces might limit batch sizes, using APIs and programmatic approaches allows for processing larger volumes.
  • Applying Techniques to Other Formats: The techniques discussed are applicable to manuscripts, maps, and even video materials. Tools like Whisper can transcribe audio and video content, enhancing accessibility.