LearnLM is our new family of models fine-tuned for learning, and grounded in educational research to make teaching and learning experiences more active, personal and engaging.
We often talk about what Generative AI will do for coders, healthcare, science or even finance, but what about the benefits for the next generation? Permit me if you will, here I’m thinking about teachers and students.
It’s no secret that some of the most active users of ChatGPT in its heyday, were students. But how are other major tech firms thinking about this?
I actually think one of the best products with the highest ceiling from Google I/O 2024 is LearnLM. It has to be way more than a chatbot, it has to feel like a multimodal tutor. I can imagine frontier model agents (H) doing this fairly well.
What if everyone, everywhere could have their own personal AI tutor, on any topic?
ChatGPT4o Is the TikTok of AI Models — from nickpotkalitsky.substack.com by Nick Potkalitsky In Search of Better Tools for AI Access in K-12 Classrooms
Nick makes the case that we should pause on the use of OpenAI in the classrooms:
In light of these observations, it’s clear that we must pause and rethink the use of OpenAI products in our classrooms, except for rare cases where accessibility needs demand it. The rapid consumerization of AI, epitomized by GPT4o’s transformation into an AI salesperson, calls for caution.
[On 5/21/24] at Microsoft Build, Microsoft and Khan Academy announced a new partnership that aims to bring these time-saving and lesson-enhancing AI tools to millions of educators. By donating access to Azure AI-optimized infrastructure, Microsoft is enabling Khan Academy to offer all K-12 educators in the U.S. free access to the pilot of Khanmigo for Teachers, which will now be powered by Azure OpenAI Service.
The two companies will also collaborate to explore opportunities to improve AI tools for math tutoring in an affordable, scalable and adaptable way with a new version of Phi-3, a family of small language models (SLMs) developed by Microsoft.
Khan Academy’s AI assistant, Khanmigo, has earned praise for helping students to understand and practice everything from math to English, but it can also help teachers devise lesson plans, formulate questions about assigned readings, and even generate reading passages appropriate for students at different levels. More than just a chatbot, the software offers specific AI-powered tools for generating quizzes and assignment instructions, drafting lesson plans, and formulating letters of recommendation.
…
Having a virtual teaching assistant is especially valuable in light of recent research from the RAND Corporation that found teachers work longer hours than most working adults, which includes administrative and prep work outside the classroom.
Copilot+ PCs are the fastest, most intelligent Windows PCs ever built. With powerful new silicon capable of an incredible 40+ TOPS (trillion operations per second), all–day battery life and access to the most advanced AI models, Copilot+ PCs will enable you to do things you can’t on any other PC. Easily find and remember what you have seen in your PC with Recall, generate and refine AI images in near real-time directly on the device using Cocreator, and bridge language barriers with Live Captions, translating audio from 40+ languages into English.
From DSC: As a first off-the-hip look, Recall could be fraught with possible security/privacy-related issues. But what do I know? The Neuron states “Microsoft assures that everything Recall sees remains private.” Ok…
From The Rundown AI concerning the above announcements:
The details:
A new system enables Copilot+ PCs to run AI workloads up to 20x faster and 100x more efficiently than traditional PCs.
Windows 11 has been rearchitected specifically for AI, integrating the Copilot assistant directly into the OS.
New AI experiences include a new feature called Recall, which allows users to search for anything they’ve seen on their screen with natural language.
Copilot’s new screen-sharing feature allows AI to watch, hear, and understand what a user is doing on their computer and answer questions in real-time.
Copilot+ PCs will start at $999, and ship with OpenAI’s latest GPT-4o models.
Why it matters: Tony Stark’s all-powerful JARVIS AI assistant is getting closer to reality every day. Once Copilot, ChatGPT, Project Astra, or anyone else can not only respond but start executing tasks autonomously, things will start getting really exciting — and likely initiate a whole new era of tech work.
AI’s New Conversation Skills Eyed for Education — from insidehighered.com by Lauren Coffey The latest ChatGPT’s more human-like verbal communication has professors pondering personalized learning, on-demand tutoring and more classroom applications.
ChatGPT’s newest version, GPT-4o ( the “o” standing for “omni,” meaning “all”), has a more realistic voice and quicker verbal response time, both aiming to sound more human. The version, which should be available to free ChatGPT users in coming weeks—a change also hailed by educators—allows people to interrupt it while it speaks, simulates more emotions with its voice and translates languages in real time. It also can understand instructions in text and images and has improved video capabilities.
…
Ajjan said she immediately thought the new vocal and video capabilities could allow GPT to serve as a personalized tutor. Personalized learning has been a focus for educators grappling with the looming enrollment cliff and for those pushing for student success.
There’s also the potential for role playing, according to Ajjan. She pointed to mock interviews students could do to prepare for job interviews, or, for example, using GPT to play the role of a buyer to help prepare students in an economics course.
A Guide to the GPT-4o ‘Omni’ Model — from aieducation.substack.com by Claire Zau The closest thing we have to “Her” and what it means for education / workforce
Today, OpenAI introduced its new flagship model, GPT-4o, that delivers more powerful capabilities and real-time voice interactions to its users. The letter “o” in GPT-4o stands for “Omni”, referring to its enhanced multimodal capabilities. While ChatGPT has long offered a voice mode, GPT-4o is a step change in allowing users to interact with an AI assistant that can reason across voice, text, and vision in real-time.
Facilitating interaction between humans and machines (with reduced latency) represents a “small step for machine, giant leap for machine-kind” moment.
Everyone gets access to GPT-4: “the special thing about GPT-4o is it brings GPT-4 level intelligence to everyone, including our free users”, said CTO Mira Murati. Free users will also get access to custom GPTs in the GPT store, Vision and Code Interpreter. ChatGPT Plus and Team users will be able to start using GPT-4o’s text and image capabilities now
ChatGPT launched a desktop macOS app: it’s designed to integrate seamlessly into anything a user is doing on their keyboard. A PC Windows version is also in the works (notable that a Mac version is being released first given the $10B Microsoft relationship)
In a surprise launch, OpenAI dropped GPT-4 Omni, their new leading model. They also made a bunch of paid features in ChatGPT free and announced a new desktop app. Pete breaks down what you should know and what this says about AI.
Generative AI is fundamentally changing how we’re approaching learning and education, enabling powerful new ways to support educators and learners. It’s taking curiosity and understanding to the next level — and we’re just at the beginning of how it can help us reimagine learning.
Today we’re introducing LearnLM: our new family of models fine-tuned for learning, based on Gemini.
On YouTube, a conversational AI tool makes it possible to figuratively “raise your hand” while watching academic videos to ask clarifying questions, get helpful explanations or take a quiz on what you’ve been learning. This even works with longer educational videos like lectures or seminars thanks to the Gemini model’s long-context capabilities. These features are already rolling out to select Android users in the U.S.
… Learn About is a new Labs experience that explores how information can turn into understanding by bringing together high-quality content, learning science and chat experiences. Ask a question and it helps guide you through any topic at your own pace — through pictures, videos, webpages and activities — and you can upload files or notes and ask clarifying questions along the way.
The Gemini era
A year ago on the I/O stage we first shared our plans for Gemini: a frontier model built to be natively multimodal from the beginning, that could reason across text, images, video, code, and more. It marks a big step in turning any input into any output — an “I/O” for a new generation.
Google is integrating AI into all of its ecosystem: Search, Workspace, Android, etc. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.
All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.
Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.
Google just casually announced Veo, a new rival to OpenAI’s Sora.
It can generate insanely good 1080p video up to 60 seconds.
Today at Google I/O we’re announcing new, powerful ways to get more done in your personal and professional life with Gemini for Google Workspace. Gemini in the side panel of your favorite Workspace apps is rolling out more broadly and will use the 1.5 Pro model for answering a wider array of questions and providing more insightful responses. We’re also bringing more Gemini capabilities to your Gmail app on mobile, helping you accomplish more on the go. Lastly, we’re showcasing how Gemini will become the connective tissue across multiple applications with AI-powered workflows. And all of this comes fresh on the heels of the innovations and enhancements we announced last month at Google Cloud Next.
Google is improving its AI-powered chatbot Gemini so that it can better understand the world around it — and the people conversing with it.
At the Google I/O 2024 developer conference on Tuesday, the company previewed a new experience in Gemini called Gemini Live, which lets users have “in-depth” voice chats with Gemini on their smartphones. Users can interrupt Gemini while the chatbot’s speaking to ask clarifying questions, and it’ll adapt to their speech patterns in real time. And Gemini can see and respond to users’ surroundings, either via photos or video captured by their smartphones’ cameras.
Generative AI in Search: Let Google do the searching for you — from blog.google With expanded AI Overviews, more planning and research capabilities, and AI-organized search results, our custom Gemini model can take the legwork out of searching.
Hello GPT-4o — from openai.com We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
Providing inflection, emotions, and a human-like voice
Understanding what the camera is looking at and integrating it into the AI’s responses
Providing customer service
With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.
This demo is insane.
A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.
Imagine giving this to every student in the world.
Trends
As a first activity, we asked the Horizon panelists to provide input on the macro trends they believe are going to shape the future of postsecondary teaching and learning and to provide observable evidence for those trends. To ensure an expansive view of the larger trends serving as context for institutions of higher education, panelists provided input across five trend categories: social, technological, economic, environmental, and political. Given the widespread impacts of emerging AI technologies on higher education, we are also including in this year’s report a list of “honorary trends” focused on AI. After several rounds of voting, the panelists selected the following trends as the most important:
The Ethical and Emotional Implications of AI Voice Preservation
Legal Considerations and Voice Rights From a legal perspective, the burgeoning use of AI in voice cloning also introduces a complex web of rights and permissions. The recent passage of Tennessee’s ELVIS Act, which allows legal action against unauthorized recreations of an artist’s voice, underscores the necessity for robust legal frameworks to manage these technologies. For non-celebrities, the idea of a personal voice bank brings about its own set of legal challenges. How do we regulate the use of an individual’s voice after their death? Who holds the rights to control and consent to the usage of these digital artifacts?
To safeguard against misuse, any system of voice banking would need stringent controls over who can access and utilize these voices. The creation of such banks would necessitate clear guidelines and perhaps even contractual agreements stipulating the terms under which these voices may be used posthumously.
Should we all consider creating voice banks to preserve our voices, allowing future generations the chance to interact with us even after we are gone?
Microsoft’s new ChatGPT competitor… — from The Rundown AI
The Rundown: Microsoft is reportedly developing a massive 500B parameter in-house LLM called MAI-1, aiming to compete with top AI models from OpenAI, Anthropic, and Google.
Hampton runs a private community for high-growth tech founders and CEOs. We asked our community of founders and owners how AI has impacted their business and what tools they use
Here’s a sneak peek of what’s inside:
The budgets they set aside for AI research and development
The most common (and obscure) tools founders are using
Measurable business impacts founders have seen through using AI
Where they are purposefully not using AI and much more
To help leaders and organizations overcome AI inertia, Microsoft and LinkedIn looked at how AI will reshape work and the labor market broadly, surveying 31,000 people across 31 countries, identifying labor and hiring trends from LinkedIn, and analyzing trillions of Microsoft 365 productivity signals as well as research with Fortune 500 customers. The data points to insights every leader and professional needs to know—and actions they can take—when it comes to AI’s implications for work.
the internet eliminated time and place as barriers to education, and
generative AI eliminates access to expertise as a barrier to education.
Just as instructional designs had to be updated to account for all the changes in affordances of online learning, they will need to be dramatically updated again to account for the new affordances of generative AI.
The Curious Educator’s Guide to AI | Strategies and Exercises for Meaningful Use in Higher Ed — from ecampusontario.pressbooks.pub by Kyle Mackie and Erin Aspenlieder; via Stephen Downes
This guide is designed to help educators and researchers better understand the evolving role of Artificial Intelligence (AI) in higher education. This openly-licensed resource contains strategies and exercises to help foster an understanding of AI’s potential benefits and challenges. We start with a foundational approach, providing you with prompts on aligning AI with your curiosities and goals.
The middle section of this guide encourages you to explore AI tools and offers some insights into potential applications in teaching and research. Along with exposure to the tools, we’ll discuss when and how to effectively build AI into your practice.
The final section of this guide includes strategies for evaluating and reflecting on your use of AI. Throughout, we aim to promote use that is effective, responsible, and aligned with your educational objectives. We hope this resource will be a helpful guide in making informed and strategic decisions about using AI-powered tools to enhance teaching and learning and research.
Annual Provosts’ Survey Shows Need for AI Policies, Worries Over Campus Speech — from insidehighered.com by Ryan Quinn Many institutions are not yet prepared to help their faculty members and students navigate artificial intelligence. That’s just one of multiple findings from Inside Higher Ed’s annual survey of chief academic officers.
Only about one in seven provosts said their colleges or universities had reviewed the curriculum to ensure it will prepare students for AI in their careers. Thuswaldner said that number needs to rise. “AI is here to stay, and we cannot put our heads in the sand,” he said. “Our world will be completely dominated by AI and, at this point, we ain’t seen nothing yet.”
Is GenAI in education more of a Blackberry or iPhone? — from futureofbeinghuman.com by Andrew Maynard There’s been a rush to incorporate generative AI into every aspect of education, from K-12 to university courses. But is the technology mature enough to support the tools that rely on it?
In other words, it’s going to mean investing in concepts, not products.
This, to me, is at the heart of an “iPhone mindset” as opposed to a “Blackberry mindset” when it comes to AI in education — an approach that avoids hard wiring in constantly changing technologies, and that builds experimentation and innovation into the very DNA of learning.
…
For all my concerns here though, maybe there is something to being inspired by the Blackberry/iPhone analogy — not as a playbook for developing and using AI in education, but as a mindset that embraces innovation while avoiding becoming locked in to apps that are detrimentally unreliable and that ultimately lead to dead ends.
Randomized-controlled experiments investigating novice and experienced teachers’ ability to identify AI-generated texts.
Generative AI can simulate student essay writing in a way that is undetectable for teachers.
Teachers are overconfident in their source identification.
AI-generated essays tend to be assessed more positively than student-written texts.
Can Using a Grammar Checker Set Off AI-Detection Software? — from edsurge.com by Jeffrey R. Young A college student says she was falsely accused of cheating, and her story has gone viral. Where is the line between acceptable help and cheating with AI?
ChatGPT shaming is a thing – and it shouldn’t be — from futureofbeinghuman.com by Andrew Maynard There’s a growing tension between early and creative adopters of text based generative AI and those who equate its use with cheating. And when this leads to shaming, it’s a problem.
Excerpt (emphasis DSC):
This will sound familiar to anyone who’s incorporating generative AI into their professional workflows. But there are still many people who haven’t used apps like ChatGPT, are largely unaware of what they do, and are suspicious of them. And yet they’ve nevertheless developed strong opinions around how they should and should not be used.
From DSC: Yes…that sounds like how many faculty members viewed online learning, even though they had never taught online before.
OpenAI rolls out Memory feature for ChatGPT
OpenAI has introduced a cool update for ChatGPT (rolling out to paid and free users – but not in the EU or Korea), enabling the AI to remember user-specific details across sessions. This memory feature enhances personalization and efficiency, making your interactions with ChatGPT more relevant and engaging.
.
Key Features
Automatic Memory Tracking
ChatGPT now automatically records information from your interactions such as preferences, interests, and plans. This allows the AI to refine its responses over time, making each conversation increasingly tailored to you.
Enhanced Personalization
The more you interact with ChatGPT, the better it understands your needs and adapts its responses accordingly. This personalization improves the relevance and efficiency of your interactions, whether you’re asking for daily tasks or discussing complex topics.
Memory Management Options
You have full control over this feature. You can view what information is stored, toggle the memory on or off, and delete specific data or all memory entries, ensuring your privacy and preferences are respected.
Memory is now available to all ChatGPT Plus users. Using Memory is easy: just start a new chat and tell ChatGPT anything you’d like it to remember.
Memory can be turned on or off in settings and is not currently available in Europe or Korea. Team, Enterprise, and GPTs to come. pic.twitter.com/mlt9vyYeMK
From DSC: The ability of AI-based applications to remember things about us will have major and positive ramifications for us when we think about learning-related applications of AI.