AI’s New Conversation Skills Eyed for Education — from insidehighered.com by Lauren Coffey The latest ChatGPT’s more human-like verbal communication has professors pondering personalized learning, on-demand tutoring and more classroom applications.
ChatGPT’s newest version, GPT-4o ( the “o” standing for “omni,” meaning “all”), has a more realistic voice and quicker verbal response time, both aiming to sound more human. The version, which should be available to free ChatGPT users in coming weeks—a change also hailed by educators—allows people to interrupt it while it speaks, simulates more emotions with its voice and translates languages in real time. It also can understand instructions in text and images and has improved video capabilities.
…
Ajjan said she immediately thought the new vocal and video capabilities could allow GPT to serve as a personalized tutor. Personalized learning has been a focus for educators grappling with the looming enrollment cliff and for those pushing for student success.
There’s also the potential for role playing, according to Ajjan. She pointed to mock interviews students could do to prepare for job interviews, or, for example, using GPT to play the role of a buyer to help prepare students in an economics course.
Generative AI is fundamentally changing how we’re approaching learning and education, enabling powerful new ways to support educators and learners. It’s taking curiosity and understanding to the next level — and we’re just at the beginning of how it can help us reimagine learning.
Today we’re introducing LearnLM: our new family of models fine-tuned for learning, based on Gemini.
On YouTube, a conversational AI tool makes it possible to figuratively “raise your hand” while watching academic videos to ask clarifying questions, get helpful explanations or take a quiz on what you’ve been learning. This even works with longer educational videos like lectures or seminars thanks to the Gemini model’s long-context capabilities. These features are already rolling out to select Android users in the U.S.
… Learn About is a new Labs experience that explores how information can turn into understanding by bringing together high-quality content, learning science and chat experiences. Ask a question and it helps guide you through any topic at your own pace — through pictures, videos, webpages and activities — and you can upload files or notes and ask clarifying questions along the way.
The Gemini era
A year ago on the I/O stage we first shared our plans for Gemini: a frontier model built to be natively multimodal from the beginning, that could reason across text, images, video, code, and more. It marks a big step in turning any input into any output — an “I/O” for a new generation.
Google is integrating AI into all of its ecosystem: Search, Workspace, Android, etc. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.
All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.
Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.
Google just casually announced Veo, a new rival to OpenAI’s Sora.
It can generate insanely good 1080p video up to 60 seconds.
Today at Google I/O we’re announcing new, powerful ways to get more done in your personal and professional life with Gemini for Google Workspace. Gemini in the side panel of your favorite Workspace apps is rolling out more broadly and will use the 1.5 Pro model for answering a wider array of questions and providing more insightful responses. We’re also bringing more Gemini capabilities to your Gmail app on mobile, helping you accomplish more on the go. Lastly, we’re showcasing how Gemini will become the connective tissue across multiple applications with AI-powered workflows. And all of this comes fresh on the heels of the innovations and enhancements we announced last month at Google Cloud Next.
Google is improving its AI-powered chatbot Gemini so that it can better understand the world around it — and the people conversing with it.
At the Google I/O 2024 developer conference on Tuesday, the company previewed a new experience in Gemini called Gemini Live, which lets users have “in-depth” voice chats with Gemini on their smartphones. Users can interrupt Gemini while the chatbot’s speaking to ask clarifying questions, and it’ll adapt to their speech patterns in real time. And Gemini can see and respond to users’ surroundings, either via photos or video captured by their smartphones’ cameras.
Generative AI in Search: Let Google do the searching for you — from blog.google With expanded AI Overviews, more planning and research capabilities, and AI-organized search results, our custom Gemini model can take the legwork out of searching.
Hello GPT-4o — from openai.com We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
Providing inflection, emotions, and a human-like voice
Understanding what the camera is looking at and integrating it into the AI’s responses
Providing customer service
With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.
This demo is insane.
A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.
Imagine giving this to every student in the world.
The Ethical and Emotional Implications of AI Voice Preservation
Legal Considerations and Voice Rights From a legal perspective, the burgeoning use of AI in voice cloning also introduces a complex web of rights and permissions. The recent passage of Tennessee’s ELVIS Act, which allows legal action against unauthorized recreations of an artist’s voice, underscores the necessity for robust legal frameworks to manage these technologies. For non-celebrities, the idea of a personal voice bank brings about its own set of legal challenges. How do we regulate the use of an individual’s voice after their death? Who holds the rights to control and consent to the usage of these digital artifacts?
To safeguard against misuse, any system of voice banking would need stringent controls over who can access and utilize these voices. The creation of such banks would necessitate clear guidelines and perhaps even contractual agreements stipulating the terms under which these voices may be used posthumously.
Should we all consider creating voice banks to preserve our voices, allowing future generations the chance to interact with us even after we are gone?
Last week a behemoth of a paper was released by AI researchers in academia and industry on the ethics of advanced AI assistants.
It’s one of the most comprehensive and thoughtful papers on developing transformative AI capabilities in socially responsible ways that I’ve read in a while. And it’s essential reading for anyone developing and deploying AI-based systems that act as assistants or agents — including many of the AI apps and platforms that are currently being explored in business, government, and education.
The paper — The Ethics of Advanced AI Assistants— is written by 57 co-authors representing researchers at Google Deep Mind, Google Research, Jigsaw, and a number of prominent universities that include Edinburgh University, the University of Oxford, and Delft University of Technology. Coming in at 274 pages this is a massive piece of work. And as the authors persuasively argue, it’s a critically important one at this point in AI development.
Key questions for the ethical and societal analysis of advanced AI assistants include:
What is an advanced AI assistant? How does an AI assistant differ from other kinds of AI technology?
What capabilities would an advanced AI assistant have? How capable could these assistants be?
What is a good AI assistant? Are there certain values that we want advanced AI assistants to evidence across all contexts?
Are there limits on what AI assistants should be allowed to do? If so, how are these limits determined?
What should an AI assistant be aligned with? With user instructions, preferences, interests, values, well-being or something else?
What issues need to be addressed for AI assistants to be safe? What does safety mean for this class of technologies?
What new forms of persuasion might advanced AI assistants be capable of? How can we ensure that users remain appropriately in control of the technology?
How can people – especially vulnerable users – be protected from AI manipulation and unwanted disclosure of personal information?
Is anthropomorphism for AI assistants morally problematic? If so, might it still be permissible under certain conditions?
A new company called Archetype is trying to tackle that problem: It wants to make AI useful for more than just interacting with and understanding the digital realm. The startup just unveiled Newton — “the first foundation model that understands the physical world.”
What’s it for?
A warehouse or factory might have 100 different sensors that have to be analyzed separately to figure out whether the entire system is working as intended. Newton can understand and interpret all of the sensors at the same time, giving a better overview of how everything’s working together. Another benefit: You can ask Newton questions in plain English without needing much technical expertise.
How does it work?
Newton collects data from radar, motion sensors, and chemical and environmental trackers
It uses an LLM to combine each of those data streams into a cohesive package
It translates that data into text, visualizations, or code so it’s easy to understand
Apple has entered into a significant agreement with stock photography provider Shutterstock to license millions of images for training its artificial intelligence models. According to a Reuters report, the deal is estimated to be worth between $25 million and $50 million, placing Apple among several tech giants racing to secure vast troves of data to power their AI systems.
AWS, Educause partner on generative AI readiness tool — from edscoop.com by Skylar Rispens Amazon Web Services and the nonprofit Educause announced a new tool designed to help higher education institutions gauge their readiness to adopt generative artificial intelligence.
Amazon Web Services and the nonprofit Educause on Monday announced they’ve teamed up to develop a tool that assesses how ready higher education institutions are to adopt generative artificial intelligence.
Through a series of curated questions about institutional strategy, governance, capacity and expertise, AWS and Educause claim their assessment can point to ways that operations can be improved before generative AI is adopted to support students and staff.
“Generative AI will transform how educators engage students inside and outside the classroom, with personalized education and accessible experiences that provide increased student support and drive better learning outcomes,” Kim Majerus, vice president of global education and U.S. state and local government at AWS, said in a press release. “This assessment is a practical tool to help colleges and universities prepare their institutions to maximize this technology and support students throughout their higher ed journey.”
Speaking of AI and our learning ecosystems, also see:
At a moment when the value of higher education has come under increasing scrutiny, institutions around the world can be exactly what learners and employers both need. To meet the needs of a rapidly changing job market and equip learners with the technical and ethical direction needed to thrive, institutions should familiarize students with the use of AI and nurture the innately human skills needed to apply it ethically. Failing to do so can create enormous risk for higher education, business and society.
What is AI literacy?
To effectively utilize generative AI, learners will need to grasp the appropriate use cases for these tools, understand when their use presents significant downside risk, and learn to recognize abuse to separate fact from fiction. AI literacy is a deeply human capacity. The critical thinking and communication skills required are muscles that need repeated training to be developed and maintained.
[Report] The Top 100 AI for Work – April 2024 — from flexos.work; with thanks to Daan van Rossum for this resource AI is helping us work up to 41% more effectively, according to recent Bain research. We review the platforms to consider for ourselves and our teams.
Following our AI Top 150, we spent the past few weeks analyzing data on the top AI platforms for work. This report shares key insights, including the AI tools you should consider adopting to work smarter, not harder.
While there is understandable concern about AI in the work context, the platforms in this list paint a different picture. It shows a future of work where people can do what humans are best suited for while offloading repetitive, digital tasks to AI.
This will fuel the notion that it’s not AI that takes your job but a supercharged human with an army of AI tools and agents. This should be a call to action for every working person and business leader reading this.
What about course videos? Professors can create them (by lecturing into a camera for several hours hopefully in different clothes) from the readings, from their interpretations of the readings, from their own case experiences – from anything they like. But now professors can direct the creation of the videos by talking – actually describing – to a CustomGPTabout what they’d like the video to communicate with their or another image. Wait. What?They can make a video by talking to a CustomGPT and even select the image they want the “actor” to use? Yes. They can also add a British accent and insert some (GenAI-developed) jokes into the videos if they like. All this and much more is now possible. This means that a professor can specify how long the video should be, what sources should be consulted and describe the demeanor the professor wants the video to project.
From DSC: Though I wasn’t crazy about the clickbait type of title here, I still thought that the article was solid and thought-provoking. It contained several good ideas for using AI.
Excerpt from a recent EdSurge Higher Ed newsletter:
There are darker metaphors though — ones that focus on the hazards for humanity of the tech. Some professors worry that AI bots are simply replacing hired essay-writers for many students, doing work for a student that they can then pass off as their own (and doing it for free).
From DSC: Hmmm…the use of essay writers was around long before AI became mainstream within higher education. So we already had a serious problem where students didn’t see the why in what they were being asked to do. Some students still aren’t sold on the why of the work in the first place. The situation seems to involve ethics, yes, but it also seems to say that we haven’t sold students on the benefits of putting in the work. Students seem to be saying I don’t care about this stuff…I just need the degree so I can exit stage left.
My main point: The issue didn’t start with AI…it started long before that.
This financial stagnation is occurring as we face a multitude of escalating challenges. These challenges include but are in no way limited to, chronic absenteeism, widespread student mental health issues, critical staff shortages, rampant classroom behavior issues, a palpable sense of apathy for education in students, and even, I dare say, hatred towards education among parents and policymakers.
…
Our current focus is on keeping our heads above water, ensuring our students’ safety and mental well-being, and simply keeping our schools staffed and our doors open.
What is Ed? An easy-to-understand learning platform designed by Los Angeles Unified to increase student achievement. It offers personalized guidance and resources to students and families 24/7 in over 100 languages.
Also relevant/see:
Los Angeles Unified Bets Big on ‘Ed,’ an AI Tool for Students — from by Lauraine Langreo
The Los Angeles Unified School District has launched an AI-powered learning tool that will serve as a “personal assistant” to students and their parents.The tool, named “Ed,” can provide students from the nation’s second-largest district information about their grades, attendance, upcoming tests, and suggested resources to help them improve their academic skills on their own time, Superintendent Alberto Carvalho announced March 20. Students can also use the app to find social-emotional-learning resources, see what’s for lunch, and determine when their bus will arrive.
Could OpenAI’s Sora be a big deal for elementary school kids?— from futureofbeinghuman.com by Andrew Maynard Despite all the challenges it comes with, AI-generated video could unleash the creativity of young children and provide insights into their inner worlds – if it’s developed and used responsibly
Like many others, I’m concerned about the challenges that come with hyper-realistic AI-generated video. From deep fakes and disinformation to blurring the lines between fact and fiction, generative AI video is calling into question what we can trust, and what we cannot.
And yet despite all the issues the technology is raising, it also holds quite incredible potential, including as a learning and development tool — as long as we develop and use it responsibly.
I was reminded of this a few days back while watching the latest videos from OpenAI created by their AI video engine Sora — including the one below generated from the prompt “an elephant made of leaves running in the jungle”
…
What struck me while watching this — perhaps more than any of the other videos OpenAI has been posting on its TikTok channel — is the potential Sora has for translating the incredibly creative but often hard to articulate ideas someone may have in their head, into something others can experience.
Can AI Aid the Early Education Workforce? — from edsurge.com by Emily Tate Sullivan During a panel at SXSW EDU 2024, early education leaders discussed the potential of AI to support and empower the adults who help our nation’s youngest children.
While the vast majority of the conversations about AI in education have centered on K-12 and higher education, few have considered the potential of this innovation in early care and education settings.
At the conference, a panel of early education leaders gathered to do just that, in a session exploring the potential of AI to support and empower the adults who help our nation’s youngest children, titled, “ChatECE: How AI Could Aid the Early Educator Workforce.”
Hau shared that K-12 educators are using the technology to improve efficiency in a number of ways, including to draft individualized education programs (IEPs), create templates for communicating with parents and administrators, and in some cases, to support building lesson plans.
Educators are, perhaps rightfully so, cautious about incorporating AI in their classrooms. With thoughtful implementation, however, AI image generators, with their ability to use any language, can provide powerful ways for students to engage with the target language and increase their proficiency.
While AI offers numerous benefits, it’s crucial to remember that it is a tool to empower educators, not replace them. The human connection between teacher and student remains central to fostering creativity, critical thinking, and social-emotional development. The role of teachers will shift towards becoming facilitators, curators, and mentors who guide students through personalized learning journeys. By harnessing the power of AI, educators can create dynamic and effective classrooms that cater to each student’s individual needs. This paves the way for a more engaging and enriching learning experience that empowers students to thrive.
In this article, seven teachers across the world share their insights on AI tools for educators. You will hear a host of varied opinions and perspectives on everything from whether AI could hasten the decline of learning foreign languages to whether AI-generated lesson plans are an infringement on teachers’ rights. A common theme emerged from those we spoke with: just as the internet changed education, AI tools are here to stay, and it is prudent for teachers to adapt.
Even though it’s been more than a year since ChatGPT made a big splash in the K-12 world, many teachers say they are still not receiving any training on using artificial intelligence tools in the classroom.
More than 7 in 10 teachers said they haven’t received any professional development on using AI in the classroom, according to a nationally representative EdWeek Research Center survey of 953 educators, including 553 teachers, conducted between Jan. 31 and March 4.
From DSC: This article mentioned the following resource:
How Early Adopters of Gen AI Are Gaining Efficiencies — from knowledge.wharton.upenn.edu by Prasanna (Sonny) Tambe and Scott A. Snyder; via Ray Schroeder on LinkedIn Enterprises are seeing gains from generative AI in productivity and strategic planning, according to speakers at a recent Wharton conference.
Its unique strengths in translation, summation, and content generation are especially useful in processing unstructured data. Some 80% of all new data in enterprises is unstructured, he noted, citing research firm Gartner. Very little of that unstructured data that resides in places like emails “is used effectively at the point of decision making,” he noted. “[With gen AI], we have a real opportunity” to garner new insights from all the information that resides in emails, team communication platforms like Slack, and agile project management tools like Jira, he said.
Here are 6 YouTube channels I watch to stay up to date with AI. This list will be useful whether you’re a casual AI enthusiast or an experienced programmer.
1. Matt Wolfe: AI for non-coders
This is a fast-growing YouTube channel focused on artificial intelligence for non-coders. On this channel, you’ll find videos about ChatGPT, Midjourney, and any AI tool that it’s gaining popularity.
#3 Photomath
Photomath is a comprehensive math help app that provides step-by-step explanations for a wide range of math problems, from elementary to college level. Photomath is only available as a mobile app. (link)
Features:
Get step-by-step solutions with multiple methods to choose from
Scan any math problem, including word problems, using the app’s camera
Access custom visual aids and extra “how” and “why” tips for deeper understanding
Google researchers have developed a new artificial intelligence system that can generate lifelike videos of people speaking, gesturing and moving — from just a single still photo. The technology, called VLOGGER, relies on advanced machine learning models to synthesize startlingly realistic footage, opening up a range of potential applications while also raising concerns around deepfakes and misinformation.
I’m fascinated by the potential of these tools to augment and enhance our work and creativity. There’s no denying the impressive capabilities we’re already seeing with text generation, image creation, coding assistance, and more. Used thoughtfully, AI can be a powerful productivity multiplier.
At the same time, I have significant concerns about the broader implications of this accelerating technology, especially for education and society at large. We’re traversing new ground at a breakneck pace, and it’s crucial that we don’t blindly embrace AI without considering the potential risks.
My worry is that by automating away too many tasks, even seemingly rote ones like creating slide decks, we risk losing something vital—humanity at the heart of knowledge work.
Nvidia has announced a partnership with Hippocratic AI to introduce AI “agents” aimed at replacing nurses in hospitals. These AI “nurses” come at a significantly low cost compared to human nurses and are purportedly intended to address staffing issues by handling “low-risk,” patient-facing tasks via video calls. However, concerns are raised regarding the ethical implications and effectiveness of replacing human nurses with AI, particularly given the complex nature of medical care.
A glimpse of the future of AI at work:
I got early access to Devin, the “AI developer” – it is slow & breaks often, but you can start to see what an AI agent can do.
It makes a plan and executes it autonomously, doing research, writing code & debugging, without you watching. pic.twitter.com/HHBQQDQZ9q
What if, for example, the corporate learning system knew who you were and you could simply ask it a question and it would generate an answer, a series of resources, and a dynamic set of learning objects for you to consume? In some cases you’ll take the answer and run. In other cases you’ll pour through the content. And in other cases you’ll browse through the course and take the time to learn what you need.
And suppose all this happened in a totally personalized way. So you didn’t see a “standard course” but a special course based on your level of existing knowledge?
This is what AI is going to bring us. And yes, it’s already happening today.
NVIDIA Digital Human Technologies Bring AI Characters to Life
Leading AI Developers Use Suite of NVIDIA Technologies to Create Lifelike Avatars and Dynamic Characters for Everything From Games to Healthcare, Financial Services and Retail Applications
Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning.
A new study from the University of Tokyo has highlighted the positive effect that immersive virtual reality experiences have for depression anti-stigma and knowledge interventions compared to traditional video.
…
The study found that depression knowledge improved for both interventions, however, only the immersive VR intervention reduced stigma. The VR-powered intervention saw depression knowledge score positively associated with a neural response in the brain that is indicative of empathetic concern. The traditional video intervention saw the inverse, with participants demonstrating a brain-response which suggests a distress-related response.
From DSC: This study makes me wonder why we haven’t heard of more VR-based uses in diversity training. I’m surprised we haven’t heard of situations where we are put in someone else’s mocassins so to speak. We could have a lot more empathy for someone — and better understand their situation — if we were to experience life as others might experience it. In the process, we would likely uncover some hidden biases that we have.