24x7x365 access

we think there is room to make search much better than it is today.

we are launching a new prototype called SearchGPT: https://t.co/A28Y03X1So

we will learn from the prototype, make it better, and then integrate the tech into ChatGPT to make it real-time and maximally helpful.

— Sam Altman (@sama) July 25, 2024

Also see:

“Who to follow in AI” in 2024? [Part II] — from ai-supremacy.com by Michael Spencer [some of posting is behind a paywall]
Part II – #19-34
“Who to follow in AI” in 2024? [Part III] — from ai-supremacy.com by Michael Spencer [some of posting is behind a paywall]
Part III – #35-55

Along these lines, also see:

Lots of folks talk about AI influencer content, but these AI influencers are also just damn good humans ?@rowancheung @nonmayorpete @decisionleader @rachel_l_woods @emollick @conorgrennan @sineadbovell @DonAllenIII @ManuVision @karenxcheng

*non-exhaustive, too many to include

— Allie K. Miller (@alliekmiller) July 13, 2024

AI In Medicine: 3 Future Scenarios From Utopia To Dystopia — from medicalfuturist.com by Andrea Koncz
There’s a vast difference between baseless fantasizing and realistic forward planning. Structured methodologies help us learn how to “dream well”.

Key Takeaways

We’re often told that daydreaming and envisioning the future is a waste of time. But this notion is misguided.
We all instinctively plan for the future in small ways, like organizing a trip or preparing for a dinner party. This same principle can be applied to larger-scale issues, and smart planning does bring better results.
We show you a method that allows us to think “well” about the future on a larger scale so that it better meets our needs.

Adobe Unveils Powerful New Innovations in Illustrator and Photoshop Unlocking New Design Possibilities for Creative Pros — from news.adobe.com

Latest Illustrator and Photoshop releases accelerate creative workflows, save pros time and empower designers to realize their visions faster
New Firefly-enabled features like Generative Shape Fill in Illustrator along with the Dimension Tool, Mockup, Text to Pattern, the Contextual Taskbar and performance enhancement tools accelerate productivity and free up time so creative pros can dive deeper into the parts of their work they love
Photoshop introduces all-new Selection Brush Tool and the general availability of Generate Image, Adjustment Brush Tool and other workflow enhancements empowering creators to make complex edits and unique designs
.

Nike is using AI to turn athletes’ dreams into shoes — from axios.com by Ina Fried

Zoom in: Nike used genAI for ideation, including using a variety of prompts to produce images with different textures, materials and color to kick off the design process.

What they’re saying: “It’s a new way for us to work,” Nike lead footwear designer Juliana Sagat told Axios during a media tour of the showcase on Tuesday.
.

AI meets ‘Do no harm’: Healthcare grapples with tech promises — from finance.yahoo.com by Maya Benjamin

Major companies are moving at high speed to capture the promises of artificial intelligence in healthcare while doctors and experts attempt to integrate the technology safely into patient care.

“Healthcare is probably the most impactful utility of generative AI that there will be,” Kimberly Powell, vice president of healthcare at AI hardware giant Nvidia (NVDA), which has partnered with Roche’s Genentech (RHHBY) to enhance drug discovery in the pharmaceutical industry, among other investments in healthcare companies, declared at the company’s AI Summit in June.

Mistral reignites this week’s LLM rivalry with Large 2 (source) — from superhuman.ai

Today, we are announcing Mistral Large 2, the new generation of our flagship model. Compared to its predecessor, Mistral Large 2 is significantly more capable in code generation, mathematics, and reasoning. It also provides a much stronger multilingual support, and advanced function calling capabilities.

Meta releases the biggest and best open-source AI model yet — from theverge.com by Alex Heath
Llama 3.1 outperforms OpenAI and other rivals on certain benchmarks. Now, Mark Zuckerberg expects Meta’s AI assistant to surpass ChatGPT’s usage in the coming months.

Back in April, Meta teased that it was working on a first for the AI industry: an open-source model with performance that matched the best private models from companies like OpenAI.

Today, that model has arrived. Meta is releasing Llama 3.1, the largest-ever open-source AI model, which the company claims outperforms GPT-4o and Anthropic’s Claude 3.5 Sonnet on several benchmarks. It’s also making the Llama-based Meta AI assistant available in more countries and languages while adding a feature that can generate images based on someone’s specific likeness. CEO Mark Zuckerberg now predicts that Meta AI will be the most widely used assistant by the end of this year, surpassing ChatGPT.

4 ways to boost ChatGPT — from wondertools.substack.com by Jeremy Caplan & The PyCoach
Simple tactics for getting useful responses

To help you make the most of ChatGPT, I’ve invited & edited today’s guest post from the author of a smart AI newsletter called The Artificial Corner. I appreciate how Frank Andrade pushes ChatGPT to produce better results with four simple, clever tactics. He offers practical examples to help us all use AI more effectively.
…
Frank Andrade: Most of us fail to make the most of ChatGPT.

We omit examples in our prompts.
We fail to assign roles to ChatGPT to guide its behavior.
We let ChatGPT guess instead of providing it with clear guidance.

If you rely on vague prompts, learning how to create high-quality instructions will get you better results. It’s a skill often referred to as prompt engineering. Here are several techniques to get you to the next level.

School 3.0: Reimagining Education in 2026, 2029, and 2034 [Borish]

On 07/19/2024, in 21st century, 24x7x365 access, A/V -- audio/visual, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, education, education technology, emerging technologies, engagement / engaging students, experimentation, face-to-face, flipping -- inverted learning, future, game-changing environment, heutagogy, human-computer interaction (HCI), ideas, innovation, instructional design, intelligent systems, intelligent tutoring, K-12 related, learning, learning agents, learning ecosystem, Learning Experience Design, Learning from the Living [Class] Room, online learning, online media, online tutoring, pace of change, reinvent, skills, staying relevant, teachers, teaching & learning, United States, vision/possibilities, by Daniel Christian

School 3.0: Reimagining Education in 2026, 2029, and 2034 — from davidborish.com by David Borish
.

The landscape of education is on the brink of a profound transformation, driven by rapid advancements in artificial intelligence. This shift was highlighted recently by Andrej Karpathy’s announcement of Eureka Labs, a venture aimed at creating an “AI-native” school. As we look ahead, it’s clear that the integration of AI in education will reshape how we learn, teach, and think about schooling altogether.
…
Traditional textbooks will begin to be replaced by interactive, AI-powered learning materials that adapt in real-time to a student’s progress.
…
As we approach 2029, the line between physical and virtual learning environments will blur significantly.

Curriculum design will become more flexible and personalized, with AI systems suggesting learning pathways based on each student’s interests, strengths, and career aspirations.
…
The boundaries between formal education and professional development will blur, creating a continuous learning ecosystem.

“Content can be turned into interactive learning games” [Cheung] + other items re: AI in our LE’s

On 07/11/2024, in 21st century, 24x7x365 access, adjunct faculty, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, assessment, Canvas - and other Instructure-related items, CMS/LMS, colleges, community colleges, content development, aggregation, repositories, digital learning, education technology, emerging technologies, experimentation, faculty, future of higher education, game-changing environment, higher education, human-computer interaction (HCI), innovation, instructional design, intelligent systems, intelligent tutoring, IT in HE, law schools, learning ecosystem, liberal arts, online media, personalized/customized learning, platforms, student-related, teachers, teaching & learning, tools, United States, universities, vendors, vision/possibilities, by Daniel Christian

Higher Education Has Not Been Forgotten by Generative AI — from insidehighered.com by Ray Schroeder
The generative AI (GenAI) revolution has not ignored higher education; a whole host of tools are available now and more revolutionary tools are on the way.

Some of the apps that have been developed for general use can be customized for specific topical areas in higher ed. For example, I created a version of GPT, “Ray’s EduAI Advisor,” that builds onto the current GPT-4o version with specific updates and perspectives on AI in higher education. It is freely available to users. With few tools and no knowledge of the programming involved, anyone can build their own GPT to supplement information for their classes or interest groups.

Excerpts from Ray’s EduAI Advisor bot:

AI’s global impact on higher education, particularly in at-scale classes and degree programs, is multifaceted, encompassing several key areas:

1. Personalized Learning…

2. Intelligent Tutoring Systems…

3. Automated Assessment…

4. Enhanced Accessibility…

5. Predictive Analytics…

6. Scalable Virtual Classrooms

7. Administrative Efficiency…

8. Continuous Improvement…

Instructure and Khan Academy Announce Partnership to Enhance Teaching and Learning With Khanmigo, the AI Tool for Education — from instructure.com
Shiren Vijiasingam and Jody Sailor make an exciting announcement about a new partnership sure to make a difference in education everywhere.

“AI will be a vital tool in overcoming the access to justice challenge.” [Susskind] + other legaltech-related items

On 07/09/2024, in 21st century, 24x7x365 access, Access to Justice (A2J), Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, career development, change, law schools, legal reform, legal technologies, legislatures / government / legal, tools, UK, United States, vendors, by Daniel Christian

Rethinking Legal Ops Skills: Generalists Versus Specialists — from abovethelaw.com by Silvie Tucker and Brandi Pack
This ongoing conversation highlights the changing demands on legal ops practitioners.

A thought-provoking discussion is unfolding in the legal operations community regarding one intriguing question: Should legal operations professionals strive to be generalists or specialists?

The conversation is timely as the marketplace consolidates and companies grapple with the best way to fill valuable and limited headcount allotments. It also highlights the evolving landscape of legal operations and the changing demands on its practitioners.
…
The Evolution of Legal Ops
Over the past decade, the field of legal operations has undergone significant transformation. Initially strictly focused on streamlining processes and reducing costs, the role has expanded to include various responsibilities driven by technological advancements and heightened industry expectations. Key areas of expansion include:

He added: “I have for long been of the view – for decades – that AI will be a vital tool in overcoming the access to justice challenge. Existing and emerging technologies are now very promising.”

Richard Susskind

In A First for Law Practice Management Platforms, Clio Rolls Out An Integrated E-Filing Service in Texas — from lawnext.com by Bob Ambrogi

Last October, during its annual Clio Cloud Conference, the law practice management company Clio announced its plan to roll out an e-filing service, called Clio File, during 2024, starting with Texas, which would make it the first law practice management platform with built-in e-filing. Today, it delivered on that promise, launching Clio File for e-filing in Texas courts.

“Lawyers can now seamlessly submit court documents directly from our flagship practice management product, Clio Manage, streamlining their workflows and simplifying the filing process,” said Chris Stock, vice president of legal content and migrations at Clio. “This is an exciting step in expanding the capabilities of our platform, providing a comprehensive solution for legal documents, from drafting to court filing.”

Just-Launched Quench Uses Gen AI to Bring Greater Speed and Accuracy to Medico-Legal Records Review — from lawnext.com by Bob Ambrogi

A cardiologist with a background in medical technology, computer science and artificial intelligence has launched a product for legal professionals and physician expert witnesses that targets the tedious task of reviewing and analyzing thousands of pages of medical records.

The product, Quench SmartChart, uses generative AI to streamline the medico-legal review process, enabling users to quickly extract, summarize and create chronologies from large, disorganized PDFs of medical records.

The product also includes a natural language chat feature, AskQuench, that lets users interact with and interrogate records to surface essential insights.

Emerging Trends in Legal Tech [Joyner & Correia]

On 07/08/2024, in 21st century, 24x7x365 access, law schools, Legal operations, legal reform, legal technologies, legislatures / government / legal, United States, vendors, videoconferencing, Virtual courts and law firms, by Daniel Christian

Emerging Trends in Legal Tech — from legaltalknetwork.com by Rob Joyner & Jared D. Correia

Remote Work Continues to Thrive
In recent years, many firms have adopted a “just make it happen” attitude toward virtual meetings, mobility, and remote work. This has enabled law firms to reevaluate the tools and training necessary for legal professionals to utilize technology effectively, improving upon the traditional in-office setup. When executed correctly, this approach can yield long-lasting benefits for the firm. Implementing a remote work policy can help firms access a global talent pool, reduce operational costs, and create a better work-life balance for their staff.

In a recent episode of Legal Toolkit, Rob Joyner, Senior Vice President of Business Development at Centerbase, and Jared D. Correia, Esq., CEO of Red Cave Law Firm Consulting, discuss the debate between remote and in-office work, as well as the latest advancements in AI and other essential legal technology.

Infinite seamless mega meme mashup ?

Keyframes were used to seamlessly transition between 20 memes w/ audio ?@LumaLabsAI Audio on ? pic.twitter.com/9jzbMDUDp2

— Blaine Brown ? (@blizaine) June 29, 2024

Bill Gates Reveals Superhuman AI Prediction — from youtube.com by Rufus Griscom, Bill Gates, Andy Sack, and Adam Brotman

This episode of the Next Big Idea podcast, host Rufus Griscom and Bill Gates are joined by Andy Sack and Adam Brotman, co-authors of an exciting new book called “AI First.” Together, they consider AI’s impact on healthcare, education, productivity, and business. They dig into the technology’s risks. And they explore its potential to cure diseases, enhance creativity, and usher in a world of abundance.

Key moments:

00:05 Bill Gates discusses AI’s transformative potential in revolutionizing technology.
02:21 Superintelligence is inevitable and marks a significant advancement in AI technology.
09:23 Future AI may integrate deeply as cognitive assistants in personal and professional life.
14:04 AI’s metacognitive advancements could revolutionize problem-solving capabilities.
21:13 AI’s next frontier lies in developing human-like metacognition for sophisticated problem-solving.
27:59 AI advancements empower both good and malicious intents, posing new security challenges.
28:57 Rapid AI development raises questions about controlling its global application.
33:31 Productivity enhancements from AI can significantly improve efficiency across industries.
35:49 AI’s future applications in consumer and industrial sectors are subjects of ongoing experimentation.
46:10 AI democratization could level the economic playing field, enhancing service quality and reducing costs.
51:46 AI plays a role in mitigating misinformation and bridging societal divides through enhanced understanding.

OpenAI Introduces CriticGPT: A New Artificial Intelligence AI Model based on GPT-4 to Catch Errors in ChatGPT’s Code Output — from marktechpost.com

The team has summarized their primary contributions as follows.

The team has offered the first instance of a simple, scalable oversight technique that greatly assists humans in more thoroughly detecting problems in real-world RLHF data.

Within the ChatGPT and CriticGPT training pools, the team has discovered that critiques produced by CriticGPT catch more inserted bugs and are preferred above those written by human contractors.

Compared to human contractors working alone, this research indicates that teams consisting of critic models and human contractors generate more thorough criticisms. When compared to reviews generated exclusively by models, this partnership lowers the incidence of hallucinations.

This study provides Force Sampling Beam Search (FSBS), an inference-time sampling and scoring technique. This strategy well balances the trade-off between minimizing bogus concerns and discovering genuine faults in LLM-generated critiques.

Character.AI now allows users to talk with AI avatars over calls — from techcrunch.com by Ivan Mehta

a16z-backed Character.AI said today that it is now allowing users to talk to AI characters over calls. The feature currently supports multiple languages, including English, Spanish, Portuguese, Russian, Korean, Japanese and Chinese.

The startup tested the calling feature ahead of today’s public launch. During that time, it said that more than 3 million users had made over 20 million calls. The company also noted that calls with AI characters can be useful for practicing language skills, giving mock interviews, or adding them to the gameplay of role-playing games.

Google Translate Just Added 110 More Languages — from lifehacker.com by
You can now use the app to communicate in languages you’ve never even heard of.

Google Translate can come in handy when you’re traveling or communicating with someone who speaks another language, and thanks to a new update, you can now connect with some 614 million more people. Google is adding 110 new languages to its Translate tool using its AI PaLM 2 large language model (LLM), which brings the total of supported languages to nearly 250. This follows the 24 languages added in 2022, including Indigenous languages of the Americas as well as those spoken across Africa and central Asia.

Gen-3 Alpha Text to Video is now available to everyone.

A new frontier for high-fidelity, fast and controllable video generation.

Try it now at https://t.co/ekldoIshdw pic.twitter.com/miNbHdK5hX

— Runway (@runwayml) July 1, 2024

Gen-3 Alpha from Runway is now available to everyone, and it’s incredible.

This is going to change the way ads and B-roll are created.

13 mind-blowing examples:

— Nathan Lands — Lore.com (@NathanLands) July 3, 2024

Listen to your favorite books and articles voiced by Judy Garland, James Dean, Burt Reynolds and Sir Laurence Olivier — from elevenlabs.io
ElevenLabs partners with estates of iconic stars to bring their voices to the Reader App

Several items re: text-to-video (and even images-to-video)

On 06/14/2024, in 21st century, 24x7x365 access, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, Asia, communications, creativity, digital audio, digital learning, digital video, emerging technologies, engagement / engaging students, media/film, multimedia, United States, vendors, by Daniel Christian

Dream Machine is an AI model that makes high quality, realistic videos fast from text and images.

It is a highly scalable and efficient transformer model trained directly on videos making it capable of generating physically accurate, consistent and eventful shots. Dream Machine is our first step towards building a universal imagination engine and it is available to everyone now!

Luma AI just dropped a Sora-like AI video generator called Dream Machine.

But unlike Sora or KLING, it’s completely open access to the public.

Here are 10 wild examples (and how to access it):

1. pic.twitter.com/Dx5Pnbp7lg

— Rowan Cheung (@rowancheung) June 12, 2024

Text-to-Video Emergence for July 2024 — from ai-supremacy.com by Michael Spencer
Who needs Sora?

There have been some incredible teasers in the text-to-video arena of Generative AI. Namely I’m watching:

Kling AI (by Kuaishou)
Luma AI
Vidu (ShengShu Technology and Tsinghua University)
Pika Labs
Zhipu AI & ByteDance (not yet released their products)
The timeline for the release of OpenAI’s Sora

“OpenAI seems to have the ability to create video in Sora, send it to ChatGPT for a script, use Voice Engine for voice over and put it all together.”
byu/MassiveWasabi insingularity

Daniel Christian: My slides for the Educational Technology Organization of Michigan’s Spring 2024 Retreat

On 06/13/2024, in 21st century, 24x7x365 access, Access to Justice (A2J), adjunct faculty, Adobe, adult learning, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, bots, business side of he, change, cloud-based computing / apps / other cloud-related, colleges, community colleges, content development, aggregation, repositories, corporate / business world, corporate universities / corporate training, cost of getting a degree, creativity, culture, dangers of the status quo, Daniel S. Christian, data related items, design, digital audio, digital learning, digital photography, digital storytelling, digital video, education technology, emerging technologies, experimentation, faculty, future of higher education, game-changing environment, heutagogy, higher education, homeschooling/homeschoolers, human-computer interaction (HCI), ideas, innovation, instructional design, intelligent systems, intelligent tutoring, K-12 related, learning, learning agents, learning ecosystem, Learning Experience Design, Learning from the Living [Class] Room, liberal arts, library / librarians, lifelong learning, media/film, microlearning, more voice more choice more control, multimedia, Natural Language Processing (NLP), new business models, NVIDIA, online media, online tutoring, Open AI, pace of change, personalized/customized learning, platforms, productivity / tips and tricks, professional development, reinvent, skills, smart/connected TV, society, staying relevant, streams of content, student-related, surviving, teachers, technologies for your home, technology (general), the downsides of how people use tech, tools, training / L&D, United States, universities, user experience (UX), user interface design, vendors, vision/possibilities, web-based collaboration, workplace, by Daniel Christian

From DSC:
Last Thursday, I presented at the Educational Technology Organization of Michigan’s Spring 2024 Retreat. I wanted to pass along my slides to you all, in case they are helpful to you.

Topics/agenda:

Topics & resources re: Artificial Intelligence (AI)
- Top multimodal players
- Resources for learning about AI
- Applications of AI
- My predictions re: AI
The powerful impact of pursuing a vision
A potential, future next-gen learning platform
Share some lessons from my past with pertinent questions for you all now
The significant impact of an organization’s culture
Bonus material: Some people to follow re: learning science and edtech

Slides of the presentation (.PPTX)
Slides of the presentation (.PDF)

Plus several more slides re: this vision.

Doing Stuff with AI: Opinionated Midyear Edition — from oneusefulthing.org by Ethan Mollick

Every six months or so, I write a guide to doing stuff with AI. A lot has changed since the last guide, while a few important things have stayed the same. It is time for an update.
…
To learn to do serious stuff with AI, choose a Large Language Model and just use it to do serious stuff – get advice, summarize meetings, generate ideas, write, produce reports, fill out forms, discuss strategy – whatever you do at work, ask the AI to help. A lot of people I talk to seem to get the most benefit from engaging the AI in conversation, often because it gives good advice, but also because just talking through an issue yourself can be very helpful. I know this may not seem particularly profound, but “always invite AI to the table” is the principle in my book that people tell me had the biggest impact on them. You won’t know what AI can (and can’t) do for you until you try to use it for everything you do. And don’t sweat prompting too much, though here are some useful tips, just start a conversation with AI and see where it goes.

You do need to use one of the most advanced frontier models, however.

DSC: There are likely to be more “menus” that run previously fine-tuned prompts

On 05/28/2024, in 21st century, 24x7x365 access, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, Daniel S. Christian, intelligent systems, intelligent tutoring, interaction design, interactivity, interface design, user experience (UX), user interface design, vendors, by Daniel Christian

This posting is out on LinkedIn.

AI’s New Conversation Skills Eyed for Education [Coffey]

On 05/17/2024, in 21st century, 24x7x365 access, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, bots, colleges, communications, community colleges, digital audio, digital learning, digital video, education technology, emerging technologies, experimentation, faculty, future of higher education, game-changing environment, higher education, human-computer interaction (HCI), innovation, instructional design, intelligent systems, intelligent tutoring, IT in HE, law schools, learning, learning agents, learning ecosystem, Learning from the Living [Class] Room, lifelong learning, Open AI, personalized/customized learning, platforms, student-related, teaching & learning, technologies for your home, tools, United States, universities, user experience (UX), vendors, vision/possibilities, web-based collaboration, by Daniel Christian

AI’s New Conversation Skills Eyed for Education — from insidehighered.com by Lauren Coffey
The latest ChatGPT’s more human-like verbal communication has professors pondering personalized learning, on-demand tutoring and more classroom applications.

ChatGPT’s newest version, GPT-4o ( the “o” standing for “omni,” meaning “all”), has a more realistic voice and quicker verbal response time, both aiming to sound more human. The version, which should be available to free ChatGPT users in coming weeks—a change also hailed by educators—allows people to interrupt it while it speaks, simulates more emotions with its voice and translates languages in real time. It also can understand instructions in text and images and has improved video capabilities.
…
Ajjan said she immediately thought the new vocal and video capabilities could allow GPT to serve as a personalized tutor. Personalized learning has been a focus for educators grappling with the looming enrollment cliff and for those pushing for student success.

There’s also the potential for role playing, according to Ajjan. She pointed to mock interviews students could do to prepare for job interviews, or, for example, using GPT to play the role of a buyer to help prepare students in an economics course.

io.google/2024

.

How generative AI expands curiosity and understanding with LearnLM — from blog.google
LearnLM is our new family of models fine-tuned for learning, and grounded in educational research to make teaching and learning experiences more active, personal and engaging.

Generative AI is fundamentally changing how we’re approaching learning and education, enabling powerful new ways to support educators and learners. It’s taking curiosity and understanding to the next level — and we’re just at the beginning of how it can help us reimagine learning.

Today we’re introducing LearnLM: our new family of models fine-tuned for learning, based on Gemini.

On YouTube, a conversational AI tool makes it possible to figuratively “raise your hand” while watching academic videos to ask clarifying questions, get helpful explanations or take a quiz on what you’ve been learning. This even works with longer educational videos like lectures or seminars thanks to the Gemini model’s long-context capabilities. These features are already rolling out to select Android users in the U.S.
…
Learn About is a new Labs experience that explores how information can turn into understanding by bringing together high-quality content, learning science and chat experiences. Ask a question and it helps guide you through any topic at your own pace — through pictures, videos, webpages and activities — and you can upload files or notes and ask clarifying questions along the way.

Google I/O 2024: An I/O for a new generation — from blog.google

The Gemini era
A year ago on the I/O stage we first shared our plans for Gemini: a frontier model built to be natively multimodal from the beginning, that could reason across text, images, video, code, and more. It marks a big step in turning any input into any output — an “I/O” for a new generation.

In this story:

Google just announced huge Gemini updates, a Sora competitor, AI agents, and more.

The 12 most impressive announcements at Google I/O:

1. Project Astra: An AI agent that can see AND hear what you do live in real-time.pic.twitter.com/sA2YT80O5G

— Rowan Cheung (@rowancheung) May 15, 2024

Daily Digest: Google I/O 2024 – AI search is here. — from bensbites.beehiiv.com
PLUS: It’s got Agents, Video and more. And, Ilya leaves OpenAI

Google is integrating AI into all of its ecosystem: Search, Workspace, Android, etc. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.
All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.
Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.

Google just casually announced Veo, a new rival to OpenAI’s Sora.

It can generate insanely good 1080p video up to 60 seconds.

9 wild examples:

1)pic.twitter.com/rYySaeMRDa

— Proper ? (@ProperPrompter) May 14, 2024

New ways to engage with Gemini for Workspace — from workspace.google.com

Today at Google I/O we’re announcing new, powerful ways to get more done in your personal and professional life with Gemini for Google Workspace. Gemini in the side panel of your favorite Workspace apps is rolling out more broadly and will use the 1.5 Pro model for answering a wider array of questions and providing more insightful responses. We’re also bringing more Gemini capabilities to your Gmail app on mobile, helping you accomplish more on the go. Lastly, we’re showcasing how Gemini will become the connective tissue across multiple applications with AI-powered workflows. And all of this comes fresh on the heels of the innovations and enhancements we announced last month at Google Cloud Next.

Google’s Gemini updates: How Project Astra is powering some of I/O’s big reveals — from techcrunch.com by Kyle Wiggers

Google is improving its AI-powered chatbot Gemini so that it can better understand the world around it — and the people conversing with it.

At the Google I/O 2024 developer conference on Tuesday, the company previewed a new experience in Gemini called Gemini Live, which lets users have “in-depth” voice chats with Gemini on their smartphones. Users can interrupt Gemini while the chatbot’s speaking to ask clarifying questions, and it’ll adapt to their speech patterns in real time. And Gemini can see and respond to users’ surroundings, either via photos or video captured by their smartphones’ cameras.

Generative AI in Search: Let Google do the searching for you — from blog.google
With expanded AI Overviews, more planning and research capabilities, and AI-organized search results, our custom Gemini model can take the legwork out of searching.

A major step towards much more natural human-computer interaction: OpenAI introduces GPT-4o

On 05/14/2024, in 24x7x365 access, A/V -- audio/visual, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, assistive technologies, bots, digital audio, digital learning, digital video, education, education technology, emerging technologies, Emotion, higher education, homeschooling/homeschoolers, human-computer interaction (HCI), ideas, informal learning, innovation, instructional design, intelligent systems, intelligent tutoring, interaction design, interactivity, IT in HE, K-12 related, languages and translation, law schools, learner profiles, learning, learning agents, learning ecosystem, Learning from the Living [Class] Room, learning preferences, liberal arts, lifelong learning, mathematics, multimedia, Natural Language Processing (NLP), online tutoring, Open AI, personalized/customized learning, platforms, productivity / tips and tricks, smart classrooms, smart/connected TV, student-related, technologies for your home, television, tools, United States, universities, usability, user experience (UX), user interface design, vendors, vision/possibilities, voice recognition / voice enabled interfaces, by Daniel Christian

Hello GPT-4o — from openai.com
We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

?

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.

Example topics covered here:

Two GPT-4os interacting and singing
Languages/translation
Personalized math tutor
Meeting AI
Harmonizing and creating music
Providing inflection, emotions, and a human-like voice
Understanding what the camera is looking at and integrating it into the AI’s responses
Providing customer service

With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.

This demo is insane.

A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.

Imagine giving this to every student in the world.

The future is so, so bright. pic.twitter.com/t14M4fDjwV

— Mckay Wrigley (@mckaywrigley) May 13, 2024

From DSC:
I like the assistive tech angle here:

GPT-4o as tested by @BeMyEyes: pic.twitter.com/WeAoVmxUFH

— Greg Brockman (@gdb) May 14, 2024

It’s been less than 24 hours since the OpenAI changed the world with GPT-4o announcement.

And the Internet is a flooded with demo videos.

Here’re the 10 most jaw-dropping examples so far (Don’t miss the 6th one) pic.twitter.com/sLx1D1YSqb

— Poonam Soni (@CodeByPoonam) May 14, 2024

Voice Banks (preserving our voices for AI) [Kubicki]

On 05/12/2024, in 21st century, 24x7x365 access, A/V -- audio/visual, Artificial Intelligence / Machine Learning / Deep Learning / Algorithms, digital audio, digital video, emerging technologies, law schools, legislatures / government / legal, society, by Daniel Christian

Voice Banks (preserving our voices for AI) — from thebrainyacts.beehiiv.com by Josh Kubicki

The Ethical and Emotional Implications of AI Voice Preservation

Legal Considerations and Voice Rights
From a legal perspective, the burgeoning use of AI in voice cloning also introduces a complex web of rights and permissions. The recent passage of Tennessee’s ELVIS Act, which allows legal action against unauthorized recreations of an artist’s voice, underscores the necessity for robust legal frameworks to manage these technologies. For non-celebrities, the idea of a personal voice bank brings about its own set of legal challenges. How do we regulate the use of an individual’s voice after their death? Who holds the rights to control and consent to the usage of these digital artifacts?

To safeguard against misuse, any system of voice banking would need stringent controls over who can access and utilize these voices. The creation of such banks would necessitate clear guidelines and perhaps even contractual agreements stipulating the terms under which these voices may be used posthumously.

Should we all consider creating voice banks to preserve our voices, allowing future generations the chance to interact with us even after we are gone?

??

Learning Ecosystems

OpenAI introduces SearchGPT + other items re: AI in general

“AI will be a vital tool in overcoming the access to justice challenge.” [Susskind] + other legaltech-related items

Emerging Trends in Legal Tech [Joyner & Correia]

“Bill Gates Reveals Superhuman AI Prediction” [Griscom & Co.] + other items re: AI in general

Several items re: text-to-video (and even images-to-video)

Doing Stuff with AI: Opinionated Midyear Edition [Mollick]

DSC: There are likely to be more “menus” that run previously fine-tuned prompts

Announcements from Google I/O re: their AI-based offerings

Voice Banks (preserving our voices for AI) [Kubicki]

Categories

Tags