Hello GPT-4o — from openai.com We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
Providing inflection, emotions, and a human-like voice
Understanding what the camera is looking at and integrating it into the AI’s responses
Providing customer service
With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.
This demo is insane.
A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.
Imagine giving this to every student in the world.
I recently created an AI version of myself—REID AI—and recorded a Q&A to see how this digital twin might challenge me in new ways. The video avatar is generated by Hour One, its voice was created by Eleven Labs, and its persona—the way that REID AI formulates responses—is generated from a custom chatbot built on GPT-4 that was trained on my books, speeches, podcasts and other content that I’ve produced over the last few decades. I decided to interview it to test its capability and how closely its responses match—and test—my thinking. Then, REID AI asked me some questions on AI and technology. I thought I would hate this, but I’ve actually ended up finding the whole experience interesting and thought-provoking.
From DSC: This ability to ask questions of a digital twin is very interesting when you think about it in terms of “interviewing” a historical figure. I believe character.ai provides this kind of thing, but I haven’t used it much.
AI Resources and Teaching | Kent State University offers valuable resources for educators interested in incorporating artificial intelligence (AI) into their teaching practices. The university recognizes that the rapid emergence of AI tools presents both challenges and opportunities in higher education.
The AI Resources and Teaching page provides educators with information and guidance on various AI tools and their responsible use within and beyond the classroom. The page covers different areas of AI application, including language generation, visuals, videos, music, information extraction, quantitative analysis, and AI syllabus language examples.
For all its jaw-dropping power, Watson the computer overlord was a weak teacher. It couldn’t engage or motivate kids, inspire them to reach new heights or even keep them focused on the material — all qualities of the best mentors.
It’s a finding with some resonance to our current moment of AI-inspired doomscrolling about the future of humanity in a world of ascendant machines. “There are some things AI is actually very good for,” Nitta said, “but it’s not great as a replacement for humans.”
His five-year journey to essentially a dead-end could also prove instructive as ChatGPT and other programs like it fuel a renewed, multimillion-dollar experiment to, in essence, prove him wrong.
…
To be sure, AI can do sophisticated things such as generating quizzes from a class reading and editing student writing. But the idea that a machine or a chatbot can actually teach as a human can, he said, represents “a profound misunderstanding of what AI is actually capable of.”
Nitta, who still holds deep respect for the Watson lab, admits, “We missed something important. At the heart of education, at the heart of any learning, is engagement. And that’s kind of the Holy Grail.”
From DSC: This is why the vision that I’ve been tracking and working on has always said that HUMAN BEINGS will be necessary — they are key to realizing this vision. Along these lines, here’s a relevant quote:
Another crucial component of a new learning theory for the age of AI would be the cultivation of “blended intelligence.” This concept recognizes that the future of learning and work will involve the seamless integration of human and machine capabilities, and that learners must develop the skills and strategies needed to effectively collaborate with AI systems. Rather than viewing AI as a threat to human intelligence, a blended intelligence approach seeks to harness the complementary strengths of humans and machines, creating a symbiotic relationship that enhances the potential of both.
Per Alexander “Sasha” Sidorkin, Head of the National Institute on AI in Society at California State University Sacramento.
NVIDIA Digital Human Technologies Bring AI Characters to Life
Leading AI Developers Use Suite of NVIDIA Technologies to Create Lifelike Avatars and Dynamic Characters for Everything From Games to Healthcare, Financial Services and Retail Applications
Today is the beginning of our moonshot to solve embodied AGI in the physical world. I’m so excited to announce Project GR00T, our new initiative to create a general-purpose foundation model for humanoid robot learning.
As the FlexOS research study “Generative AI at Work” concluded based on a survey amongst knowledge workers, ChatGPT reigns supreme. … 2. AI Tool Usage is Way Higher Than People Expect – Beating Netflix, Pinterest, Twitch. As measured by data analysis platform Similarweb based on global web traffic tracking, the AI tools in this list generate over 3 billion monthly visits.
With 1.67 billion visits, ChatGPT represents over half of this traffic and is already bigger than Netflix, Microsoft, Pinterest, Twitch, and The New York Times.
Something unusual is happening in America. Demand for electricity, which has stayed largely flat for two decades, has begun to surge.
Over the past year, electric utilities have nearly doubled their forecasts of how much additional power they’ll need by 2028 as they confront an unexpected explosion in the number of data centers, an abrupt resurgence in manufacturing driven by new federal laws, and millions of electric vehicles being plugged in.
The tumult could seem like a distraction from the startup’s seemingly unending march toward AI advancement. But the tension, and the latest debate with Musk, illuminates a central question for OpenAI, along with the tech world at large as it’s increasingly consumed by artificial intelligence: Just how open should an AI company be?
…
The meaning of the word “open” in “OpenAI” seems to be a particular sticking point for both sides — something that you might think sounds, on the surface, pretty clear. But actual definitions are both complex and controversial.
In partnership with the National Cancer Institute, or NCI, researchers from the Department of Energy’s Oak Ridge National Laboratory and Louisiana State University developed a long-sequenced AI transformer capable of processing millions of pathology reports to provide experts researching cancer diagnoses and management with exponentially more accurate information on cancer reporting.
Dave told me that he couldn’t have made Borrowing Time without AI—it’s an expensive project that traditional Hollywood studios would never bankroll. But after Dave’s short went viral, major production houses approached him to make it a full-length movie. I think this is an excellent example of how AI is changing the art of filmmaking, and I came out of this interview convinced that we are on the brink of a new creative age.
We dive deep into the world of AI tools for image and video generation, discussing how aspiring filmmakers can use them to validate their ideas, and potentially even secure funding if they get traction. Dave walks me through how he has integrated AI into his movie-making process, and as we talk, we make a short film featuring Nicolas Cage using a haunted roulette ball to resurrect his dead movie career, live on the show.
In the realm of technological advancements, artificial intelligence (AI) stands out as a beacon of immeasurable potential, yet also as a source of existential angst when considering that AI might already be beyond our ability to control.
His research underscores a chilling truth: our current understanding and control of AI are woefully inadequate, posing a threat that could either lead to unprecedented prosperity or catastrophic extinction.
From DSC: This next item is for actors, actresses, and voiceover specialists:
Turn your voice into passive income. — from elevenlabs.io; via Ben’s Bites Are you a professional voice actor? Sign up and share your voice today to start earning rewards every time it’s used.
This paper presents a groundbreaking comparison between Large Language Models (LLMs) and traditional legal contract reviewers—Junior Lawyers and Legal Process Outsourcers (LPOs). We dissect whether LLMs can outperform humans in accuracy, speed, and cost-efficiency during contract review. Our empirical analysis benchmarks LLMs against a ground truth set by Senior Lawyers, uncovering that advanced models match or exceed human accuracy in determining legal issues. In speed, LLMs complete reviews in mere seconds, eclipsing the hours required by their human counterparts. Cost-wise, LLMs operate at a fraction of the price, offering a staggering 99.97 percent reduction in cost over traditional methods. These results are not just statistics—they signal a seismic shift in legal practice. LLMs stand poised to disrupt the legal industry, enhancing accessibility and efficiency of legal services. Our research asserts that the era of LLM dominance in legal contract review is upon us, challenging the status quo and calling for a reimagined future of legal workflows.
Technology can help you solve problems in your law firm, connect with clients, save time and money, and so much more. But, how do you know what tech will work best for your practice? Molly Ranns and JoAnn Hathaway talk with Colin Levy about understanding and utilizing technology, common mistakes in choosing new tools, and ways to overcome tech-related fear and anxiety.
Every time we attend Legalweek, we have a unique opportunity to tap into the collective knowledge of hundreds of legal professionals. This year at Legalweek 2024 we talked with peers in a wide variety of roles, from litigation support professionals and lawyers to partners and heads of innovation. Throughout the sessions and discussions, we started to notice a few common themes, interesting trends, and helpful insights.
Generative AI (GAI) has the potential to disrupt fields such as architecture, design, and engineering by enabling users to quickly generate digital content in response to prompts.
GAI, represented by large language models like GPT-4, has shown remarkable capabilities in natural language processing, machine translation, and content generation.
GAI’s ability to produce thoughtful content and analysis at almost zero marginal cost is causing significant impact in global politics, industry, and culture.
The architecture, engineering, and construction (AEC) industry is already experiencing the effects of GAI, with concerns about job displacement and the use of AI-generated avatars.
GAI is compute-bound, leading to a high demand for computing power, particularly GPUs. However, emerging trends suggest that future developments will establish a
Today, we’re a step closer to this vision as we introduce Gemini, the most capable and general model we’ve ever built.
Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.
So, in many ways, ChatGPT and its friends are far from as intelligent as a human; they do not have “general” intelligence (AGI).
But this will not last for long. The debate about ProjectQ aside, AIs with the ability to engage in high-level reasoning, plan, and have long-term memory are expected in the next 2–3 years. We are already seeing AI agents that are developing the ability to actautonomously and collaborate to a degree. Once AIs can reason and plan, acting autonomously and collaborating will not be a challenge.
ChatGPT is winning the future — but what future is that?— from theverge.com by David Pierce OpenAI didn’t mean to kickstart a generational shift in the technology industry. But it did. Now all we have to decide is where to go from here.
We don’t know yet if AI will ultimately change the world the way the internet, social media, and the smartphone did. Those things weren’t just technological leaps — they actually reorganized our lives in fundamental and irreversible ways. If the final form of AI is “my computer writes some of my emails for me,” AI won’t make that list. But there are a lot of smart people and trillions of dollars betting that’s the beginning of the AI story, not the end. If they’re right, the day OpenAI launched its “research preview” of ChatGPT will be much more than a product launch for the ages. It’ll be the day the world changed, and we didn’t even see it coming.
AI is overhyped” — from theneurondaily.com by Pete Huang & Noah Edelman
If you’re feeling like AI is the future, but you’re not sure where to start, here’s our advice for 2024 based on our convos with business leaders:
Start with problems – Map out where your business is spending time and money, then ask if AI can help. Don’t do AI to say you’re doing AI.
Model the behavior – Teams do better in making use of new tools when their leadership buys in. Show them your support.
Do what you can, wait for the rest – With AI evolving so fast, “do nothing for now” is totally valid. Start with what you can do today (accelerating individual employee output) and keep up-to-date on the rest.
Google has unveiled a new artificial intelligence model that it claims outperforms ChatGPT in most tests and displays “advanced reasoning” across multiple formats, including an ability to view and mark a student’s physics homework.
The model, called Gemini, is the first to be announced since last month’s global AI safety summit, at which tech firms agreed to collaborate with governments on testing advanced systems before and after their release. Google said it was in discussions with the UK’s newly formed AI Safety Institute over testing Gemini’s most powerful version, which will be released next year.
So asks a recent study by two academics from Stanford Law School, David Freeman Engstrom and Nora Freeman Engstrom, on the potential impact of AI on the civil litigation landscape in the US.
It is against this landscape, the study observes, that champions of legal tech have suggested that there is an opportunity for legal tech to “democratise” litigation and put litigation’s “haves” and “have nots” on a more equal footing, by arming smaller firms and sole practitioners with the tools necessary to do battle against their better resourced opponents, and cutting the cost of legal services, putting lawyers within reach of a wider swathe of people.
But is this a real opportunity, and will AI be key to its realisation?
…
However, while AI may reduce the justice gap between the “haves” and “have-nots” of litigation, it could also exacerbate existing inequalities.
From DSC: While this article approaches things from the lawyer’s viewpoint, I’d like to see this question and the use of AI from the common man’s/woman’s viewpoint. Why? In order to provide FAR GREATER access to justice (#A2J) for those who can’t afford a lawyer as they head into the civil law courtrooms.
Should I take my case to court? Do I have a chance to win this case? If so, how?
What forms do I need to complete if I’m going to go to court?
When and how do I address the judge?
What does my landlord have to do?
How do I prevent myself from falling into a debt-collection mess and/or what options do I have to get out of this mess?
Are there any lawyers in my area who would take my case on a pro bono basis?
…and judges and lawyers — as well as former litigants — could add many more questions (and answers) to this list
Bottom line: It is my hope that technology can help increase access to justice.
A number of products are already under development, or have been launched. One example is a project that Norton Rose Fulbright is working on, together with not-for-profit legal service Justice Connect. The scope is to develop an automated natural language processing AI model that seeks to interpret the ‘everyday’ language used by clients in order to identify the client’s legal issues and correctly diagnose their legal problem. This tool is aimed at addressing the struggles that individuals often face in deciphering legal jargon and understanding the nature of their legal issue and the type of lawyer, or legal support, they need to resolve that problem.
Artificial intelligence is disrupting higher education — from itweb.co.za by Rennie Naidoo; via GSV Traditional contact universities need to adapt faster and find creative ways of exploring and exploiting AI, or lose their dominant position.
Higher education professionals have a responsibility to shape AI as a force for good.
Introducing Canva’s biggest education launch — from canva.com We’re thrilled to unveil our biggest education product launch ever. Today, we’re introducing a whole new suite of products that turn Canva into the all-in-one classroom tool educators have been waiting for.
Also seeCanva for Education. Create and personalize lesson plans, infographics,
posters, video, and more. 100% free for
teachers and students at eligible schools.
ChatGPT and generative AI: 25 applications to support student engagement — from timeshighereducation.com by Seb Dianati and Suman Laudari In the fourth part of their series looking at 100 ways to use ChatGPT in higher education, Seb Dianati and Suman Laudari share 25 prompts for the AI tool to boost student engagement
There are two ways to use ChatGPT — from theneurondaily.com
Type to it.
Talk to it (new).
… Since then, we’ve looked to it for a variety of real-world business advice. For example, Prof Ethan Mollick posted a great guide using ChatGPT-4 with voice as a negotiation instructor.
In a similar fashion, you can consult ChatGPT with voice for feedback on:
Job interviews.
Team meetings.
Business presentations.
With a prompt, GPT-4 with voice does a pretty good job of acting as a negotiation simulator/instructor. It is not all the way there, but as someone who builds educational simulations, I can tell you this is already impressively far along towards an effective teaching tool.… pic.twitter.com/IphPHF95cL
Via The Rundown:Google is using AI to analyze the company’s Maps data and suggest adjustments to traffic light timing — aiming to cut driver waits, stops, and emissions.
The camera never lies. Except, of course, it does – and seemingly more often with each passing day.
In the age of the smartphone, digital edits on the fly to improve photos have become commonplace, from boosting colours to tweaking light levels.
Now, a new breed of smartphone tools powered by artificial intelligence (AI) are adding to the debate about what it means to photograph reality.
Google’s latest smartphones released last week, the Pixel 8 and Pixel 8 Pro, go a step further than devices from other companies. They are using AI to help alter people’s expressions in photographs.
Still using AI to help you mark a student’s work?
Mark a full class’s worth with one prompt.
Here’s the Whole Class Feedback Giant Prompt.
Comment & retweet & I’ll DM it to you.
Discover why thousands of teachers subscribe to the Sunday AI Educator ?https://t.co/ivpXYyWNzN
— Dan Fitzpatrick – The AI Educator (@theaieducatorX) October 22, 2023
Dr. Chris Dede, of Harvard University and Co-PI of the National AI Institute for Adult Learning and Online Education, spoke about the differences between knowledge and wisdom in AI-human interactions in a keynote address at the 2022 Empowering Learners for the Age of AI conference. He drew a parallel between Star Trek: The Next Generation characters Data and Picard during complex problem-solving: While Data offers the knowledge and information, Captain Picard offers the wisdom and context from on a leadership mantle, and determines its relevance, timing, and application.
This “decreasing obstacles” framing turned out to be helpful in thinking about generative AI. When the time came, my answer to the panel question, “how would you summarize the impact generative AI is going to have on education?” was this:
“Generative AI greatly reduces the degree to which access to expertise is an obstacle to education.”
We haven’t even started to unpack the implications of this notion yet, but hopefully just naming it will give the conversation focus, give people something to disagree with, and help the conversation progress more quickly.
How to Make an AI-Generated Film — from heatherbcooper.substack.com by Heather Cooper Plus, Midjourney finally has a new upscale tool!
From DSC: I’m not excited about this, as I can’t help but wonder…how long before the militaries of the world introduce this into their warfare schemes and strategies?
To use Sherpa, an instructor first uploads the reading they’ve assigned, or they can have the student upload a paper they’ve written. Then the tool asks a series of questions about the text (either questions input by the instructor or generated by the AI) to test the student’s grasp of key concepts. The software gives the instructor the choice of whether they want the tool to record audio and video of the conversation, or just audio.
The tool then uses AI to transcribe the audio from each student’s recording and flags areas where the student answer seemed off point. Teachers can review the recording or transcript of the conversation and look at what Sherpa flagged as trouble to evaluate the student’s response.
“The [Ai Pin is a] connected and intelligent clothing-based wearable device uses a range of sensors that enable contextual and ambient compute interactions,” the company noted at the time. “The Ai Pin is a type of standalone device with a software platform that harnesses the power of Ai to enable innovative personal computing experiences.”
Also relevant/see:
Introducing Rewind Pendant – a wearable that captures what you say and hear in the real world!
? Rewind powered by truly everything you’ve seen, said, or heard
? Summarize and ask any question using AI
? Private by design
ChatGPT can now see, hear, and speak — from openai.com We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.
Voice and image give you more ways to use ChatGPT in your life. Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.
We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.
Use your voice to engage in a back-and-forth conversation with ChatGPT. Speak with it on the go, request a bedtime story, or settle a dinner table debate.
ChatGPT can now browse the internet to provide you with current and authoritative information, complete with direct links to sources. It is no longer limited to data before September 2021. pic.twitter.com/pyj8a9HWkB
Because it’s the first-ever on Earth, it’s hard to label what kind of entertainment Hypercinema is. While it’s marketed as a “live AI experience” that blends “theatre, film and digital technology”, Dr. Gregory made it clear that it’s not here to make movies and TV extinct.
Your face and personality are how HyperCinema sets itself apart from the art forms of old. You get 15 photos of your face taken from different angles, then answer a questionnaire – mine started by asking what my fave vegetable was and ended by demanding to know what I thought the biggest threat to humanity was. Deep stuff, but the questions are always changing, cos that’s how AI rolls.
All of this information is stored on your cube – a green, glowing accessory that you carry around for the whole experience and insert into different sockets to transfer your info onto whatever screen is in front of you. Upon inserting your cube, the “live AI experience” starts.
The AI has taken your photos and superimposed your face on a variety of made-up characters in different situations.
We are entering a new era of AI, one that is fundamentally changing how we relate to and benefit from technology. With the convergence of chat interfaces and large language models you can now ask for what you want in natural language and the technology is smart enough to answer, create it or take action. At Microsoft, we think about this as having a copilot to help navigate any task. We have been building AI-powered copilots into our most used and loved products – making coding more efficient with GitHub, transforming productivity at work with Microsoft 365, redefining search with Bing and Edge and delivering contextual value that works across your apps and PC with Windows.
Today we take the next step to unify these capabilities into a single experience we call Microsoft Copilot, your everyday AI companion. Copilot will uniquely incorporate the context and intelligence of the web, your work data and what you are doing in the moment on your PC to provide better assistance – with your privacy and security at the forefront.
DALL·E 3 understands significantly more nuance and detail than our previous systems, allowing you to easily translate your ideas into exceptionally accurate images. DALL·E 3 is now in research preview, and will be available to ChatGPT Plus and Enterprise customers in October, via the API and in Labs later this fall.