From DSC:
Whenever we’ve had a flat tire over the years, a tricky part of the repair process is jacking up the car so that no harm is done to the car (or to me!). There are some grooves underneath the Toyota Camry where one is supposed to put the jack. But as the car is very low to the ground, these grooves are very hard to find (even in good weather and light). 

 

What’s needed is a robotic jack with vision.

If the jack had “vision” and had wheels on it, the device could locate the exact location of the grooves, move there, and then ask the owner whether they are ready for the car to be lifted up. The owner could execute that order when they are ready and the robotic jack could safely hoist the car up.

This type of robotic device is already out there in other areas. But this idea for assistance with replacing a flat tire represents an AI and robotic-based, consumer-oriented application that we’ll likely be seeing much more of in the future. Carmakers and suppliers, please add this one to your list!

Daniel

 

Duolingo Introduces AI-Powered Innovations at Duocon 2024 — from investors.duolingo.com; via Claire Zau

Duolingo’s new Video Call feature represents a leap forward in language practice for learners. This AI-powered tool allows Duolingo Max subscribers to engage in spontaneous, realistic conversations with Lily, one of Duolingo’s most popular characters. The technology behind Video Call is designed to simulate natural dialogue and provides a personalized, interactive practice environment. Even beginner learners can converse in a low-pressure environment because Video Call is designed to adapt to their skill level. By offering learners the opportunity to converse in real-time, Video Call builds the confidence needed to communicate effectively in real-world situations. Video Call is available for Duolingo Max subscribers learning English, Spanish, and French.


And here’s another AI-based learning item:

AI reading coach startup Ello now lets kids create their own stories — from techcrunch.com by Lauren Forristal; via Claire Zau

Ello, the AI reading companion that aims to support kids struggling to read, launched a new product on Monday that allows kids to participate in the story-creation process.

Called “Storytime,” the new AI-powered feature helps kids generate personalized stories by picking from a selection of settings, characters, and plots. For instance, a story about a hamster named Greg who performed in a talent show in outer space.

 

Gemini makes your mobile device a powerful AI assistant — from blog.google
Gemini Live is available today to Advanced subscribers, along with conversational overlay on Android and even more connected apps.

Rolling out today: Gemini Live <– Google swoops in before OpenAI can get their Voice Mode out there
Gemini Live is a mobile conversational experience that lets you have free-flowing conversations with Gemini. Want to brainstorm potential jobs that are well-suited to your skillset or degree? Go Live with Gemini and ask about them. You can even interrupt mid-response to dive deeper on a particular point, or pause a conversation and come back to it later. It’s like having a sidekick in your pocket who you can chat with about new ideas or practice with for an important conversation.

Gemini Live is also available hands-free: You can keep talking with the Gemini app in the background or when your phone is locked, so you can carry on your conversation on the go, just like you might on a regular phone call. Gemini Live begins rolling out today in English to our Gemini Advanced subscribers on Android phones, and in the coming weeks will expand to iOS and more languages.

To make speaking to Gemini feel even more natural, we’re introducing 10 new voices to choose from, so you can pick the tone and style that works best for you.

.

Per the Rundown AI:
Why it matters: Real-time voice is slowly shifting AI from a tool we text/prompt with, to an intelligence that we collaborate, learn, consult, and grow with. As the world’s anticipation for OpenAI’s unreleased products grows, Google has swooped in to steal the spotlight as the first to lead widespread advanced AI voice rollouts.

Beyond Social Media: Schmidt Predicts AI’s Earth-Shaking Impact— from wallstreetpit.com
The next wave of AI is coming, and if Schmidt is correct, it will reshape our world in ways we are only beginning to imagine.

In a recent Q&A session at Stanford, Eric Schmidt, former CEO and Chairman of search giant Google, offered a compelling vision of the near future in artificial intelligence. His predictions, both exciting and sobering, paint a picture of a world on the brink of a technological revolution that could dwarf the impact of social media.

Schmidt highlighted three key advancements that he believes will converge to create this transformative wave: very large context windows, agents, and text-to-action capabilities. These developments, according to Schmidt, are not just incremental improvements but game-changers that could reshape our interaction with technology and the world at large.

.


The rise of multimodal AI agents— from 11onze.cat
Technology companies are investing large amounts of money in creating new multimodal artificial intelligence models and algorithms that can learn, reason and make decisions autonomously after collecting and analysing data.

The future of multimodal agents
In practical terms, a multimodal AI agent can, for example, analyse a text while processing an image, spoken language, or an audio clip to give a more complete and accurate response, both through voice and text. This opens up new possibilities in various fields: from education and healthcare to e-commerce and customer service.


AI Change Management: 41 Tactics to Use (August 2024)— from flexos.work by Daan van Rossum
Future-proof companies are investing in driving AI adoption, but many don’t know where to start. The experts recommend these 41 tips for AI change management.

As Matt Kropp told me in our interview, BCG has a 10-20-70 rule for AI at work:

  • 10% is the LLM or algorithm
  • 20% is the software layer around it (like ChatGPT)
  • 70% is the human factor

This 70% is exactly why change management is key in driving AI adoption.

But where do you start?

As I coach leaders at companies like Apple, Toyota, Amazon, L’Oréal, and Gartner in our Lead with AI program, I know that’s the question on everyone’s minds.

I don’t believe in gatekeeping this information, so here are 41 principles and tactics I share with our community members looking for winning AI change management principles.


 

From DSC:
[For those folks who use Google Chrome]

When you keep getting distracted from all of the extraneous items — such as those annoying videos and advertisements — that appear when you launch a web page, there is a solution to quickly hiding all of those items. It’s called Postlight Reader. I’ve been using it for years and wanted to put this information out there for folks who might not have heard about it.

 

I highly recommend it if you are having trouble reading an article and processing the information that it contains. Instructional Designers will know all about Extraneous Load (one of the types of Cognitive Load) and how it negatively impacts one’s learning and processing of the information that really counts (i.e., the Germane Cognitive Load).

Note the differences when I used Postlight Reader on an article out at cbsnews.com:

 

The page appears with all kinds of ads and videos going on…I can hardly
process the information on the article due to these items:

 

 

Then, after I enabled this extension in Chrome and click on
the icon for Postlight Reader, it strips away all of those items
and leaves me with the article that I wanted to read:

 

 

If you aren’t using it, I highly recommend that you give it a try.

 


Postlight Reader – Clear away the clutter from all of your articles. Instantly.

The Postlight Reader extension for Chrome removes ads and distractions, leaving only text and images for a clean and consistent reading view on every site. Features:

  • Disable surrounding webpage noise and clutter with one click
  • Send To Kindle functionality
  • Adjust typeface and text size, and toggle between light or dark themes
  • Quick keyboard shortcut (Cmd + Esc for Mac users, Alt + ` for Windows users) to switch to Reader on any article page
  • Printing optimization
  • Sharing through Facebook, Twitter and Email
 

From DSC:
The above item is simply excellent!!! I love it!



Also relevant/see:

3 new Chrome AI features for even more helpful browsing — from blog.google from Parisa Tabriz
See how Chrome’s new AI features, including Google Lens for desktop and Tab compare, can help you get things done more easily on the web.


On speaking to AI — from oneusefulthing.org by Ethan Mollick
Voice changes a lot of things

So, let’s talk about ChatGPT’s new Advanced Voice mode and the new AI-powered Siri. They are not just different approaches to talking to AI. In many ways, they represent the divide between two philosophies of AI – Copilots versus Agents, small models versus large ones, specialists versus generalists.


Your guide to AI – August 2024 — from nathanbenaich.substack.com by Nathan Benaich and Alex Chalmers


Microsoft says OpenAI is now a competitor in AI and search — from cnbc.com by Jordan Novet

Key Points

  • Microsoft’s annually updated list of competitors now includes OpenAI, a long-term strategic partner.
  • The change comes days after OpenAI announced a prototype of a search engine.
  • Microsoft has reportedly invested $13 billion into OpenAI.


Excerpt from by Graham Clay

1. Flux, an open-source text-to-image creator that is comparable to industry leaders like Midjourney, was released by Black Forest Labs (the “original team” behind Stable Diffusion). It is capable of generating high quality text in images (there are tons of educational use cases). You can play with it on their demo page, on Poe, or by running it on your own computer (tutorial here).

Other items re: Flux:

How to FLUX  — from heatherbcooper.substack.com by Heather Cooper
Where to use FLUX online & full tutorial to create a sleek ad in minutes

.

Also from Heather Cooper:

Introducing FLUX: Open-Source text to image model

FLUX… has been EVERYWHERE this week, as I’m sure you have seen. Developed by Black Forest Labs, is an open-source image generation model that’s gaining attention for its ability to rival leading models like Midjourney, DALL·E 3, and SDXL.

What sets FLUX apart is its blend of creative freedom, precision, and accessibility—it’s available across multiple platforms and can be run locally.

Why FLUX Matters
FLUX’s open-source nature makes it accessible to a broad audience, from hobbyists to professionals.

It offers advanced multimodal and parallel diffusion transformer technology, delivering high visual quality, strong prompt adherence, and diverse outputs.

It’s available in 3 models:
FLUX.1 [pro]: A high-performance, commercial image synthesis model.
FLUX.1 [dev]: An open-weight, non-commercial variant of FLUX.1 [pro]
FLUX.1 [schnell]: A faster, distilled version of FLUX.1, operating up to 10x quicker.

Daily Digest: Huge (in)Flux of AI videos. — from bensbites.beehiiv.com
PLUS: Review of ChatGPT’s advanced voice mode.

  1. During the weekend, image models made a comeback. Recently released Flux models can create realistic images with near-perfect text—straight from the model, without much patchwork. To get the party going, people are putting these images into video generation models to create prettytrippyvideos. I can’t identify half of them as AI, and they’ll only get better. See this tutorial on how to create a video ad for your product..

 


7 not only cool but handy use cases of new Claude — from techthatmatters.beehiiv.com by Harsh Makadia

  1. Data visualization
  2. Infographic
  3. Copy the UI of a website
  4. …and more

Achieving Human Level Competitive Robot Table Tennis — from sites.google.com

 

Daniel Christian: My slides for the Educational Technology Organization of Michigan’s Spring 2024 Retreat

From DSC:
Last Thursday, I presented at the Educational Technology Organization of Michigan’s Spring 2024 Retreat. I wanted to pass along my slides to you all, in case they are helpful to you.

Topics/agenda:

  • Topics & resources re: Artificial Intelligence (AI)
    • Top multimodal players
    • Resources for learning about AI
    • Applications of AI
    • My predictions re: AI
  • The powerful impact of pursuing a vision
  • A potential, future next-gen learning platform
  • Share some lessons from my past with pertinent questions for you all now
  • The significant impact of an organization’s culture
  • Bonus material: Some people to follow re: learning science and edtech

 

Education Technology Organization of Michigan -- ETOM -- Spring 2024 Retreat on June 6-7

PowerPoint slides of Daniel Christian's presentation at ETOM

Slides of the presentation (.PPTX)
Slides of the presentation (.PDF)

 


Plus several more slides re: this vision.

 

 

Apple Intelligence: every new AI feature coming to the iPhone and Mac — from theverge.com by Wes Davis

Apple announced “Apple Intelligence” at WWDC 2024, its name for a new suite of AI features for the iPhone, Mac, and more. Starting later this year, Apple is rolling out what it says is a more conversational Siri, custom, AI-generated “Genmoji,” and GPT-4o access that lets Siri turn to OpenAI’s chatbot when it can’t handle what you ask it for.

Apple jumps into the AI arms race with OpenAI deal — from washingtonpost.com by Gerrit De Vynck
The iPhone maker has mostly stayed on the sidelines as the tech industry goes wild for AI. Not anymore.

SAN FRANCISCO — Apple officially launched itself into the artificial intelligence arms race, announcing a deal with ChatGPT maker OpenAI to use the company’s technology in its products and showing off a slew of its own new AI features.

The announcements, made at the tech giant’s annual Worldwide Developers Conference on Monday in Cupertino, Calif., are aimed at helping the tech giant keep up with competitors such as Google and Microsoft, which have boasted in recent months that AI makes their phones, laptops and software better than Apple’s. In addition to Apple’s own homegrown AI tech, the company’s phones, computers and iPads will also have ChatGPT built in “later this year,” a huge validation of the importance of the highflying start-up’s tech.

Apple Intelligence: AI for the rest of us. — from apple.com

  • Built into your iPhone, iPad, and Mac to help you write, express yourself, and get things done effortlessly.
  • Draws on your personal context while setting a brand-new standard for privacy in AI.

Introducing Apple Intelligence, the personal intelligence system that puts powerful generative models at the core of iPhone, iPad, and Mac — from apple.com
Setting a new standard for privacy in AI, Apple Intelligence understands personal context to deliver intelligence that is helpful and relevant

Apple doubles down on artificial intelligence, announcing partnership with OpenAI — from npr.org by Lola Murti and Dara Kerr

The highly anticipated AI partnership is the first of its kind for Apple, which has been regarded by analysts as slower to adopt artificial intelligence than other technology companies such as Microsoft and Google.

The deal allows Apple’s millions of users to access technology from OpenAI, one of the highest-profile artificial intelligence companies of recent years. OpenAI has already established partnerships with a variety of technology and publishing companies, including a multibillion-dollar deal with Microsoft.

 

The real deal here is that Apple is literally putting AI into the hands of >1B people, most of whom will probably be using AI for the 1st time. And it’s delivering AI that’s actually useful (forget those Genmojis, we’re talking about implanting ChatGPT-4o’s brain into Apple devices).

Noah Edelman (source)

Here’s everything Apple announced at the WWDC 2024 keynote, including Apple Intelligence, Siri makeover — from techcrunch.com by Christine Hall

It’s WWDC 2024 keynote time! Each year Apple kicks off its Worldwide Developers Conference with a few hours of just straight announcements, like the long-awaited Apple Intelligence and a makeover for smart AI assistant, Siri. We expected much of them to revolve around the company’s artificial intelligence ambitions (and here), and Apple didn’t disappoint. We also bring you news about Vision Pro and lots of feature refreshes.

Here’s how to watch the archive of WWDC 2024.


Why Gamma is great for presentations — from Jeremy Caplan

Gamma has become one of my favorite new creativity tools. You can use it like Powerpoint or Google Slides, adding text and images to make impactful presentations. It lets you create vertical, square or horizontal slides. You can embed online content to make your deck stand out with videos, data or graphics. You can even use it to make quick websites.

Its best feature, though, is an easy-to-use application of AI. The AI will learn from any document you import, or you can use a text prompt to create a strong deck or site instantly.
.


107 Up-to-Date ChatGPT Statistics & User Numbers [April 2024] — from nerdynav.com

Top ChatGPT Statistics

  • ChatGPT has 180.5 million users out of which 100 million users are active weekly.
  • In January 2024, ChatGPT got 2.3 billion website visits and 2 million developers are using its API.
  • The highest percentage of ChatGPT users belong to USA (46.75%), followed by India (5.47%). ChatGPT is banned in 7 countries including Russia and China.
  • OpenAI’s projected revenue from ChatGPT is $2billion in 2024.
  • Running ChatGPT costs OpenAI around $700,000 daily.
  • Sam Altman is seeking $7 trillion for a global AI chip project while Open AI is also listed as a major shareholder in Reddit.
  • ChatGPT offers a free version with GPT-3.5 and a Plus version with GPT-4, which is 40% more accurate and 82% safer costing $20 per month.
  • ChatGPT is being used for automation, education, coding, data-analysis, writing, etc.
  • 43% of college students and 80% of the Fortune 500 companies are using ChatGPT.
  • A 2023 study found 25% of US companies surveyed saved $50K-$70K using ChatGPT, while 11% saved over $100K.
 
 

Introducing Copilot+ PCs — from blogs.microsoft.com

[On May 20th], at a special event on our new Microsoft campus, we introduced the world to a new category of Windows PCs designed for AI, Copilot+ PCs.

Copilot+ PCs are the fastest, most intelligent Windows PCs ever built. With powerful new silicon capable of an incredible 40+ TOPS (trillion operations per second), all–day battery life and access to the most advanced AI models, Copilot+ PCs will enable you to do things you can’t on any other PC. Easily find and remember what you have seen in your PC with Recall, generate and refine AI images in near real-time directly on the device using Cocreator, and bridge language barriers with Live Captions, translating audio from 40+ languages into English.

From DSC:
As a first off-the-hip look, Recall could be fraught with possible security/privacy-related issues. But what do I know? The Neuron states “Microsoft assures that everything Recall sees remains private.” Ok…


From The Rundown AI concerning the above announcements:

The details:

  • A new system enables Copilot+ PCs to run AI workloads up to 20x faster and 100x more efficiently than traditional PCs.
    Windows 11 has been rearchitected specifically for AI, integrating the Copilot assistant directly into the OS.
  • New AI experiences include a new feature called Recall, which allows users to search for anything they’ve seen on their screen with natural language.
  • Copilot’s new screen-sharing feature allows AI to watch, hear, and understand what a user is doing on their computer and answer questions in real-time.
  • Copilot+ PCs will start at $999, and ship with OpenAI’s latest GPT-4o models.

Why it matters: Tony Stark’s all-powerful JARVIS AI assistant is getting closer to reality every day. Once Copilot, ChatGPT, Project Astra, or anyone else can not only respond but start executing tasks autonomously, things will start getting really exciting — and likely initiate a whole new era of tech work.


 

Hello GPT-4o — from openai.com
We’re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.

Example topics covered here:

  • Two GPT-4os interacting and singing
  • Languages/translation
  • Personalized math tutor
  • Meeting AI
  • Harmonizing and creating music
  • Providing inflection, emotions, and a human-like voice
  • Understanding what the camera is looking at and integrating it into the AI’s responses
  • Providing customer service

With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.





From DSC:
I like the assistive tech angle here:





 

 

Enter the New Era of Mobile AI With Samsung Galaxy S24 Series — from news.samsung.com

Galaxy AI introduces meaningful intelligence aimed at enhancing every part of life, especially the phone’s most fundamental role: communication. When you need to defy language barriers, Galaxy S24 makes it easier than ever. Chat with another student or colleague from abroad. Book a reservation while on vacation in another country. It’s all possible with Live Translate,2 two-way, real-time voice and text translations of phone calls within the native app. No third-party apps are required, and on-device AI keeps conversations completely private.

With Interpreter, live conversations can be instantly translated on a split-screen view so people standing opposite each other can read a text transcription of what the other person has said. It even works without cellular data or Wi-Fi.


Galaxy S24 — from theneurondaily.com by Noah Edelman & Pete Huang

Samsung just announced the first truly AI-powered smartphone: the Galaxy S24.


For us AI power users, the features aren’t exactly new, but it’s the first time we’ve seen them packaged up into a smartphone (Siri doesn’t count, sorry).


Samsung’s Galaxy S24 line arrives with camera improvements and generative AI tricks — from techcrunch.com by Brian Heater
Starting at $800, the new flagships offer brighter screens and a slew of new photo-editing tools

 

Mark Zuckerberg: First Interview in the Metaverse | Lex Fridman Podcast #398


Photo-realistic avatars show future of Metaverse communication — from inavateonthenet.net

Mark Zuckerberg, CEO, Meta, took part in the first-ever Metaverse interview using photo-realistic virtual avatars, demonstrating the Metaverse’s capability for virtual communication.

Zuckerberg appeared on the Lex Fridman podcast, using scans of both Fridman and Zuckerberg to create realistic avatars instead of using a live video feed. A computer model of the avatar’s faces and bodies are put into a Codec, using a headset to send an encoded version of the avatar.

The interview explored the future of AI in the metaverse, as well as the Quest 3 headset and the future of humanity.


 

Google’s AI-powered note-taking app is the messy beginning of something great — from theverge.com by David Pierce; via AI Insider
NotebookLM is a neat research tool with some big ideas. It’s still rough and new, but it feels like Google is onto something.

Excerpts (emphasis DSC):

What if you could have a conversation with your notes? That question has consumed a corner of the internet recently, as companies like Dropbox, Box, Notion, and others have built generative AI tools that let you interact with and create new things from the data you already have in their systems.

Google’s version of this is called NotebookLM. It’s an AI-powered research tool that is meant to help you organize and interact with your own notes. 

Right now, it’s really just a prototype, but a small team inside the company has been trying to figure out what an AI notebook might look like.

 
 

Apple’s $3,499 Vision Pro AR headset is finally here — from techcrunch.com by Brian Heater

Image of the Vision Pro AR headset from Apple

Image Credits: Apple

Excerpts:

“With Vision Pro, you’re no longer limited by a display,” Apple CEO Tim Cook said, introducing the new headset at WWDC 2023. Unlike earlier mixed reality reports, the system is far more focused on augmented reality than virtual. The company refresh to this new paradigm is “spatial computing.”


Reflections from Scott Belsky re: the Vision Pro — from implications.com


Apple WWDC 2023: Everything announced from the Apple Vision Pro to iOS 17, MacBook Air and more — from techcrunch.com by Christine Hall



Apple unveils new tech — from therundown.ai (The Rundown)

Here were the biggest things announced:

  • A 15” Macbook Air, now the thinnest 15’ laptop available
  • The new Mac Pro workstation, presumably a billion dollars
  • M2 Ultra, Apple’s new super chip
  • NameDrop, an AirDrop-integrated data-sharing feature allowing users to share contact info just by bringing their phones together
  • Journal, an ML-powered personalized journalling app
  • Standby, turning your iPhone into a nightstand alarm clock
  • A new, AI-powered update to autocorrect (finally)
  • Apple Vision Pro


Apple announces AR/VR headset called Vision Pro — from joinsuperhuman.ai by Zain Kahn

Excerpt:

“This is the first Apple product you look through and not at.” – Tim Cook

And with those famous words, Apple announced a new era of consumer tech.

Apple’s new headset will operate on VisionOS – its new operating system – and will work with existing iOS and iPad apps. The new OS is created specifically for spatial computing — the blend of digital content into real space.

Vision Pro is controlled through hand gestures, eye movements and your voice (parts of it assisted by AI). You can use apps, change their size, capture photos and videos and more.


From DSC:
Time will tell what happens with this new operating system and with this type of platform. I’m impressed with the engineering — as Apple wants me to be — but I doubt that this will become mainstream for quite some time yet. Also, I wonder what Steve Jobs would think of this…? Would he say that people would be willing to wear this headset (for long? at all?)? What about Jony Ive?

I’m sure the offered experiences will be excellent. But I won’t be buying one, as it’s waaaaaaaaay too expensive.


 
© 2024 | Daniel Christian