{"id":91669,"date":"2024-05-14T19:25:13","date_gmt":"2024-05-14T23:25:13","guid":{"rendered":"https:\/\/danielschristian.com\/learning-ecosystems\/?p=91669"},"modified":"2024-05-14T20:04:41","modified_gmt":"2024-05-15T00:04:41","slug":"a-major-step-towards-much-more-natural-human-computer-interaction-openai-introduces-gpt-4o","status":"publish","type":"post","link":"https:\/\/danielschristian.com\/learning-ecosystems\/2024\/05\/14\/a-major-step-towards-much-more-natural-human-computer-interaction-openai-introduces-gpt-4o\/","title":{"rendered":"A major step towards much more natural human-computer interaction: OpenAI introduces GPT-4o"},"content":{"rendered":"<p><a href=\"https:\/\/openai.com\/index\/hello-gpt-4o\/\" target=\"_blank\" rel=\"noopener\"><strong>Hello GPT-4o<\/strong><\/a> &#8212; from openai.com<br \/>\n<em>We\u2019re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time.<\/em><\/p>\n<p><iframe loading=\"lazy\" title=\"YouTube video player\" src=\"https:\/\/www.youtube.com\/embed\/DQacCB9tDaw?si=Z7f1IIIOc6rfF4tg\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><span data-mce-type=\"bookmark\" style=\"display: inline-block; width: 0px; overflow: hidden; line-height: 0;\" class=\"mce_SELRES_start\">?<\/span><\/iframe><\/p>\n<p>GPT-4o (\u201co\u201d for \u201comni\u201d) is a step towards much more natural human-computer interaction\u2014it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to\u00a0human response time in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.<\/p>\n<p>Example topics covered <a href=\"https:\/\/openai.com\/index\/hello-gpt-4o\/\" target=\"_blank\" rel=\"noopener\">here<\/a>:<\/p>\n<ul>\n<li>Two GPT-4os interacting and singing<\/li>\n<li>Languages\/translation<\/li>\n<li>Personalized math tutor<\/li>\n<li>Meeting AI<\/li>\n<li>Harmonizing and creating music<\/li>\n<li>Providing inflection, emotions, and a human-like voice<\/li>\n<li>Understanding what the camera is looking at and integrating it into the AI&#8217;s responses<\/li>\n<li>Providing customer service<\/li>\n<\/ul>\n<blockquote><p><span style=\"color: #ff6600;\"><strong>With GPT-4o, we trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.<\/strong><\/span><\/p><\/blockquote>\n<hr \/>\n<hr \/>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">This demo is insane.<\/p>\n<p>A student shares their iPad screen with the new ChatGPT + GPT-4o, and the AI speaks with them and helps them learn in *realtime*.<\/p>\n<p>Imagine giving this to every student in the world.<\/p>\n<p>The future is so, so bright. <a href=\"https:\/\/t.co\/t14M4fDjwV\">pic.twitter.com\/t14M4fDjwV<\/a><\/p>\n<p>\u2014 Mckay Wrigley (@mckaywrigley) <a href=\"https:\/\/twitter.com\/mckaywrigley\/status\/1790088880919818332?ref_src=twsrc%5Etfw\">May 13, 2024<\/a><\/p><\/blockquote>\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<hr \/>\n<hr \/>\n<p><em><span style=\"color: #800000;\">From DSC:<\/span><\/em><br \/>\n<span style=\"color: #800000;\">I like the assistive tech angle here:<\/span><\/p>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"pt\">GPT-4o as tested by <a href=\"https:\/\/twitter.com\/BeMyEyes?ref_src=twsrc%5Etfw\">@BeMyEyes<\/a>: <a href=\"https:\/\/t.co\/WeAoVmxUFH\">pic.twitter.com\/WeAoVmxUFH<\/a><\/p>\n<p>\u2014 Greg Brockman (@gdb) <a href=\"https:\/\/twitter.com\/gdb\/status\/1790195202214572399?ref_src=twsrc%5Etfw\">May 14, 2024<\/a><\/p><\/blockquote>\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<hr \/>\n<hr \/>\n<blockquote class=\"twitter-tweet\">\n<p dir=\"ltr\" lang=\"en\">It&#8217;s been less than 24 hours since the OpenAI changed the world with GPT-4o announcement.<\/p>\n<p>And the Internet is a flooded with demo videos.<\/p>\n<p>Here&#8217;re the 10 most jaw-dropping examples so far (Don&#8217;t miss the 6th one) <a href=\"https:\/\/t.co\/sLx1D1YSqb\">pic.twitter.com\/sLx1D1YSqb<\/a><\/p>\n<p>\u2014 Poonam Soni (@CodeByPoonam) <a href=\"https:\/\/twitter.com\/CodeByPoonam\/status\/1790414979180704211?ref_src=twsrc%5Etfw\">May 14, 2024<\/a><\/p><\/blockquote>\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<hr \/>\n<hr \/>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hello GPT-4o &#8212; from openai.com We\u2019re announcing GPT-4o, our new flagship model that can reason across audio, vision, and text in real time. ? GPT-4o (\u201co\u201d for \u201comni\u201d) is a step towards much more natural human-computer interaction\u2014it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[329,354,356,190,817,8,286,9,343,32,210,848,3,373,391,419,149,180,119,482,124,158,276,64,46,289,825,432,102,17,7,510,228,79,15,163,23,821,219,869,69,309,196,84,66,40,466,195,518,321,214,156,312,186,367,11,822],"tags":[],"class_list":["post-91669","post","type-post","status-publish","format-standard","hentry","category-24x7x365-access","category-av-audiovisual","category-artificial-intelligence-agents-llms-and-related","category-assistive-technologies","category-bots","category-digital-audio","category-digital-learning","category-digital-video","category-education","category-education-technology","category-emerging-technologies","category-emotion","category-higher-education","category-homeschoolinghomeschoolers","category-human-computer-interaction-hci","category-ideas-teaching","category-informal-learning","category-innovation","category-instructional-design","category-intelligent-systems","category-intelligent-tutoring","category-interaction-design","category-interactivity","category-it-in-he","category-k-12-related","category-languages-and-translation","category-law-schools","category-learner-profiles","category-learning","category-learning-agents","category-learning-ecosystem","category-learning-from-the-living-class-room","category-learning-preferences","category-liberal-arts","category-lifelong-learning","category-mathematics","category-multimedia","category-natural-language-processing-nlp","category-online-tutoring","category-open-ai","category-personalizedcustomized-learning","category-platforms","category-productivity-tips-and-tricks","category-smart-classrooms","category-student-related","category-technologies-for-your-home","category-television","category-tools","category-tvs","category-united-states","category-universities","category-usability","category-user-experience-ux","category-user-interface-design","category-vendors","category-vision-possibilities","category-voice-recognition-voice-enabled-interfaces"],"_links":{"self":[{"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/posts\/91669","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/comments?post=91669"}],"version-history":[{"count":21,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/posts\/91669\/revisions"}],"predecessor-version":[{"id":91684,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/posts\/91669\/revisions\/91684"}],"wp:attachment":[{"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/media?parent=91669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/categories?post=91669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/danielschristian.com\/learning-ecosystems\/wp-json\/wp\/v2\/tags?post=91669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}