It was a struggle to find any content for State of the AI Nation 3. Nothing has really happened in the space over the last few months…..
Let’s start with Microsoft. I’ve been unashamedly critical of Microsoft over the past few years, and frankly they’ve earned every scathing word. The multi-trillion-dollar company is held up by dubious foundations: Sharepoint – an online storage system that still demands you adhere to the VFAT long file-naming rules from 1995, the Azure cloud – which suffered a quickly-hushed worldwide data breach that one outlet described as “the worst you can imagine“, and Bing – a search engine (no, me either) whose market share actually went down this year from 3.59% to 3.01%, despite getting $13bn-worth of advanced generative AI baked into it.
Disclosure: I am a Microsoft shareholder
Breathe. But all that might be about to change…
Let’s start with pivot tables shall we.
WTF is a Pivot Table?
A pivot table allows you to take data and reshape it around a dimension or field. For example, if you have a list of transactions like below, a pivot table allows you to aggregate data on one particular product.
I use pivot tables about once a year so always end up wasting half an hour trying to remember how to do it, searching usually not Bing for the answer. (Seriously kids, use DuckDuckGo).
But what if you could just ask Excel to do that pivot stuff for you? Well now you can… With Excel Copilot you can simply ask for what you want and it will pivot the data for you:
Somehow, out of nowhere, Microsoft has figured out how to embed ChatGPT (and let’s face it, these “copilots” are really just ChatGPT with a fresh CSS stylesheet) so seamlessly, smartly and effectively that even the most complicated Excel data operation is just a vague question away.
Trying to remember the exact syntax for formulas is now no longer needed, just ask Excel Copilot what you’re after and it, generally, does a damned good job at satisfying your request.
Building Apps through Conversation with Power AppBuilder
The Power Platform is Microsoft’s no-code/low-code solution for making data in Microsoft 365’s “dataverse” (sorry I just threw up in my mouth a little) more useful and usable.
Building apps on Power Platform has always reminded me of designing Visual Basic 6 apps in my secondary school IT room after school (yes I’m a nerd, nice to meet you).
But now, Microsoft has embedded a Copilot into Power Apps that lets you describe and app and it will build it.
Let me repeat just so I’m sure you understood. You can describe an app, and it will build it for you:
Not only will it build it for you, if it needs to use the data already in the datav – bloik! – dataverse, it will pull it in AND it will honour all of you role-based access controls, Compliance Manager rules and Purview-assigned classifications.
You can see a walkthrough here.
The thing I enjoyed the most about the 2023 Ignite conference was Microsoft’s announcement of Loop.
Sharepoint was once very powerful and helpful but by 2023 standards it is a useless bag of spanners that doesn’t even allow you to create a hierarchy of Pages without showing off the Windows 3.1 folder structure going on behind the scenes. It has been monumentally crapped on from a great height by modern, UX-obsessed tools like Confluence and Notion.
But with Loop, Microsoft is taking the fight directly to Notion, with Enterprise-wide integration into Bing Corporate Search, Purview and Azure AD, I mean Microsoft Entra, roles and access controls.
Loop is lovely, I’m totally sold on it. It is component-based, and you can share each individual block of content seamlessly across Microsoft 365 apps such as Teams (bloik!) and Word.
But forget sharing across other apps, just use Loop. Simples. Loop comes with commenting, a raft of block types (such as polls) and … oh yeah, it even has a Copilot!
Using Loop Copilot, you can ask AI to start brainstorming or creating a project:
Notion has been able to do this for a while now, but the deep integration with Microsoft’s identity and data governance platform is a huge win for Enterprise users. Almost enough to forgive the Ribbon menu.
Has anything happened in OpenAI recently? Hmmmmmm….
Ah yes. So their CEO hokey cokey aside, OpenAI held their first (and possibly only!) DevDay in November, at which they announced custom GPTs, allowing you to create customised, specialised and fine-tuned chatbots through discussion alone, and cutting most prices by 1/3rd.
The new GPT-4 Turbo model that Sam Altman launched at the event is what has made the custom GPTs possible, it has shown much better reasoning and increases the context window to 128k (around 300 pages of text) whilst being 1/3rd the price of GPT-4.
OpenAI claims “multimodal capabilities”, although they still say the image creation is coming from DALL-E 3, so it seems to be a UI-level integration that at a model level.
Which brings us nicely onto…
Google is fighting back.
After the lukewarm response to the original Bard, which used the LaMDA LLM, Google has finally dropped Gemini on us.
Gemini, as I mentioned in the previous State of the AI Nation, is a true multi-modal model that equally understands text, images and audio in the same context, and can mix and match accordingly.
Gemini is now available in preview on Bard and I’ve been very impressed with its capabilities.
This picture I took in Blackpool (St. Tropez of the north west) was uploaded, processed and responded to within a couple of seconds. Gemini not only described the picture correctly, it also guessed that it was sunset (not sunrise!) which is also correct. This is so seamless and seemingly benign that I think it’s finally a viable alternative to ChatGPT.
Bard is also integrated with YouTube, Maps and several other Alphabet products:
Currently, Bard doesn’t seem to use your data, such as YouTube subscriptions and saved places in Maps. Once this happens, Bard will be leaps ahead of the competition. You can ask it for things like “show me a video on youtube about chernobyl”, which brings back relevant results. While this is certainly being routed to the video search API, it feels really consistent.
Another example, the integration with Maps is pretty slick:
Oh dear, Apple.
I’ve been quite positive about Apple’s iPhones having a competitive advantage in edge AI capabilities but, to be blunt, they’ve dropped the ball.
It is said that we could expect a revamped LLM-powered Siri on the iPhone 16 launched at WWDC 2024, which isn’t until next Spring, with other outlets saying that it will be the end of 2024, and the launch of iOS 18, before AI features make it to the iPhone. That just shows how far behind Apple are behind the curve. Very disappointing.
Nothing to see here.
I’m not a Samsung fan, I don’t believe premium prices should mean you’re burdened with crappy partner apps on your phone you can’t uninstall, but Samsung have leapfrogged Apple with the launch of Gauss.
Gauss, supposedly coming out for the Galaxy S24, will be a multi-modal model integrated deeply into Samsung phones. You’ll be able to use Gauss to write your emails, summarise documents, perform translations, etc. It even supports code and image generation.
It’s almost a shame people will just switch over to the Bard app instead as soon as they buy a Galaxy S24!
What fresh AI hell awaits us in 2024?
I’d put money on us seeing GPT-5 in the first half of next year, I expect it will be a more comprehensively integrated multi-modal model. I think this will be reinforced with a dropping of the names Codex, Whisper and DALL-E too.
Now that Google has a competitive advantage and a model that isn’t “useless“, you can expect them to integrate it into all your Google apps – Gmail, Youtube, Maps, etc. The next big phase of Gemini will undoubtedly be bringing more of your data into the context so that responses are uniquely personal to you.
OpenAI can’t compete when it comes to YOUR data, just on the data it is trained on, so I think Google has a real competitive advantage, and Bard has a real chance of being the Killer App for AI.
I believe the Google/Apple anti-trust lawsuit will have an impact on what AI looks like for Apple users. Apple is unlikely to get near what Bard and ChatGPT are already capable of in the next few months so Apple will look at other solutions. If the antitrust case rules that there is no monopoly problems, one option for Apple would be to further integrate Google into iOS. Replace Siri with Bard (but still call it Siri), and introduce a range of specialised features for iMessage, Watch, AirPods, Apple Maps, CarPlay etc that are all based around Google’s AI. Apple’s hardware and design prowess meets Google’s software dabhandery, that’s a recipe made in tech heaven!