This week, we're covering the strides made in prompting and problem-solving with large language models, scaling up speech technology to a thousand languages, improving efficiency with transformers, and a policy proposal for publicly owned tech companies in the UK.
Let's dive in.
1. Taking a Fresh Approach to Problem-Solving
Princeton and DeepMind have collaborated on Tree of Thoughts: Deliberate Problem Solving with Large Language Models. This approach represents a generalisation of methods like Chain-of-thought prompting (and the “self-consistency” extension), using a search tree to come up with solutions. The approach performs strongly on mathematics puzzles that require multi-step reasoning.
2. Speaking the Language of the World
Meta AI has launched a massively multilingual speech project, aiming to take speech technology to over a thousand languages. The strategy involves using publicly available religious texts in different languages and self-supervised learning to build comprehensive models. The models have notably outperformed previous benchmarks and are available under a creative commons non-commercial license.
3. A Vision for the Future of AI
The authors of VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks provide a unified perspective for vision and language tasks, treating images as a foreign language. The framework accepts both images and instructions, using an LLM decoder to present results in the desired format. The approach, though general, has shown results competitive with specialized models for the task of object detection on MS COCO.
4. Tackling Complex Problems with Augmented Language Models
The research paper "ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings" presents a strategy for using tools that require many few-shot examples. The authors propose representing each tool as a token that can learn from a wide array of demonstration data. These tokens, or "toolkens," enhance the language model's performance, achieving significantly higher accuracy on wikidata question answering.
5. More Efficiency with Transformers
Meta AI's "MegaByte: Predicting Million-byte Sequences with Multiscale Transformers" introduces a multi-scale decoder architecture that enhances efficiency. This architecture segments sequences into patches and runs local and global models, offering benefits like sub-quadratic self-attention and improved parallelism in decoding.
6. Salesforce Steps Up in Code Understanding and Generation
Salesforce AI Research released CodeT5+, a family of models trained for span denoising, decoder-only causal language modelling, and Seq2Seq causal language modelling. A specially tuned variant has demonstrated robust performance on the HumanEval benchmark.
7. Expert-Level Medical Question Answering from Google
Building on the PaLM 2 model, Google's "Towards Expert-Level Medical Question Answering with Large Language Models" applies medical domain finetuning and ensemble refinement prompting techniques, surpassing the performance of Med-PaLM on MedQA. The model even holds up favourably against professional physicians.
8. Optimal Model Design for Compute Efficiency
The "Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design" study from Google finds that small vision models can rival larger ones in performance, provided their shape is optimized. Their SoViT model proves this, demonstrating competitive accuracy on ImageNet.
9. Grounded Responses through "According to" Prompting
A new technique, "According to ..." Prompting Language Models Improves Quoting from Pre-Training Data," tackles the problem of large language models struggling with hallucination. The method prompts models to base responses on previously observed text, thus increasing the direct quotations used.
10. Verifying Real-World Claims with Evidence
The University of Texas at Austin proposes a pipeline in "Complex Claim Verification with Evidence Retrieved in the Wild" to verify real-world claims by retrieving raw evidence from the web. Complex claims are decomposed into multiple questions, forming a verification pipeline to assess the veracity of claims.
11. Checking Factual Accuracy through Cross Examination
"LM vs LM: Detecting Factual Errors via Cross Examination" introduces a technique of cross-examining one language model's claim by another language model acting as an examiner. This interactive approach helps assess factual correctness and has shown strong performance against existing prompting strategies.
13. Funding for Anthropic
Anthropic, the startup working to develop AI in a safer and more understandable manner, has recently raised $450 million in Series C funding. Spark Capital led the funding round, with significant participation from Google, Salesforce Ventures, Sound Ventures, and Zoom Ventures.
14. AI Heart of Windows 11
Microsoft continues to push AI to the forefront with their latest update to Windows 11. This iteration of their popular operating system includes the AI-powered Copilot tool, which is designed to assist users across a variety of applications.
15. The Great British Cloud
Labour for the Long-Term has proposed an ambitious plan to invest £11 billion in two publicly-owned tech firms, The Great British Cloud and BritGPT. The proposition, penned by Hadyn Belfield, argues that the UK should prioritize investments in public cloud and public foundational models, rather than attempting to compete in chip production.
16. Neeva's Next Steps
The consumer search product neeva.com, developed by Neeva, will be shutting down. In a statement released on their website, the company cited the difficulty of user acquisition as a contributing factor to the closure.
17. The EU AI Act
Technomancers.ai has criticized the EU AI Act in a recent report, expressing concerns about the law's implications for open source traditional machine learning models and generative AI. Lokesh Choudhary has countered these claims, arguing that the draft AI Act does not generally target the open-source ecosystem, and is more focused on Big Tech.
18. Meta's hardware
Meta has announced their first-generation AI inference accelerator, MTIA, an ASIC designed for AI recommendation workloads. This unveiling came alongside an overview of their research SuperCluster's use cases, a powerful assembly boasting 16,000 NVIDIA A100s.
19. Together and Open-Source Models
Together, an AI startup founded by Stanford professors Percy Liang and Chris Re, has raised $20 million in funding to develop open-source generative AI models.
20. Apple's Own Tech and New Feature
In response to concerns over potential data leaks, Apple is restricting employee use of ChatGPT and building its own technology in-house. The company is also introducing a new 'Personal Voice' feature that can mimic your voice or that of a loved one in just 15 minutes.
AI Risk
OpenAI has published a perspective on the governance of superintelligence, suggesting the need for a proactive approach given the existential risks involved. In a related vein, Yoshua Bengio, the AI pioneer, has written a piece on the potential emergence of rogue AIs, or autonomous AI systems that could behave in ways that could be catastrophically harmful to humanity. Bengio explores the possibility of rogue AIs arising even without intent from their creators, offering thought-provoking observations such as the need to avoid designing survival instincts into AI systems.
Matthew Hudson also contributes to the discussion in his New Yorker article, discussing the challenges of mitigating runaway AI and exploring various perspectives on the issue. Amidst the ongoing debate, Aidan Gomez, co-founder of Cohere, commented on the extraordinary narrative of a "literal doomsday cult" steering tech regulation discussions.
Innovative Tools
The tool roundup starts with Adobe releasing an integration of Photoshop with Firefly to enable a generative fill tool that looks fancy. Meanwhile, an open source library called Pandas AI is infusing generative AI capabilities into the Pandas data processing library.
The final edition of the VoxSRC challenge is up and running, pushing the frontiers of speaker recognition in diverse, real-world scenarios.
Podcasts and Platforms
Two pdocasts to look out for: the Machine Learning Street Talk podcast with its deep dives on technical topics, and the No Priors podcast, which features discussions with AI engineers, researchers, and founders. Google's new AI-powered Colab with Codey platform is also creating a buzz with its text-driven code generation, autocompletion, and chat features.
AI Commentary
Fergal Reid poses a compelling question in his recent Medium post: "Why are so many giants of AI getting GPTs so badly wrong?" Reid highlights the underestimation of large language models by notable figures in AI, such as Yann LeCun, Rodney Brooks, and Noam Chomsky, a trend he finds concerning given the ongoing AI safety debate.
Martin Goodson has suggests that "The Alan Turing Institute has failed to develop modern AI in the UK." Goodson criticizes the Institute's lack of relevancy in recent AI breakthroughs and calls for a radical change.
Policy and Defense
Nathan Benaich argues in the Financial Times that European governments need to prioritize defence innovation seriously. He highlights Europe's lack of defence innovation and advocates for greater political courage and investment.
Book Recommendation
Finally, the book recommendation for this week is "The Art of Doing Science and Engineering: Learning to Learn" by Richard Hamming. This book presents valuable insights garnered over a long career as a leading engineer and scientist.
Filtir - fact-checking AI outputs
Lastly, I'm working with colleagues on a project called Filtir with the goal of catching AI hallucinations. If you’re interested in finding out more, we’re on Discord here.
If you prefer video summaries, you can find a video version of the newsletter here: