Welcome to this week's AI news roundup! Here we go…
Are Emergent Abilities of Large Language Models a Mirage? Researchers from Stanford ask whether the claimed emergent abilities of large language models are a mirage. They suggest that emergent abilities are primarily caused by the researcher choosing a metric that nonlinearly or discontinuously deforms per-token error rates.
Mosaic Releases MPT-7B: Mosaic has released MPT-7B, which they describe as a new standard for Open-Source Commercially usable LLMs. It cost approximately $200,000 to train on a trillion tokens of text and code in roughly 10 days.
Non-invasive Brain Reading: A study in Nature Neuroscience from the University of Texas at Austin introduces a non-invasive decoder that reconstructs continuous language from cortical semantic representations recorded using fMRI.
DeepMind's Bipedal Soccer Robot: DeepMind has released work called "Learning agile soccer skills for a bipedal robot with Deep Reinforcement learning". They used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one soccer game.
StarCoder, an LLM for Code: The BigCode collaboration has released StarCoder, a large language model for Code that performs strongly on HumanEval and multilingual code benchmarks.
Detecting Lying in LLMs: Recent work from Ariel and CMU suggests that the internal state of a large language model knows when it's lying. They propose SAPLMA, a method that leverages the hidden layer activations of an LLM to predict the truthfulness of generated statements.
Distilling Step-by-Step: A pre-print entitled "Distilling step-by-step! Outperforming large language models with less training data and smaller model sizes" presents a method to extract LLM rationales obtained through chain-of-thought reasoning as additional supervision for small models.
CEBRA for High-Dimensional Recordings: An article in Nature proposes CEBRA or "Consistent embeddings of high-dimensional recordings using Auxiliary variables". This method combines ideas from nonlinear ICA with contrastive learning.
Shap-E: Generating Conditional 3D Implicit Functions: OpenAI introduces Shap-E, a conditional generative model for 3D assets that can directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields.
Generative AI at Work: A paper from Stanford and MIT is the first study of the impact of generative AI when deployed at scale in the workplace. They find that access to AI assistance increases agents' productivity by 14% as measured by the number of customer issues they can resolve per hour.
OpenAI Raises at 27-29B USD Valuation: TechCrunch reports that OpenAI has raised a further $300M. Relatedly, The Information reported that OpenAI's losses doubled to $540 million last year.
Italy Lifts Ban on ChatGPT: In late April, Italy lifted its ban on ChatGPT after privacy improvements. Also in late April, Mark Zuckerberg discussed how Meta wants to introduce AI agents to billions of people.
Further Misc. News: Video generator Runway has raised $100 million at a $1.5 billion valuation. LinkedIn now uses AI to let users draft messages to hiring managers, and Bing AI will see widespread deployment on Samsung Galaxy devices via the built-in SwiftKey keyboard. Waymo doubles its autonomous ride-hailing area in Phoenix and expands in San Francisco.
Modular's Unified Inference Engine and Mojo: Modular announced a unified inference engine for AI in their product keynote, together with Mojo, a programming language for all AI developers. The project is headed by Chris Lattner, so we can expect excellent engineering.
Bakerian Medal: Computer Vision researcher Andrew Zisserman wins the prestigious Royal Society Bakerian medal for pioneering research contributions.
National AI Developments: Following talks with AI company CEOs, the US White House announced new AI-related actions. Additionally, the UK announced £100M for a taskforce to focus on opportunities to establish the UK as a world leader in foundation models and their applications across the economy, acting as a global standard-bearer for AI safety.
AI-Generated Content Farms on the Rise: A study from Newsguard found that AI-generated content farms are increasing. They discovered 49 websites spanning seven languages in the form of typical news websites with some clues that the content was generated by AI.
AI Risk: AI Safety researcher Paul Christiano gave a two-hour-long interview on the Bankless podcast, covering the challenges of the Alignment problem and his estimates for the probability of AI takeover. This interview follows up on previous discussions with Eliezier Yudkowsky and Robin Hanson. AI pioneer Geoff Hinton also weighed in on the risks, speaking to the BBC and CNN. Another AI pioneer, Yann LeCun, offered a critique on Twitter of one aspect of the latter interview. Conjecture CEO Conor Leahy spoke with CNN about the lack of regulation for powerful AI systems.
The Costs of Caution: Kelsey Piper wrote a post arguing in favour of slowing down AI development while acknowledging that this comes at a cost.
Frameworks for AGI: Richard Ngo shared a framework for thinking about AGI called the t-AGI framework, which defines a system as a t-AGI if it beats most human experts given time t to perform cognitive tasks.
AI Tools Roundup: Filtir announced a fact-checking API specifically for AI-generated text. Midjourney released version 5.1, featuring improvements in usability, coherence, and fewer unwanted artifacts. A collection of models called LaMini-LM was released for non-commercial research use. Finally, Microsoft rolled out a preview version of an updated Microsoft Designer tool.
If you prefer video summaries, you can find a video version of the newsletter here:
Acknowledgement: GPT-4 helped me with editing.