Generative AI
April 19, 2023

Foundation Models : Software Engineering’s iPhone Moment

Dhanush Ram

We’re moving into an “AI first company” era. Have you watched the movie Her? Remember how the protagonist falls in love with an AI? No matter how good the movie, such an occurrence seemed far off from becoming common back then. Fast forward to almost a decade later, and you wouldn’t be a non-conformist to consider this to be possible now that AI generated text and speech are becoming increasingly difficult to distinguish from those produced by humans.

The launch of AI models like Open AI’s ChatGPT and DALL-E 2 have led to a surge of AI tools which are leading the paradigm shift from Analytical AI to Generative AI, where AI tools aren’t merely software products that analyze data to produce reports, but software products capable of doing creative work, as generating written content or designing images.

The early days of AI focused on mimicking the activities of the brain’s neurons into an artificial neural network. However, the evolution picked pace in 2015 when AI models equipped with transformers, which reduced training time drastically, replaced analytical AI models. The emergence of LLMs further fueled this shift, and they serve as the foundation for all the applications built on top of them, similar to how applications were built on AWS or Azure during the cloud platform shift of the 2010s.

The emergence of a new era of AI has opened up a plethora of possibilities and has rapidly gained popularity. One such example is Stable Diffusion, which has gained massive popularity and has surpassed Ethereum and Bitcoin on GitHub stars. PyTorch and TensorFlow are also expected to follow suit soon.

Generative AI Tech Stack consists of three layers: Applications, Model and Infrastructure. Applications layer includes end-to-end apps or third-party APIs that integrate generative AI models into user-facing products. The model layer comprises proprietary APIs or open-source checkpoints that power AI products. This layer requires a hosting solution for deployment. The infrastructure layer includes cloud platforms and hardware that are responsible for running training and inference workloads for generative AI models.

The high level tech-stack for Foundation Models is generally as follows

Foundation Models

Foundation models are large AI models which can be adapted to a wide range of downstream applications. These models are usually trained on petabytes scale of data, and are typically used for zero-shot or few-shot scenarios. Some major players in this space include OpenAI, Hugging Face, Stability AI,etc. These models are often utilized in either of three settings

  • Native AI model : These foundation models are used in zero-shot scenarios via API calls or playgrounds where they can adapt to a broad range of tasks and the general performance is ample for the selected tasks
  • Fine-tuned model : These models are generally smaller than Native AI models and are trained on a fairly large amount of specialized training data suited for the task at hand. This is the most widely used approach in the industry as it allows more control over the model outputs
  • Edge models : AI models which are capable of running natively on a local computer or end device are often termed as edge models. These are significantly smaller in size and a lot less computationally heavy to adapt to the device’s limitations. The size trade off leads to lower accuracy of the model in comparison to the other two

AI’s multi-modal have made significant progress across various domains, including text, code generation, images, speech synthesis, video, and 3D models. Text-based models are currently being used for short/medium-form writing, but as the models improve, we can expect higher quality outputs and better vertical-specific tuning. Code generation models like GitHub CoPilot have the potential to greatly improve developer productivity. Image models have gone viral, with different aesthetic styles and techniques for editing and modifying generated images. Speech synthesis and video/3D models are also rapidly improving and have significant potential for high-end applications like films and games. Additionally, there is significant research and development happening in other fields such as audio, music, biology, and chemistry.

These advancements have resulted in diverse examples of highly lucrative use cases of such foundational models. For instance, AI models can be used for audio replication, generation of protein sequences, and even gaming. To support these AI models, infrastructure is required in the form of foundational and fine-tuned models, middleware products that assist the data pipeline and model deployment, and application layers that aim to solve widely prevalent problems across user segments.

As more applications are developed using foundational models, we are witnessing an upsurge in innovation in dev tools that enable faster and more efficient development of these application layers. This remarkable shift is underway, with software (1.0) having already taken over the world, and AI (Software 2.0) taking it to the next level by “eating” software.

Developer’s Copilot Era

Developer Tools have a tremendous opportunity to be transformed by AI, and Copilot represents a significant paradigm shift for software developers. While Copilot has been a massive success, offering AI-powered suggestions for code and boosting productivity for millions of engineers, it may be just the first step in a larger transformation of how software engineers work.

With the rise of generative AI, software development tasks like code generation & autocomplete, SQL generation, automated code reviews & code quality improvement, and code documentation automators, etc have the potential to disrupt software development across tasks, technologies and frameworks by significantly lowering the resources that go into the development of a software product. There is a wide range of opportunities for exploration and a multitude of ways to hook into developer’s workflows.

Despite the opportunities presented by AI-powered developer tools, building such DevTools comes with its own set of challenges. One of the primary challenges is obtaining labeled data for training the models. While sample code is abundant, carefully labeled code is scarce and valuable, which means that companies spend heavily on data labeling for code-based models. Other challenges include inference and training costs, data copyright concerns, and critical UI challenges. Solving these challenges can create a significant strong moat for companies trying to build disruptive developer tools. Companies that can efficiently train models with less labeled data, deploy models efficiently, overcome data copyright concerns, and create intuitive user interfaces will be well-positioned to dominate the developer tools market.

Thanks to language models that make it easier for teams to create intelligent features, an increasing number of products, including new startups and established players, will integrate/build features with LLM, with a focus on enhancing the developer experience and workflow. We can’t wait to collaborate with engineers who are developing the tools that will enable others to take advantage of LLMs, and we believe this will have a transformative impact on the field of software development.

As the world of software continues to evolve, we firmly believe that an orchestration layer [Tooling] will become an essential component of the software development process. With the increasing use of large language models (LLMs) in the software development process, there is a growing need for a platform that can effectively manage the use of these models in real-time code writing and deployment.

We believe that MLOps is on the verge of a paradigm shift towards Large Language Model Operations (LLMOps) is a new set of practices and processes for developing, deploying, maintaining, and optimizing large language models. LLMOps offers advantages such as tasks annotation, real-time optimization and adjustments based on user input data, automatic handling of embedding, real-time monitoring of performance data, and one-click fine-tuning functionality based on previously annotated real-use data.

Working with LLMs can be a challenging and constantly evolving process that demands effective state management. The ever-changing prompts used with LLMs require sophisticated tools capable of efficient management and iteration on them. To achieve effective LLM production use, it is necessary to implement a hybrid pipeline that combines APIs, databases, and filters. Managing state and quick feedback loops, with or without human intervention, are also critical components of the LLM production process. Therefore, having a state management tool is essential when working with LLMs in production, and managing state and feedback loops can be vital for the successful use of these models.

The inference cost of LLMs differs significantly from that of traditional ML models, and their inference may require a chain of models to ensure optimal results. Therefore, while some MLOps tools may be similar, they may not be appropriate for fine-tuning and deploying LLMs. In general, LLMOps provides a more transparent and easily monitored application management process, enabling team members to comprehend the application’s operation better than traditional development MLOps approaches.

Furthermore, frameworks for prompt engineering could be another area of opportunity in the LLMOps space. Fine-tuning LLMs often involves crafting effective prompts that generate desired outputs, and having tools that aid in prompt engineering can accelerate the development process and improve the quality of results generated by LLMs. We anticipate the emergence of more platforms for fine-tuning and deploying LLMs, including options for no-code and code-first development as well as supporting tools to simplify and standardize various aspects of the workflow.

The Generative AI landscape is undergoing a significant platform shift, with massive investments pouring into this domain. At Speciale Invest, we are deeply committed to exploring and understanding these emerging technologies, predicting their capabilities, and supporting ambitious founders who are driving innovation in this field. If you are building applications, models, tooling, or infra in the space of foundation models and would like to connect with you, please reach out to us dhanush.ram@specialeinvest.com