Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the jetpack domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /home/feedavenue.com/public_html/wp-includes/functions.php on line 6114
Exploring Generative AI - Feedavenue
Tuesday, December 24, 2024
HomeTechnologySoftwareExploring Generative AI

Exploring Generative AI

Date:

Related stories

spot_imgspot_img


Generative AI and particularly LLMs (Large Language Models) have exploded
into the public consciousness. Like many software developers I am intrigued
by the possibilities, but unsure what exactly it will mean for our profession
in the long run. I have now taken on a role in Thoughtworks to coordinate our
work on how this technology will affect software delivery practices.
I’ll be posting various memos here to describe what my colleagues and I are
learning and thinking.

First Memo: The toolchain

26 July 2023

Let’s start with the toolchain. Whenever there is a new area with still evolving patterns and technology, I try to develop a mental model of how things fit together. It helps deal with the wave of information coming at me. What types of problems are being solved in the space? What are the common types of puzzle pieces needed to solve those problems? How are things fitting together?

The following are the dimensions of my current mental model of tools that use LLMs (Large Language Models) to support with coding.

Assisted tasks

  • Finding information faster, and in context
  • Generating code
  • “Reasoning” about code (Explaining code, or problems in the code)
  • Transforming code into something else (e.g. documentation text or diagram)

These are the types of tasks I see most commonly tackled when it comes to coding assistance, although there is a lot more if I would expand the scope to other tasks in the software delivery lifecycle.

Interaction modes

I’ve seen three main types of interaction modes:

  • Chat interfaces
  • In-line assistance, i.e. typing in a code editor
  • CLI

Prompt composition

The quality of the prompt obviously has a big impact on the usefulness on the tools, in combination with the suitability of the LLM used in the backend. Prompt engineering does not have to be left purely to the user though, many tools apply prompting techniques for you in a backend.

  • User creates the prompt from scratch
  • Tool composes prompt from user input and additional context (e.g. open files, a set of reusable context snippets, or additional questions to the user)

Properties of the model

  • What the model was trained with
    • Was it trained specifically with code, and coding tasks? Which languages?
    • When was it trained, i.e. how current is the information
  • Size of the model (it’s still very debated in which way this matters though, and what a “good” size is for a specific task like coding)
  • Size of the context window supported by the model, which is basically the number of tokens it can take as the prompt
  • What filters have been added to the model, or the backend where it is hosted

Origin and hosting

  • Commercial products, with LLM APIs hosted by a the product company
  • Open source tools, connecting to LLM API services
  • Self-built tools, connecting to LLM API services
  • Self-built tools connecting to fine-tuned, self-hosted LLM API

Examples

Here are some common examples of tools in the space, and how they fit into this model. (The list is not an endorsement of these tools, or dismissal of other tools, it’s just supposed to help illustrate the dimensions.)

Tool Tasks Interaction Prompt composition Model Origin / Hosting
GitHub Copilot Code generation In-line assistance Composed by IDE extension Trained with code, vulnerability filters Commercial
GitHub Copilot Chat All of them Chat Composed of user chat + open files Trained with code Commercial
ChatGPT All of them Chat All done by user Trained with code Commercial
GPT Engineer Code generation CLI Prompt composed based on user input Choice of OpenAI models Open Source, connecting to OpenAI API
“Team AIs” All of them Web UI Prompt composed based on user input and use case Most commonly with OpenAI’s GPT models Maintained by a team for their use cases, connecting to OpenAI APIs
Meta’s CodeCompose Code generation In-line assistance Composed by editor extension Model fine-tuned on internal use cases and codebases Self-hosted

What are people using today, and what’s next?

Today, people are most commonly using combinations of direct chat interaction (e.g. via ChatGPT or Copilot Chat) with coding assistance in the code editor (e.g. via GitHub Copilot or Tabnine). In-line assistance in the context of an editor is probably the most mature and effective way to use LLMs for coding assistance today, compared to other approaches. It supports the developer in their natural workflow with small steps. Smaller steps make it easier to follow along and review the quality more diligently, and it’s easy to just move on in the cases where it does not work.

There is a lot of experimentation going on in the open source world with tooling that provides prompt composition to generate larger pieces of code (e.g. GPT Engineer, Aider). I’ve seen similar usage of small prompt composition applications tuned by teams for their specific use cases, e.g. by combining a reusable architecture and tech stack definition with user stories to generate task plans or test code, similar to what my colleague Xu Hao is describing here. Prompt composition applications like this are most commonly used with OpenAI’s models today, as they are most easily available and relatively powerful. Experiments are moving more and more towards open source models and the big hyperscalers hosted models though, as people are looking for more control over their data.

As a next step forward, beyond advanced prompt composition, people are putting lots of hopes for future improvements into the model component. Do larger models, or smaller but more specifically trained models work better for coding assistance? Will models with larger context windows enable us to feed them with more code to reason about the quality and architecture of larger parts of our codebases? At what scale does it pay off to fine-tune a model with your organization’s code? What will happen in the space of open source models? Questions for a future memo.

Thanks to Kiran Prakash for his input




Source link

Latest stories

spot_img
Previous article
Next article