Highlights:

  • Code Llama by Meta was created by training the original Llama 2 neural network on a vast dataset of code samples and related files.
  • Code Llama has three variants: a standard edition and two specialized options.

Code Llama is an open-source large language model that can generate code snippets and explain how they work, according to a release from Meta Platforms Inc.

To use the model commercially is free.

Code Llama is based on Llama 2, an additional open-source language model that Meta released just last month. The latter design serves a broader range of purposes. Not only can it code, but it can also summarize documents, translate text, and respond to simple questions.

Llama 2 is one of the most sophisticated open-source language models. It outperformed several other freely available neural networks in a series of benchmark experiments conducted by Meta researchers. Code Llama, the language paradigm introduced by Meta today, is a specialized variant of Llama 2 with vastly improved programming capabilities.

By using a sizable dataset of code samples and “code-related” files to train the original Llama 2 neural network, Meta created Code Llama. This training dataset included 500 billion tokens, the company claims. In artificial intelligence projects, a token is a fundamental unit of information that typically consists of a few letters or numbers.

There are three versions of Code Llama: a standard edition and two specialized versions.

The Python programming language creates software using the first specialized version. It was trained on a dataset that contained Python code equivalent to 100 billion tokens.

Code Llama – Instruct is the name of the other specialized software edition. It is designed to produce code in response to user instructions in natural language. The model also explains the functionality of the code it produces.

There are three versions of each of the three Code Llama editions. Each version has 7 billion, 13 billion, and 34 billion parameters. The configuration options known as parameters affect how an AI converts data into decisions.

According to Meta, the Code Llama variants with 7 billion and 13 billion parameters are quicker than the edition with 34 billion parameters. This advantage makes them more suitable for duties that are sensitive to latency. A business could, for instance, use them to create a development tool that generates code autocomplete suggestions in real time for programmers.

Code Llama’s 34-billion edition compromises accuracy for speed. Therefore, it ought to be more helpful in scenarios where maximizing response quality is the top priority.

The context window of Code Llama, the general-purpose language model on which it is based, is a significant feature that distinguishes it from Llama 2.

The amount of data users can include in a single prompt depends on the context window of an AI. In the case of Llama 2, that amount of data is 4,096 tokens. The maximum context window for Code Llama is 100,000 tokens.

The model will be able to perform certain programming duties more efficiently than its namesake due to the larger context window. According to Meta, Code Llama will improve software error debugging. Additionally, the company believes that the feature can assist developers in improving the content of AI-generated code.

Meta’s researchers wrote in a blog post, “For example, users can provide the model with more context from their codebase to make the generations more relevant.”

HumanEval and Mostly Basic Python Programming, two well-known coding tests, were used by Meta to assess Code Llama’s performance. The model outperformed a number of cutting-edge alternatives from the open-source ecosystem, claims the company. It also performed some tasks more efficiently than GPT-3.5, a more recent predecessor to OpenAI LP’s flagship GPT-4 language model.

Meta’s researchers detailed, “Our benchmark testing showed that Code Llama performed better than open-source, code-specific LLMs and outperformed Llama 2. Code Llama 34B, for example, scored 53.7% on HumanEval and 56.2% on MBPP, the highest compared with other state-of-the-art open solutions.”