AI Glossary for Java Developers


Take your skills to the next level!

The Persistence Hub is the place to be for every Java developer. It gives you access to all my premium video courses, monthly Java Persistence News, monthly coding problems, and regular expert sessions.


Most Java developers encounter problems when learning how to integrate AI into their applications using SpringAI, Langchain4J, or some other library. AI introduces many new terms, acronyms, and techniques you must understand to build a good system.

I ran into the same issue when I started learning about AI.

In this article, I did my best to explain the most important terms and acronyms to Java developers who are not already AI experts.

Generative AI

Generative AI is a model trained on huge data sets to find patterns that it uses to generate content such as text, images, audio, and video.

LLM – Large Language Model

LLMs are a specific kind of generative AI specialized in text-based data.

NLP – Natural Language Processing

NLP aims to enable programs to understand and generate human language. It’s a subfield of computer science that uses linguistics, statistical modeling, and machine learning.

AI Prompt

A prompt is the input provided to an AI system to achieve a desired result. This can be any form of question, instruction, or information.

You can often tweak the result provided by an AI system by rephrasing your prompt or by providing additional information.

Zero-shot prompting / direct prompting

Using a prompt without examples to let an AI system perform a task its model hasn’t been trained for is called zero-shot or direct prompting. The model then uses the patterns it has learned to create the result.

This is the simplest form of a prompt.

One-shot prompting / few-shot prompting / multi-shot prompting

A prompt with one or more examples that lets an AI system perform a task its model hasn’t been trained for is called an on-shot, few-shot, or multi-shot prompt. The model then uses the provided examples to extrapolate the structure of the result it is supposed to create and the patterns it has learned to create it.

RAG – Retrieval-Augmented Generation

Providing facts to an LLM from a trusted source that was not part of the training data is called Retrieval-Augmented Generation (RAG). This is often used to improve the quality of the generated result by ensuring that the LLM uses the most current and reliable information.

You can use RAG to reduce the risk of hallucinations.

Fine-tuning

Fine-tuning is a technique for adapting an already trained LLM to a specific task. It uses an already-trained model and performs further training on a smaller, task-specific dataset.

Embeddings

Embeddings are vector representations of data, like words, images, or documents, enabling a model to compare and find similar data.

Hallucination

A hallucination is false information presented by an AI system as facts.

You can use RAG to reduce the risk of hallucinations.

Summary

There are many new terms, concepts, and abbreviations you have to be familiar with when trying to learn how to integrate AI into your application.

I tried explaining the most important ones in this article. If you want me to add something, please post a comment below.