What is AI?

Completed

The term "Artificial Intelligence" (AI) covers a wide range of software capabilities that enable applications to exhibit human-like behavior. AI has been around for many years, and its definition has varied as the technology and use cases associated with it have evolved. In today's technological landscape, AI solutions are built on machine learning models that encapsulate semantic relationships found in huge quantities of data; enabling applications to appear to interpret input in various formats, reason over the input data, and generate appropriate responses and predictions.

Common AI capabilities that developers can integrate into a software application include:

Capability Description
Diagram of speech bubbles and a robot.
Generative AI and agents
Generative AI is based on large language models (LLMs) with the ability to generate original responses to natural language prompts. For example, to power interactive chat applications and AI-assisted content creation. Increasingly, generative AI is used as the foundation for agentic AI solutions in which AI agents combine LLMs with focused instructions that define task responsibilities, and tools that the agent can use to find relevant knowledge and automate the tasks for which it is responsible.
Diagram of a text document.
Natural language processing
Modern LLMs evolved from a well-established area within AI called natural language processing (NLP). NLP makes use of statistical and semantic models to make sense of language in documents, emails, social media messages, and other sources of text. While many common NLP tasks can now be performed by generative AI LLMs, there are some specialized uses of NLP - particularly within the realm of text analysis that can benefit from statistical NLP techniques built on term-frequency algorithms, and task-specific models for text classification, sentiment analysis, and summarization.
Diagram of a user with a headset.
Computer Speech
The ability to recognize and synthesize speech enables AI apps and agents to engage more naturally with users through voice input and spoken responses. Computer speech is another well-established area of AI, and recent advances enable it to handle complex conversational interactions; handling background noise, interruptions, and multiple languages and accents. Beyond interactive conversational solutions, computer speech is an integral component of AI solutions for transcription and analysis of live or recorded speech, and the synthesis of speech from text for simultaneous translation or "read aloud" interfaces.
Diagram of an eye being scanned.
Computer vision
Computer vision refers to the ability of AI applications and agents to accept, interpret, and process visual input from images, videos, and live camera streams. For example, an automated checkout in a grocery store might use computer vision to identify which products a customer has in their shopping basket, eliminating the need to scan a barcode or manually enter the product and quantity. Increasingly, generative AI models are multimodal, and can not only process visual input, but also generate visual output in the form of images and videos.
Diagram of a document and a magnifying glass.
Information extraction
The ability to combine generative AI models for language reasoning, natural language techniques for document understanding, and computer vision and speech for media analysis enables the development of AI solutions that can extract key information from documents, forms, images, recordings, and other kinds of content. For example, an automated expense claims processing application might extract purchase dates, individual line item details, and total costs from a scanned receipt.

Determining the specific AI capabilities you want to include in your application can help you identify the most appropriate AI services that you'll need to provision, configure, and use in your solution.