Cloud, Analytics & Data Glossary

A

Anomaly Detection

Anomaly detection involves identifying rare events which can raise suspicions by deviating significantly from other observations, and may indicate issues such as...

Artificial Intelligence (AI)

Artificial intelligence (AI), a subfield of computer science, is the study of designing and implementing intelligent machines that can complete jobs that are usually done...

API

An Application Programming Interface (API) is a well-defined interaction through which a program offers services to other applications...

Automation Bias

Automation bias is the tendency for humans to over-rely on automated systems against their own judgements and to dismiss contradictory information. This bias can lead...

B

Batch Processing

Batch data processing is designed to efficiently process large volumes of data in batches from a specific timespan, rather than in a continuous stream...

Business Intelligence (BI)

The Business Intelligence (BI) concept is the range of instruments, programs, and procedures that allow businesses to gather data from both internal...

Big Data

The ever-growing data sets that more and more people, businesses and organizations are looking to data mine is called Big Data. In the...

Behavioral Analytics

Behavioral analysis utilizes user data generated online to identify patterns and insights into consumer behavior and predict how they are likely to act in the future...

C

Classification Analysis

Classification analysis is the process of identifying and categorizing a collection of data using mathematical techniques, effectively gathering a "summary" of that data for...

Cloud Application

The web-based software that uses cloud computing and related resources to store, manipulate and display data, is called a Cloud Application. Local devices...

Cloud Migration

Moving on-premises IT infrastructure, including databases, apps, and other components, to the cloud is referred to as Cloud Migration. Organizations can use migrations to...

Correlation Analysis

Correlation analysis is a statistical method used to discover whether there is a relationship (usually linear) between two variables or datasets, and how strongly associated...

D

Dark Data

Dark data refers to data that is collected, processed and stored by an organization, but remains unused, unknown and untapped...

Dashboard

A Dashboard is the feature of a software that provides summary data from one or more reports in a variety of visual formats, including charts...

Data Lake

The system that keeps data in its unprocessed form is called a Data Lake. There are no format or size limits for the files stored...

Data Lakehouse

A data lakehouse is a new big-data storage architecture that combines elements of the data warehouse (data structure and management features) with those of the data lake...

Database

A database is a structured group of data that is digitally stored and accessible to be manipulated and interpreted using BI tools. Small databases can be stored on a...

E

E-Commerce Analytics

E-commerce analytics applies analytics practices to the field of e-commerce retail, which generates massive amounts of data across a wide variety of data points, using it to...

ETL

The process of consuming and combining data from several sources into a single, consolidated data storage is known as Extract, Transform, and Load (ETL). ETL is crucial...

ELT

ELT (“Extract, Load, and Transform”) refers to the set of processes used by a data pipeline to replicate data from a source system into a target repository, then prepare it for...

F

Failover

When a primary system has a failure, functions of the system are immediately switched to a secondary system which is called...

Financial Analytics

Financial Analytics processes volumes of a company’s financial data to discover patterns, make forecasts, improve business performance and guide...

Full Load

Full Load is one of the two options of uploading data into a data warehouse (the other being Incremental Load). When a Full Load is executed, each record in the data source...

Fuzzy Logic

Fuzzy logic is an approach to computing designed to represent and manipulate uncertain information. It's based on "degrees of truth" between 0.0 and 1.0, rather than...

G

Gamification

Gamification applies elements of game design (such as scoring points, competition, or achievements) to increase user engagement, motivation and loyalty...

Geo Analytics

Geospatial analytics incorporates geo-location and other spatial data to provide contextual awareness and uncover hidden geographic trends, patterns...

Governance

Governance refers to the implementation of suitable processes and oversight for IT infrastructure. Organizations keep track of the accuracy of their applications...

H

Hybrid Cloud

A Hybrid Cloud, also known as a Cloud Hybrid, is a computing environment that combines a private cloud with a public cloud, enabling the sharing of data...

Hadoop Distributed File System

The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. Designed for high fault tolerance to...

HR Analytics

HR analytics, also known as people analytics or talent analytics, is the process of collecting, applying and reporting human resources data to improve hiring...

I

In-Memory

In contrast to databases that store data on disk or SSDs, In-Memory databases are designed to rely on computer memory for data storage. By removing the need to access...

Incremental Load

Incremental Load is one of the two options of uploading data into a data warehouse (the other being Full Load). When an Incremental...

Index

An Index is a data structure used to hold the values for a single table column. Indexing is a technique for grouping several records based on various fields...

Infrastructure-as-a-Service (IaaS)

One of the main categories of cloud services that offer consumers rapid access to computing, storage, and other IT infrastructure via the Internet is infrastructure-as-a-service...

Integration

The process of combining several software subsystems into a single, cohesive and centralized system is known...

Internet of Things (IoT)

The aggregate group of physical objects that may connect to the Internet and communicate with one another is referred to as the Internet of Things (IoT)...

J

JSON

JavaScript Object Notation, also known as JSON, is a data exchange standard that makes it simple for applications to store and send data across the web in a way that...

Journey Analytics

Journey analytics applies advanced analytics techniques with big data technology to map out customer behaviour across the customer journey and lifecycle, understanding...

K

Kafka Streams

Kafka Streams is a library for building applications that integrates real-time data from various source systems and transforms it to message sequences and...

Key Value Stores

A key-value store (or key-value database) is a type of data storage in which records are stored in a format that uses unique keys to retrieve the record...

KPI

The metrics used to assess business performance and health are referred to as KPIs. Depending on their product, service or vertical, each organization...

Kubernetes

In order to manage workload and services that need to be containerized, organizations utilize the open-source tool called Kubernetes. This platform offers...

L

Latency

The term Latency describes the difference in time between the cause and the response of a physical change in a system. When two platforms communicate...

Load Balancing

Load distributing optimizes performance by distributing a workload over a computer network or cluster, in order to reduce response time and avoid unevenly overloading...

M

Machine Learning

A subfield of artificial intelligence called "Machine Learning" focuses on creating adaptive, intelligent computer algorithms. Machine learning is used by businesses...

Metadata

Metadata stands for data that describes other data, not the content of the data, such as the message's text or the image itself...

Microservices

Microservices architecture refers to the system design that organizes services in a loose and granular way. The premise of using the microservices model is for teams...

Multi-tenant

Multi-tenant architecture, often known as multi-tenancy, is a model of software architecture frequently used in cloud computing to deliver multiple separate instances of...

N

Natural Language Processing (NLP)

Natural language processing refers to the ability of computer programs to understand and process written and spoken human language. It integrates the...

Normalization

The process of reorganizing a database to improve the quality of the data and get rid of redundant information or other undesirable...

NoSQL

NoSQL (for "Not only SQL") is an approach to the management of databases that prioritizes flexibility and scalability. It does not use the standard...

O

Online Analytical Processing (OLAP)

Online Analytical Processing (OLAP) is a process for performing multidimensional data analysis. It uses three operators—drill-down, consolidation, and slice & dice—to...

Online Transactional Processing (OLTP)

Online Transactional Processing (OLTP) is a process that supports real-time execution of large number of small, non-complex transactions with concurrent access by...

On-prem

Technology that is run on computers inside the premises (in the same building as the person or business employing it) is referred to as On-Premise technology...

Orchestration

The practice of planning and combining automated tasks across several systems is known as orchestration in computing...

P

Platform-as-a-Service (PaaS)

The cloud computing paradigm in which a vendor provides the users with the hardware and software tools necessary to create, deploy, and manage applications at scale...

Predictive Analytics

Predictive analytics refers to the application of statistical modeling and machine learning to historical and current data in order to forecast future outcomes...

Prescriptive Analytics

Prescriptive analytics applies advanced processes to not only predict future outcomes, but also recommend prescribed actions and often quantify their expected impacts...

Private Cloud

An organization's sole use of a cloud environment and its associated resources are referred to as a private cloud. Private clouds can either be hosted by a...

Public Cloud

A cloud system that is owned and managed by a third-party provider is referred to as a public cloud. Public cloud resources are made available to "tenants" who all use...

R

Relational Database

The term "relational database" refers to a particular kind of database that stores data by linking the data points. In a relational database each record is individually identified...

Retail Analytics

Retail analytics applies data analysis practices to the field of retail, incorporating data on sales trends, inventory levels and supply chain movement, and...

REST

APIs that adhere to the REST architectural style's limitations and permit communication with RESTful web services are known as REpresentational State Transfer...

S

Scaling

Scaling in cloud computing is the act of adding or eliminating compute, storage, and network services to accommodate the demands a workload places on the resources...

Serverless Computing

The Serverless computing term refers to the cloud paradigm in which the cloud provider allocates machine resources as needed and manages the servers on behalf...

SLA

A Service Level Agreement (SLA) refers to the binding commitment a service provider makes towards a customer. Specific service...

Software as a Service (Saas)

A method of distributing programs online as a service is known as Software as a Service or SaaS. Through SaaS solutions users avoid complicated software and hardware...

SQL

The most popular language for extracting and manipulating data from relational databases is SQL, which stands for Structured Query...

Storage

Cloud Storage is a kind of computer data storage which places digital data in logical pools that are said to be “on the cloud". The physical environment is owned and...

Stored Procedure

A stored procedure defines a collection of SQL statements that may be utilized to complete different operations on data and are distributed among numerous clients...

Stream Processing

Stream Processing is a data processing model that focuses on the real-time processing of continuous streams of data. By limiting the performance of parallel computing...

Single Source of Truth

A "single source of truth" is a central, authoritative repository of information or data that serves as the definitive reference point for an organization or project. It ensures that...

T

Text Analytics

Text analytics applies statistical, linguistic and machine learning techniques to process large volumes of text-based data in order to extract meaning and provide insights...

Trend Analytics

The automated process of turning massive amounts of unstructured text into quantitative data in order to find patterns, trends, and insights is known as text analytics...

U

Unstructured Data

Information in a database that isn't properly formatted or standardized is referred to as Unstructured Data. Usually the information that comes from various sensors or direct...

V

View

A query that is conducted on one or more database tables forms the basis of a database View. Database views may be used to store sophisticated and frequently used...

VPN

The term "Virtual Private Network," or VPN, refers to the possibility of creating a secure network connection when utilizing public networks. VPNs mask...

Virtual Machine

An electronic computing environment that functions like a real computer is called a virtual machine. Software, not hardware, is used by virtual machines to execute apps...

Visual Analytics

Visual analytics uses interactive, visual interfaces (such as charts, graphs and maps) powered by sophisticated analytics tools and processes to help users...

W

Web Analytics

The measurement, gathering, analysis, and reporting of web data for the purpose of comprehending and improving web usage is known as web analytics...

X

XML

Extensible Markup Language or XML is a file format and markup language used to store, send, and create data. It establishes a set of...