What is Latency? | How to Reduce Latency

By Gcore

April 27, 2022

5 min read

There are many factors that affect the speed of a web resource. One of them is network latency. Let’s take a closer look at what latency is, how it affects application performance, and how it can be reduced.

What is latency?

Broadly speaking, latency is any delay in the execution of some operations. There are different types of latencies: network latencies, audio latencies, when broadcasting video during livestreams, at the storage level, etc.

Basically, any type of latency results from the limitations of the speed at which any signal can be transmitted.

Most⁠—but not all⁠—latency types are measured in milliseconds. For example, when communicating between the CPU and SSD, microseconds are used to measure latency.

This article will focus on network latency, hereinafter referred to as “latency”.

Network latency (response time) is the delay that occurs when information is transferred across the network from point A to point B.

Imagine a web application deployed in a data center in Paris. This application is accessed by a user from Rome. The browser sends a request to the server at 9:22:03.000 CET. And the server receives it at 9:22:03.174 CET (UTC+1). The delay when sending this request is 174 ms.

This is a somewhat simplified example. It should be noted that data volume is not taken into account when measuring latency. It takes longer to transfer 1,000 MB of data than 1 KB. However, the data transfer rate can be the same, and the latency, in this case, will also be the same.

The concept of network latency is mainly used when discussing interactions between user devices and a data center. The lower the latency, the faster users will get access to the application that is hosted in the data center.

It is impossible to transmit data with no delays since nothing can travel faster than the speed of light.

What does network latency depend on?

The main factor that affects latency is distance. The closer the information source is to users, the faster the data will be transferred.

For example, a request from Rome to Naples (a little less than 200 km) takes about 10 ms. And the same request sent under the same conditions from Rome to Miami (a little over 8,000 km) will take about 120 ms.

There are other factors that affect network latency.

Network quality. At speeds above 10 Gbps, copper cables and connections show too much signal attenuation even over short distances, as little as within a few meters. With increasing interface speeds, fiber-optic network cables are mainly used.

Route. Data on the Internet is usually transmitted over more than one network. Information passes through several networks—autonomous systems. At the points of transition from one autonomous system to another, routers process data and send it to the required destination. Processing also takes time. Therefore, the more networks and IX there are on the package’s path, the longer it will take for it to be transferred.

Router performance. The faster the routers process data, the faster the information will reach its destination.

In some sources, the concept of network latency also includes the time the server needs to process a request and send a response. In this case, the server configuration, its capacity, and operation speed will also affect the latency. However, we will stick to the above definition, which includes only the time it takes to send the signal to its destination.

What is affected by network latency?

Latency affects other parameters of web resource performance, for example, the RTT and TTFB.

RTT (Round-Trip Time) is the time it takes for sent data to reach its destination, plus the time to confirm that the data has been received. Roughly speaking, this is the time it takes for data to travel back and forth.

TTFB (Time to First Byte) is the time from the moment the request is sent to the server until the first byte of information is received from it. Unlike the RTT, this indicator includes not only the time spent on delivering data but also the time the server takes to process it.

These indicators, in turn, affect the perception of speed and the user experience as a whole. The faster a web resource works, the more actively users will use it. Conversely, a slow application can negatively affect your online business.

What is considered optimal latency and how to measure it?

The easiest way to determine your resource’s latency is by measuring other speed indicators, for example, the RTT. This parameter is closest to latency. In many cases, it will be equal to twice the latency value (when the travel time to is equal to the travel time back).

It is very easy to measure it using the ping command. Open a command prompt, type “ping”, and enter the resource’s IP address or web address.

Let’s try to ping www.google.com as an example.

C:Usersusername>ping www.google.com

Exchange of packages with www.google.com [216.58.207.228] with 32 bytes of data

Response from 216.58.207.228: number of bytes=32 time=24ms TTL=16

The time parameter is the RTT. In our example, it is 24 ms.

The optimal RTT value depends on the specifics of your project. On average, most specialists consider less than 100 ms to be a good indicator.

RTT value	Its meaning
<100 ms	Very good, no improvements required
100–200 ms	Acceptable, but can be improved
>200 ms	Unsatisfactory, improvements are required

How to reduce latency?

Here are some basic guidelines:

Reduce the distance between the data origin and the users. Try to place servers as close to your clients as possible.
Improve network connectivity. The more peer-to-peer partners (networks you can exchange traffic with) and route options you have, the better the route you can build and the faster the data will be transferred.
Improve traffic balancing. Distributing large amounts of data over different routes will help reduce the network load. In that way, information will be transferred faster.

CDN—a Content Delivery Network with many connected servers that collect information from the origin, cache it, and deliver it using the shortest route—will help with the first and second points. A global network with good connectivity will help you significantly reduce latency.

However, keep in mind that latency is only one factor affecting users’ perception of application performance. In some cases, the latency is very low, but the website still loads slowly. This happens, for example, when the server is slow in processing requests.

As a rule, complex optimization is required to significantly speed up the application. You can find the main acceleration tips in the article “How to increase your web resource speed”.

Summary

Latency is the time it takes to deliver data across the network from one point to another.
The main factor it depends on is distance. It is also affected by the network quality and the route (number of networks and traffic exchange points).
Latency affects other parameters of the web resource performance, such as RTT and TTFB. They, in turn, affect conversion rates and search engine rankings.
The easiest way to determine the latency of a resource is to measure the RTT. This can be done using the ping command. An optimal RTT is less than 100 ms.
The most effective way to reduce latencies is to enable a CDN. Content delivery network will reduce the distance between the client and the data origin, as well as improve the routing. As a result, the information will be transferred faster.

Gcore CDN provides excellent data transfer speed. We deliver heavy files with minimal delays anywhere in the world.

We have a free plan. Test our network and see how your resource will speed up.

More about Gcore CDN

How AI is reshaping the future of interactive streaming

Interactive streaming is entering a new era. Artificial intelligence is changing how live content is created, delivered, and experienced. Advances in real-time avatars, voice synthesis, deepfake rendering, and ultra-low-latency delivery are giving rise to new formats and expectations.Viewers don’t want to be passive audiences anymore. They want to interact, influence, and participate. For platforms that want to lead, the stakes are growing: innovate now, or fall behind.At Gcore, we support this shift with global streaming infrastructure built to handle responsive, AI-driven content at scale. This article explores how real-time interactivity is evolving and how you can prepare for what’s next.A new era for live contentStreaming used to mean watching someone else perform. Today, it’s becoming a conversation between the creator and the viewer. AI tools are making live content more reactive and personalized. A cooking show host can take ingredient requests from the audience and generate live recipes. A language tutor can assess student pronunciation and adjust the lesson plan on the spot. These aren’t speculative use cases—they’re already being piloted.Traditional cameras and presenters are no longer required. Some creators now use entirely digital hosts, powered by motion capture and generative AI. They can stream with multiple personas, switch backgrounds on command, or pause for mid-session translations. This evolution is not about replacing humans but creating new ways to engage that scale across time zones, languages, and platforms.Creating virtual influencersVirtual influencers are digital characters designed to build audiences, promote products, and hold conversations with followers. Unlike human influencers, they don’t get tired, change jobs, or need extensive re-shoots when messaging changes. They’re fully programmable, and the most successful ones are backed by teams of writers, animators, and brand strategists.For example, a skincare company might launch a virtual influencer with a consistent tone, recognizable look, and 24/7 availability. This persona could host product tutorials in the morning, respond to DMs during the day, and livestream reactions to customer feedback at night—all in the local language of the audience.These characters are not limited to influencer marketing. A virtual celebrity might appear as a guest at a live product launch or provide commentary during a sports event. The point is consistency, scalability, and control. Gcore’s global delivery network ensures these digital personas perform without delay, wherever the audience is located.Real-time avatars and AI-generated personasReal-time avatars use motion capture and emotion detection to mimic human behavior with digital models. A fitness instructor can appear as a stylized avatar while tracking their own real movements. A virtual talk show host can gesture, smile, or pause in response to viewer comments. These avatars do more than just look the part—they respond dynamically.AI-generated personas build on this foundation with language generation and decision-making. For instance, an edtech company could deploy a digital tutor that asks learners comprehension questions and adapts its tone based on their engagement level. In entertainment, a music artist might perform live as a virtual character that reflects audience mood through color shifts, dance patterns, or facial expression.These experiences require ultra-low latency. If the avatar lags, the illusion collapses. Gcore’s infrastructure supports the real-time input-output loop needed to make digital characters feel present and responsive.Deepfake technology for creative storytellingDeepfakes are often associated with misinformation, but the same tools can be used to build engaging, high-integrity content. The technology enables face-swapping, voice cloning, and character animation, all of which are powerful in live formats.A museum might use deepfake avatars of historical figures for interactive educational sessions. Visitors could ask questions, and Abraham Lincoln or Golda Meir might respond with historically grounded answers in real time. A brand could create a fictional spokesperson who evolves over time, appearing in product demos, ads, and livestreams. Deepfake technology also allows multilingual content without re-recording—the speaker’s lip movements and tone are modified to match each language.These applications raise legitimate ethical questions. Gcore’s streaming infrastructure includes controls to ensure the source and integrity of AI-generated content are traceable and secure. We provide the technical foundation that enables deepfake use cases without compromising trust.Synthetic voices and personalized audioAudio is often overlooked in discussions about AI streaming, but it’s just as important as video. Synthetic voices today can express subtle emotions and match speaking styles. They can whisper, shout, pause for dramatic effect, and even mimic regional accents.Let’s consider a news platform that offers interactive daily briefings. Viewers choose their preferred language, delivery style (casual, serious, humorous), and even the voice profile. The AI generates a personalized broadcast on the fly. In gaming, synthetic characters can offer encouragement, warn about strategy mistakes, or narrate progress—all without human voice actors.Gcore’s streaming infrastructure ensures that synthetic voice outputs are tightly synchronized with video, so users don’t experience out-of-sync dialogue or lag during back-and-forth exchanges.Increasing interactivity through feedback and participationInteractivity in streaming now goes far beyond comments or emoji reactions. It includes live polls that influence story outcomes, branching narratives based on audience behavior, and user-generated content layered into the broadcast.For example, a live talent show might allow viewers to suggest challenges mid-broadcast. An online classroom could let students vote on the next topic. A product launch might include a real-time Q&A where the host pulls questions from chat and answers them in the moment.All of these use cases rely on real-time data processing, behavior tracking, and adaptive rendering. Gcore’s platform handles the underlying complexity so that creators can focus on building experiences, not infrastructure.Why low latency is criticalInteractive content only works if it feels immediate. A delay of even a second can break immersion, especially when users are trying to influence the outcome or receive a response. Low latency is essential for real-time gaming, sports, interviews, and educational formats.A live trivia game with hundreds of participants won’t retain users if there’s a lag between the question appearing and the timer starting. A remote surgery training session won’t work if the avatar’s responses trail behind the mentor’s instructions. In each of these cases, timing is everything.Gcore Video Streaming minimizes buffering, supports high-resolution streams, and synchronizes data flows to keep participants engaged. Our infrastructure is built to support high-throughput, globally distributed audiences with the responsiveness that interactive formats demand.Preparing for what’s nextAI-generated content is no longer a novelty. It’s becoming a standard feature of modern streaming strategies. Whether you’re building a platform that features virtual influencers, immersive avatars, or interactive educational streams, the foundation matters. That foundation is infrastructure.If you’re planning the next generation of live content, we’re ready to help you bring it to life. At Gcore, we provide the performance, scale, and security to launch these experiences with confidence. Our streaming solutions are designed to support real-time content generation, audience interaction, and global delivery without compromise.Want to see interactive streaming in action? Learn how fan.at used Gcore Video Streaming to deliver ultra-low-latency streams and boost fan engagement with real-time features.Read the case study

What is Latency? | How to Reduce Latency

What is latency?

What does network latency depend on?

What is affected by network latency?

What is considered optimal latency and how to measure it?

How to reduce latency?

Summary

Related articles

How AI is reshaping the future of interactive streaming

What are captions and subtitles, and how do they work?

Why captions and subtitles are essential for video engagement

How to cut egress costs and speed up delivery using Gcore CDN and Object Storage

How do CDNs work?

What is a CDN?

Subscribe to our newsletter