PoPs (Points of Presence): CDN PoPs (Points of Presence) are strategically located data centers responsible for communicating with users in their geographic vicinity. Their main function is to reduce round trip time by bringing the content closer to the website's visitor. Each CDN PoP typically contains numerous caching servers.
Caching servers: Caching servers are responsible for the storage and delivery of cached files. Their main function is to accelerate website load times and reduce bandwidth consumption. Each CDN caching server typically holds multiple storage drives and high amounts of RAM resources.
SSD/HDD + RAM: Inside CDN caching servers, cached files are stored on solid-state and hard-disk drives (SSD and HDD) or in random-access memory (RAM), with the more commonly-used files hosted on the more speedy mediums. Being the fastest of the three, RAM is typically used to store the most frequently-accessed items.
Reverse proxy
Receiving a user connection request
Completing a TCP three-way handshake, terminating the initial connection
Connecting with the origin server and forwarding the original request
Forward proxy
Block employees from visiting certain websites
Monitor employee online activity
Block malicious traffic from reaching an origin server
Improve the user experience by caching external site content
CDN using reverse proxy servers
Content caching: Reverse proxies are placed in several geographically dispersed locations, where mirror versions of website pages are compressed and cached. This facilitates rapid content delivery based on client geolocation, helping to reduce page load times and improve your user experience.
Traffic scrubbing: Prevent DDoS & security threads from outside. Located in front of your backend servers, reverse proxies are ideally situated to scrub all incoming application traffic before it's sent on to your backend servers.
IP masking: When routing your incoming traffic through a reverse proxy server, connections are first terminated by the proxy and then reopened with the backend server. From your users' perspective, their requests are resolved via the proxy IP.
Load balancing: Because reverse proxy server are the gateway between users and your application's origin server, they're able to determine where to route individual HTTP sessions. For applications using multiple backend servers, this means the reverse proxy can efficiently distribute the load, thereby improving overall user experience and helping ensure high availability. In the event that a server goes down, reverse proxies act as a failover solution, rerouting traffic to ensure continued site availability.
2. CDN Architecture
Four pilliar of CDN: Performance (location, low-latency, high bandwidth), Scalability (high bandwidth resource, DDoS protection), Reliability (high availability, no single point failure), Responsiveness (quick configuration propagation, sync)
Caching: HDD<SSD<RAM
Topology: The Scattered CDN, The Consolidated CDN (mainly about cost/DDoS)
3. CDN caching
static file: image/videos/music/javascript/css (cu down cost/improve user experience/reliable delivery)
caching header: Web developers use HTTP cache headers to mark cacheable web content and set cache durations. Using cache headers, you can control your caching strategy by establishing optimum cache policies that ensure the freshness of your content. For example: "Cache-Control: max-age=3600" means that the file can be cached for no longer than an hour before it must be refetched from the origin content.
cache control
Cache-Control: public – enables caching by public platforms such as CDNs.
Cache-Control: private – reserved for private information that is designated non-cacheable.
Cache-Control: no-cache – requires validation before caching.
Expires: Similar to Cache-Control: max-age, sets the time of content expiration and removal.
Surrogate: Gives you increased control over cache policies, acting with the authority of the origin server.
Etag: Provides your cached web content with unique identifiers, enabling individual labeling and more sophisticated sorting.
Pragma: Largely supplanted by Cache Control, Pragma was previously used to handle caching instructions for browsers.
Vary (use with caution): Some browsers still struggle with supporting the Vary header. When used properly, Vary can be a powerful tool for managing delivery of multiple file versions, especially for compressed files cached alongside their uncompressed counterparts.