Skip to main content

[pkg-discuss] The future of the pkg mirror

  • From: Erik Trauschke < >
  • To:
  • Subject: [pkg-discuss] The future of the pkg mirror
  • Date: Tue, 30 Apr 2013 09:07:09 -0700

Hi folks,

lately we had some talks with other groups and the issue of properly supporting a content delivery network popped back up.

At the moment it is possible to do that but it's kind of a hack. One could configure the CDN as a pkg mirror in the publisher setup and if (and only if) the CDN is faster than the main origin the file content is retrieved from it.

Another thing is that the pkg mirror is a rather confusing term since one would expect a mirror to be a complete alternate repo which it is not. I think that is the main reason why Shawn was contemplating getting rid of it altogether.

Pkg should support a proper way of offloading pkg content to a CDN though, since this will make pkg downloads faster for a lot of people which do not live in the US. It also allows transporting bulk data unencrypted (and therefore faster) even though the according metadata will still be transmitted encrypted and secure.

So I'm proposing the following:

- provide a mechanism to let a pkg client download from an alternate source

- the mechanism shall be invisible to the user, so no additional setup of source URIs in set-publisher shall be necessary. Also, the output of 'pkg publisher' shall not be cluttered by it.

- The only data on the alternate source shall be bulk data, I think at the moment that would only be content under file/1. Metadata shall only come from an origin.

- The pkg client shall try to retrieve bulk data from the alternate source only. However, if the client encounters too many errors it will fall back to download bulk data from the origin.

- The pkg client shall support a list of alternate sources and decide which one to download from by determining which is fastest.



I think it shouldn't take that much effort to add this functionality into the current pkg client because most of it is already there in one form or other. What I'm thinking is putting a new property in the publisher object. In particular that would be a list called content_source (or alternate_file_source, alt_file_src, whatever, let the bikeshedding begin).
The pkg client will get this list from the repo and use the entries to determine the fastest one to download it's data from. However, contrary to the current behavior it will not consider the origin for downloading file data.

If the client encounters a lot of errors from all alternate sources it will at some point ignore them and fall back to only retrieve content from the origin.

This way we'd have a supported, simple and transparent to the user mechanism to deploy the split model and offload some of the traffic to CDNs like Akamai.

Let me know what you guys think.

Thanks
Erik



[pkg-discuss] The future of the pkg mirror

Erik Trauschke 04/30/2013

[pkg-discuss] Re: The future of the pkg mirror

Bart Smaalders 04/30/2013

[pkg-discuss] Re: The future of the pkg mirror

Erik Trauschke 04/30/2013

[pkg-discuss] Re: The future of the pkg mirror

Tim Foster 04/30/2013

[pkg-discuss] Re: The future of the pkg mirror

Erik Trauschke 04/30/2013

[pkg-discuss] Re: The future of the pkg mirror

Tim Foster 04/30/2013
 
 
Close
loading
Please Confirm
Close