Unix Desktop

ID #1017

Making Metalinks

By Darius Liktorius (Metalink Creator), A. Bram Neijt (Metalinks tools project), Manuel Subredu (RoPkg::Metalink), Anthony Bryan

What are Metalinks? Why would your organization or site want to use them?

Metalink is a system which improves the download process by increasing availability and guaranteeing integrity. It can give your users a more reliable download by providing multiple links to the same file, which can be switched to if one server is down or fails during transmission. It can also make downloads faster by using multiple resources at once. Organizations use mirror servers in multiple locations to provide redundancy and place content nearby users in the hopes that it will be faster for local users. Examples would be mirrors on each continent or near a concentration of users. Downloaders in Europe would manually choose a mirror located in a European country, preferably their own, and people in Japan or other countries would choose mirrors close to them. If mirrors are down they can be used by downloaders in other locations as a backup or second option. Metalink lists mirrors with machine readable information on priority and location so their efficient use can be automated by download programs. It can list mirrors around the world, but will automatically default to mirrors closer to you and by priority.

Download programs (also known as download managers) that are Metalink-aware read the information from a Metalink file (one with a .metalink extension), which is just plain XML. The download programs split the file tranfer into independently verifiable segments which are re-assembled. By downloading multiple segments from multiple servers, download speeds are usually three to four times faster on an average broadband connection and much faster on business grade connections. The checksum verification process, usually manual and arcane to most people, is automated with Metalink, so files are guaranteed to be an exact copy of the file you downloaded, free of errors. Metalinks can also contain publisher information, Operating System and architecture, language, file descriptions, mutliple files (to be added to a download queue) and so on. All this extra information allows download programs to do interesting things. It lets you have one download link for multiple Operating Systems and languages, and downloaders will get the correct version for their system.

A few of the projects and organizations that use Metalink include: OpenOffice.org, PC-BSD, openSUSE, Arch Linux, DesktopBSD, TrueBSD, blag linux, Ubuntu Christian Edition, StartCom Linux, Berry Linux, redWall Firewall, GoboLinux, and Eiffel Software.

Here is a very short example Metalink (ubuntu-6_10-desktop-i386_iso.metalink) listing only three mirrors and one checksum.

<?xml version="1.0" encoding="UTF-8"?>
<metalink version="3.0" xmlns="http://www.metalinker.org/">
  
  <files>  
   <file name="ubuntu-6.10-desktop-i386.iso">
    <os>Linux-x86</os>
    <size>732293120</size>
    <verification>
     <hash type="md5">b950a4d7cf3151e5f213843e2ad77fe3</hash>
    </verification>
    <resources>     
      <url type="http"
           location="ro"
           preference="90">
       http://ftp.iasi.roedu.net/mirrors/ubuntulinux.org/releases/.pool/ubuntu-6.10-desktop-i386.iso
      </url>
      <url type="http"
           location="jp"
           preference="100">
       http://ftp.yz.yamagata-u.ac.jp/pub/linux/ubuntu/releases/.pool/ubuntu-6.10-desktop-i386.iso
      </url>
      <url type="http"
           location="us"
           preference="90">
       http://ftp.osuosl.org/pub/ubuntu/.pool/ubuntu-6.10-desktop-i386.iso
      </url>      
    </resources>
   </file>
  </files> 
</metalink>

(The full XML implementation and more information can be found at Metalinker.org).

To take advantage of the benefits of Metalink, you need to make the .metalink files and add them to your site. You could manually create the files since they are XML text. But there are a few programs available which make the process easier. There are different tools and they range from being suitable for manually creating one Metalink at a time or automatically creating many. A list of files, their checksums, and the mirrors you want listed in the Metalinks are necessary.

Making a Few Metalinks (Manual)

What is the Metalink Creator?

The Metalink Creator (a.k.a. "Metalink Generator") is a web-based tool for assisting in the initial creation of a Metalink. Instead of coding everything by hand, you can get a jump start by making use of the Creator. You are able to view a Metalink (as XML text); including the ability to download a .metalink file created from your input.

How Metalink Creator Works

Metalink Creator takes the input you provide and runs it through a Metalink-building engine. This engine was built using Microsoft's ASP.NET 2.0 application server environment. The Metalink Creator does its best to validate your input and prevent you from entering erroneous values into the fields which might invalidate the resulting Metalink. You can access the Metalink Creator at the following URL: http://www.metalinker.org/generator

How to use the Metalink Creator

Upon the initial load of the Creator, you can pre-populate all of the fields with sample data using the link just above the form fields; or just simply start entering information relevant to the Metalink you would like to generate. If you need to enter more than the first five Metalink mirror locations provided, simply click on the "Add more URLs..." button.

When you are done, click on the "Create Metalink" button. If any of the input fields contain invalid values the post-back will not succeed and you will see an error message next to each field whose value falls outside of the Metalink specification. Upon the successful creation of a Metalink, you will be presented with a new screen displaying the XML text version of the Metalink, along with a download link so you can save the actual .metalink file.

You can highlight and select the contents of the Metalink and copy them to your clipboard, or you can click on the "Download this Metalink as a File" button to download the Metalink file with your browser. If for some reason you need to modify your input data, simply click on the "Modify Input" button to do so. All of your previously entered fields will contain your original input and allow you to modify them, keeping you from having to re-enter everything.

Making Many Metalinks (Automated)

Apart from writing your own scripts to create Metalinks, there are currently two ways to automate their creation: A command line program called metalink, and an automated mirroring system's module called RoPkg::Metalink.

What is metalink?

metalink is the first tool to come out of the Metalinks tools project. It is a simple command line utility to generate Metalinks from local files or md5sum lists. Its tasks include hashing files and generating links from a mirror list.

Installing and running metalink

Here we will consider version 0.3.1 of the metalink program. For both Windows and Linux, there is a single binary to download and run. However, you can also install it from source: for most Unix like systems, using the A-A-P build tool is the easiest (sudo aap install; sudo aap cleanALL). It depends on various boost libraries and the GCrypt library (apt-get install libgcrypt-dev libboost-regex-dev libboost-program-options-dev libboost-filesystem-dev libboost-dev). More convenient installation methods are still on the TODO list, and any suggestions can be posted on the Sourceforge project page.

All files can be downloaded from the Sourceforge project download page.

Using metalink: an example

The most simple examples are part of the metalink --help output (partly due to the absence of a manual page). Here we will show you a full fledged example. What you need for a simple Metalink is: a list of servers and their properties and a file to hash. For this, we will be using the Ubuntu desktop ISO, from Ubuntu.com.

First we create the mirror list. This file contains all the mirror information for our Metalink, with one mirror on each line. To start us off, we'll extract the links from the download page using some GNU tool magic:

wget -O - 'http://www.ubuntu.com/products/GetUbuntu/download?action=show&redirect=download' | \
egrep -o '(http|ftp)://[^"]+ubuntu-6.10-desktop-i386.iso' | \
sed -e 's/^/ % /;s/ubuntu-6.10-desktop-i386.iso//' %gt; mirror.lst

Now we have a perfectly acceptable mirror list for metalink, albeit a little simple. An example of a line is: % http://ubuntu.uz/releases/edgy/. Before the percent mark, space separated attributes can be added to the mirror.

To generate a Metalink, we now execute:

metalink --somedigests ubuntu-6.10-desktop-i386.iso < mirror.lst > ubuntu-6.10-desktop-i386.iso.metalink

And after some intense hashing (MD5, SHA1 and ED2K), we are done and can post ubuntu-6.10-desktop-i386.iso.metalink online.

More advanced uses of metalink

Of course, there is more to Metalinks then what we just did. Now we will show you how to add more information to your Metalink. First, we add a description to the Metalink. Create a seperate, head.txt, file with the following content:

  <publisher>
    <name>Ubuntu.com</name>
    <url>http://www.ubuntu.com/</url>
  </publisher>

Then, looking at the Metalink, we see that some mirrors don't have a location attribute. This is because metalink can only detect countries based on their top-level domain. For all the .com, .net, .org etc. we should manually add a country. We open our mirror.lst file and add more information using the syntax "[location [preference] [type] % ] <mirror base url>". We use this to add country codes to servers without them in their domain name and possibly add a preference and type (http, ftp, bittorrent).

Most download pages will tell you "please choose a location near you". By adding the location attribute, the download client can do just that and make the right choice at the client side. When it runs out of servers inside the country, the download client could use the preference to make a good choice (although this is totally up to the client implementation).

For our example, find the following mirrors and add some attributes:

uk % http://ftp.ticklers.org/releases.ubuntu.org/releases/edgy/
nl 15 % http://nl.releases.ubuntu.com/releases/edgy/
it 10 % http://ubuntu.fastbull.org/ubuntu-releases/edgy/

This will put a higher preference on NL and IT servers and identify the Ticklers.org server as being in the UK. After this change to the mirror list, we regenerate our Metalink:

metalink --somedigests --headerfile head.txt --desc 'Ubuntu Linux 6.10 Desktop Live CD for i386 arch' \
 ubuntu-6.10-desktop-i386.iso < mirror.lst > ubuntu-6.10-desktop-i386.iso.metalink

Now we have a Metalink with our publisher in the header (--headerfile head.txt) a description of the content of the Metalink (--desc 'description') and our new set of mirrors with country and preference set (< mirror.lst).

As shown, metalink allows you to create Metalink records fast. As the first tool to come out of the Metalinks tools project, there is still some work to be done. So the best thing to do for now is get it and post your feature requests at https://sourceforge.net/projects/metalinks.

What is RoPkg::Metalink?

RoPkg::Metalink is, in essence, a Perl module used to automatically generate Metalink files. RoPkg::Metalink stores information about files in a database.

The main advantages of RoPkg::Metalink (at this moment) are the automatically generated Metalinks and the high level of customization for generated Metalinks.

RoPkg::Metalink is very appropiate for owners of large online archives with a lot of files who want to give their users the possibility to download large files using Metalink. For example, it is used to make the official OpenOffice.org Metalinks, along with other Metalinks for openSUSE, Ubuntu, the Linux Kernel, Fedora, and Arch Linux.

Installing RoPkg::Metalink

Prerequisites for installing RoPkg::Metalink

After you have installed all the required modules, it is time to install the main module, RoPkg::Metalink.

Download the latest version from http://download.packages.ro/perl-foundry/ .

wget http://download.packages.ro/perl-foundry/RoPkg-Metalink-0.2.5.tar.gz

Unpack the archive

tar xvfz RoPkg-Metalink-0.2.5.tar.gz

cd RoPkg-Metalink-0.2.5
perl Makefile.PL
make
make install

Now, copy mt-gen somewhere on your file system (/usr/local/bin for example) and edit it. Search for the line where the configuration file is defined and modify the path to point to the location where you want to put your configuration file.

Next, we should create a directory for configuration files (and the template) used and copy the files there.

mkdir /usr/local/etc/metalink/
cp mt-gen.cfg metalink.tmpl /usr/local/etc/metalink/

Create a database (eg MySQL) and import the initial data.

mysqladmin create mtgen
mysql
grant all on mtgen.* to mtgen@localhost identified by 'mt-gen-passwd';
mysql mtgen <metalink.mysql

The next step is to modify the configuration file (/usr/local/etc/metalink/mt-gen.cfg) to reflect your setup. After editing the configuration file, you're all set.

To run it, simply type mt-gen . Of course, you still need to add some projects and some mirrors in the database.

How RoPkg::Metalink works

RoPkg::Metalink is based on plugins. It has been designed for serving very large online archives. The main goal of RoPkg::Metalink is to automatically generate Metalink files for very popular projects (Fedora, Kernel, OpenOffice, etc).

RoPkg::Metalink does this by using plugins. This way, new functionality can be added without modifying the code base, just by adding new plugins. The main disadvantage of this approach is that for each project someone has to write a plugin. The main advantage is that Metalink files can be created specifically for that project.

So, how does the whole thing work? Each plugin specifies some conditions that have to be met by files (usually filenames) and the base directory in which the files will be searched. When new files are found, each file is hashed and the hash and file information is inserted in the database. After the file system has been searched, Metalink files are generated based on the information from the database.

What happens when a file changes? Each time RoPkg::Metalink runs, it compares the file information from the database with the one on the file system. If the file modification time differs then the information from the database is modified and the Metalink is regenerated.

At the same time, RoPkg::Metalink checks if the files from the database still exist on the file system. If they don't, the files are deleted from the database and the Metalinks are removed too.

This way RoPkg::Metalink keeps only the latest information available for users.

Extending RoPkg::Metalink

As you have seen, RoPkg::Metalink is plugin based. If you want to extend RoPkg::Metalink, you should take a look at the source code of any plugin. It is really very very easy to add new plugins. The only things you have to know are Perl programming and regular expressions.

Conclusion

There are a variety of solutions for generating Metalinks, from making a few quickly and easily to more automated systems for making many. One should fit the needs of your project. If you have any special needs or suggestions, please inquire at the specific tool's website.

Now that you've made your Metalinks, you can upload them to your site and link to them. You can set a MIME type of application/metalink+xml for .metalink files on your Web server. If you don't change the MIME type, the Metalink will display as text in Web browsers.

Resources:

[1] - http://www.metalinker.org (Metalink WebSite)

[2] - http://www.metalinker.org/generator/ (Metalink Creator)

[3] - http://metalinks.sourceforge.net (Metalink tools project)

[4] - http://metalink.packages.ro (Metalink files for popular projects)

[5] - http://download.packages.ro (Various downloads including RoPkg::Metalink)

[6] - http://www.packages.ro (Announcements)

[7] - http://www.iasi.roedu.net (RoEduNet Iasi - Main RoPkg::Metalink sponsor)

[8] - http://ftp.iasi.roedu.net (RoEduNet Iasi Online Archive - Large Online Archive (first online archive who implemented metalink))

[9] - http://www.linux.com/article.pl?sid=06/11/01/1641247 ("Downloading bliss with Metalink" by Mayank Sharma. Also translated into Japanese.)

[10] - http://www.freesoftwaremagazine.com/node/1779 ("Using Metalinks" by Anthony Bryan (For end users))

Last update: 2006-11-29 11:19
Author: Darius Liktorius, A. Bram Neijt, Manuel Subredu, Anthony Bryan
Revision: 1.0

Print this record Print this record Send to a friend Send to a friend Send to a friend Digg it!

Please rate this entry:

Average rating: 4.63 out of 5 (8 Votes )

completely useless 1 2 3 4 5 most valuable

You cannot comment on this entry