Supporting custom URL protocols



Introduction


In a prior post, I described an alternative to URLClassLoader (see that post) named URLClassLoaderX. While implementing URLClassLoaderX, I had to build in support for URL's with a custom protocol (“path”). After finishing that implementation, I was not completely happy with the way I had to build ad-hoc support for the “path” protocol. Since that post was not primarily concerned with building support for custom URL protocols, I did not spend too much time improving the code supporting this feature in that post. In the following post, I present a set of classes illustrating how to rapidly build support for generic URL's.


First of all, what exactly do I mean by “custom” or “generic” URL's? Taking a look at URL support ready built into Java, one finds that certain URL protocols (such as “file:” and “http:”) are supported out of the box. So, one can directly construct an URL for a file and an HTTP site as follows:


URL fileURL = new URL(“file:/C:/var/tmp”);


URL httpURL = new URL(“http://www.cnn.com”);


But try constructing an URL using “new URL(“mem:foo”)”, and you get a MalformedURLException. It turns out that some extra work is required to support such “custom” URL's (see the documentation for “java.net.URLStreamHandlerFactory” for more information). Basically, one has to create a class implementing the URLStreamHandlerFactory class, and create another class that extends the URLConnection class. In the code presented in this post, I illustrate a scheme that minimizes the number of classes that must be implemented to build this support.


There are two central classes in the codebase presented below, namely an interface named “com.subhajit.url.Content” and a class named “com.subhajit.url.GenericURLStreamHandlerFactory” that implements URLStreamHandlerFactory.


The Content interface, shown below, describes objects that are capable of accessing some data from somewhere:


public interface Content {
    /**
     * Returns an {@link InputStream} from which the content of this object can
     * be read.
     *
     * @return {@link InputStream} to the content of this object.
     * @throws IOException
     */
    InputStream openStream() throws IOException;


    /**
     * Returns a reference to the content of this object that is of a type equal
     * to the first match of the given {@link Class}s, or <tt>null</tt> if the
     * content cannot be returned as any of the given {@link Class}s.
     *
     * @param klass
     * @return
     * @throws IOException
     */
    Object getContent(Class<?>... klass) throws IOException;


    /**
     * Returns the {@link URL} representing this object.
     *
     * @return
     * @throws MalformedURLException
     */
    URL toURL() throws MalformedURLException;


    /**
     * Establishes a connection to the content of this object.
     *
     * @throws IOException
     */
    void connect() throws IOException;
}


The Content interface has a default abstract implementation named “AbstractContent”, which defines some abstract methods that sub classes must implement, and some methods that subclasses may optionally override:


public abstract class AbstractContent implements Content {
    protected abstract String getProtocol();


    protected abstract String getUrlFile();


    protected URL url;


    protected String getHost() {
        return null;
    }


    protected int getPort() {
        return 0;
    }
    public void connect() throws IOException {
    }


    public final URL toURL() throws MalformedURLException {
        if (url == null) {
            url = new URL(getProtocol(), getHost(), getPort(),
                    getUrlFile(),
                    new GenericURLStreamHandlerFactory()
                            .createURLStreamHandler(
                                    getProtocol()));
            new URLMapperFactory().getURLMapper().putData(
                    url.toString(), this);
        }
        return url;
    }
}


The GenericURLStreamHandlerFactory class, which is another piece of the puzzle, is shown below:


public class GenericURLStreamHandlerFactory implements URLStreamHandlerFactory {
    public URLStreamHandler createURLStreamHandler(String protocol) {
        return new URLStreamHandler() {
            @Override
            protected URLConnection openConnection(URL url) throws IOException {
                final Content data = new URLMapperFactory().getURLMapper()
                        .getData(url);
                return new URLConnection(url) {
                    @Override
                    public void connect() throws IOException {
                        data.connect();
                    }


                    @Override
                    public Object getContent() throws IOException {
                        return data.getContent();
                    }


                    @SuppressWarnings("unchecked")
                    @Override
                    public Object getContent(Class[] classes)
                            throws IOException {
                        return data.getContent(classes);
                    }


                    @Override
                    public InputStream getInputStream() throws IOException {
                        return data.openStream();
                    }
                };
            }
        };
    }
}


Note how the overridden methods of this class delegate invocations to the enclosed “data” object, and how the “data” object is obtained from the URL via an “URLMapperFactory” class.


The URLMapperFactory class creates URLMapper classes, which map URL's to Content objects. URLMapper is actually an interface:


public interface URLMapper {
    Content getData(URL url);
    public void putData(String urlStr, Content data);   
}


The default implementation of this interface is GenericURLMapper, which is a glorified wrapper around a concurrent map:


class GenericURLMapper implements URLMapper {
    private static final GenericURLMapper instance = new GenericURLMapper();
    private final ConcurrentMap<String, Content> map = new ConcurrentHashMap<String, Content>();


    public Content getData(URL url) {
        return map.get(url.toString());
    }


    public void putData(String urlStr, Content data) {
        map.put(urlStr, data);
    }


    public static URLMapper getInstance() {
        return instance;
    }
}



Application


So, given these classes, how do you go about implementing a custom URL? It turns out that all you have to do is implement the Content interface, as shown below. But first, here is an example of how you might create a custom URL with the "jarentry" protocol:



URL jarEntryUrl = new JarEntryContent( new File( "pathToJarFile" ),"/my/company/classes/Class1.class").toURL();

Note that custom URL's are obtained indirectly from their Content objects, not directly constructed.

As an example of implementing the Content interface, let us assume that you wish to implement URL's with a “mem” protocol. These URL's access content stored in memory. The “MemoryContent” class shown below shows you how to do this:


public class MemoryContent extends AbstractContent implements Content {
    private final byte[] bytes;
    public MemoryContent(byte[] bytes) {
        super();
        this.bytes = bytes;
    }


    public Object getContent(Class<?>... klasses) throws IOException {
        if (klasses.length == 0) {
            return openStream();
        }
        for (Class<?> klass : klasses) {
            if (klass == byte[].class) {
                byte[] copy = new byte[bytes.length];
                System.arraycopy(bytes, 0, copy, 0, bytes.length);
                return copy;
            } else if (klass == InputStream.class) {
                return openStream();
            }
        }
        return null;
    }


    public InputStream openStream() throws IOException {
        return new ByteArrayInputStream(bytes);
    }


    @Override
    protected String getProtocol() {
        return "mem";
    }


    @Override
    protected String getUrlFile() {
        return this.toString();
    }
}


As another example, the “JarEntryContent” class, shown below, provides access to the bytes of a named entry in a specified JAR file.


public class JarEntryContent extends AbstractContent implements Content {
    private final File jarFile;
    private final String entry;
    private final List<byte[]> holder;


    public JarEntryContent(File jarFile, String entry) {
        super();
        this.jarFile = jarFile;
        this.entry = entry;
        this.holder = new ArrayList<byte[]>();
    }


    private byte[] getBytes() throws ZipException, IOException {
        if (holder.isEmpty()) {
            ZipFile zipFile = null;
            try {
                zipFile = new ZipFile(jarFile);
                InputStream in = zipFile.getInputStream(new ZipEntry(entry));
                if (in == null) {
                    throw new IOException("Zip file contains no such entry - "
                            + jarFile.getAbsolutePath() + "\t" + entry);
                }
                try {
                    holder.add(StreamUtils.readFully(in));
                } finally {
                    in.close();
                }
            } finally {
                if (zipFile != null) {
                    zipFile.close();
                }
            }
        }
        return holder.get(0);
    }


    public Object getContent(Class<?>... klasses) throws IOException {
        if (klasses.length == 0) {
            return getContent(byte[].class);
        }
        for (Class<?> klass : klasses) {
            if (klass == byte[].class) {
                byte[] copy = new byte[getBytes().length];
                System.arraycopy(getBytes(), 0, copy, 0, copy.length);
                return copy;
            } else if (klass == InputStream.class) {
                return new ByteArrayInputStream(getBytes());
            }
        }
        return null;
    }


    public InputStream openStream() throws IOException {
        return (InputStream) getContent(InputStream.class);
    }


    @Override
    protected String getProtocol() {
        return "jarentry";
    }


    @Override
    protected String getUrlFile() {
        return jarFile.getName() + "#" + entry;
    }


}


Here are some code snippets showing how to use JarEntryContent:


URL jarEntryUrl = new JarEntryContent( new File( "pathToJarFile" ),"/my/company/classes/Class1.class").toURL();


Note that custom URL's are obtained indirectly from their Content objects, not directly constructed.


Source Code


Complete source code for these classes can be found here. To use, download the zip file into a temporary directory and unzip it. This should create the following files:


a) generic-url.jar, which contains a built version of these classes,


b) src.zip, which contains the source code, and


c) util-lite.jar, which is a small utility library that I use. I will be happy to post complete source code for util-lite.jar upon request.




 

What did you think of this article?




Trackbacks
  • No trackbacks exist for this post.
Comments
  • No comments exist for this post.
Leave a comment

Submitted comments are subject to moderation before being displayed.

 Enter the above security code (required)

 Name

 Email (will not be published)

 Website

Your comment is 0 characters limited to 3000 characters.