Java NIO编程 : Buffer

Published on 2017 - 05 - 04

An Example Client

Although the new I/O APIs aren’t specifically designed for clients, they do work for them. I’m going to begin with a client program using the new I/O APIs because it’s a little simpler. In particular, many clients can be implemented with one connection at a time, so I can introduce channels and buffers before talking about selectors and nonblocking I/O.

A simple client for the character generator protocol defined in RFC 864 will demonstrate the basics. This protocol is designed for testing clients. The server listens for connections on port 19. When a client connects, the server sends a continuous sequence of characters until the client disconnects. Any input from the client is ignored. The RFC does not specify which character sequence to send, but recommends that the server use a recognizable pattern. One common pattern is rotating, 72-character carriage return/linefeed delimited lines of the 95 ASCII printing characters, like this:

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghi
#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij
$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijk
%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijkl
&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm

I picked this protocol for the examples in this chapter because both the protocol for transmitting the data and the algorithm to generate the data are simple enough that they won’t obscure the I/O. However, chargen can transmit a lot of data over a relatively few connections and quickly saturate a network. It’s thus a good candidate for the new I/O APIs.

When implementing a client that takes advantage of the new I/O APIs, begin by invoking the static factory method SocketChannel.open() to create a new java.nio.channels.SocketChannel object. The argument to this method is a java.net.SocketAddress object indicating the host and port to connect to. For example, this fragment connects the channel to rama.poly.edu on port 19:

SocketAddress rama = new InetSocketAddress("rama.poly.edu", 19);
SocketChannel client = SocketChannel.open(rama);

The channel is opened in blocking mode, so the next line of code won’t execute until the connection is established. If the connection can’t be established, an IOException is thrown.

If this were a traditional client, you’d now ask for the socket’s input and/or output streams. However, it’s not. With a channel you write directly to the channel itself. Rather than writing byte arrays, you write ByteBuffer objects. You’ve got a pretty good idea that the lines of text are 74 ASCII characters long (72 printable characters followed by a carriage return/linefeed pair) so you’ll create a ByteBuffer that has a 74-byte capacity using the static allocate() method:

ByteBuffer buffer = ByteBuffer.allocate(74);

Pass this ByteBuffer object to the channel’s read() method. The channel fills this buffer with the data it reads from the socket. It returns the number of bytes it successfully read and stored in the buffer:

int bytesRead = client.read(buffer);

By default, this will read at least one byte or return –1 to indicate the end of the data, exactly as an InputStream does. It will often read more bytes if more bytes are available to be read. Shortly you’ll see how to put this client in nonblocking mode where it will return 0 immediately if no bytes are available, but for the moment this code blocks just like an InputStream. As you could probably guess, this method can also throw an IOException if anything goes wrong with the read.

Assuming there is some data in the buffer—that is, n > 0—this data can be copied to System.out. There are ways to extract a byte array from a ByteBuffer that can then be written on a traditional OutputStream such as System.out. However, it’s more informative to stick with a pure, channel-based solution. Such a solution requires wrapping the OutputStream System.out in a channel using the Channels utility class, specifically, its newChannel() method:

WritableByteChannel output = Channels.newChannel(System.out);

You can then write the data that was read onto this output channel connected to System.out. However, before you do that, you have to flip the buffer so that the output channel starts from the beginning of the data that was read rather than the end:

buffer.flip();
output.write(buffer);

You don’t have to tell the output channel how many bytes to write. Buffers keep track of how many bytes they contain. However, in general, the output channel is not guaranteed to write all the bytes in the buffer. In this specific case, though, it’s a blocking channel and it will either do so or throw an IOException.

You shouldn’t create a new buffer for each read and write. That would kill the performance. Instead, reuse the existing buffer. You’ll need to clear the buffer before reading into it again:

buffer.clear();

This is a little different than flipping. Flipping leaves the data in the buffer intact, but prepares it for writing rather than reading. Clearing resets the buffer to a pristine state. (Actually that’s a tad simplistic. The old data is still present; it’s not overwritten, but it will be overwritten with new data read from the source as soon as possible.)

Example 1 puts this together into a complete client. Because chargen is by design an endless protocol, you’ll need to kill the program using Ctrl-C.

import java.nio.*;
import java.nio.channels.*;
import java.net.*;
import java.io.IOException;

public class ChargenClient {

  public static int DEFAULT_PORT = 19;

  public static void main(String[] args) {

    if (args.length == 0) {
      System.out.println("Usage: java ChargenClient host [port]");
      return;
    }

    int port;
    try {
      port = Integer.parseInt(args[1]);
    } catch (RuntimeException ex) {
      port = DEFAULT_PORT;
    }

    try {
      SocketAddress address = new InetSocketAddress(args[0], port);
      SocketChannel client = SocketChannel.open(address);

      ByteBuffer buffer = ByteBuffer.allocate(74);
      WritableByteChannel out = Channels.newChannel(System.out);

      while (client.read(buffer) != -1) {
        buffer.flip();
        out.write(buffer);
        buffer.clear();
      }
    } catch (IOException ex) {
      ex.printStackTrace();
    }
  }
}

Here’s the output from a sample run:

$ java ChargenClient rama.poly.edu
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghi
#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghij
$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijk
%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijkl
&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm
...

So far, this is just an alternate vision of a program that could have easily been written using streams. The really new feature comes if you want the client to do something besides copying all input to output. You can run this connection in either blocking or nonblocking mode in which read() returns immediately even if no data is available. This allows the program to do something else before it attempts to read. It doesn’t have to wait for a slow network connection. To change the blocking mode, pass true (block) or false (don’t block) to the configureBlocking() method. Let’s make this connection nonblocking:

client.configureBlocking(false);

In nonblocking mode, read() may return 0 because it doesn’t read anything. Therefore, the loop needs to be a little different:

while (true) {
  // Put whatever code here you want to run every pass through the loop
  // whether anything is read or not
  int n = client.read(buffer);
  if (n > 0) {
     buffer.flip();
     out.write(buffer);
     buffer.clear();
  } else if (n == -1) {
    // This shouldn't happen unless the server is misbehaving.
    break;
  }
}

There’s not a lot of call for this in a one-connection client like this one. Perhaps you could check to see if the user has done something to cancel input, for example. However, as you’ll see in the next section, when a program is processing multiple connections, this enables code to run very quickly on the fast connections and more slowly on the slow ones. Each connection gets to run at its own speed without being held up behind the slowest driver on the one-lane road.

An Example Server

Clients are well and good, but channels and buffers are really intended for server systems that need to process many simultaneous connections efficiently. Handling servers requires a third new piece in addition to the buffers and channels used for the client. Specifically, you need selectors that allow the server to find all the connections that are ready to receive output or send input.

To demonstrate the basics, this example implements a simple server for the character generator protocol. When implementing a server that takes advantage of the new I/O APIs, begin by calling the static factory method ServerSocketChannel.open() method to create a new ServerSocketChannel object:

ServerSocketChannel serverChannel = ServerSocketChannel.open();

Initially, this channel is not actually listening on any port. To bind it to a port, retrieve its ServerSocket peer object with the socket() method and then use the bind() method on that peer. For example, this code fragment binds the channel to a server socket on port 19:

ServerSocket ss = serverChannel.socket();
ss.bind(new InetSocketAddress(19));

In Java 7 and later, you can bind directly without retrieving the underlying java.net.ServerSocket:

serverChannel.bind(new InetSocketAddress(19));

The server socket channel is now listening for incoming connections on port 19. To accept one, call the accept() method, which returns a SocketChannel object:

SocketChannel clientChannel = serverChannel.accept();

On the server side, you’ll definitely want to make the client channel nonblocking to allow the server to process multiple simultaneous connections:

clientChannel.configureBlocking(false);

You may also want to make the ServerSocketChannel nonblocking. By default, this accept() method blocks until there’s an incoming connection, like the accept() method of ServerSocket. To change this, simply call configureBlocking(false) before calling accept():

serverChannel.configureBlocking(false);

A nonblocking accept() returns null almost immediately if there are no incoming connections. Be sure to check for that or you’ll get a nasty NullPointerException when trying to use the socket.

There are now two open channels: a server channel and a client channel. Both need to be processed. Both can run indefinitely. Furthermore, processing the server channel will create more open client channels. In the traditional approach, you assign each connection a thread, and the number of threads climbs rapidly as clients connect. Instead, in the new I/O API, you create a Selector that enables the program to iterate over all the connections that are ready to be processed. To construct a new Selector, just call the static Selector.open() factory method:

Selector selector = Selector.open();

Next, you need to register each channel with the selector that monitors it using the channel’s register() method. When registering, specify the operation you’re interested in using a named constant from the SelectionKey class. For the server socket, the only operation of interest is OP_ACCEPT; that is, is the server socket channel ready to accept a new connection?

serverChannel.register(selector, SelectionKey.OP_ACCEPT);

For the client channels, you want to know something a little different—specifically, whether they’re ready to have data written onto them. For this, use the OP_WRITE key:

SelectionKey key = clientChannel.register(selector, SelectionKey.OP_WRITE);

Both register() methods return a SelectionKey object. However, you’re only going to need to use that key for the client channels, because there can be more than one of them. Each SelectionKey has an attachment of arbitrary Object type. This is normally used to hold an object that indicates the current state of the connection. In this case, you can store the buffer that the channel writes onto the network. Once the buffer is fully drained, you’ll refill it. Fill an array with the data that will be copied into each buffer. Rather than writing to the end of the buffer, and then rewinding to the beginning of the buffer and writing again, it’s easier just to start with two sequential copies of the data so every line is available as a contiguous sequence in the array:

byte[] rotation = new byte[95*2];
for (byte i = ' '; i <= '~'; i++) {
  rotation[i - ' '] = i;
  rotation[i + 95 - ' '] = i;
}

Because this array will only be read from after it’s been initialized, you can reuse it for multiple channels. However, each channel will get its own buffer filled with the contents of this array. You’ll stuff the buffer with the first 72 bytes of the rotation array, then add a carriage return/linefeed pair to break the line. Then you’ll flip the buffer so it’s ready for draining, and attach it to the channel’s key:

ByteBuffer buffer = ByteBuffer.allocate(74);
buffer.put(rotation, 0, 72);
buffer.put((byte) '\r');
buffer.put((byte) '\n');
buffer.flip();
key2.attach(buffer);

To check whether anything is ready to be acted on, call the selector’s select() method. For a long-running server, this normally goes in an infinite loop:

while (true) {
  selector.select ();
  // process selected keys...
}

Assuming the selector does find a ready channel, its selectedKeys() method returns a java.util.Set containing one SelectionKey object for each ready channel. Otherwise, it returns an empty set. In either case, you can loop through this with a java.util.Iterator:

Set<SelectionKey> readyKeys = selector.selectedKeys();
Iterator iterator = readyKeys.iterator();
while (iterator.hasNext()) {
  SelectionKey key = iterator.next();
  // Remove key from set so we don't process it twice
  iterator.remove();
  // operate on the channel...
}

Removing the key from the set tells the Selector that you’ve dealt with it, and the Selector doesn’t need to keep giving it back every time you call select(). The Selector will add the channel back into the ready set when select() is called again if the channel becomes ready again. It’s really important to remove the key from the ready set here, though.

If the ready channel is the server channel, the program accepts a new socket channel and adds it to the selector. If the ready channel is a socket channel, the program writes as much of the buffer as it can onto the channel. If no channels are ready, the selector waits for one. One thread, the main thread, processes multiple simultaneous connections.

In this case, it’s easy to tell whether a client or a server channel has been selected because the server channel will only be ready for accepting and the client channels will only be ready for writing. Both of these are I/O operations, and both can throw IOExceptions for a variety of reasons, so you’ll want to wrap this all in a try block:

try {
  if (key.isAcceptable()) {
    ServerSocketChannel server = (ServerSocketChannel) key.channel();
    SocketChannel connection = server.accept();
    connection.configureBlocking(false);
    connection.register(selector, SelectionKey.OP_WRITE);
    // set up the buffer for the client...
  } else if (key.isWritable()) {
    SocketChannel client = (SocketChannel) key.channel();
    // write data to client...
  }
}

Writing the data onto the channel is easy. Retrieve the key’s attachment, cast it to ByteBuffer, and call hasRemaining() to check whether there’s any unwritten data left in the buffer. If there is, write it. Otherwise, refill the buffer with the next line of data from the rotation array and write that.

ByteBuffer buffer = (ByteBuffer) key.attachment();
if (!buffer.hasRemaining()) {
  // Refill the buffer with the next line
  // Figure out where the last line started
  buffer.rewind();
  int first = buffer.get();
  // Increment to the next character
  buffer.rewind();
  int position = first - ' ' + 1;
  buffer.put(rotation, position, 72);
  buffer.put((byte) '\r');
  buffer.put((byte) '\n');
  buffer.flip();
}
client.write(buffer);

The algorithm that figures out where to grab the next line of data relies on the characters being stored in the rotation array in ASCII order. buffer.get() reads the first byte of data from the buffer. From this number you subtract the space character (32) because that’s the first character in the rotation array. This tells you which index in the array the buffer currently starts at. You add 1 to find the start of the next line and refill the buffer.

In the chargen protocol, the server never closes the connection. It waits for the client to break the socket. When this happens, an exception will be thrown. Cancel the key and close the corresponding channel:

catch (IOException ex) {
  key.cancel();
  try {
    key.channel().close();
  } catch (IOException cex) {
    // ignore
  }
}

Example 2 puts this all together in a complete chargen server that processes multiple connections efficiently in a single thread.

import java.nio.*;
import java.nio.channels.*;
import java.net.*;
import java.util.*;
import java.io.IOException;

public class ChargenServer {

  public static int DEFAULT_PORT = 19;

  public static void main(String[] args) {

    int port;
    try {
      port = Integer.parseInt(args[0]);
    } catch (RuntimeException ex) {
      port = DEFAULT_PORT;
    }
    System.out.println("Listening for connections on port " + port);

    byte[] rotation = new byte[95*2];
    for (byte i = ' '; i <= '~'; i++) {
      rotation[i -' '] = i;
      rotation[i + 95 - ' '] = i;
    }

    ServerSocketChannel serverChannel;
    Selector selector;
    try {
      serverChannel = ServerSocketChannel.open();
      ServerSocket ss = serverChannel.socket();
      InetSocketAddress address = new InetSocketAddress(port);
      ss.bind(address);
      serverChannel.configureBlocking(false);
      selector = Selector.open();
      serverChannel.register(selector, SelectionKey.OP_ACCEPT);
    } catch (IOException ex) {
      ex.printStackTrace();
      return;
    }

    while (true) {
      try {
        selector.select();
      } catch (IOException ex) {
        ex.printStackTrace();
        break;
      }
     Set<SelectionKey> readyKeys = selector.selectedKeys();
      Iterator<SelectionKey> iterator = readyKeys.iterator();
      while (iterator.hasNext()) {

        SelectionKey key = iterator.next();
        iterator.remove();
        try {
          if (key.isAcceptable()) {
            ServerSocketChannel server = (ServerSocketChannel) key.channel();
            SocketChannel client = server.accept();
            System.out.println("Accepted connection from " + client);
            client.configureBlocking(false);
            SelectionKey key2 = client.register(selector, SelectionKey.
                                                                    OP_WRITE);
            ByteBuffer buffer = ByteBuffer.allocate(74);
            buffer.put(rotation, 0, 72);
            buffer.put((byte) '\r');
            buffer.put((byte) '\n');
            buffer.flip();
            key2.attach(buffer);
          } else if (key.isWritable()) {
            SocketChannel client = (SocketChannel) key.channel();
            ByteBuffer buffer = (ByteBuffer) key.attachment();
            if (!buffer.hasRemaining()) {
              // Refill the buffer with the next line
              buffer.rewind();
              // Get the old first character
              int first = buffer.get();
              // Get ready to change the data in the buffer
              buffer.rewind();
              // Find the new first characters position in rotation
              int position = first - ' ' + 1;
              // copy the data from rotation into the buffer
              buffer.put(rotation, position, 72);
              // Store a line break at the end of the buffer
              buffer.put((byte) '\r');
              buffer.put((byte) '\n');
              // Prepare the buffer for writing
              buffer.flip();
            }
            client.write(buffer);
          }
        } catch (IOException ex) {
          key.cancel();
          try {
            key.channel().close();
          }
          catch (IOException cex) {}
        }
      }
    }
  }
}

socket只要send buffer不满就可以写,刚开始send buffer为空,socket总是可以写,于是server.select()立即返回可写状态

This example only uses one thread. There are situations where you might still want to use multiple threads, especially if different operations have different priorities. For instance, you might want to accept new connections in one high-priority thread and service existing connections in a lower-priority thread. However, you’re no longer required to have a 1:1 ratio between threads and connections, which improves the scalability of servers written in Java.

Buffers

I recommended that you always buffer your streams. Almost nothing has a greater impact on the performance of network programs than a big enough buffer. In the new I/O model, you’re no longer given the choice. All I/O is buffered. Indeed, the buffers are fundamental parts of the API. Instead of writing data onto output streams and reading data from input streams, you read and write data from buffers. Buffers may appear to be just an array of bytes as in buffered streams. However, native implementations can connect them directly to hardware or memory or use other, very efficient implementations.

From a programming perspective, the key difference between streams and channels is that streams are byte-based whereas channels are block-based. A stream is designed to provide one byte after the other, in order. Arrays of bytes can be passed for performance. However, the basic notion is to pass data one byte at a time. By contrast, a channel passes blocks of data around in buffers. Before bytes can be read from or written to a channel, the bytes have to be stored in a buffer, and the data is written or read one buffer at a time.

The second key difference between streams and channels/buffers is that channels and buffers tend to support both reading and writing on the same object. This isn’t always true. For instance, a channel that points to a file on a CD-ROM can be read but not written. A channel connected to a socket that has shutdown input could be written but not read. If you try to write to a read-only channel or read from a write-only channel, an UnsupportedOperationException will be thrown. However, more often than not network programs can read from and write to the same channels.

Without worrying too much about the underlying details (which can vary hugely from one implementation to the next, mostly as a result of being tuned very closely to the host operating system and hardware), you can think of a buffer as a fixed-size list of elements of a particular, normally primitive data type, like an array. However, it’s not necessarily an array behind the scenes. Sometimes it is; sometimes it isn’t. There are specific subclasses of Buffer for all of Java’s primitive data types except boolean: ByteBuffer, CharBuffer, ShortBuffer, IntBuffer, LongBuffer, FloatBuffer, and DoubleBuffer. The methods in each subclass have appropriately typed return values and argument lists. For example, the DoubleBuffer class has methods to put and get doubles. The IntBuffer class has methods to put and get ints. The common Buffer superclass only provides methods that don’t need to know the type of the data the buffer contains. (The lack of primitive-aware generics really hurts here.) Network programs use ByteBuffer almost exclusively, although occasionally one program might use a view that overlays the ByteBuffer with one of the other types.

Besides its list of data, each buffer tracks four key pieces of information. All buffers have the same methods to set and get these values, regardless of the buffer’s type:

position

The next location in the buffer that will be read from or written to. This starts counting at 0 and has a maximum value equal to the size of the buffer. It can be set or gotten with these two methods:

public final int    position()
public final Buffer position(int newPosition)

capacity

The maximum number of elements the buffer can hold. This is set when the buffer is created and cannot be changed thereafter. It can be read with this method:

public final int capacity()

limit

The end of accessible data in the buffer. You cannot write or read at or past this point without changing the limit, even if the buffer has more capacity. It is set and gotten with these two methods:

public final int    limit()
public final Buffer limit(int newLimit)

mark

A client-specified index in the buffer. It is set at the current position by invoking the mark() method. The current position is set to the marked position by invoking reset():

public final Buffer mark()
public final Buffer reset()

If the position is set below an existing mark, the mark is discarded.

Unlike reading from an InputStream, reading from a buffer does not actually change the buffer’s data in any way. It’s possible to set the position either forward or backward so you can start reading from a particular place in the buffer. Similarly, a program can adjust the limit to control the end of the data that will be read. Only the capacity is fixed.

The common Buffer superclass also provides a few other methods that operate by reference to these common properties.

The clear() method “empties” the buffer by setting the position to zero and the limit to the capacity. This allows the buffer to be completely refilled:

public final Buffer clear()

However, the clear() method does not remove the old data from the buffer. It’s still present and could be read using absolute get methods or changing the limit and position again.

The rewind() method sets the position to zero, but does not change the limit:

public final Buffer rewind()

This allows the buffer to be reread.

The flip() method sets the limit to the current position and the position to zero:

public final Buffer flip()

It is called when you want to drain a buffer you’ve just filled.

Finally, there are two methods that return information about the buffer but don’t change it. The remaining() method returns the number of elements in the buffer between the current position and the limit. The hasRemaining() method returns true if the number of remaining elements is greater than zero:

public final int     remaining()
public final boolean hasRemaining()

Creating Buffers

The buffer class hierarchy is based on inheritance but not really on polymorphism, at least not at the top level. You normally need to know whether you’re dealing with an IntBuffer or a ByteBuffer or a CharBuffer or something else. You write code to one of these subclasses, not to the common Buffer superclass.

Each typed buffer class has several factory methods that create implementation-specific subclasses of that type in various ways. Empty buffers are normally created by allocate methods. Buffers that are prefilled with data are created by wrap methods. The allocate methods are often useful for input, and the wrap methods are normally used for output.

Allocation

The basic allocate() method simply returns a new, empty buffer with a specified fixed capacity. For example, these lines create byte and int buffers, each with a size of 100:

ByteBuffer buffer1 = ByteBuffer.allocate(100);
IntBuffer  buffer2 = IntBuffer.allocate(100);

The cursor is positioned at the beginning of the buffer (i.e., the position is 0). A buffer created by allocate() will be implemented on top of a Java array, which can be accessed by the array() and arrayOffset() methods. For example, you could read a large chunk of data into a buffer using a channel and then retrieve the array from the buffer to pass to other methods:

byte[] data1 = buffer1.array();
int[]  data2 = buffer2.array();

The array() method does expose the buffer’s private data, so use it with caution. Changes to the backing array are reflected in the buffer and vice versa. The normal pattern here is to fill the buffer with data, retrieve its backing array, and then operate on the array. This isn’t a problem as long as you don’t write to the buffer after you’ve started working with the array.

Direct allocation

The ByteBuffer class (but not the other buffer classes) has an additional allocateDirect() method that may not create a backing array for the buffer. The VM may implement a directly allocated ByteBuffer using direct memory access to the buffer on an Ethernet card, kernel memory, or something else. It’s not required, but it’s allowed, and this can improve performance for I/O operations. From an API perspective, allocateDirect() is used exactly like allocate():

ByteBuffer buffer = ByteBuffer.allocateDirect(100);

Invoking array() and arrayOffset() on a direct buffer will throw an UnsupportedOperationException. Direct buffers may be faster on some virtual machines, especially if the buffer is large (roughly a megabyte or more). However, direct buffers are more expensive to create than indirect buffers, so they should only be allocated when the buffer is expected to be around for a while. The details are highly VM dependent. As is generally true for most performance advice, you probably shouldn’t even consider using direct buffers until measurements prove performance is an issue.

Wrapping

If you already have an array of data that you want to output, you’ll normally wrap a buffer around it, rather than allocating a new buffer and copying its components into the buffer one at a time. For example:

byte[] data = "Some data".getBytes("UTF-8");
ByteBuffer buffer1 = ByteBuffer.wrap(data);
char[] text = "Some text".toCharArray();
CharBuffer buffer2 = CharBuffer.wrap(text);

Here, the buffer contains a reference to the array, which serves as its backing array. Buffers created by wrapping are never direct. Again, changes to the array are reflected in the buffer and vice versa, so don’t wrap the array until you’re finished with it.

Filling and Draining

Buffers are designed for sequential access. Recall that each buffer has a curent position identified by the position() method that is somewhere between zero and the number of elements in the buffer, inclusive. The buffer’s position is incremented by one when an element is read from or written to the buffer. For example, suppose you allocate a CharBuffer with capacity 12, and fill it by putting five characters into it:

CharBuffer buffer = CharBuffer.allocate(12);
buffer.put('H');
buffer.put('e');
buffer.put('l');
buffer.put('l');
buffer.put('o');

The position of the buffer is now 5. This is called filling the buffer.

You can only fill the buffer up to its capacity. If you tried to fill it past its initially set capacity, the put() method would throw a BufferOverflowException.

If you now tried to get() from the buffer, you’d get the null character (\u0000) that Java initializes char buffers with that’s found at position 5. Before you can read the data you wrote in out again, you need to flip the buffer:

buffer.flip();

This sets the limit to the position (5 in this example) and resets the position to 0, the start of the buffer. Now you can drain it into a new string:

String result = "";
while (buffer.hasRemaining()) {
  result += buffer.get();
}

Each call to get() moves the position forward one. When the position reaches the limit, hasRemaining() returns false. This is called draining the buffer.

Buffer classes also have absolute methods that fill and drain at specific positions within the buffer without updating the position. For example, ByteBuffer has these two:

public abstract byte       get(int index)
public abstract ByteBuffer put(int index, byte b)

These both throw an IndexOutOfBoundsException if you try to access a position at or past the limit of the buffer. For example, using absolute methods, you can put the same text into a buffer like this:

CharBuffer buffer = CharBuffer.allocate(12);
buffer.put(0, 'H');
buffer.put(1, 'e');
buffer.put(2, 'l');
buffer.put(3, 'l');
buffer.put(4, 'o');

However, you no longer need to flip before reading it out, because the absolute methods don’t change the position. Furthermore, order no longer matters. This produces the same end result:

CharBuffer buffer = CharBuffer.allocate(12);
buffer.put(1, 'e');
buffer.put(4, 'o');
buffer.put(0, 'H');
buffer.put(3, 'l');
buffer.put(2, 'l');

Bulk Methods

Even with buffers, it’s often faster to work with blocks of data rather than filling and draining one element at a time. The different buffer classes have bulk methods that fill and drain an array of their element type.

For example, ByteBuffer has put() and get() methods that fill and drain a ByteBuffer from a preexisting byte array or subarray:

public ByteBuffer get(byte[] dst, int offset, int length)
public ByteBuffer get(byte[] dst)
public ByteBuffer put(byte[] array, int offset, int length)
public ByteBuffer put(byte[] array)

These put methods insert the data from the specified array or subarray, beginning at the current position. The get methods read the data into the argument array or subarray beginning at the current position. Both put and get increment the position by the length of the array or subarray. The put methods throw a BufferOverflowException if the buffer does not have sufficient space for the array or subarray. The get methods throw a BufferUnderflowException if the buffer does not have enough data remaining to fill the array or subarrray. These are runtime exceptions.

Data Conversion

All data in Java ultimately resolves to bytes. Any primitive data type—int, double, float, etc.—can be written as bytes. Any sequence of bytes of the right length can be interpreted as a primitive datum. For example, any sequence of four bytes corresponds to an int or a float (actually both, depending on how you want to read it). A sequence of eight bytes corresponds to a long or a double. The ByteBuffer class (and only the ByteBuffer class) provides relative and absolute put methods that fill a buffer with the bytes corresponding to an argument of primitive type (except boolean) and relative and absolute get methods that read the appropriate number of bytes to form a new primitive datum:

public abstract char       getChar()
public abstract ByteBuffer putChar(char value)
public abstract char       getChar(int index)
public abstract ByteBuffer putChar(int index, char value)
public abstract short      getShort()
public abstract ByteBuffer putShort(short value)
public abstract short      getShort(int index)
public abstract ByteBuffer putShort(int index, short value)
public abstract int        getInt()
public abstract ByteBuffer putInt(int value)
public abstract int        getInt(int index)
public abstract ByteBuffer putInt(int index, int value)
public abstract long       getLong(int index)
public abstract ByteBuffer putLong(int index, long value)
public abstract float      getFloat()
public abstract ByteBuffer putFloat(float value)
public abstract float      getFloat(int index)
public abstract ByteBuffer putFloat(int index, float value)
public abstract double     getDouble()
public abstract ByteBuffer putDouble(double value)
public abstract double     getDouble(int index)
public abstract ByteBuffer putDouble(int index, double value)

In the world of new I/O, these methods do the job performed by DataOutputStream and DataInputStream in traditional I/O. These methods do have an additional ability not present in DataOutputStream and DataInputStream. You can choose whether to interpret the byte sequences as big-endian or little-endian ints, floats, doubles, and so on. By default, all values are read and written as big endian (i.e., most significant byte first. The two order() methods inspect and set the buffer’s byte order using the named constants in the ByteOrder class. For example, you can change the buffer to little-endian interpretation like so:

if (buffer.order().equals(ByteOrder.BIG_ENDIAN)) {
  buffer.order(ByteOrder.LITTLE_ENDIAN);
}

Suppose instead of a chargen protocol, you want to test the network by generating binary data. This test can highlight problems that aren’t apparent in the ASCII chargen protocol, such as an old gateway configured to strip off the high-order bit of every byte, throw away every 230 byte, or put into diagnostic mode by an unexpected sequence of control characters. These are not theoretical problems. I’ve seen variations on all of these at one time or another.

You could test the network for such problems by sending out every possible int. This would, after almost 4.3 billion iterations, test every possible four-byte sequence. On the receiving end, you could easily test whether the data received is expected with a simple numeric comparison. If any problems are found, it is easy to tell exactly where they occurred. In other words, this protocol (call it Intgen) behaves like this:

  1. The client connects to the server.
  2. The server immediately begins sending four-byte, big-endian integers, starting with 0 and incrementing by 1 each time. The server will eventually wrap around into the negative numbers.
  3. The server runs indefinitely. The client closes the connection when it’s had enough.

The server would store the current int in a four-byte-long direct ByteBuffer. One buffer would be attached to each channel. When the channel becomes available for writing, the buffer is drained onto the channel. Then the buffer is rewound and the content of the buffer is read with getInt(). The program then clears the buffer, increments the previous value by one, and fills the buffer with the new value using putInt(). Finally, it flips the buffer so it will be ready to be drained the next time the channel becomes writable. Example 3 demonstrates.

import java.nio.*;
import java.nio.channels.*;
import java.net.*;
import java.util.*;
import java.io.IOException;

public class IntgenServer {

  public static int DEFAULT_PORT = 1919;

  public static void main(String[] args) {

    int port;
    try {
      port = Integer.parseInt(args[0]);
    } catch (RuntimeException ex) {
      port = DEFAULT_PORT;
    }
    System.out.println("Listening for connections on port " + port);
    ServerSocketChannel serverChannel;
    Selector selector;
    try {
      serverChannel = ServerSocketChannel.open();
      ServerSocket ss = serverChannel.socket();
      InetSocketAddress address = new InetSocketAddress(port);
      ss.bind(address);
      serverChannel.configureBlocking(false);
      selector = Selector.open();
      serverChannel.register(selector, SelectionKey.OP_ACCEPT);
    } catch (IOException ex) {
      ex.printStackTrace();
      return;
    }

    while (true) {
      try {
        selector.select();
      } catch (IOException ex) {
        ex.printStackTrace();
        break;
      }

      Set<SelectionKey> readyKeys = selector.selectedKeys();
      Iterator<SelectionKey> iterator = readyKeys.iterator();
      while (iterator.hasNext()) {
        SelectionKey key = iterator.next();
        iterator.remove();
        try {
          if (key.isAcceptable()) {
            ServerSocketChannel server = (ServerSocketChannel) key.channel();
            SocketChannel client = server.accept();
            System.out.println("Accepted connection from " + client);
            client.configureBlocking(false);
            SelectionKey key2 = client.register(selector, SelectionKey.
                                                                     OP_WRITE);
            ByteBuffer output = ByteBuffer.allocate(4);
            output.putInt(0);
            output.flip();
            key2.attach(output);
          } else if (key.isWritable()) {
            SocketChannel client = (SocketChannel) key.channel();
            ByteBuffer output = (ByteBuffer) key.attachment();
            if (! output.hasRemaining()) {
              output.rewind();
              int value = output.getInt();
              output.clear();
              output.putInt(value + 1);
              output.flip();
            }
            client.write(output);
          }
        } catch (IOException ex) {
          key.cancel();
          try {
            key.channel().close();
          }
          catch (IOException cex) {}
        }
      }
    }
  }
}

View Buffers

If you know the ByteBuffer read from a SocketChannel contains nothing but elements of one particular primitive data type, it may be worthwhile to create a view buffer. This is a new Buffer object of appropriate type (e.g., DoubleBuffer, IntBuffer, etc.), that draws its data from an underlying ByteBuffer beginning with the current position. Changes to the view buffer are reflected in the underlying buffer and vice versa. However, each buffer has its own independent limit, capacity, mark, and position. View buffers are created with one of these six methods in ByteBuffer:

public abstract ShortBuffer  asShortBuffer()
public abstract CharBuffer   asCharBuffer()
public abstract IntBuffer    asIntBuffer()
public abstract LongBuffer   asLongBuffer()
public abstract FloatBuffer  asFloatBuffer()
public abstract DoubleBuffer asDoubleBuffer()

For example, consider a client for the Intgen protocol. This protocol is only going to read ints, so it may be helpful to use an IntBuffer rather than a ByteBuffer. Example 4 demonstrates. For variety, this client is synchronous and blocking, but it still uses channels and buffers.

import java.nio.*;
import java.nio.channels.*;
import java.net.*;
import java.io.IOException;

public class IntgenClient {

  public static int DEFAULT_PORT = 1919;

  public static void main(String[] args) {

    if (args.length == 0) {
      System.out.println("Usage: java IntgenClient host [port]");
      return;
    }

    int port;
    try {
      port = Integer.parseInt(args[1]);
    } catch (RuntimeException ex) {
      port = DEFAULT_PORT;
    }

    try {
      SocketAddress address = new InetSocketAddress(args[0], port);
      SocketChannel client  = SocketChannel.open(address);
      ByteBuffer    buffer  = ByteBuffer.allocate(4);
      IntBuffer     view    = buffer.asIntBuffer();

      for (int expected = 0; ; expected++) {
        client.read(buffer);
        int actual = view.get();
        buffer.clear();
        view.rewind();

        if (actual != expected) {
          System.err.println("Expected " + expected + "; was " + actual);
          break;
        }
        System.out.println(actual);
      }
    } catch(IOException ex) {
      ex.printStackTrace();
    }
  }
}

There’s one thing to note here. Although you can fill and drain the buffers using the methods of the IntBuffer class exclusively, data must be read from and written to the channel using the original ByteBuffer of which the IntBuffer is a view. The SocketChannel class only has methods to read and write ByteBuffers. It cannot read or write any other kind of buffer. This also means you need to clear the ByteBuffer on each pass through the loop or the buffer will fill up and the program will halt. The positions and limits of the two buffers are independent and must be considered separately. Finally, if you’re working in nonblocking mode, be careful that all the data in the underlying ByteBuffer is drained before reading or writing from the overlaying view buffer. Nonblocking mode provides no guarantee that the buffer will still be aligned on an int, double, or char. boundary following a drain. It’s completely possible for a nonblocking channel to write half the bytes of an int or a double. When using nonblocking I/O, be sure to check for this problem before putting more data in the view buffer.

Compacting Buffers

Most writable buffers support a compact() method:

public abstract ByteBuffer   compact()
public abstract IntBuffer    compact()
public abstract ShortBuffer  compact()
public abstract FloatBuffer  compact()
public abstract CharBuffer   compact()
public abstract DoubleBuffer compact()

(If it weren’t for invocation chaining, these six methods could have been replaced by one method in the common Buffer superclass.) Compacting shifts any remaining data in the buffer to the start of the buffer, freeing up more space for elements. Any data that was in those positions will be overwritten. The buffer’s position is set to the end of the data so it’s ready for writing more data.

Compacting is an especially useful operation when you’re copying—reading from one channel and writing the data to another using nonblocking I/O. You can read some data into a buffer, write the buffer out again, then compact the data so all the data that wasn’t written is at the beginning of the buffer, and the position is at the end of the data remaining in the buffer, ready to receive more data. This allows the reads and writes to be interspersed more or less at random with only one buffer. Several reads can take place in a row, or several writes follow consecutively. If the network is ready for immediate output but not input (or vice versa), the program can take advantage of that. This technique can be used to implement an echo server as shown in Example 5. The echo protocol simply responds to the client with whatever data the client sent. Like chargen, it’s useful for network testing. Also like chargen, echo relies on the client to close the connection. Unlike chargen, however, an echo server must both read and write from the connection.

import java.nio.*;
import java.nio.channels.*;
import java.net.*;
import java.util.*;
import java.io.IOException;

public class EchoServer {

  public static int DEFAULT_PORT = 7;

  public static void main(String[] args) {

    int port;
    try {
      port = Integer.parseInt(args[0]);
    } catch (RuntimeException ex) {
      port = DEFAULT_PORT;
    }
    System.out.println("Listening for connections on port " + port);

    ServerSocketChannel serverChannel;
    Selector selector;
    try {
      serverChannel = ServerSocketChannel.open();
      ServerSocket ss = serverChannel.socket();
      InetSocketAddress address = new InetSocketAddress(port);
      ss.bind(address);
      serverChannel.configureBlocking(false);
      selector = Selector.open();
      serverChannel.register(selector, SelectionKey.OP_ACCEPT);
    } catch (IOException ex) {
      ex.printStackTrace();
      return;
    }
    while (true) {
      try {
        selector.select();
      } catch (IOException ex) {
        ex.printStackTrace();
        break;
      }

      Set<SelectionKey> readyKeys = selector.selectedKeys();
      Iterator<SelectionKey> iterator = readyKeys.iterator();
      while (iterator.hasNext()) {
        SelectionKey key = iterator.next();
        iterator.remove();
        try {
          if (key.isAcceptable()) {
            ServerSocketChannel server = (ServerSocketChannel) key.channel();
            SocketChannel client = server.accept();
            System.out.println("Accepted connection from " + client);
            client.configureBlocking(false);
            SelectionKey clientKey = client.register(
                selector, SelectionKey.OP_WRITE | SelectionKey.OP_READ);
            ByteBuffer buffer = ByteBuffer.allocate(100);
            clientKey.attach(buffer);
          }
          if (key.isReadable()) {
            SocketChannel client = (SocketChannel) key.channel();
            ByteBuffer output = (ByteBuffer) key.attachment();
            client.read(output);
          }
          if (key.isWritable()) {
            SocketChannel client = (SocketChannel) key.channel();
            ByteBuffer output = (ByteBuffer) key.attachment();
            output.flip();
            client.write(output);
            output.compact();
          }
        } catch (IOException ex) {
          key.cancel();
          try {
            key.channel().close();
          } catch (IOException cex) {}
        }
      }
    }
  }
}

One thing I noticed while writing and debugging this program: the buffer size makes a big difference, although perhaps not in the way you might think. A big buffer can hide a lot of bugs. If the buffer is large enough to hold complete test cases without being flipped or drained, it’s very easy to not notice that the buffer isn’t being flipped or compacted at the right times because the test cases never actually need to do that. Before shipping your program, try turning the buffer size down to something significantly lower than the input you’re expecting. In this case, I tested with a buffer size of 10. This test degrades performance, so you shouldn’t ship with such a ridiculously small buffer, but you absolutely should test your code with small buffers to make sure it behaves properly when the buffer fills up.

Duplicating Buffers

It’s often desirable to make a copy of a buffer to deliver the same information to two or more channels. The duplicate() methods in each of the six typed buffer classes do this:

public abstract ByteBuffer   duplicate()
public abstract IntBuffer    duplicate()
public abstract ShortBuffer  duplicate()
public abstract FloatBuffer  duplicate()
public abstract CharBuffer   duplicate()
public abstract DoubleBuffer duplicate()

The return values are not clones. The duplicated buffers share the same data, including the same backing array if the buffer is indirect. Changes to the data in one buffer are reflected in the other buffer. Thus, you should mostly use this method when you’re only going to read from the buffers. Otherwise, it can be tricky to keep track of where the data is being modified.

The original and duplicated buffers do have independent marks, limits, and positions even though they share the same data. One buffer can be ahead of or behind the other buffer.

Duplication is useful when you want to transmit the same data over multiple channels, roughly in parallel. You can make duplicates of the main buffer for each channel and allow each channel to run at its own speed.

Slicing Buffers

Slicing a buffer is a slight variant of duplicating. Slicing also creates a new buffer that shares data with the old buffer. However, the slice’s zero position is the current position of the original buffer, and its capacity only goes up to the source buffer’s limit. That is, the slice is a subsequence of the original buffer that only contains the elements from the current position to the limit. Rewinding the slice only moves it back to the position of the original buffer when the slice was created. The slice can’t see anything in the original buffer before that point. Again, there are separate slice() methods in each of the six typed buffer classes:

public abstract ByteBuffer   slice()
public abstract IntBuffer    slice()
public abstract ShortBuffer  slice()
public abstract FloatBuffer  slice()
public abstract CharBuffer   slice()
public abstract DoubleBuffer slice()

This is useful when you have a long buffer of data that is easily divided into multiple parts such as a protocol header followed by the data. You can read out the header, then slice the buffer and pass the new buffer containing only the data to a separate method or class.

Marking and Resetting

Like input streams, buffers can be marked and reset if you want to reread some data. Unlike input streams, this can be done to all buffers, not just some of them. For a change, the relevant methods are declared once in the Buffer superclass and inherited by all the various subclasses:

public final Buffer mark()
public final Buffer reset()

The reset() method throws an InvalidMarkException, a runtime exception, if the mark is not set. The mark is also unset when the position is set to a point before the mark.

Object Methods

The buffer classes all provide the usual equals(), hashCode(), and toString() methods. They also implement Comparable, and therefore provide compareTo() methods. However, buffers are not Serializable or Cloneable.

Two buffers are considered to be equal if:

  1. They have the same type (e.g., a ByteBuffer is never equal to an IntBuffer but may be equal to another ByteBuffer).
  2. They have the same number of elements remaining in the buffer.
  3. The remaining elements at the same relative positions are equal to each other.

Note that equality does not consider the buffers’ elements that precede the position, the buffers’ capacity, limits, or marks. For example, this code fragment prints true even though the first buffer is twice the size of the second:

CharBuffer buffer1 = CharBuffer.wrap("12345678");
CharBuffer buffer2 = CharBuffer.wrap("5678");
buffer1.get();
buffer1.get();
buffer1.get();
buffer1.get();
System.out.println(buffer1.equals(buffer2));

The hashCode() method is implemented in accordance with the contract for equality. That is, two equal buffers will have equal hash codes and two unequal buffers are very unlikely to have equal hash codes. However, because the buffer’s hash code changes every time an element is added to or removed from the buffer, buffers do not make good hash table keys.

Comparison is implemented by comparing the remaining elements in each buffer, one by one. If all the corresponding elements are equal, the buffers are equal. Otherwise, the result is the outcome of comparing the first pair of unequal elements. If one buffer runs out of elements before an unequal element is found and the other buffer still has elements, the shorter buffer is considered to be less than the longer buffer.

The toString() method returns strings that look something like this:

java.nio.HeapByteBuffer[pos=0 lim=62 cap=62]

These are primarily useful for debugging. The notable exception is CharBuffer, which returns a string containing the remaining chars in the buffer.

Reference