Monday, March 8, 2010

Thoughts on Google Fiber for Appleton

Tonight I attended a public hearing at Appleton City Hall (PDF) regarding the city's consideration to submit a response to Google's request for information on the Google Fiber for Communities experiment. (Don't miss Google's project overview and other linked pages.) Also, please join the Google Fiber for Appleton Facebook group.

I was pleasantly surprised to see this public hearing bring the attention of Green Bay TV stations WFRV / CBS channel 5 and WGBA / NBC channel 26. The public hearing started late due to another meeting in the same room. There were probably a few more than 20 people in attendance, and the response was overwhelmingly positive. This echoed the current status of the City of Appleton's Survey, which was mentioned to also be overwhelmingly in favor of submitting a proposal. In general, the responses given at the hearing were mostly focused on the points that Appleton should proceed with submitting a favorable proposal in order to remain competitive as a community, to bring additional competition and choices for Internet service, and many other convincing reasons.

There were only 3 responses against: 2 from AT&T representatives in attendance for somewhat obvious reasons in concern for their business, and 1 gentleman concerned with the physical cost necessary to connect to connect gigabit networking to his Apple computer. (He apparently assumed that the fiber would need to be connected directly to his computer. Most computers sold in the past few years already have gigabit Ethernet cards, or they are readily available for less than $50. Additionally, there are many potential uses beyond a computer, such as video and other multimedia.) As also mentioned by another resident at the meeting, even being able to connect at only 100 Mbps (vs. the 1 Gbps / 1,000 Mbps being advertised) would still be about 10x faster than most consumer broadband connections available today.

My Own Thoughts

In general, additional competition for residential Internet service can only be a good thing - whether that competition is from Google or another provider. At my current residence in an apartment just outside the official city limits (in Grand Chute, near the Fox River Mall), my choices for broadband Internet are currently limited to Time Warner Cable and DSL. Wireless / "3G" was tried, and is not current viable for primary / serious use as previously detailed. Interestingly, the reps from AT&T at the hearing used their wireless network as one of their primary arguments against submitting a proposal to Google. After looking into Time Warner Cable, they presented themselves as one of the shadiest operations in town, at least based upon my experience at their local office. I'm currently using AT&T's DSL. While I would like to sign-up for AT&T U-Verse, I'm told that it is not available to my particular building, despite several of my neighbors in very close proximity having the service.

Unlike many of the comments and hype, I would hope that Google's offering is not all about the speed - even though, unfortunately, this is all many residential consumers are aware of or take into consideration. A few questions to consider of any ISP:

  • In addition to the speed, what is the latency (or lag)?
  • What is the support for IPv6?
  • Are industry standards properly followed, such as RFC 2308 - Negative Caching of DNS Queries? Or is the ISP involved in DNS hijacking?
  • Is service advertised as "unlimited" truly unlimited, or are there limits involved that are only shown in fine print, if at all?
  • What is the ISP's stance on network neutrality?
  • What support / allowance is there for operating as a server - either a web server, or something more "residential" such as allowing remote desktop connections, allowing for peer-to-peer file transfers, or playing games that require being able to accept incoming network connections?
  • What are the results from the ICSI Netalyzer hosted by UC Berkeley, which tests for most of the above as well other issues?

Google, in particular, has a positive record for properly supporting the above requirements and avoiding the listed issues:

I can't imagine that Google wouldn't uphold the same principals in providing their own Internet service.

(On a humorous note, I can’t help but recall Google's previous ISP / fiber offering: Google TiSP.)

Even if Appleton submits a proposal for and is accepted as a location for Google Fiber, there is no guarantee that I would be in the service area - especially being in a neighboring town. However, I have no doubt that I would still benefit from the increased competition. Additionally, once we're back in the market to buy a a house in the area, the availability of Google Fiber would be a serious consideration. (Someone please buy our house for sale in Wausau!)

The Need for IPv6

Almost a year ago, I brought IPv6 connectivity to my home network / LAN. Details on the setup to follow in a future post.

Background

Similar to the past Y2K issue, the Internet is facing a similar issue that just hasn't been publicized too much in the mainstream media yet: Exhaustion of the IPv4 addresses currently being used. I found a very interesting and detailed IPv4 Address Report by Geoff Huston that is auto-generated daily. There are various estimates as to the numbers and dates, but all the predictions are currently falling in the range of years 2011-2012. This shortage of IPv4 addresses will certainly be a much larger issue than other predictions and myths for the year 2012.

The only real solution to the IPv4 address shortage is upgrading to IPv6. IPv4 allowed for 232, or 4,294,967,296 addresses. With most computers, servers, and even cell phones each being assigned a unique address, the shortage should not be surprising. However, back when the IPv4 specification was published back in 1981 (RFC 791), I'm sure 4+ billion addresses was considered more than sufficient. IPv6 solves this shortage by increasing the number of possible addresses to 2128, or 3.4×1038. Written out, this is 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses. Beyond the increased address space, IPv6 also brings a number of other features, including mandatory support for advanced security, simplified processing, and support for network mobility.

IPv6 became active for production use on the Internet in June 2006. Unfortunately, it seems that many organizations and much of the Internet has not yet committed to converting, and the shortage will have to be dealt with. I see this causing more problems for regular at-home users than anyone else. Most individuals are not aware of the issue, and have little choice than to accept however their ISP handles the issue - short of possibly switching providers. Already, most consumers are only leased one IP address per Internet account, which usually must be shared between several computers and other Internet-connected devices. This is almost always accomplished through network address translation (NAT). This already causes complications and issues with file transfers, remote assistance applications, VPN software, online gaming, and many other typical Internet uses. As the shortage becomes more significant, expect for an increasing number of ISPs to no longer lease a public IP address at all, but instead only lease a private IP, where multiple private IPs share one public IP - essentially nesting one NAT network within another, and will only further complicate matters. The same "public IP per Internet account" that we are accustomed to today may still be available - but only for an added fee.

Use of NAT and private IPs are also in conflict with the fundamental design of the Internet and prevent end-to-end connectivity. Overall, it increasingly seems that Internet providers are only guaranteeing limited "web access" vs. fuller "Internet access". I.E., if it doesn't run in a web browser, it is probably not supported. I already experienced this first-hand in my previous dealings with Alltel / Verizon in regards to my wireless Internet issues. Consumers need to start demanding more from their ISPs, and need to know and have a limited understanding of the facts to do so. One tool that can help with this is the ICSI Netalyzer hosted by UC Berkeley. Guarantee of a public IP - either IPv4 or IPv6 - is also something that should be investigated and demanded.

IPv6 Choices

The best and easiest way to utilize IPv6 is by connecting to an ISP that provides IPv6 support. Unfortunately, finding such an ISP is still a nearly impossible task - especially when limited to those that provide local access. There are a number of transition mechanisms that should be able to provide IPv6 even without ISP support, but all have their own issues. For example, Microsoft Windows Vista, Windows 7, and most other modern operating systems support 6to4, Teredo, and ISATAP as tunneling mechanisms. However, I have not had any real success with any of these - at least not under Windows and while behind a NAT.

6to4 actually seems like an ideal solution to provide IPv6 access to a LAN, as long as there is a capable device to serve as a router that also has access to a public IPv4 address. Unfortunately, the address of the IPv6 subnet is based on the IPv4 address. While this may be a feasible solution for those with static IPv4 addresses (rare, more expensive, and only becoming worse), use on a dynamic IPv4 address requires an insanely short lifetime on IPv6 addresses, and requires the entire LAN to be re-addressed whenever the hosting IPv4 address is updated.

This pretty much leaves me with tunneled IPv6 access through a tunnel broker, using either configured 6in4 or AYIYA protocols. The best I have found - at least for free - are Hurricane Electric's Tunnel Broker, SixXS, and gogoNET (previously go6.net).

SixXS has the largest list of available "Points of Presence" - 35 over 18 countries. However, access pretty much requires the AICCU client, which is becoming a bit outdated and has a number of issues under Windows. (As of this writing, the last update for Windows was 2008-05-25.) Additionally, while free, SixXS has had much difficulty maintaining uptimes - particularly the one in Chicago as well as other POPs in the US.

gogoNET currently has much better support for Windows (using their gogoClient - with versions for most *nix versions as well), but the available tunnels are limited to 3, and with nothing local to the US: Montreal, Amsterdam, and Sydney.

Overall, I've had the most success with Hurricane Electric. HE provides 24 tunnels across 10 countries, including 12 within the US. However, unlike SixXS and gogoNET, HE provides no visible support for use behind a non-owned firewall, such as for mobile use on other public networks.

Setup details to follow in a future post.

Saturday, February 20, 2010

JMX Secure Connections / Avoiding Java System Properties, RMI

I spent much of my weekend working on adding support for Java Management Extensions (JMX) into a large enterprise application. Security was appropriately a primary concern, and I needed to ensure that all connections were properly encrypted. The most significant observation I've made during this work is that Java system properties are often overly depended upon / misused. Dependencies within RMI are a prime example of where the use of system properties cause some severe limitations, and are an area that probably could certainly use some improvement.

System properties are global to a JVM. Especially in a large application, conflicts can quickly arise if alternate configuration methods aren't available. For example, a necessary configuration may require one section of code or a referenced library to have a given system property set to one value, while another section or library requires the same system property set to another value. This is possible if they are split into separate programs, running on separate JVMs, but not within the same JVM. System properties can be set and changed at runtime by calling System.setProperty(...), but this should not be taken lightly and should usually be avoided. When they do need to be set outside of the java command-line, system properties should only be set within an applications "main" method, or other top-level code. I previously had to fix an issue where a JSP was switching the "javax.xml.transform.TransformerFactory" value between the interpretive and compiled (XSLTC) versions, which caused interesting issues (a.k.a. failures) elsewhere throughout the application, as the switch was causing different processors to be used for various functions, depending upon the timing between calls to the JSP (review: concurrency, thread safety). The same primary issue is shared with environment variables, as they are also globally shared by the JVM (or any process), However, unlike Java's system properties, the environment variables of the current runtime are not modifiable by Java.

Specific to my work were the Java Secure Socket Extension (JSSE) customizations, particularly the "javax.net.ssl.keyStore*" and "javax.net.ssl.trustStore*" system properties. Note that these are referred to as "defaults", with some default values provided for the default parameters, meaning that there should be a way to use a non-default value when needed. Another limitation of these specific customizations is that they are only single-valued, with no way to provide support for multiple key stores, etc., short of providing an overridden implementation class, which is full of issues in itself. Especially in a large enterprise application, calls need to be made to different services that require different certificates, and especially with limitations around automatic certificate selection, there needs to be a way to hook into this through flexible code when required.

HttpsURLConnection is an excellent example of a class that provides exactly this. In addition to the static (JVM-global) setDefaultSSLSocketFactory(...) method, it also provides a setSSLSocketFactory(...) method that can be used to provide customized SSL socket factories on a per-connection basis.

Unfortunately, JMX and RMI currently provide no such hooks, relying exclusively on system properties or the default socket factory. In the case of RMI, things only get more interesting and complicated. Everything exposed through the RMI public API is protocol-generic, with no concept of TCP, IP addresses, or port numbers. These are only handled by internal UnicastRef, LiveRef, and eventually TCPEndpoint classes. Only a "stub" is communicated to the client, through a registry or other means, and this stub contains an optional, serilaized RMICClientSocketFactory for creating a connection from the client to the host. That's right - the server controls the socket factory that will be used by the client to connect back to the server. I haven't found any clean way to override this behavior at the client. For JMX over RMI, both the server and client factories are set through an environment map with property keys defined on RMIConnectorServer. (RMI is the only JMX connector included with Java 5 and 6.)

For the application I was working on, changing the system properties to control the certificates used for secure communications is simply not feasible, if even an option at all. So I created an alternate SslRMIClientSocketFactory implementation, as even mentioned in the JDK's source code:

// If someone needs to have different SslRMIClientSocketFactory factories
// with different underlying SSLSocketFactory objects using different key
// and trust stores, he can always do so by subclassing this class and
// overriding createSocket(String host, int port).

This alternate implementation returned sockets from a socket factory on a custom-initialized SSLContext. This worked great when connecting between different instances of the same application (different JVMs on the same node, as well as different nodes). However, this requires the alternate class (and any and all other references classes) to be available on the classpath of any client making the connection - making things difficult for connections from other standard clients such as JConsole or VisualVM. It is possible, however, by setting the classpath with the "-J-Djava.class.path=..." arguments, which work the same for both JConsole and VisualVM. Both these programs utilize native launchers, so the "-J" prefix means to pass the trailing argument to the JVM and not the native launcher itself. When doing this, "<java.home>/lib/jconsole.jar" must be included as well for JConsole, or JConsole won't even start. tools.jar is also necessary for connecting to local processes and possibly other features, but apparently isn't required for remote connections like those being attempted here.

The solution I'm proceeding with is two-fold. First, I'm registering two JMXConnectorServers per JVM - one that uses the standard SslRMIClientSocketFactory, and one that uses a customized factory class. The first can be used by any client, assuming that the certificates are valid in the default trust store, or that the proper references using system properties can be made. The second can be used by any client that has the customized class available on the classpath, including other instances of the application itself. Fortunately, this does not require any additional network ports to be kept open. Each instance shares the same port, with a non-visible (at least not publicly or easily) ObjID being used to distinguish between them - which is also included in the serialized connection stub used by the client.

For my customized SslRMIClientSocketFactory class, simplest is best. By using only classes native to the JDK, only the one class is necessary to be available on the client's classpath so that it can be deserialized from the RMI stub. To avoid the issues with global system properties as described above, it also needed to be customizable, ideally being able to provide alternate socket factory implementations from within the client. Unfortunately, even if this class had a setSocketFactory method available, the rest of the RMI and JMX API doesn't provide for any apparent opportunity to call such a method. While a bit of a hack, my solution was to use a ThreadLocal. Here is my entire class:

import java.io.IOException;
import java.io.Serializable;
import java.net.Socket;
import java.rmi.server.RMIClientSocketFactory;

import javax.rmi.ssl.SslRMIClientSocketFactory;

public class ThreadLocalSslRmiClientSocketFactory
    implements RMIClientSocketFactory, Serializable{
  
  private static final long serialVersionUID = 1L;
  
  public static final ThreadLocal<RMIClientSocketFactory> SOCKET_FACTORY
      = new InheritableThreadLocal<RMIClientSocketFactory>(){
    @Override
    protected RMIClientSocketFactory initialValue(){
      return new SslRMIClientSocketFactory();
    }
  };
  
  public Socket createSocket(String host, int port) throws IOException{
    return SOCKET_FACTORY.get().createSocket(host, port);
  }
  
}

This allows for the socket factory to be configured by calling ThreadLocalSslRmiClientSocketFactory.SOCKET_FACTORY.set(...) somewhere within the current thread prior to making the connection, and without having requirements on or otherwise impacting the rest of the application. If this customization is not called, then the default SslRMIClientSocketFactory is used, and the global system properties should be referenced to determine the trust store, etc. Starting the servers then looks like this:

import java.lang.management.ManagementFactory;
import java.util.HashMap;
import java.util.Map;

import javax.management.MBeanServer;
import javax.management.remote.JMXConnectorServer;
import javax.management.remote.JMXConnectorServerFactory;
import javax.management.remote.JMXServiceURL;
import javax.management.remote.rmi.RMIConnectorServer;
import javax.rmi.ssl.SslRMIClientSocketFactory;

import org.slf4j.Logger;

...

MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
JMXServiceURL url = new JMXServiceURL("rmi", null, 0);

Map<String, ? super Object> serverEnv = new HashMap<String, Object>();
serverEnv.put(
  RMIConnectorServer.RMI_SERVER_SOCKET_FACTORY_ATTRIBUTE,
  new JmxSslRmiServerSocketFactory());
serverEnv.put(
  RMIConnectorServer.RMI_CLIENT_SOCKET_FACTORY_ATTRIBUTE,
  new SslRMIClientSocketFactory());

JMXConnectorServer sslConnector = JMXConnectorServerFactory.newJMXConnectorServer(url, serverEnv, mbs);
sslConnector.start();
LOGGER.info("JMX SSL server started: {}", sslConnector.getAddress());

serverEnv.put(
  RMIConnectorServer.RMI_CLIENT_SOCKET_FACTORY_ATTRIBUTE,
  new ThreadLocalSslRmiClientSocketFactory());

JMXConnectorServer threadLocalSslConnector = JMXConnectorServerFactory.newJMXConnectorServer(url, serverEnv, mbs);
threadLocalSslConnector.start();
LOGGER.info("JMX thread-local SSL server started: {}", threadLocalSslConnector.getAddress())

What about other connectors? As listed in the connector chapter of the JMX overview documentation:

An optional part of the JMX Remote API, which is not included in the Java SE platform, is a generic connector. This connector can be configured by adding pluggable modules to define the following:

  • The transport protocol used to send requests from the client to the server, and to send responses and notifications from the server to the clients
  • The object wrapping for objects that are sent from the client to the server and whose class loader can depend on the target MBean

The JMX Messaging Protocol (JMXMP) connector is a configuration of the generic connector where the transport protocol is based on TCP and the object wrapping is native Java serialization. Security is more advanced than for the RMI connector. Security is based on the Java Secure Socket Extension (JSSE), the Java Authentication and Authorization Service (JAAS), and the Simple Authentication and Security Layer (SASL).

The generic connector and its JMXMP configuration are optional, which means that they are not always included in an implementation of the JMX Remote API. The Java SE platform does not include the optional generic connector.

There is also a Web Services (WS) connector being worked on, as described in JSR 262: Web Services Connector for Java Management Extensions (JMX) Agents and the reference implementation project at https://ws-jmx-connector.dev.java.net/. I've read that the WS connector is planned to be included with Java 7. Fortunately, it appears that the WS connector supports a custom SSLContext, configurable on the client, as shown in Securing JMX Web Services Connector (Jean-Francois Denise, 2007-08-16, blogs.sun.com).

Other related JMX references that have already proved useful:

Wednesday, February 17, 2010

MarkUtils-IO: Performant Java Streams, Readers, and Writers

MarkUtils-IO is another high-performance addition to MarkUtils, and is a collection of utility classes that I've found myself frequently reusing over the past number of years.

Work on this library was also the driving factor for another recent post concerning some of Java's built-in code, Redundant argument validation code in Java IO classes.

com.ziesemer.utils.io is available on ziesemer.dev.java.net under the GPL license, complete with source code, a compiled .jar, generated JavaDocs, and a suite of 40+ JUnit tests. Download the com.ziesemer.utils.io-*.zip distribution from here. Please report any bugs or feature requests on the java.net Issue Tracker.

Saturday, January 30, 2010

DynDNS Update Client Shell Script

I'm sharing the shell script I wrote and have been using for the past number of months to update my Dynamic DNS account on DynDNS.com with my dynamic IP address on my DSL connection, and previously with Alltel / Verizon Wireless.

There are many other update clients available. However, I had specific issues with every other one I tried, and this one meets my needs perfectly. Additionally, I wrote this script with a few particular design goals, as commented in the code below.

This is a Unix/Linux shell script, and is not designed to work with other environments such as Windows. It may work under Cygwin or another such environment as long as all the dependencies are available and any necessary changes are made. These dependencies include: wget, date, logger, sed, at, and parseable output from ifconfig - all available by default on most Linux installations.

#!/bin/sh

# Mark A. Ziesemer, http://www.ziesemer.com
# 2009-08-30, 2009-10-26
# http://blogger.ziesemer.com/2010/01/dyndns-update-client-shell-script.html
# With thanks to "ferret" and "pgas" on #bash (IRC) for some general bash-related questions.

# Design goals, in order of priority / importance:
#    1) 100% compliance with DynDNS Update API (http://www.dyndns.com/developers/specs/),
#      including all policies and required timings.
#    2) Maintain and update IP as fast as allowed by the specification, minimizing "downtime".
#     Properly update in cases currently missed by other update clients, e.g.
#        when multiple updates are requested within an otherwise arbitrarily-defined time limit.
#    3) Run efficiently, respecting CPU, RAM, and disk requirements.
#      Use external scheduling (atd) and hooks to lessen required in-memory process time as much as possible.
#      Should be re-usable on embedded systems, e.g. OpenWrt, with only minor modifications necessary.
#    4) No dependencies on other "large" runtime libraries, e.g. Perl or Python.

# Exit status codes:
#   0   Update completed successfully.
#   11  Update completed successfully, but unnecessarily. (NOCHG)
#   21  IP has not changed.
#   31  Recognized temporary failure.
#   41  Assumed (unrecognized) temporary failure.
#   51  Temporary failure due to $lastUpdateResult="TEMP_FAIL".
#   101  Permanent failure.
#   111  Permanent failure due to $lastUpdateResult="FATAL".
#   112  Permanent failure due to unrecognized $lastUpdateResult.
#   201  Couldn't kill existing script.
#   255  Other unexpected failure.

### variables accepted as command line arguments
configFile="/etc/ddIpUpdate/ddIpUpdate.config"
forceUpdate=
forceRetry=

### variables persisted in $configFile
dynIF=

username=
password=
hostname=

varDir="/var/lib/ddIpUpdate"
stateFile="${varDir}/ddIpUpdate.state"
pidFile="/var/run/ddIpUpdate.pid"

### variables persisted in $stateFile

lastIP=
lastUpdateTime=
# GOOD, TEMP_FAIL, FATAL
lastUpdateResult=
futureJobNum=

### other internal variables
userAgent="com.ziesemer.ddIpUpdate 2009.10.26"
callback=$0

### Helper functions.

_log(){
  printf "$(date --rfc-3339=seconds) $*\n" >&2
  logger $0 "$*"
}

_setArgs(){
  while [ "$1" != "" ]; do
    case $1 in
      "-c" | "--configFile")
        shift
        configFile=$1
        ;;
      "-f" | "--forceUpdate")
        forceUpdate=true
        ;;
      "-r" | "--forceRetry")
        forceRetry=true
        ;;
    esac
    shift
  done
}

_exitErr(){
  local exitStatus=$?
  _log "Error! $exitStatus"
  _exitNormal
  return $exitStatus
}

_exitNormal(){
  trap - EXIT
  _writeConfig
  rm $pidFile
}

_checkRun(){ # cmd
  local cmd status out
  cmd=$*
  out=$(eval $cmd)
  status=$?
  if [ $status -ne 0 ]; then
    _log \
      "\nUnexpected return status: $status, exiting." \
      "\nCommand: $cmd" \
      "\nOutput: $out"
    return $status
  else
    echo "$out"
  fi
}

_schedule(){ # cmd, time
  if [ -n "$futureJobNum" ]; then
    _log "Removing existing scheduled job: $futureJobNum"
    atrm $futureJobNum
  fi

  local at_out
  at_out=$(_checkRun "echo $1 | at $2 2>&1")
  _log "Scheduled command: \"$1\", $(echo "$at_out" | tail -n 1)"
  local at_id=$(echo "$at_out" | sed -n "s/^job \([0-9]*\) at .*$/\1/p")
  echo $at_id
}

### Core functions.

_findIP(){
  local ip=$(ifconfig $dynIF | sed -n 's/ *inet addr:\([0-9.]*\).*/\1/p')
  _log "Detected IP $ip on interface $dynIF."
  echo $ip
}

_checkInstances(){
  if [ -e $pidFile ]; then
    local pid=$(cat $pidFile)
    command kill -TERM $pid
    # 'wait' only works for child processes
    sleep 1
    if [ kill -0 "$pid" ]; then
      _log "Existing script with PID $pid didn't stop; exiting..."
      exit 201
    fi
  fi

  echo $$ > $pidFile
}

_checkLastStatus(){
  case "$lastUpdateResult" in
    "FATAL")
      _log "Last update resulted in a fatal condition; user intervention required."
      exit 111
      ;;
    "TEMP_FAIL")
      if [ -z "$forceRetry" ] ; then
        if [ $(( $(date +%s) < ($lastUpdateTime + 1800) )) -ne 0 ] ; then
          _log "Temporary timeout not yet expired, or user intervention required."
          _rescheduleTempFail
          exit 51
        fi
      fi
      ;;
    "GOOD" | "")
      # Continue
      ;;
    *)
      _log "Unrecognized lastUpdateResult: $lastUpdateResult; exiting..."
      exit 112
      ;;
  esac
}

_readConfig(){
  case "$configFile" in
    *"/"*) . $configFile ;;
    *) . ./$configFile ;;
  esac
  if [ -r $stateFile ]; then
    case "$stateFile" in
      *"/"*) . $stateFile ;;
      *) . ./$stateFile ;;
    esac
  fi
}

_writeConfig(){
  echo "#This file is automatically re-written!" >$stateFile
  for name in "lastIP" "lastUpdateTime" "lastUpdateResult" "futureJobNum"; do
    echo $name=\"$(eval "echo \$$name")\" >> $stateFile
  done
  echo >> $stateFile
}

_checkIPChanged(){ # ip
  if [ "$lastIP" = "$1" ]; then
    _log "IP has not changed from $lastIP; exiting..."
    exit 21
  fi
}

_rescheduleTempFail(){
  futureJobNum=$(_schedule "$0 -c $configFile" "now + 30 minutes")
}

_updateIP(){ # ip
  local returnStatus=255
  lastIP=

  # Could do without writing the temporary files, but good to save anyway for debugging / troubleshooting.
  local updateResult=0
  echo "https://$username:$password@members.dyndns.org/nic/update?hostname=$hostname&myip=$1" \
    | wget -i - -O - -U "$userAgent" --save-headers \
      2>${varDir}/response.err >${varDir}/response.out || updateResult=$?
  
  if [ $updateResult -eq 0 ]; then
    # DynDNS requires action based on return codes only, not HTTP status:
    #    http://www.dyndns.com/developers/specs/guidelines.html
    :
  else
    _log "Error result from web service: $updateResult"
  fi

  # Previous bashism (bash array):
  # declare -a updateResponse=($(tail -n 1 ${varDir}/response.out))
  # ${updateResponse[0]}

  local updateResponse="$(tail -n 1 ${varDir}/response.out)"
  local updateResponse1="$(echo $updateResponse | awk '{print $1}')"
  # 2nd token is the returned IP, which really doesn't offer anything.
  # local updateResponse2="$(echo $updateResponse | awk '{print $2}')"

  _log "Received response: $updateResponse"

  case "$updateResponse1" in
    "good")
      lastUpdateResult="GOOD"
      returnStatus=0
      ;;
    "nochg")
      lastUpdateResult="GOOD"
      returnStatus=11
      ;;
    "dnserr" | "911")
       # Temporary issue, prevent any further requests for at least 30 minutes or until user manually clears error.
      lastUpdateResult="TEMP_FAIL"
      returnStatus=31
      ;;
    "badauth" | "!donator" | "notfqdn" | "nohost" | "numhost" | "abuse" | "badagent" | *)
      if ( [ $updateResult -ne 0 ] && [ -z "$updateResponse1" ] ); then
        # Temporary network failure or other assumed-temporary issue.
        lastUpdateResult="TEMP_FAIL"
        returnStatus=41
      else
        # Known permanent failure, or other completely unexpected result.
        # Prevent any further requests until user manually clears error.
        lastUpdateResult="FATAL"
        returnStatus=101
      fi
      ;;
  esac

  lastUpdateTime=$(date +%s)
  case "$lastUpdateResult" in
    "GOOD")
      lastIP=$1
      futureJobNum=$(_schedule "$0 -f -c $configFile" "now + 28 days")
      ;;
    "TEMP_FAIL")
      _rescheduleTempFail
      ;;
  esac

  return $returnStatus
}

_runUpdate(){
  _checkLastStatus
  local ip=$(_findIP)
  if [ -z "$forceUpdate" ]; then
    _checkIPChanged $ip
  fi
  _updateIP $ip
  return $?
}

### "Main"

set -e
trap _waitAbort TERM
trap _exitErr EXIT

_setArgs $*
_readConfig
_checkInstances

mkdir -p $varDir

result=0
_runUpdate || result=$?

_exitNormal
_log "Exiting with status: $result"
exit $result

To use, just save as an executable file somewhere at the location of your choice. Create a configuration file containing the 4 required parameters: dynIF, username, password, and hostname, e.g.:

dynIF="ppp0"
username="someUsername"
password="somePassword"
hostname="someDomainName.dyndns.org"

View the source above for the other optional parameters accepted, as well as their default values. The permissions for this file should be set so that it is only readable by the user account that will execute the script, in order to protect the password. This file will be looked for at "/etc/ddIpUpdate/ddIpUpdate.config" by default, but can be overridden with the "--configFile" or "-c" command line arguments.

When run, this script maintains state in a file, the location of which defaults to "/var/lib/ddIpUpdate/ddIpUpdate.state", but can be overriden in the config file. This file is created automatically on first run, and looks like this:

#This file is automatically re-written!
lastIP="1.2.3.4"
lastUpdateTime="1264001234"
lastUpdateResult="GOOD"
futureJobNum="1"

There are many options available for having this script executed. On my Ubuntu Karmic (9.10) system, I created a link to this script in the "/etc/ppp/ip-up.d" directory so that it is executed every time after my PPP connection starts, or is restarted.

I don't write shell scripts for a living, so while I have been using this myself for several months now without an issue, it is certainly possible that there may be a bug or other room for improvement. Please leave a comment here if you have a correction or a suggestion, but please "cite your source" to a supporting reference related to the issue, particularly for any shell script semantics. Please also remember to follow typical best practices for bug reporting.

Thursday, January 28, 2010

Redundant argument validation code in Java IO classes

Today I noticed some almost comical redundant checks in a number of the java.io InputStream, OutputStream, Reader, and Writer classes. I'm looking at the read and write methods with a signature of ([] x, int off, int len), where "[] x" is either a byte or char array, and "x" is named "b" for a byte array, or "c" or "cbuf" for a char array. One example is InputStream.read(byte[], int, int).

Below is a portion of the source code used within StringReader, StringWriter, BufferedReader, BufferedWriter, CharArrayReader, CharArrayWriter, and ByteArrayOutputStream, identical between all versions of Java checked from 1.3 - 1.6 (6.0), and only slightly reformatted for readability:

if ( (off < 0) || (off > x.length) || (len < 0)
    || ((off + len) > x.length) || ((off + len) < 0) ) {
  throw new IndexOutOfBoundsException();
}

Keep in mind that these methods are typically called within a loop, and depending upon the chosen array/buffer size and the amount of data being read or written, these methods and the shown checks may be executed many times. Any unnecessary statements will only hinder performance. Notice any redundancies in the check?

First, if neither "off" nor "len" are less than 0 (both already checked), it is guaranteed that the sum of "off" and "len" will also never be less than 0. This makes the "((off + len) < 0)" check completely redundant, and it should be removed.

Second, if the sum of "off" and "len" is not greater than the length of the array, and if both "off" and "len" are positive (all already checked), it is guaranteed that "off" alone will also never be greater than the length of the array. This makes the "(off > cbuf.length)" check completely redundant, and it should be removed.

This would simplify and shorten the above checks to the following:

if ( (off < 0) || (len < 0) || ((off + len) > x.length) ) {
  throw new IndexOutOfBoundsException();
}

While I haven't checked, I suppose it is possible that this type of issue could be optimized away by the compiler or even the JVM at runtime - but I wouldn't hold my breath.

BufferedOutputStream in all the same versions works a little differently and doesn't perform any of its own argument validation.

Someone apparently realized this issue on ByteArrayInputStream, and made an optimization. In Java 1.5 / 5.0 and previous, the check was the same as all the others above. In Java 1.6 / 6.0, the check is now written as:

if ( off < 0 || len < 0 || len > x.length - off ) {
  throw new IndexOutOfBoundsException();
}

This is basically identical to my optimized version above, just with "off" moved to the other side of the comparison, with the operator properly switched from '+' to '-' to match. However, all versions of the read method in ByteArrayInputStream add an unnecessary check for "(b == null)", only to manually throw a NullPointerException. An identical NullPointerException will already be thrown on the attempt to read the "length" property from the array if the array is null in the above check. This makes the additional check redundant, and it should be removed.

BufferedInputStream takes an interesting, alternative approach:

if ( (off | len | (off + len) | (x.length - (off + len))) < 0) {
  throw new IndexOutOfBoundsException();
}

Notice that bitwise ORs are being used instead of logical ORs. Any negative value bitwise OR'd with any other value(s) will produce a negative result. However, this code is performing another redundant operation. Again, if neither "off" nor "len" are less than 0 (both already checked), it is guaranteed that the sum of "off" and "len" will also never be less than 0. This makes the "(off + len))" check completely redundant, and should be removed. This can simplify and shorten to:

if ( (off | len | (x.length - (off + len))) < 0 ) {
  throw new IndexOutOfBoundsException();
}

I'm not certain which of the approaches (bitwise or logical ORs) should be faster. Successive logical ORs can be skipped once an earlier portion evaluates to true. However, for almost all (assuming valid) calls to these methods, these expressions should almost always evaluate to false, which then requires an entire logical OR expression to be evaluated regardless. I'd be interested to hear any detailed arguments for one way or another.

I will report this to the Sun (Oracle) bug database, and post an update if and when I receive a public bug ID.

Sunday, November 15, 2009

Capturing complete HTTP requests - Echo Server

Background

I recently had a need to capture and inspect a complete HTTP request in preparation for developing a new web service. The main reason for this is that there were no real requirements for the requested service. It wasn't clear which parameters would be sent in the request, or exactly how the parameters would be named. It also wasn't clear how the parameters would be split between GET and POST parameters, or even additional HTTP request headers. Additionally, there was also some history with issues around various character encodings, so I needed to be able to capture a byte-accurate copy of the entire request, including the headers and the body.

Initially, I did not have good luck finding an existing tool or solution for this. My first attempt was to just host a basic web server, then to capture the data using Wireshark. Unfortunately, Wireshark primarily works with Ethernet packets. It supports higher-level viewing of many protocols including HTTP. It even includes options for re-assembly of both HTTP headers and bodies, re-assembly of chuncked transfer-coded bodies, and decompression of entity bodies. However, while I'm sure there are additional options and methods for getting it to work more like I desired, it just didn't seem like the right tool for this particular job - and that is no fault of Wireshark.

My other early attempt was to use the mod_dumpio module in Apache HTTP Server. The first issue with this was that all the output from all requests is simply mixed-in to the same error log file (along with other debugging / outputs), which would make the data very difficult for proper extraction. The second issue was that at least as far as I can tell, there can only be one error log file per <VirtualHost/>, which would have resulted in an excessive amount of data being captured.

I then started to look at a simple Java HTTP server to capture the desired data. I've written trivial HTTP servers before, but it quickly becomes non-trivial to properly handle and respond to all the possible options and variations - even just to accept a complete request (including body) from a client. Trying to avoid duplicating previous work, I looked at a number of existing web servers including Apache Tomcat, but did not find any that provided the desired logging options.

My solution

I started looking further into Jetty. (I previously used and blogged about Jetty in regards to a test platform for my MarkUtils-Web project.) I found that I could intercept the incoming requests byte-by-byte by extending Jetty's default Connector - SelectChannelConenctor, and then overloading the newEndPoint(...) method to return an extended SelectChannelEndPoint. Hooking into the SelectChannelEndPoint's fill(Buffer buffer) method allows for capturing of the complete HTTP request. Kudos to the Jetty developers for not making this difficult or impossible by marking everything as private or otherwise overly-restricted, as compared to an unfortunate practice followed by many other projects and companies!

With only a little extra code, each HTTP request is logged to a chosen directory as a pair of files, grouped by a time-based session ID. The first is a "meta" file that contains details that would not ordinarily be captured as part of the HTTP capture, including the session ID, server date, and remote address, host, and port. The second is the "content" file that contains the actual byte-by-byte capture of the HTTP request. While it is named as a ".txt" file for easy viewing, it is treated as binary and will accurately capture all requests, including those with binary payloads. The format also allows for easily re-playing the request to a server for additional testing, debugging, or analysis.

Finally, by implementing and registering an associated Handler, the request is not only captured, but is efficiently echoed back to the client - without ever needing to buffer or store the entire request. This echoed response starts with the contents of the "meta" file, including the session ID that can be used by the client to easily refer to the saved log file back on the server. The contents of a sample echoed response are shown below, and would appear in the body of a viewing web browser:

Session ID: 124f67a451d-6313f5e0
Date: Sun Nov 15 00:14:18 CST 2009
remoteAddr: 127.0.0.1
remoteHost: 127.0.0.1
remotePort: 23349
==========
POST /someUrlPath?someGetKey=someGetValue HTTP/1.1
Host: localhost:8080
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5 GTB5 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 25

somePostKey=somePostValue

On the server-side, the first portion (above the '=' divider) is saved as 124f67a451d-6313f5e0-meta.txt, with the last portion (below the divider) saved as 124f67a451d-6313f5e0-content.txt.

The main class for this project is com.ziesemer.httpEchoServer.HttpEchoServer. It is written to be suitable for inclusion into other projects or uses, as visible from the included JUnit test. It also includes a main() method for direct use from the command-line, supporting arguments to control the port to listen on ("--port") and the directory to use to store the log files ("--logDir"). By default, Jetty is configured to listen on port 8080. If the specified port is unavailable, add "--allowDynamicPort" to configure the process to fall-back to a dynamically-chosen port if the specified port is already in-use.

Fiddler: Another alternative

Another alternative I later considered was Microsoft's Fiddler, a HTTP Debugging Proxy. While not open-source, it is freeware and extensible. It is also certainly a better match for my requirements than either Wireshark or Apache's mod_dumpio, and arguably even my solution described here. However, Fiddler still requires a server to answer the requests for it can monitor the traffic, and doesn't support echoing the request to the response. Fiddler does have many other features to offer that may prove useful, and is at least worth testing out.

Download

com.ziesemer.httpEchoServer is available on ziesemer.dev.java.net under the GPL license, complete with source code, a compiled .jar, generated JavaDocs, and JUnit tests. Download the com.ziesemer.httpEchoServer-*.zip distribution from here. Please report any bugs or feature requests on the java.net Issue Tracker.

Friday, October 30, 2009

Executing JavaScript from HTML AJAX

AJAX, by definition, is known as "asynchronous JavaScript and XML". However, the use of XMLHttpRequest is not limited to XML, with JSON being a popular alternative. This post will focus on another alternative: standard HTML. Using HTML may be necessary when the HTML is already being generated server-side, with no available XML or JSON equivalent. If the response is to be viewable by the user, HTML may already be the desired result, and can avoid additional complexities of transforming XML or JSON into HTML.

Simply retrieving HTML from an AJAX request and inserting into the existing DOM of a web page is relatively easy. For my purpose, I used the .responseText attribute of the response on the XMLHttpRequest. Once the HTML string is obtained, it can be injected into an existing DOM node in the document by assigning it to the node's .innerHTML property.

Another possibility to consider is if the returned HTML is guaranteed to be valid XHTML. In this case, it may be possible to continue treating the response as XML and using the .responseXML attribute, and inject it into an existing DOM node in the document by calling the node's .importNode(...) method.

In either case, there is the complication of scripts that may be embedded within the returned HTML. These embedded scripts are not reliably executed, especially not consistently across web browsers. This is particularly complicated by Internet Explorer's handling of the <script/> tags as "NoScope" elements, which is detailed by Microsoft's John Sudds in MSDN's .innerHTML property documentation and referenced forum thread. In summary, embedded scripts should have the "defer" property set to true, and must follow a scoped element (e.g. <input type="hidden"/>) to work with IE. I have seen several pages that suggest setting the .innerHTML on a cloned node, including the above provisions for IE, then inserting back into the HTML DOM using methods such as .appendChild(...) or .replaceChild(...) to work across browsers. However, even this is not guaranteed to always work, and fails with Safari in particular.

Current Working Solution:

Rather than using a combination of browser "hacks" and/or browser detection, this solution instead recognizes and assumes that embedded scripts will not be executed automatically by the browser. After inclusion into the document, any embedded scripts are efficiently searched for and executed through JavaScript's eval(...) method. Shown below is an example / test case:

HTML source code: (equivalent to using this page's View/Source)

<script type="text/javascript">
  // http://blogger.ziesemer.com/2007/10/respecting-javascript-global-namespace.html
  if(!com) var com={};
  if(!com.ziesemer) com.ziesemer={};
  if(!com.ziesemer.demo) com.ziesemer.demo={};
  
  com.ziesemer.demo.htmlAjax = function(){
    var pub = {};
    var contentDiv, oldContent;
    
    var html = "<div>New 1st-level dynamic text, presumably from AJAX response.</div>"
      + "<div id=\"com.ziesemer.demo.htmlAjax.middle\">More 1st-level dynamic text that should be replaced.</div>"
      + "<div>More 1st-level dynamic text, presumably from an AJAX response.</div>"
      + "<script type=\"text/javascript\">"
      + "document.getElementById(\"com.ziesemer.demo.htmlAjax.middle\").innerHTML = "
      + "\"2nd-level dynamic text, controlled by JavaScript presumably from an AJAX response.\";"
      // http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/
      + "<" + "/script>";
    
    pub.run = function(){
      contentDiv = document.getElementById("com.ziesemer.demo.htmlAjax.output");
      oldContent = contentDiv.innerHTML;
      contentDiv.innerHTML = html;
      
      var scripts = contentDiv.getElementsByTagName("script");
      // .text is necessary for IE.
      for(var i=0; i < scripts.length; i++){
        eval(scripts[i].innerHTML || scripts[i].text);
      }
    };
    
    pub.reset = function(){
      if(contentDiv && oldContent){
        contentDiv.innerHTML = oldContent;
      }
    };
    
    return pub;
  }();
</script>

<div style="border:1px solid; padding:0.5em;">
  <div><b>Dynamic test area:</b></div>
  <div id="com.ziesemer.demo.htmlAjax.output" style="border:1px solid;">
    Original static HTML text that should be replaced.
  </div>
  <p>
    <input type="button" value="Go!" onclick="com.ziesemer.demo.htmlAjax.run();"/>
    <input type="button" value="Reset" onclick="com.ziesemer.demo.htmlAjax.reset();"/>
  </p>
  
  <div>
    <b>Expected result:</b> (of what the above should appear as after the &quot;Go!&quot; button is clicked)
  </div>
  <div style="border:1px solid;">
    <div>New 1st-level dynamic text, presumably from AJAX response.</div>
    <div>2nd-level dynamic text, controlled by JavaScript presumably from an AJAX response.</div>
    <div>More 1st-level dynamic text, presumably from an AJAX response.</div>
  </div>
</div>

Demo:

Dynamic test area:
Original static HTML text that should be replaced.

Expected result: (of what the above should appear as after the "Go!" button is clicked)
New 1st-level dynamic text, presumably from AJAX response.
2nd-level dynamic text, controlled by JavaScript presumably from an AJAX response.
More 1st-level dynamic text, presumably from an AJAX response.

This was tested on Mozilla Firefox 3.5; Internet Explorer 6, 7, and 8; Google Chrome 3; and Apple Safari 4 for Windows. If both blocks of above text don't match after clicking "Go!", you've probably found an issue.

A few things to note:

  • This does not work with externally sourced scripts, i.e., those with a "src" attribute. This should normally not be a concern, as there shouldn't be any reason why these types of scripts can't be included in the page before the AJAX call is made. However, if supporting this is necessary, YUI's Get Utility will probably prove useful.
  • As with most similar approaches, this does not work with scripts that utilize document.write(...). If document.write is used in this method, it will likely result in all existing content of the page being removed. Again, this should normally not be a concern, as use of document.write should generally be discouraged and avoided for a number of reasons. The suggested alternative is to use existing DOM elements as placeholders, as shown in my example above.
  • Note the use of the "broken" closing </script> tag. This would not be necessary if the HTML string was actually obtained externally, e.g. from an AJAX call. It is only necessary as it is contained within a JavaScript string literal (as in the above example), as excellently described by Walt Stoneburner in his blog at http://www.wwco.com/~wls/blog/2007/04/25/using-script-in-a-javascript-literal/ (2007-04-25).
  • Not really specific to this example, but the for loop around the returned "scripts" array would ideally be replaced by Array.forEach(...), e.g.:
    contentDiv.getElementsByTagName("script").forEach(function(script){
      // .text is necessary for IE.
      eval(script.innerHTML || script.text);
    });
    However, .forEach isn't supported until JavaScript 1.6. Firefox 3.5 supports JS 1.8.1. Chrome 1.0 and Safari 3.2 support JS 1.7. Even version 8 of IE still only supports JS 1.5!

Additional references:

Tuesday, September 22, 2009

MarkUtils-Codec: Base64, URL, and other byte/char conversions

This is an overdue introduction of my latest addition to MarkUtils. MarkUtils-Codec could be considered a high-performance replacement for Apache Commons Codec. Like Commons Codec, this implementation has support for Base64, URL (a.k.a. Percent, and covered previously), and Hexadecimal encodings and decodings. Also like Commons Codec, this implementation utilizes a number of interfaces that allow various codecs to be used interchangeably. Unlike Commons Codec, this implementation is designed to be higher performing, as it is written for streaming use with the Buffer classes. The most significant advantage to this design is lower memory requirements and usage, especially when working with longer lengths of data.

MarkUtils-Codec is really a follow-up to one of my previous posts, Improving URLEncoder/URLDecoder Performance in Java. While the API I proposed and sample code I provided solved an immediate need, the lack of proper interfaces made it difficult to replace with other codecs, such as Base64. The options to plug-in to other standard streaming classes was also limited. For example, there was no clear way to create an InputStream that would read decoded data from encoded data. This library is meant as a complete replacement, as I have placed the "urlCodec" library in archival status.

Until I have a suitable place to host the Javadocs online, please reference them in the downloads available at ziesemer.dev.java.net.

The highest-level API interface is com.ziesemer.utils.codec.ICoder. Verbatim from the Javadoc, this is the "Base API for high-performance encoding and decoding between various Buffers. Supports conversions between ByteBuffers and CharBuffers through the IByteToCharEncoder and ICharToByteDecoder child interfaces. This API is similar in design to CharsetEncoder and CharsetDecoder."

Do note that the relation to the Charset classes may seem a bit backwards. When a character set is decoded, the input is bytes and the output is characters. The purpose of this library is to encode any data (as bytes) into character data that can safely be sent through various non-byte transports, e.g. HTTP forms. For this purpose, decoding takes characters and input and produces bytes as output.

Here is a simple example of supported direct usage, taking no advantage of streaming capabilities. This is included as one of the JUnit tests within the com.ziesemer.utils.codec.DemoTest class:

/**
 * Simplest usage, taking no advantage of streaming capabilities.
 */
@Test
public void testDirectSimple() throws Exception{
  IByteToCharEncoder encoder = new URLEncoder();
  ICharToByteDecoder decoder = new URLDecoder();
  
  // Random test data.
  byte[] rawData = new byte[1 << 10];
  new Random().nextBytes(rawData);
  
  // Encode.
  CharBuffer cbOut = encoder.code(ByteBuffer.wrap(rawData));
  
  // Decode (round-trip).
  ByteBuffer bbOut = decoder.code(cbOut);
  
  // Verify.
  byte[] result = new byte[bbOut.remaining()];
  bbOut.get(result);
  Assert.assertArrayEquals(rawData, result);
}

A number of input/output wrappers are also included in the "com.ziesemer.utils.codec.io" package, allowing for transparent use as a standard Java IO reader, writer, or stream. The signatures of the required constructors are also shown. Each class also provides an alternate constructor that can be used to fine-tune the read buffer size.

  • CharDecoderInputStream(ICharToByteDecoder decoder, Reader reader)

    Reads raw bytes from encoded characters. Counter-part to CharEncoderReader. This is a pull-interface; CharDecoderWriter is the equivalent push-interface.

    Can be adapted to read characters to the consumer (instead of raw bytes) by wrapping in a InputStreamReader. This is only valid if the decoded form of the data is known to only contain valid characters. Can also be adapted to read bytes from a provider (instead of characters) by using a InputStreamReader as the Reader.

  • CharDecoderWriter(ICharToByteDecoder decoder, OutputStream outputStream)

    Accepts encoded characters, and writes the raw bytes. Counter-part to CharEncoderOutputStream. This is a push-interface; CharDecoderInputStream is the equivalent pull-interface.

    Can be adapted to accept bytes from a provider (instead of characters) by wrapping in a OutputStreamWriter.

  • CharEncoderOutputStream(IByteToCharEncoder encoder, Writer writer)

    Accepts raw bytes, and writes the encoded characters. Counter-part to CharDecoderWriter. This is a push-interface; CharEncoderReader is the equivalent pull-interface.

    Can be adapted to write bytes to the consumer (instead of characters) by using a OutputStreamWriter as the Writer. Can also be adapted to accept characters from the provider (instead of raw bytes) by wrapping in a OutputStreamWriter.

  • CharEncoderReader(IByteToCharEncoder encoder, InputStream reader)

    Reads encoded characters from raw bytes. Counter-part to CharDecoderInputStream. This is a pull-interface; CharEncoderOutputStream is the equivalent push-interface.

    Can be adapted to read characters to the consumer (instead of raw bytes) by wrapping in a InputStreamReader.

Also included are a number of "character lists" (in the com.ziesemer.utils.codec.charLists package), particularly to support the different Base64 variations.

Please refer to the included JUnit tests (currently 161) for usage examples.

Download

com.ziesemer.utils.codec is available on ziesemer.dev.java.net under the GPL license, complete with source code, a compiled .jar, generated JavaDocs, and a suite of JUnit tests. Download the com.ziesemer.utils.codec-*.zip distribution from here. Please report any bugs or feature requests on the java.net Issue Tracker.

Monday, September 7, 2009

"networking restart" issues, VLANs under Ubuntu

For this post, I'm using Ubuntu Linux 9.04 / "Jaunty Jackalope". This is somewhat a follow-up to my Ubuntu Linux Router Upgrade Project.

Errors during "/etc/init.d/networking restart"

First, assuming a statically-configured LAN for server use, with NetworkManager disabled:

/etc/network/interfaces:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
  address 192.168.1.1
  netmask 255.255.255.0
$ sudo /etc/init.d/networking restart
 * Reconfiguring network interfaces...                                   [ OK ]

The networking simply restarts, without showing any errors or warnings. However, this quickly changes once an additional network adapter is configured. This could be an additional physical adapter, but for my purposes, I had added a virtual LAN (VLAN) to work with my Dell PowerConnect 2716. For Ubuntu, this simply requires installing the "vlan" package, and defining the virtual LAN with an additional entry in /etc/network/interfaces. Note that this is done using a "<base name>.<VLAN ID>" naming scheme. As shown below, I'm adding an interface for VLAN 2:

auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
  address 192.168.1.1
  netmask 255.255.255.0

auto eth0.2
iface eth0.2 inet static
  address 192.168.2.1
  netmask 255.255.255.0

Please add a comment if you can find any official documentation that documents this functionality of the interfaces file, as I can't. However, this appears to be driven by "/etc/network/if-pre-up.d/vlan" and "/etc/network/if-post-down.d/vlan".

This also requires the "8021q" module, but at least in Jaunty, it is already available by default, as shown by "lsmod | grep 8021q".

This is where I started running into errors:

$ sudo /etc/init.d/networking restart
 * Reconfiguring network interfaces...
RTNETLINK answers: No such process
Removed VLAN -:eth0.2:-
 * if-up.d/mountnfs[eth0]: waiting for interface eth0.2 before doing NFS mounts
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 2 to IF -:eth0:-
                                                                         [ OK ]

This doesn't actually cause any issues, but it isn't right and should be fixed. I don't see any way to make "networking restart" give more verbose output. Additionally, nothing helpful appears in the logs. However, most of what the networking script does is call "ifdown -a --exclude=lo" and "ifup -a", where "-a" is "affect all interfaces marked auto". Fortunately, repeating this with adding "-v" for verbose mode yields some details:

$ sudo ifdown -av --exclude=lo && sudo ifup -av
Configuring interface eth0=eth0 (inet)
run-parts --verbose /etc/network/if-down.d
run-parts: executing /etc/network/if-down.d/avahi-autoipd
run-parts: executing /etc/network/if-down.d/wpasupplicant

ifconfig eth0 down
run-parts --verbose /etc/network/if-post-down.d
run-parts: executing /etc/network/if-post-down.d/avahi-daemon
run-parts: executing /etc/network/if-post-down.d/bridge
run-parts: executing /etc/network/if-post-down.d/vlan
run-parts: executing /etc/network/if-post-down.d/wireless-tools
run-parts: executing /etc/network/if-post-down.d/wpasupplicant
Configuring interface eth0.2=eth0.2 (inet)
run-parts --verbose /etc/network/if-down.d
run-parts: executing /etc/network/if-down.d/avahi-autoipd
RTNETLINK answers: No such process
run-parts: executing /etc/network/if-down.d/wpasupplicant

ifconfig eth0.2 down
run-parts --verbose /etc/network/if-post-down.d
run-parts: executing /etc/network/if-post-down.d/avahi-daemon
run-parts: executing /etc/network/if-post-down.d/bridge
run-parts: executing /etc/network/if-post-down.d/vlan
Removed VLAN -:eth0.2:-
run-parts: executing /etc/network/if-post-down.d/wireless-tools
run-parts: executing /etc/network/if-post-down.d/wpasupplicant
Configuring interface eth0=eth0 (inet)
run-parts --verbose /etc/network/if-pre-up.d
run-parts: executing /etc/network/if-pre-up.d/bridge
run-parts: executing /etc/network/if-pre-up.d/dhclient3-apparmor
run-parts: executing /etc/network/if-pre-up.d/vlan
run-parts: executing /etc/network/if-pre-up.d/wireless-tools
run-parts: executing /etc/network/if-pre-up.d/wpasupplicant

ifconfig eth0 192.168.1.1 netmask 255.255.255.0       up

run-parts --verbose /etc/network/if-up.d
run-parts: executing /etc/network/if-up.d/avahi-autoipd
run-parts: executing /etc/network/if-up.d/avahi-daemon
run-parts: executing /etc/network/if-up.d/ip
run-parts: executing /etc/network/if-up.d/mountnfs
 * if-up.d/mountnfs[eth0]: waiting for interface eth0.2 before doing NFS mounts
run-parts: executing /etc/network/if-up.d/ntpdate
run-parts: executing /etc/network/if-up.d/wpasupplicant
Configuring interface eth0.2=eth0.2 (inet)
run-parts --verbose /etc/network/if-pre-up.d
run-parts: executing /etc/network/if-pre-up.d/bridge
run-parts: executing /etc/network/if-pre-up.d/dhclient3-apparmor
run-parts: executing /etc/network/if-pre-up.d/vlan
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 2 to IF -:eth0:-
run-parts: executing /etc/network/if-pre-up.d/wireless-tools
run-parts: executing /etc/network/if-pre-up.d/wpasupplicant

ifconfig eth0.2 192.168.2.1 netmask 255.255.255.0       up

run-parts --verbose /etc/network/if-up.d
run-parts: executing /etc/network/if-up.d/avahi-autoipd
run-parts: executing /etc/network/if-up.d/avahi-daemon
run-parts: executing /etc/network/if-up.d/ip
run-parts: executing /etc/network/if-up.d/mountnfs
run-parts: executing /etc/network/if-up.d/ntpdate
run-parts: executing /etc/network/if-up.d/wpasupplicant

So the error is apparently coming from "/etc/network/if-down.d/avahi-autoipd". (Avahi is a free Zero configuration networking (zeroconf) implementation.) I temporarily edited the "avahi-autoipd" script to add "-x" to enable debugging by changing the first line to "#!/bin/sh -ex". What is happening is that this script first checks for the existence of a route matching "169.254.0.0/16", but doesn't check for what interface it is on. If it finds a matching route, even on another interface, it makes a call to delete the route, but only for the current interface. When this combination doesn't exist, the "RTNETLINK answers: No such process" message is given.

Resolution: The scripts could just be fixed to include the interface in the search. I opened a bug report to Ubuntu on this: #425854. However, a "server"-type system shouldn't be handling these automatic types of network routings, so I just disabled these scripts:

sudo chmod -x /etc/network/if-up.d/avahi-autoipd
sudo chmod -x /etc/network/if-down.d/avahi-autoipd

Shown fixed:

$ sudo /etc/init.d/networking restart
 * Reconfiguring network interfaces...
Removed VLAN -:eth0.2:-
 * if-up.d/mountnfs[eth0]: waiting for interface eth0.2 before doing NFS mounts
Set name-type for VLAN subsystem. Should be visible in /proc/net/vlan/config
Added VLAN with VID == 2 to IF -:eth0:-
                                                                         [ OK ]

DHCP server binding issues

Another issue I noticed with running "/etc/init.d/networking restart" is that the DHCP server ("dhcpd" / "dhcp3-server") quit responding to requests. (Previous posting on dhcp3-server.) Using Wireshark, I noticed that DHCP requests were still being received, but no responses were being sent. Nothing applicable is shown in any of the logs, other than that dhcpd isn't even acknowledging the requests after the interface is brought back up.

Tools such as "lsof" "netstat" show that dhcpd is still properly bound to the interfaces. Other applications such as the SSH server (sshd) don't have this issue, and continue accepting connections even after the interface goes down and comes back up, so I'm not sure why this is an issue for the DHCP server. However, I can't get things to resume working properly short of restarting the DHCP server using "sudo /etc/init.d/dhcp3-server restart" As I'm starting to work with IPv6, I noticed the exact same issue with radvd.

I wasn't exactly sure where the best place was to automate these restarts. I could probably add a script to "/etc/network/if-up.d/", but would then have to include a check that I wasn't unnecessarily restarting the servers for every interface, as they should only be concerned with the interface they are serving requests on, e.g. eth0. For now, I just added these commands to "/etc/network/interfaces":

# ...

auto eth0
iface eth0 inet static
  address 192.168.1.1
  netmask 255.255.255.0
  post-up /etc/init.d/dhcp3-server restart
  post-up /etc/init.d/radvd restart

# ...

Wednesday, August 19, 2009

Scripted hiding of Windows Updates under Vista

Similar to my last post, here is another UI issue with Windows Vista. Fortunately, this time I have a solution to offer.

Starting with Windows Vista, the "Windows Update" functionality is provided through Control Panel rather than Internet Explorer. In both versions, there is the ability to hide updates. While hidden updates can easily be restored, this feature allows for ignoring unnecessary updates so that they don't continually count towards the number of available updates that are displayed. For me, this includes the 34 "Windows Vista Ultimate Language Packs" that are currently available. Unfortunately, multiple-selection is not enabled in the "View available updates" dialog. There are checkboxes, including a checkbox on the header that can be used to select/unselect all shown updates, but the checkbox selections are used for the "Install" button only. The other options available from the context menu - "View details", "Copy details", and "Hide update" - can currently be applied only one-at-a-time. This means that hiding just the 34 language packs would require no fewer than 68 clicks!

Originally, I assumed that these hidden updates and other preferences would be stored in the Windows registry, or possibly in a file on the file system. They are in a file, but a database-type file that isn't directly editable: %SystemRoot%\SoftwareDistribution\DataStore\DataStore.edb. Fortunately, there is a comprehensive Windows API for viewing and editing this information, and it is even easily available to scripting through the Windows Scripting Host and languages such as JScript. Microsoft's reference is located on MSDN: Windows Update Agent API.

Here is my resulting script that automatically hides all the "Windows Vista Ultimate Language Packs":

var updateSession = WScript.CreateObject("Microsoft.Update.Session");
var updateSearcher = updateSession.CreateUpdateSearcher();
updateSearcher.Online = false;

var searchResult = updateSearcher.Search("CategoryIDs Contains 'a901c1bd-989c-45c6-8da0-8dde8dbb69e0' And IsInstalled=0");

for(var i=0; i<searchResult.Updates.Count; i++){
  var update = searchResult.Updates.Item(i);
  WScript.echo("Hiding update: " + update.Title);
  update.IsHidden = true;
}

If you're not familiar with WSH, this can be simply executed as saving it as a *.js file, then double-clicking. A better option is to execute the file from a command-line with cscript. This will cause the output messages to be written to the standard output, instead of popping up a message box that must be acknowledged for each message. Also, since this script is making administrative changes to the system, it must be executed as an administrator.

"a901c1bd-989c-45c6-8da0-8dde8dbb69e0" is the ICategory.CategoryID for "Windows Vista Ultimate Language Packs". (This ICategory happens to have a .Type of "Product".) A similar script can easily be used to perform operations on other sets of updates by simply modifying the search query.

For the above example, the changes can be reverted by updating the script to executed update.IsHidden = false; (instead of true), then re-executing the script. Alternatively, here the Windows Vista GUI works a little better: By clicking on "Restore hidden updates" from the side panel in Windows Update, the "Restore" button operates on the checkbox selection - allowing all hidden updates to quickly be restored with 2 clicks if desired.

Finally, here is an extended example that doesn't change anything, but displays some of the many details that are available through this API. First, it displays all the updates grouped and nested by category. Note that some updates belong to more than one category. Finally, it displays all available updates in a "flat" view, without using categories.

var updateSession = WScript.CreateObject("Microsoft.Update.Session");
var updateSearcher = updateSession.CreateUpdateSearcher();
updateSearcher.Online = false;

var searchResult = updateSearcher.Search("IsInstalled=1 or IsInstalled=0");

var describeCategory = function(cat, depth){
  var pad = new Array(depth + 1).join("  ");
  WScript.echo(pad + depth + ": " + cat + ", " + cat.CategoryID + ", " + cat.Name + ", " + cat.Type);

  for(var i=0; i<cat.Children.Count; i++){
    var child = cat.Children.Item(i);
    describeCategory(child, depth + 1);
  }
  
  for(var i=0; i<cat.Updates.Count; i++){
    var update = cat.Updates.Item(i);
    WScript.echo(pad + "  " + describeUpdate(update, pad + "  "));
  }
};

var describeUpdate = function(update, pad){
  var u = update;
  var np = "\n" + (pad || "") + "  ";
  return u.Title
    + np + "Type: " + u.Type
    //+ np + "Description: " + u.Description
    + np + "IsInstalled: " + u.IsInstalled
    + np + "IsDownloaded: " + u.IsDownloaded
    + np + "IsHidden: " + u.IsHidden
    + np + "AutoSelectOnWebSites: " + u.AutoSelectOnWebSites;
};

for(var i=0; i<searchResult.RootCategories.Count; i++){
  var category = searchResult.RootCategories.Item(i);
  describeCategory(category, 1);
}

WScript.echo("\n");

for(var i=0; i<searchResult.Updates.Count; i++){
  var update = searchResult.Updates.Item(i);
  WScript.echo(describeUpdate(update));
}

According to the IUpdateSearcher.Search documentation, the default search criteria is "IsInstalled = 0 and IsHidden = 0". Unfortunately, there doesn't seem to be a simple option to short-circuit the evaluator to just return all available updates, e.g. "" or "1=1". So far now, "IsInstalled=1 or IsInstalled=0" results in all updates being displayed. The only other note concerning the above example is that the "description" line is commented out in the describeUpdate function only because it can be rather verbose, and make the overall output difficult to read. Feel free to uncomment it to view the details, as well as adding additional lines for all the other properties available from IUpdate.