posted by Tony Bourke on Mon 23rd Feb 2004 21:54 UTC

"SPARC Optimizations With GCC, Page 3/3"
SSH is Slow, but Why?
On many of the operating systems I evaluated, I noticed that logging in via SSH was inordinately slow, such as NetBSD 1.6.1. It could take 3 or more seconds to get a password prompt. I knew it wasn't the hardware, as logging into Solaris via SSH would return a password in less than a second. So what was a culprit? Google came up with these two items of particular note:

  • http://sparclinux.net/faq/cache/22.html
  • http://marc.theaimsgroup.com/?l=linux-sparc&m=101663078400634&w=2

    When I ran the OpenSSL speed test on NetBSD, I got extremely poor performance:

    OpenSSL 0.9.6g 9 Aug 2002
    built on: NetBSD 1.6.1
    options:bn(32,32) md2(int) rc4(ptr,int) des(ptr,risc1,16,int) blowfish(idx) 
    compiler: gcc version 2.95.3 20010315 (release) (NetBSD nb3)
    sign verify sign/s verify/s
    rsa 512 bits 0.0248s 0.0022s 40.2 449.9
    rsa 1024 bits 0.1279s 0.0076s 7.8 131.7
    rsa 2048 bits 0.9217s 0.0276s 1.1 36.2
    rsa 4096 bits 6.4647s 0.0928s 0.2 10.8
    sign verify sign/s verify/s
    dsa 512 bits 0.0224s 0.0281s 44.7 35.5
    dsa 1024 bits 0.0750s 0.0927s 13.3 10.8
    

    This was even slower than my OpenSSL 0.9.7c tests with the V7 instruction set on Solaris 9. Performing a "/usr/bin/true" through SSH on NetBSD showed the lengthy delay:

    > time ssh 192.168.0.19 "/usr/bin/true"
    0:02.79
    Almost 3 seconds! And it wasn't just NetBSD, either. A few others suffered the same problem.

    To fix this, I compiled OpenSSL 0.9.6l from NetBSD's pkgsrc and compiled OpenSSH 3.7.1p2. I made sure to include "-mcpu=ultrasparc", and ran the “

    /usr/bin/true
    ” test again.

    > time ssh tony@192.168.0.19 "/usr/bin/true"
    0:01.35

    I was able to cut the time almost in half with that optimization.

    I ran the same test on Solaris 9, using OpenSSL 0.9.7c libs compiled for -mcpu=v7 and -mcpu=ultrasparc OpenSSH 3.7.1p2.

    For -mcpu=v7, the login took almost 2 seconds.

    > time ssh tony@192.168.0.19 "/bin/true"
    0:01.56
    With
    -mcpu=ultrasparc
    however, it took less than a second.
    > time ssh tony@192.168.0.19 "/bin/true"
     0:00.95
    Where To Add The Optimizations
    There are a few ways to add optimizations at compile time. For many applications, you can go into the Makefile and look for the CLFAG entry, such as this for OpenSSL 0.9.6l on NetBSD 1.6.1:
    CFLAG= -fPIC -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -O2 -Wall
    Here is where I would add
    -mcpu=ultrasparc
    , probably at the end.
    CFLAG= -fPIC -DDSO_DLFCN -DHAVE_DLFCN_H -DTERMIOS -O2 -Wall -mcpu=ultrasparc
    For applications like MySQL, there are several subdirectories with their own Makefiles, all generated/configured when the Configure script is run. Editing just the top level
    Makefile
    probably will not affect the subdirectories, so there needs to be another way.

    Often, these applications will accept environment variables of

    CFLAGS
    (for the C compiler) and CXXFLAGS (for the C++ compiler flags).

    export CFLAG="-O3 -mcpu=ultrasparc"
    Running that before you run the configure script will add those flags. You can see in this excerpt from the Configure –help command from MySQL 4.0.17, showing the various compiler-related environment variables it accepts:
    CC C compiler command
    CFLAGS C compiler flags
    LDFLAGS linker flags, e.g. -L[lib dir] if you have libraries
    in a nonstandard directory [lib dir]
    CPPFLAGS C/C++ preprocessor flags, e.g. -I[include dir]
    if you have headers in a nonstandard directory [include dir]
    CXX C++ compiler command
    CXXFLAGS C++ compiler flags
    CPP C preprocessor
    
    This is common for the more complex open source applications.

    To Optimize or Not To Optimize
    Optimization really depends on what you're compiling. If you're creating a “hello world” application, or compiling ls from GNU's fileutils, you probably don't need to squeeze every ounce of possible performance. Characteristics such as mathematical operations versus I/O would all be factors in the potential benefit.

    Still, the performance optimizations discussed can have a potentially huge impact on performance on SPARC systems, much more dramatically than comparable optimizations on x86 systems.

    As such, adding -mcpu options for compilation is a good idea for systems that support V8 or higher. Even if you've got a mix of systems, it can very well be worth your time to keep multiple sets of binaries, one for each platform you run.

    Table of contents
    1. "SPARC Optimizations With GCC, Page 1/3"
    2. "SPARC Optimizations With GCC, Page 2/3"
    3. "SPARC Optimizations With GCC, Page 3/3"
  • e p (0)    22 Comment(s)

    Technology White Papers

    See More